Granite 3.0 8b Lab Community

Community-driven LLM

Granite 3.0 8b Lab Community is a cutting-edge AI model that combines the power of large-scale alignment with community-driven, openly-reproducible methods. Using the LAB methodology, this model can add new knowledge and skills incrementally, without suffering from catastrophic forgetting. With a unique taxonomy-driven data curation process and large-scale synthetic data generator, Granite 3.0 8b Lab Community can generate high-quality, safe, and diverse data. Its training process consists of two phases: knowledge tuning and skills tuning, allowing it to learn simple and complicated knowledge, as well as foundational and compositional skills. While it's primarily designed for English, it's essential to note that this model has not undergone safety alignment and may produce problematic outputs. As with any AI model, caution is advised when relying on it for crucial decisions or impactful information.

Instructlab apache-2.0 Updated 4 months ago

Table of Contents

Model Overview

Meet the Granite-3.0-8b-lab-community model, a game-changer in the world of language models. This model is part of a community-driven, openly-reproducible series, which means it’s transparent, accessible, and constantly improving.

What makes it special?

  • It’s trained using a novel method called LAB (Large-scale Alignment for chatBots), which allows it to learn new knowledge and skills without forgetting what it already knows.
  • It uses a taxonomy-driven data curation process, which ensures that the model is trained on a diverse set of knowledge domains and skills.
  • It’s designed to be safe and high-quality, with built-in checks to prevent problematic outputs.

Capabilities

This model is a powerful tool for natural language processing tasks. Its primary capabilities include:

Knowledge Tuning

The model can learn and recall vast amounts of knowledge, including simple and complicated information. This is achieved through a two-step process:

  1. Simple knowledge: The model learns short samples of information.
  2. Complicated knowledge: The model learns longer samples of information, using a replay buffer with data from the first step.

Skills Tuning

The model can also learn and master various skills, including:

  • Foundational skills: Reasoning and other fundamental skills are learned through in-context learning using seed examples from the taxonomy.
  • Compositional skills: Creative writing and other complex skills are also learned through in-context learning.

Generation Capabilities

The model can generate high-quality text, including:

  • Questions and answers: The model can generate questions and answers based on a given document.
  • Creative writing: The model can generate creative writing samples, such as stories or poems.

Strengths

The Granite-3.0-8b-lab-community model has several strengths, including:

  • Efficient learning: The model can learn from a large amount of data, using a novel synthetic data-based alignment tuning method called LAB.
  • High-quality generation: The model can generate high-quality text, including questions and answers, and creative writing samples.
  • Flexibility: The model can be fine-tuned for specific tasks and domains, using an external knowledge source (document) and prompt.

Unique Features

The Granite-3.0-8b-lab-community model has several unique features, including:

  • Taxonomy-driven data curation: The model uses a taxonomy to drive the sampling process, ensuring that the generated data covers a wide range of tasks.
  • Two-phased training: The model uses a two-phased training process, with a replay buffer to ensure that the model learns from the data efficiently.
  • Open-source and reproducible: The model is open-source and reproducible, allowing for transparency and community involvement.
Examples
Write a short story about a character who discovers a hidden world within their reflection. As she stared into the mirror, Emily noticed something strange. Her reflection began to ripple, like the surface of a pond on a summer's day. She felt an inexplicable pull, as if her reflection was beckoning her to come closer. Without thinking, she reached out a hand and touched the glass. Suddenly, she was sucked into the mirror, finding herself in a world that was identical yet opposite to her own.
Explain the concept of catastrophic forgetting in the context of artificial intelligence. Catastrophic forgetting occurs when an artificial intelligence model forgets previously learned information after being trained on new data. This can happen when the model is not designed to retain knowledge from previous training sessions, resulting in a loss of performance on tasks it was previously able to accomplish.
Generate a question and answer based on the document 'The Benefits of Meditation'. Q: What is one of the primary benefits of regular meditation practice? A: Reduced stress and anxiety.

Performance

This model showcases remarkable performance in various tasks. Let’s dive into its speed, accuracy, and efficiency.

Speed

How fast can the model process information? With a large-scale alignment tuning method, this model can handle massive amounts of data quickly.

Accuracy

But speed is not everything. The model also boasts high accuracy in tasks such as:

  • Knowledge tuning
  • Skills tuning
  • In-context learning

Efficiency

The model is designed to be efficient, using a novel synthetic data-based alignment tuning method. This approach allows the model to learn incrementally, without suffering from catastrophic forgetting.

Limitations

While the Granite-3.0-8b-lab-community model is a powerful tool, it’s not perfect. Here are some of its limitations:

  • Lack of Safety Alignment: The model hasn’t undergone safety alignment, which means it may produce problematic outputs.
  • Hallucination Risk: Smaller models like this one might be more prone to hallucination in ungrounded generation scenarios.
  • Dependence on Teacher Model: The model relies on the Mixtral-8x7B-Instruct teacher model for generation.
  • Limited Domain Knowledge: While the model can learn new domain-specific knowledge, it’s not perfect.
  • Potential for Misuse: As with any powerful tool, there’s a risk of misuse.

Format

The model uses a novel synthetic data-based alignment tuning method called Large-scale Alignment for chatBots (LAB). This approach allows for adding new knowledge and skills to an already pre-trained model without suffering from catastrophic forgetting.

Model Architecture

The LAB approach consists of three key components:

  • Taxonomy-driven data curation process
  • Large-scale synthetic data generator
  • Two-phased-training with replay buffers

Data Formats

The model accepts input in the form of text sequences, and is primarily trained on English language data. The model uses a prompt template to generate responses, which includes a system prompt and a user input.

Input Requirements

To use the model, you will need to provide input in the following format:

sys_prompt = "I am a Red Hat Instruct Model, an AI language model developed by Red Hat and IBM Research based on the granite-3.0-8b-base model. My primary role is to serve as a chat assistant."
prompt = f'<|system|>\n{sys_prompt}\n<|user|>\n{inputs}\n<|assistant|>\n'
stop_token = '<|endoftext|>'

Replace {inputs} with your desired user input.

Output Format

The model generates responses in the form of text sequences. The output will be a continuation of the input prompt, with the model’s response appended to the end.

Special Requirements

  • Use the system prompt employed during the model’s training for optimal inference performance.
  • Be cautious when using the model, as it may produce problematic outputs without adequate safeguards and RLHF.
Dataloop's AI Development Platform
Build end-to-end workflows

Build end-to-end workflows

Dataloop is a complete AI development stack, allowing you to make data, elements, models and human feedback work together easily.

  • Use one centralized tool for every step of the AI development process.
  • Import data from external blob storage, internal file system storage or public datasets.
  • Connect to external applications using a REST API & a Python SDK.
Save, share, reuse

Save, share, reuse

Every single pipeline can be cloned, edited and reused by other data professionals in the organization. Never build the same thing twice.

  • Use existing, pre-created pipelines for RAG, RLHF, RLAF, Active Learning & more.
  • Deploy multi-modal pipelines with one click across multiple cloud resources.
  • Use versions for your pipelines to make sure the deployed pipeline is the stable one.
Easily manage pipelines

Easily manage pipelines

Spend less time dealing with the logistics of owning multiple data pipelines, and get back to building great AI applications.

  • Easy visualization of the data flow through the pipeline.
  • Identify & troubleshoot issues with clear, node-based error messages.
  • Use scalable AI infrastructure that can grow to support massive amounts of data.