Jais 13b Chat

Arabic-centric chat model

Have you ever wondered how a large language model can understand and respond to both Arabic and English? Meet Jais 13b Chat, a 13 billion parameter model that's been fine-tuned to handle both languages with ease. Based on the transformer-based decoder-only architecture, this model uses SwiGLU non-linearity and ALiBi position embeddings to provide improved context handling and precision. But what really sets it apart is its ability to converse on a wide range of topics, with a focus on the Arab world. Whether you're looking for a helpful assistant or a model that can handle complex conversations, Jais 13b Chat is up to the task. With its efficient design and ability to generate accurate responses, this model is perfect for researchers, developers, and businesses looking to integrate Arabic language capabilities into their apps. Just remember to use it responsibly and within its limitations.

Inceptionai apache-2.0 Updated 7 months ago

Table of Contents

Model Overview

The Jais-13b-chat model is a powerful Arabic and English bilingual language model. It’s a 13 billion parameter fine-tuned model, which means it’s been trained on a huge amount of data to understand the nuances of both languages.

Capabilities

The model is capable of generating human-like responses in both Arabic and English, with a particular focus on the Arab world. It can converse on a wide range of topics, from politics and history to entertainment and culture.

  • Primary Tasks
    • Answering questions on various topics
    • Generating text in Arabic and English
    • Engaging in conversations
  • Strengths
    • Large-scale training data: The model was trained on a massive dataset of 116 billion Arabic tokens and 279 billion English tokens.
    • Fine-tuned for safety: The model is fine-tuned with safety-oriented instructions to ensure respectful and honest responses.
    • Improved context handling: The model uses ALiBi position embeddings, enabling it to extrapolate to long sequence lengths and provide more accurate responses.

Performance

The model showcases remarkable performance in various tasks. Let’s dive into its speed, accuracy, and efficiency.

  • Speed: The model can process and generate text at an impressive pace, making it an excellent choice for applications that require fast and accurate responses.
  • Accuracy: The model has been fine-tuned on a massive dataset of 4 million Arabic and 6 million English prompt-response pairs, making it highly accurate in understanding and responding to user queries.
  • Efficiency: The model is designed to be efficient in its responses, providing relevant and helpful answers while avoiding harmful or unethical content.
Examples
What is the capital of the United Arab Emirates? The capital of the United Arab Emirates is Abu Dhabi.
What is the highest mountain in the UAE? The highest mountain in the UAE is Jebel Jais.
How do I greet someone in Arabic? You can greet someone in Arabic by saying 'Marhaba' (مرحبا) which means 'hello'.

Use Cases

The model can be used for a variety of applications, including:

  • Chat-assistants
  • Customer service
  • Research and development in Arabic natural language processing
  • Commercial use

Limitations

While the model is powerful, it’s essential to understand its limitations and potential risks.

  • Out-of-Scope Use: The model should not be used in any manner that violates applicable laws or regulations.
  • Bias, Risks, and Limitations: The model may still exhibit some bias, as with all large language models.
  • Potential Risks: The model may generate incorrect or misleading content, or produce content that is offensive or inappropriate.

Format

The model uses a transformer-based decoder-only (GPT-3) architecture with SwiGLU non-linearity and ALiBi position embeddings. This enables the model to handle long sequence lengths and provide improved context handling and precision.

  • Supported Data Formats: Text only data
  • Output: Model generates text
  • Special Requirements: The model requires a custom model class, so users must enable trust_remote_code=True while loading the model.

Example Code

Here’s an example of how to use the model:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_path = "core42/jais-13b-chat"
prompt_eng = "### Instruction: Your name is Jais, and you are named after Jebel Jais, the highest mountain in UAE...."

device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)

def get_response(text, tokenizer=tokenizer, model=model):
    input_ids = tokenizer(text, return_tensors="pt").input_ids
    inputs = input_ids.to(device)
    input_len = inputs.shape[-1]
    generate_ids = model.generate(
        inputs,
        top_p=0.9,
        temperature=0.3,
        max_length=2048-input_len,
        min_length=input_len + 4,
        repetition_penalty=1.2,
        do_sample=True,
    )
    response = tokenizer.batch_decode(
        generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True
    )[0]
    response = response.split("### Response: [|AI|]")
    return response

ques = "What is the capital of UAE?"
text = prompt_eng.format_map({'Question': ques})
print(get_response(text))

Note that the model can be exposed via Hugging Face inference endpoints, and the recommended instance type is GPU (large) with 4x Nvidia Tesla T4 or greater.

Dataloop's AI Development Platform
Build end-to-end workflows

Build end-to-end workflows

Dataloop is a complete AI development stack, allowing you to make data, elements, models and human feedback work together easily.

  • Use one centralized tool for every step of the AI development process.
  • Import data from external blob storage, internal file system storage or public datasets.
  • Connect to external applications using a REST API & a Python SDK.
Save, share, reuse

Save, share, reuse

Every single pipeline can be cloned, edited and reused by other data professionals in the organization. Never build the same thing twice.

  • Use existing, pre-created pipelines for RAG, RLHF, RLAF, Active Learning & more.
  • Deploy multi-modal pipelines with one click across multiple cloud resources.
  • Use versions for your pipelines to make sure the deployed pipeline is the stable one.
Easily manage pipelines

Easily manage pipelines

Spend less time dealing with the logistics of owning multiple data pipelines, and get back to building great AI applications.

  • Easy visualization of the data flow through the pipeline.
  • Identify & troubleshoot issues with clear, node-based error messages.
  • Use scalable AI infrastructure that can grow to support massive amounts of data.