TowerInstruct 13B V0.1

Multilingual translator

Have you ever struggled with translation-related tasks? TowerInstruct-13B-v0.1 is here to help. This 13B parameter model is fine-tuned on a diverse range of data sources, including translation, automatic post edition, and named-entity recognition. It supports 10 languages, including English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, and Russian. With its ability to handle tasks like general machine translation, terminology-aware translation, and paraphrase generation, TowerInstruct-13B-v0.1 is a powerful tool for anyone working with languages. But what really sets it apart is its efficiency and speed. By leveraging a mix of publicly available and synthetic datasets, this model can provide fast and accurate results, making it a valuable asset for both technical and non-technical users. Whether you're working on a translation project or just need help with a specific task, TowerInstruct-13B-v0.1 is designed to make your life easier.

Unbabel cc-by-nc-4.0 Updated 7 months ago

Table of Contents

Model Overview

The TowerInstruct-13B-v0.1 model is a powerful language model that can handle various translation-related tasks. It was developed by a team of researchers from Unbabel, Instituto Superior Técnico, and CentraleSupélec University of Paris-Saclay.

Capabilities

This model can perform tasks such as:

  • General machine translation (e.g., sentence- and paragraph/document-level translation)
  • Terminology-aware translation
  • Context-aware translation
  • Automatic post edition
  • Named-entity recognition
  • Gramatical error correction
  • Paraphrase generation

It supports 10 languages, including English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, and Russian.

How was it trained?

The model was fine-tuned on a mix of publicly available, synthetic datasets on translation-related tasks, as well as conversational datasets and code instructions.

How can I use it?

You can use the model by running the pipeline() function from the Transformers library. Here’s an example:

pipe = pipeline("text-generation", model="Unbabel/TowerInstruct-13B-v0.1", torch_dtype=torch.bfloat16, device_map="auto")

Performance

The model is designed to handle various translation-related tasks quickly and efficiently. With 13B parameters, it can process large amounts of data and respond promptly.

Speed

How fast can a language model process and respond to multiple tasks? The TowerInstruct-13B-v0.1 model is designed to handle various translation-related tasks quickly and efficiently.

Accuracy

But how accurate is it? The model has been fine-tuned on a diverse range of data sources, including translation, automatic post edition, and named-entity recognition. This fine-tuning enables the model to achieve high accuracy in tasks such as:

  • General machine translation
  • Terminology-aware translation
  • Context-aware translation
  • Paraphrase generation

Efficiency

What about its efficiency? The model is designed to be efficient in processing and responding to multiple tasks. With a total_train_batch_size of 256 and a learning_rate of 7e-06, the model can process large datasets quickly and accurately.

Limitations

The model has not been aligned to human preferences, so it may generate problematic outputs (e.g., hallucinations, harmful content, or false statements). Additionally, the model is not intended to be used as a document-level translator.

Language Support

The model only supports 10 languages: English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, and Russian. If you try to use it with other languages, it may not perform well.

Conversational Limitations

Although the model was trained on conversational data and code instructions, it’s not designed to be a conversational chatbot or code assistant.

Document-Level Translation

The model is not intended for document-level translation. It’s better suited for shorter texts, like sentences or paragraphs.

Format

The model accepts input in the form of text sequences, supporting multiple languages, including:

  • English
  • Portuguese
  • Spanish
  • French
  • German
  • Dutch
  • Italian
  • Korean
  • Chinese
  • Russian

Input Requirements

To use the model, you need to format your input using the ChatML prompt templates. Here’s an example:

<|im_start|>user
{USER PROMPT}
<|im_end|>
<|im_start|>assistant
{MODEL RESPONSE}
<|im_end|>

Output Format

The model generates text sequences as output. You can access the generated text using the generated_text key in the output dictionary.

Examples
Translate the following text from Portuguese into English. Portuguese: Um grupo de investigadores lançou um novo modelo para tarefas relacionadas com tradução. English: A group of researchers has launched a new model for translation-related tasks.
Recognize named entities in the following sentence: 'The new CEO of Microsoft, Satya Nadella, was born in India.' The recognized named entities are: Organization - Microsoft, Person - Satya Nadella, Location - India.
Generate a paraphrase for the sentence: 'The company will release a new product next quarter.' A new product is scheduled to be launched by the company in the upcoming quarter.

Example Use Case

You can use the model to translate text from Portuguese to English. Here’s an example:

messages = [{"role": "user", "content": "Translate the following text from Portuguese into English.\nPortuguese: Um grupo de investigadores lançou um novo modelo para tarefas relacionadas com tradução.\nEnglish:"}]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=False)
print(outputs[0]["generated_text"])

This example demonstrates the model’s ability to translate text from Portuguese to English accurately and efficiently.

Dataloop's AI Development Platform
Build end-to-end workflows

Build end-to-end workflows

Dataloop is a complete AI development stack, allowing you to make data, elements, models and human feedback work together easily.

  • Use one centralized tool for every step of the AI development process.
  • Import data from external blob storage, internal file system storage or public datasets.
  • Connect to external applications using a REST API & a Python SDK.
Save, share, reuse

Save, share, reuse

Every single pipeline can be cloned, edited and reused by other data professionals in the organization. Never build the same thing twice.

  • Use existing, pre-created pipelines for RAG, RLHF, RLAF, Active Learning & more.
  • Deploy multi-modal pipelines with one click across multiple cloud resources.
  • Use versions for your pipelines to make sure the deployed pipeline is the stable one.
Easily manage pipelines

Easily manage pipelines

Spend less time dealing with the logistics of owning multiple data pipelines, and get back to building great AI applications.

  • Easy visualization of the data flow through the pipeline.
  • Identify & troubleshoot issues with clear, node-based error messages.
  • Use scalable AI infrastructure that can grow to support massive amounts of data.