TowerInstruct 13B V0.1
Have you ever struggled with translation-related tasks? TowerInstruct-13B-v0.1 is here to help. This 13B parameter model is fine-tuned on a diverse range of data sources, including translation, automatic post edition, and named-entity recognition. It supports 10 languages, including English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, and Russian. With its ability to handle tasks like general machine translation, terminology-aware translation, and paraphrase generation, TowerInstruct-13B-v0.1 is a powerful tool for anyone working with languages. But what really sets it apart is its efficiency and speed. By leveraging a mix of publicly available and synthetic datasets, this model can provide fast and accurate results, making it a valuable asset for both technical and non-technical users. Whether you're working on a translation project or just need help with a specific task, TowerInstruct-13B-v0.1 is designed to make your life easier.
Table of Contents
Model Overview
The TowerInstruct-13B-v0.1 model is a powerful language model that can handle various translation-related tasks. It was developed by a team of researchers from Unbabel, Instituto Superior Técnico, and CentraleSupélec University of Paris-Saclay.
Capabilities
This model can perform tasks such as:
- General machine translation (e.g., sentence- and paragraph/document-level translation)
- Terminology-aware translation
- Context-aware translation
- Automatic post edition
- Named-entity recognition
- Gramatical error correction
- Paraphrase generation
It supports 10 languages, including English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, and Russian.
How was it trained?
The model was fine-tuned on a mix of publicly available, synthetic datasets on translation-related tasks, as well as conversational datasets and code instructions.
How can I use it?
You can use the model by running the pipeline()
function from the Transformers library. Here’s an example:
pipe = pipeline("text-generation", model="Unbabel/TowerInstruct-13B-v0.1", torch_dtype=torch.bfloat16, device_map="auto")
Performance
The model is designed to handle various translation-related tasks quickly and efficiently. With 13B parameters
, it can process large amounts of data and respond promptly.
Speed
How fast can a language model process and respond to multiple tasks? The TowerInstruct-13B-v0.1 model is designed to handle various translation-related tasks quickly and efficiently.
Accuracy
But how accurate is it? The model has been fine-tuned on a diverse range of data sources, including translation, automatic post edition, and named-entity recognition. This fine-tuning enables the model to achieve high accuracy in tasks such as:
- General machine translation
- Terminology-aware translation
- Context-aware translation
- Paraphrase generation
Efficiency
What about its efficiency? The model is designed to be efficient in processing and responding to multiple tasks. With a total_train_batch_size
of 256
and a learning_rate
of 7e-06
, the model can process large datasets quickly and accurately.
Limitations
The model has not been aligned to human preferences, so it may generate problematic outputs (e.g., hallucinations, harmful content, or false statements). Additionally, the model is not intended to be used as a document-level translator.
Language Support
The model only supports 10 languages: English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, and Russian. If you try to use it with other languages, it may not perform well.
Conversational Limitations
Although the model was trained on conversational data and code instructions, it’s not designed to be a conversational chatbot or code assistant.
Document-Level Translation
The model is not intended for document-level translation. It’s better suited for shorter texts, like sentences or paragraphs.
Format
The model accepts input in the form of text sequences, supporting multiple languages, including:
- English
- Portuguese
- Spanish
- French
- German
- Dutch
- Italian
- Korean
- Chinese
- Russian
Input Requirements
To use the model, you need to format your input using the ChatML prompt templates. Here’s an example:
<|im_start|>user
{USER PROMPT}
<|im_end|>
<|im_start|>assistant
{MODEL RESPONSE}
<|im_end|>
Output Format
The model generates text sequences as output. You can access the generated text using the generated_text
key in the output dictionary.
Example Use Case
You can use the model to translate text from Portuguese to English. Here’s an example:
messages = [{"role": "user", "content": "Translate the following text from Portuguese into English.\nPortuguese: Um grupo de investigadores lançou um novo modelo para tarefas relacionadas com tradução.\nEnglish:"}]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=False)
print(outputs[0]["generated_text"])
This example demonstrates the model’s ability to translate text from Portuguese to English accurately and efficiently.