Openthaigpt1.5 72b Instruct
OpenThaiGPT 1.5 is a state-of-the-art Thai language chat model that boasts impressive capabilities. With 72 billion parameters, it's designed to handle complex conversations and answer Thai-specific domain questions. This model stands out for its multi-turn conversation support, Retrieval Augmented Generation (RAG) compatibility, and tool calling feature, which enables users to call various functions through intelligent responses. OpenThaiGPT 1.5 has been fine-tuned on over 2 million Thai instruction pairs and has achieved the highest average scores across various Thai language exams compared to other open-source Thai LLMs. What sets it apart is its ability to process up to 131,072 tokens of input and generate up to 8,192 tokens, making it ideal for detailed and complex interactions. Want to explore its capabilities? You can try it out through the online demo or example code for API calling. Keep in mind that the model's performance may vary depending on the input and context.
Table of Contents
Model Overview
The OpenThaiGPT 72b Version 1.5 model is a cutting-edge Thai language chat model with 72 billion parameters. It’s based on the Qwen v2.5 architecture and has been fine-tuned on over 2 million Thai instruction pairs.
Capabilities
Primary Tasks
The model excels in various tasks, including:
- Multi-turn conversation support: Engage in extended dialogues with users, understanding context and responding accordingly.
- Retrieval Augmented Generation (RAG): Enhance response generation by retrieving relevant information from external sources.
- Tool calling support: Efficiently call various functions through intelligent responses, enabling users to access real-time data and perform tasks.
Strengths
- State-of-the-art Thai language LLM: Achieves the highest average scores across various Thai language exams compared to other open-source Thai LLMs.
- Impressive context handling: Processes up to
131,072tokens of input and generates up to8,192tokens, enabling detailed and complex interactions.
Performance Benchmarks
The model has been evaluated on various Thai language exams, achieving impressive results:
| Exam Name | OpenThaiGPT 72b Version 1.5 | Other Models |
|---|---|---|
| scb10x/llama-3-typhoon-v1.5x-70b-instruct | 64.07 | 58.76 (Qwen/Qwen2-72B-Instruct) |
| meta-llama/Llama-3.1-70B-Instruct | 76.67 | 58.23 (meta-llama/Meta-Llama-3.1-70B-Instruct) |
| Qwen/Qwen2.5-72B-Instruct | 75.00 | 57.35 (Qwen/Qwen2.5-14B-Instruct) |
Usage and Integration
The model can be used through various APIs and libraries, including:
- Free API Service: Hosted by Siam.AI and Float16.cloud.
- OpenAI Client Library: Hosted by VLLM.
- Huggingface: Using the
transformerslibrary.
Examples
Performance
The model showcases remarkable performance in various tasks, especially in handling the Thai language.
Speed
The model can process up to 131,072 tokens of input and generate up to 8,192 tokens, making it suitable for complex and detailed interactions.
Accuracy
In multiple Thai language exams, the model achieved the highest average scores compared to other open-source Thai LLMs.
Limitations
While the model is advanced, it’s not perfect. Some limitations include:
- Context Handling: May struggle with extremely long or complex conversations.
- Domain Knowledge: Limited to specific domains or topics.
- Zero-Shot Learning: May not always generalize well to completely new or unfamiliar tasks.
Format
The model uses a transformer architecture and supports input in the form of tokenized text sequences, with a specific format for prompts.
Prompt Format
The prompt format is based on ChatML. It consists of three parts:
system: This part defines the role of the system.user: This part defines the role of the user.assistant: This part defines the role of the assistant.
Here’s an example of a single-turn conversation prompt:
<|im_start|>system\nคุณคือผู้ช่วยตอบคำถามที่ฉลาดและซื่อสัตย์<|im_end|>\n<|im_start|>user\nสวัสดีครับ<|im_end|>\n<|im_start|>assistant\n
Code Examples
Here’s an example of how to use the model with the Hugging Face library:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "openthaigpt/openthaigpt1.5-72b-instruct"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "ประเทศไทยคืออะไร"
messages = [{"role": "system", "content": "คุณคือผู้ช่วยตอบคำถามที่ฉลาดและซื่อสัตย์"}, {"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(**model_inputs, max_new_tokens=512)
generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
Note that this is just an example, and you may need to modify the code to suit your specific use case.


