KORani V3 13B
KORani V3 13B is a large language model designed for Korean and English tasks. It's built on the LLaMA 13B model and fine-tuned using the ShareGPT and KoVicuna datasets. This model excels in translation, question-answering, and summarization tasks, with performance comparable to other state-of-the-art models like GPT-4 and Vicuna-13B. With its unique blend of Korean and English capabilities, KORani V3 13B is a valuable tool for those working with both languages. Its efficiency and speed make it suitable for a wide range of applications, from research to real-world use.
Table of Contents
Model Overview
The KORani-v3-13B model is a powerful tool designed for Korean and English language tasks. It’s built on top of the LLaMA 13B and Polyglot 12.8B models, and fine-tuned using the ShareGPT and KoVicuna datasets.
Capabilities
The model is capable of handling various tasks, including translation, question answering, and summarization. For example, it can translate English text into Korean with impressive accuracy.
Translation Capabilities
KORani-v3-13B can translate English text into Korean with ease. For instance, given the input “Hey! I have some Kissflow Legal Review requests that are blocked by Tax and Accounting, can this be looked at?”, the model outputs "".
Question Answering Capabilities
The model can also answer questions based on a given text. When asked “헨리 구스타프의 뇌는 현재 어디에 위치해 있는가?” (Where is Henry Gustav’s brain currently located?), the model responds with “헨리 구스타프의 뇌는 현재 미국 샌디에이고에 있습니다.” (Henry Gustav’s brain is currently located in San Diego, USA).
Summarization Capabilities
KORani-v3-13B can summarize long documents into shorter, more digestible text. When given a Korean document about the increase in public servants in Gwangyang City, the model outputs a three-sentence summary:
“광양시의 공무원 정원이 전년도 대비 79명 늘어난 1019명으로 증가했다. 이로 인해 행정서비스 향상이 기대되며, 지역과 나라 발전을 뒷받침하는 경제도시로서의 위상이 상승할 것으로 전망된다. 한편, 광양시는 전라남도로부터 4급 상설기구인 ‘산단녹지관리센터’를 승인받아 도시 개발에 탄력을 받을 것으로 예상된다.”
Performance
The model showcases remarkable performance in various tasks, particularly in translation, question-answering, and summarization.
Translation (ENG -> KOR)
Model | Score (averaged over 5 examples) |
---|---|
GPT-4 | - |
DeepL | 9.4 |
GPT-3.5-turbo | 8.6 |
Vicuna-13B | 3.8 |
KoAlpaca-13B | 5.6 |
KORani-v1 | 7.5 |
KORani-v2 | 5.4 |
KORani-v3 | 7.1 |
QA (Korean)
Model | Score (averaged over 5 examples) |
---|---|
GPT-4 | - |
GPT-3.5-turbo | 7.1 |
Vicuna-13B | 7.2 |
KoAlpaca-13B | 2.6 |
KORani-v1 | 6.6 |
KORani-v2 | 8.0 |
KORani-v3 | 6.4 |
Summarization (Korean)
Model | Score (averaged over 5 examples) |
---|---|
GPT-4 | - |
GPT-3.5-turbo | 8.5 |
Vicuna-13B | 7.4 |
KoAlpaca-13B | 5.7 |
KORani-v1 | 7.0 |
KORani-v2 | 6.2 |
KORani-v3 | 6.9 |
Limitations
While the model is powerful, it has its own set of limitations. For example, it may not perform well on tasks that require specialized knowledge or expertise.
Limited Domain Knowledge
The model’s domain knowledge is limited to the data it was trained on. This means that it may not perform well on tasks that require specialized knowledge or expertise.
Lack of Common Sense
The model doesn’t have the same level of common sense as a human. It may not always understand the nuances of human language or behavior, which can lead to errors or misunderstandings.
Vulnerability to Adversarial Attacks
Like other language models, the model is vulnerable to adversarial attacks. This means that it can be tricked into generating incorrect or misleading outputs by carefully crafted input.
Alternatives
If you’re looking for alternative models, you may want to consider the following options:
- GPT-4
- Vicuna-13B
- ==KoAlpaca-13B==
Examples
Here are some examples of how you can use the model:
import torch
from transformers import KORaniForSequenceClassification, KORaniTokenizer
# Load pre-trained model and tokenizer
model = KORaniForSequenceClassification.from_pretrained('KORani-v3-13B')
tokenizer = KORaniTokenizer.from_pretrained('KORani-v3-13B')
# Pre-process input text
input_text = "안녕하세요! 저는 Kissflow Legal Review 요청이 세금과 회계에 의해 차단되고 있는데, 이 문제가 살펴볼 수 있을까요?"
inputs = tokenizer.encode_plus(
input_text,
add_special_tokens=True,
max_length=512,
return_attention_mask=True,
return_tensors='pt'
)
# Forward pass
outputs = model(inputs['input_ids'], attention_mask=inputs['attention_mask'])
# Get output text
output_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(output_text)
Note: This code example assumes that you have the transformers
library installed and have downloaded the pre-trained model and tokenizer for KORani-v3-13B.