KORani V3 13B

Korean language model

KORani V3 13B is a large language model designed for Korean and English tasks. It's built on the LLaMA 13B model and fine-tuned using the ShareGPT and KoVicuna datasets. This model excels in translation, question-answering, and summarization tasks, with performance comparable to other state-of-the-art models like GPT-4 and Vicuna-13B. With its unique blend of Korean and English capabilities, KORani V3 13B is a valuable tool for those working with both languages. Its efficiency and speed make it suitable for a wide range of applications, from research to real-world use.

KRAFTON apache-2.0 Updated 7 months ago

Table of Contents

Model Overview

The KORani-v3-13B model is a powerful tool designed for Korean and English language tasks. It’s built on top of the LLaMA 13B and Polyglot 12.8B models, and fine-tuned using the ShareGPT and KoVicuna datasets.

Capabilities

The model is capable of handling various tasks, including translation, question answering, and summarization. For example, it can translate English text into Korean with impressive accuracy.

Translation Capabilities

KORani-v3-13B can translate English text into Korean with ease. For instance, given the input “Hey! I have some Kissflow Legal Review requests that are blocked by Tax and Accounting, can this be looked at?”, the model outputs "".

Question Answering Capabilities

The model can also answer questions based on a given text. When asked “헨리 구스타프의 뇌는 현재 어디에 위치해 있는가?” (Where is Henry Gustav’s brain currently located?), the model responds with “헨리 구스타프의 뇌는 현재 미국 샌디에이고에 있습니다.” (Henry Gustav’s brain is currently located in San Diego, USA).

Summarization Capabilities

KORani-v3-13B can summarize long documents into shorter, more digestible text. When given a Korean document about the increase in public servants in Gwangyang City, the model outputs a three-sentence summary:

“광양시의 공무원 정원이 전년도 대비 79명 늘어난 1019명으로 증가했다. 이로 인해 행정서비스 향상이 기대되며, 지역과 나라 발전을 뒷받침하는 경제도시로서의 위상이 상승할 것으로 전망된다. 한편, 광양시는 전라남도로부터 4급 상설기구인 ‘산단녹지관리센터’를 승인받아 도시 개발에 탄력을 받을 것으로 예상된다.”

Performance

The model showcases remarkable performance in various tasks, particularly in translation, question-answering, and summarization.

Translation (ENG -> KOR)

ModelScore (averaged over 5 examples)
GPT-4-
DeepL9.4
GPT-3.5-turbo8.6
Vicuna-13B3.8
KoAlpaca-13B5.6
KORani-v17.5
KORani-v25.4
KORani-v37.1

QA (Korean)

ModelScore (averaged over 5 examples)
GPT-4-
GPT-3.5-turbo7.1
Vicuna-13B7.2
KoAlpaca-13B2.6
KORani-v16.6
KORani-v28.0
KORani-v36.4

Summarization (Korean)

ModelScore (averaged over 5 examples)
GPT-4-
GPT-3.5-turbo8.5
Vicuna-13B7.4
KoAlpaca-13B5.7
KORani-v17.0
KORani-v26.2
KORani-v36.9

Limitations

While the model is powerful, it has its own set of limitations. For example, it may not perform well on tasks that require specialized knowledge or expertise.

Limited Domain Knowledge

The model’s domain knowledge is limited to the data it was trained on. This means that it may not perform well on tasks that require specialized knowledge or expertise.

Lack of Common Sense

The model doesn’t have the same level of common sense as a human. It may not always understand the nuances of human language or behavior, which can lead to errors or misunderstandings.

Vulnerability to Adversarial Attacks

Like other language models, the model is vulnerable to adversarial attacks. This means that it can be tricked into generating incorrect or misleading outputs by carefully crafted input.

Alternatives

If you’re looking for alternative models, you may want to consider the following options:

Examples

Examples
Translate this sentence from English to Korean: 'I would like to know the current weather in Seoul.' 서울의 현재 날씨를 알고 싶습니다.
Summarize the following text in 3 Korean sentences: 'The Korean government has announced plans to increase the number of public holidays. The new holidays will include a day off on the spring equinox and a day off on the autumnal equinox. This change is expected to boost tourism and improve work-life balance.' 1) 한국 정부는 공휴일을 늘리기로 했습니다. 2) 봄과 가을 분기에 각각 1일씩 추가로 휴일이 생깁니다. 3) 이로 인해 관광업이 활성화되고, 일과 삶의 균형이 개선될 것으로 기대됩니다.
What is the current location of Henry Gustav Molaison's brain? 헨리 구스타프의 뇌는 현재 미국 샌디에이고에 위치해 있다.

Here are some examples of how you can use the model:

import torch
from transformers import KORaniForSequenceClassification, KORaniTokenizer

# Load pre-trained model and tokenizer
model = KORaniForSequenceClassification.from_pretrained('KORani-v3-13B')
tokenizer = KORaniTokenizer.from_pretrained('KORani-v3-13B')

# Pre-process input text
input_text = "안녕하세요! 저는 Kissflow Legal Review 요청이 세금과 회계에 의해 차단되고 있는데, 이 문제가 살펴볼 수 있을까요?"
inputs = tokenizer.encode_plus(
    input_text,
    add_special_tokens=True,
    max_length=512,
    return_attention_mask=True,
    return_tensors='pt'
)

# Forward pass
outputs = model(inputs['input_ids'], attention_mask=inputs['attention_mask'])

# Get output text
output_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(output_text)

Note: This code example assumes that you have the transformers library installed and have downloaded the pre-trained model and tokenizer for KORani-v3-13B.

Dataloop's AI Development Platform
Build end-to-end workflows

Build end-to-end workflows

Dataloop is a complete AI development stack, allowing you to make data, elements, models and human feedback work together easily.

  • Use one centralized tool for every step of the AI development process.
  • Import data from external blob storage, internal file system storage or public datasets.
  • Connect to external applications using a REST API & a Python SDK.
Save, share, reuse

Save, share, reuse

Every single pipeline can be cloned, edited and reused by other data professionals in the organization. Never build the same thing twice.

  • Use existing, pre-created pipelines for RAG, RLHF, RLAF, Active Learning & more.
  • Deploy multi-modal pipelines with one click across multiple cloud resources.
  • Use versions for your pipelines to make sure the deployed pipeline is the stable one.
Easily manage pipelines

Easily manage pipelines

Spend less time dealing with the logistics of owning multiple data pipelines, and get back to building great AI applications.

  • Easy visualization of the data flow through the pipeline.
  • Identify & troubleshoot issues with clear, node-based error messages.
  • Use scalable AI infrastructure that can grow to support massive amounts of data.