Orion 14B Base

Multilingual AI model

Orion-14B is a cutting-edge, open-source multilingual large language model that stands out in its class. With 14 billion parameters, it's trained on a massive 2.5 trillion token multilingual corpus, covering languages like Chinese, English, Japanese, and Korean. This model boasts exceptional performance in comprehensive evaluations, particularly in Japanese and Korean test sets. Its fine-tuned versions demonstrate strong adaptability and excel in human-annotated blind tests. Notably, the long-chat version can handle extremely long texts with impressive performance. Additionally, the quantized versions reduce model size by 70% and increase inference speed by 30%, with minimal performance loss. This model is a game-changer in the field of natural language processing, offering a unique blend of efficiency, speed, and capabilities.

OrionStarAI other Updated 4 months ago

Table of Contents

Model Overview

The Orion-14B model is a multilingual large language model that can understand and respond to text-based input in multiple languages, including English, Chinese, Japanese, and Korean. It has been trained on a massive dataset of 2.5 trillion tokens and has demonstrated outstanding performance in various evaluations.

Capabilities

Capable of generating both text and code, this model outperforms many open-source chat models across common industry benchmarks.

Key Features

  • Multilingual support: Trained on a diverse dataset of 2.5 trillion tokens, including Chinese, English, Japanese, Korean, and more.
  • Strong performance: Shows outstanding performance in comprehensive evaluations, outperforming other models in Japanese and Korean testsets.
  • Adaptability: Fine-tuned models demonstrate strong adaptability, excelling in human-annotated blind tests.
  • Long-context support: Can handle extremely lengthy texts, performing exceptionally well at a token length of 200k and supporting up to a maximum of 320k.
  • Quantization: Can be quantized to reduce model size by 70% and improve inference speed by 30%, with minimal performance loss.

Model Variants

  • Orion-14B-Base: A multilingual large language foundational model with 14 billion parameters.
  • Orion-14B-Chat: A chat-model fine-tuned on a high-quality corpus for an excellent interactive experience.
  • Orion-14B-LongChat: A long-context version of the chat model, excelling at handling extremely lengthy texts.
  • Orion-14B-Chat-RAG: A chat-model fine-tuned on a custom retrieval augmented generation dataset.
  • Orion-14B-Chat-Plugin: A chat-model specifically tailored for plugin and function calling tasks.

Performance

This model showcases remarkable performance in various tasks, demonstrating its capabilities in speed, accuracy, and efficiency.

Speed

Quantized versions reduce model size by 70% and improve inference speed by 30%, making it an excellent choice for applications requiring fast processing.

Accuracy

In comprehensive evaluations, the base model outperforms other models with similar parameter scales, achieving high scores in language understanding and common knowledge tasks.

Efficiency

The long-chat version excels at handling extremely lengthy texts, performing exceptionally well at a token length of 200k and supporting up to a maximum of 320k.

Multilingual Capabilities

Demonstrates strong multilingual capabilities, significantly outperforming in Japanese and Korean testsets, and showing excellent performance in Chinese, English, and other languages.

Chat and Plugin Tasks

Chat and plugin models exhibit strong adaptability, excelling in human-annotated blind tests and plugin-related tasks.

Comparison with Other Models

Outperforms other models, including LLaMA 2-13B, ==Skywork-13B==, and ==Baichuan 2-13B==, in various tasks and evaluations.

Real-World Applications

Impressive performance makes it an ideal choice for various applications, such as:

  • Chatbots and conversational AI
  • Language translation and understanding
  • Text classification and generation
  • Plugin and function calling tasks
Examples
Can you tell me a joke in Japanese? Why don't scientists trust atoms? Because they make up everything. (,)
Please summarize a long text: 'OrionStar is a leading global service robot solutions company, founded in September 2016. OrionStar is dedicated to using artificial intelligence technology to create the next generation of revolutionary robots, allowing people to break free from repetitive physical labor and making human work and life more intelligent and enjoyable. Through technology, OrionStar aims to make society and the world a better place. OrionStar possesses fully self-developed end-to-end artificial intelligence technologies, such as voice interaction and visual navigation. It integrates product development capabilities and technological application capabilities. Based on the Orion robotic arm platform, it has launched products such as OrionStar AI Robot Greeting, AI Robot Greeting Mini, Lucki, Coffee Master, and established the open platform OrionOS for Orion robots. Following the philosophy of Born for Truly Useful Robots, OrionStar empowers more people through AI technology.' OrionStar is a global service robot solutions company founded in 2016. It uses AI technology to create robots that make human work and life more intelligent and enjoyable. The company has developed various products and technologies, including voice interaction and visual navigation, and has established an open platform for its robots.
Can you generate a short story in Korean? Once upon a time, in a small village nestled in the mountains, there lived a young girl named Min-ju. She was a kind and gentle soul, loved by all who knew her. One day, while out collecting herbs in the forest, Min-ju stumbled upon a hidden cave. As she explored the cave, she discovered a magical crystal that glowed with an otherworldly light. From that day on, Min-ju possessed the power to heal any wound, and she used her gift to help those in need. The villagers were amazed by her abilities, and soon, people from all over the region came to seek her help. Min-ju's kindness and generosity had brought prosperity to the village, and she lived happily ever after.

Example Use Cases

  • Casual Chat: Engage in conversation, asking questions and receiving responses.
  • Japanese & Korean Chat: Interact in Japanese and Korean, demonstrating its multilingual capabilities.

Limitations

While powerful, this model is not perfect. Some limitations include:

Language Understanding

Performance may vary depending on the language and specific task. For example, performance on Japanese and Korean testsets is impressive, but it may struggle with more complex or nuanced tasks in these languages.

Common Sense and World Knowledge

May lack common sense or world knowledge in certain areas. It may not always understand the context or nuances of a particular situation, which can lead to inaccurate or inappropriate responses.

Reasoning and Problem-Solving

Ability to reason and solve problems is limited. It may struggle with complex logical or mathematical problems, and its responses may not always be accurate or relevant.

Bias and Fairness

May reflect biases present in the data it was trained on. This can result in unfair or discriminatory responses, particularly in sensitive areas such as race, gender, or politics.

Quantization and Performance

Quantized versions reduce model size by 70% and improve inference speed by 30%, but they may also incur a minimal performance loss of around 1%.

Security and Safety

Can be used for malicious purposes if not properly secured and monitored. Users are urged to use the model responsibly and in compliance with applicable laws and regulations.

Format

Accepts input in the form of tokenized text sequences. Supports multiple languages, including English, Chinese, Japanese, and Korean.

Architecture

Based on a transformer architecture and has 14 billion parameters. Trained on a diverse dataset of 2.5 trillion tokens.

Data Formats

Supports the following data formats:

  • Tokenized text sequences
  • Multilingual input (English, Chinese, Japanese, Korean)

Input Requirements

To use this model, you need to:

  • Preprocess your input text by tokenizing it
  • Use a specific format for input sequences (e.g., {"role": "user", "content": "Hello, what is your name? "})

Output Format

The output is a text sequence that can be used for various tasks, such as:

  • Chatting
  • Question answering
  • Text generation

Example Code

Here is an example of how to use this model in Python:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("OrionStarAI/Orion-14B", use_fast=False, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("OrionStarAI/Orion-14B", device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True)

messages = [{"role": "user", "content": "Hello, what is your name? "}]
response = model.chat(tokenizer, messages, streaming=False)
print(response)
Dataloop's AI Development Platform
Build end-to-end workflows

Build end-to-end workflows

Dataloop is a complete AI development stack, allowing you to make data, elements, models and human feedback work together easily.

  • Use one centralized tool for every step of the AI development process.
  • Import data from external blob storage, internal file system storage or public datasets.
  • Connect to external applications using a REST API & a Python SDK.
Save, share, reuse

Save, share, reuse

Every single pipeline can be cloned, edited and reused by other data professionals in the organization. Never build the same thing twice.

  • Use existing, pre-created pipelines for RAG, RLHF, RLAF, Active Learning & more.
  • Deploy multi-modal pipelines with one click across multiple cloud resources.
  • Use versions for your pipelines to make sure the deployed pipeline is the stable one.
Easily manage pipelines

Easily manage pipelines

Spend less time dealing with the logistics of owning multiple data pipelines, and get back to building great AI applications.

  • Easy visualization of the data flow through the pipeline.
  • Identify & troubleshoot issues with clear, node-based error messages.
  • Use scalable AI infrastructure that can grow to support massive amounts of data.