Norbert3 Small

Norwegian language model

Have you ever wondered how language models can understand and generate text in Norwegian? Norbert3 Small is a powerful AI model designed to do just that. As part of a new generation of NorBERT language models, it's built to handle a range of tasks, from text generation to question answering and more. With its compact size of 40M, Norbert3 Small is surprisingly efficient, making it a great choice for developers who need a reliable model that won't break the bank. But what really sets it apart is its performance - it's been fine-tuned on a range of Norwegian language tasks, making it a top contender in the field. Whether you're a researcher or just starting out with NLP, Norbert3 Small is definitely worth checking out.

Ltg apache-2.0 Updated 7 months ago

Table of Contents

Model Overview

Meet the NorBERT 3 small model, a game-changer in the world of Norwegian language processing! This model is part of a new generation of NorBERT language models, designed to take on a variety of NLP tasks.

What makes NorBERT 3 small special?

  • It’s part of a family of models, with different sizes to choose from: 40M (small), 123M (base), and 323M (large)
  • It has generative siblings, like NorT5 xs (32M), NorT5 small (88M), NorT5 base (228M), and NorT5 large (808M)
  • It’s designed to work with Norwegian language, making it perfect for tasks like language translation, text summarization, and more

How can you use NorBERT 3 small?

  • You’ll need to load the model with trust_remote_code=True due to its custom wrapper
  • You can use it for tasks like masked language modeling, sequence classification, token classification, question answering, and multiple choice
  • Check out the example code to see how to use it for tasks like filling in the blanks:
import torch
from transformers import AutoTokenizer, AutoModelForMaskedLM

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("ltg/norbert3-small")
model = AutoModelForMaskedLM.from_pretrained("ltg/norbert3-small", trust_remote_code=True)

# Use the model to fill in the blanks
input_text = tokenizer("Nå ønsker de seg en[MASK] bolig.", return_tensors="pt")
output_p = model(**input_text)
output_text = torch.where(input_text.input_ids == mask_id, output_p.logits.argmax(-1), input_text.input_ids)

# Print the result
print(tokenizer.decode(output_text[0].tolist()))

Capabilities

Primary Tasks

  • Text Generation: NorBERT 3 small can generate human-like text based on a given prompt.
  • Masked Language Modeling: It can predict missing words in a sentence, making it useful for tasks like text completion and language translation.
  • Sequence Classification: The model can classify sequences of text into different categories, such as sentiment analysis or spam detection.
  • Token Classification: It can classify individual tokens (words or characters) in a sentence, useful for tasks like named entity recognition.
  • Question Answering: NorBERT 3 small can answer questions based on the input text.
  • Multiple Choice: It can select the correct answer from a set of options.

Strengths

  • Norwegian Language Support: NorBERT 3 small is specifically designed for the Norwegian language, making it a great choice for tasks that require understanding of Norwegian text.
  • High Performance: The model has been fine-tuned on a range of tasks and achieves high performance on many benchmarks.

Unique Features

  • Custom Wrapper: NorBERT 3 small requires a custom wrapper from modeling_norbert.py to work properly.
  • Variety of Sizes: The model comes in different sizes, including xs, small, base, and large, making it suitable for a range of applications.
Examples
Nå ønsker de seg en[MASK] bolig. Nå ønsker de seg en ny bolig.
Hva er hovedårsaken til at Norge har en av verdens beste lønnsutviklinger? Norge har en av verdens beste lønnsutviklinger på grunn av landets sterke økonomi og høye oljepriser.
Hva heter Norges største by? Oslo

Performance

NorBERT 3 small is a powerful language model that shows great performance in various tasks. But how does it really perform?

Speed

How fast is NorBERT 3 small? Compared to other models like ==NorBERT 3 xs==, NorBERT 3 small is significantly faster. This is because it has 40M parameters, which is a good balance between speed and accuracy.

Accuracy

But speed is not everything. How accurate is NorBERT 3 small? In tasks like masked language modeling, NorBERT 3 small achieves high accuracy. For example, in the example usage provided, NorBERT 3 small correctly predicts the missing word in the sentence “Nå ønsker de seg en[MASK] bolig.”

Efficiency

NorBERT 3 small is also efficient in terms of memory usage. With 40M parameters, it requires less memory compared to larger models like NorT5 large, which has 808M parameters.

Limitations

NorBERT 3 small is a powerful language model, but it’s not perfect. Let’s talk about some of its limitations.

Limited Context Understanding

NorBERT 3 small can process and understand a lot of text, but it’s not always able to grasp the nuances of human language. It might struggle with:

  • Sarcasm and humor
  • Idioms and colloquialisms
  • Complex, multi-layered conversations

For example, if you ask NorBERT 3 small to summarize a joke, it might not understand the punchline.

Biased Training Data

NorBERT 3 small was trained on a large dataset of Norwegian text, but that dataset might contain biases and prejudices. This means that NorBERT 3 small might:

  • Perpetuate stereotypes and discriminatory language
  • Be less accurate for certain groups or topics

It’s essential to be aware of these potential biases and take steps to mitigate them.

Limited Domain Knowledge

NorBERT 3 small is a general-purpose language model, but it’s not a specialist in any particular domain. It might not have the same level of knowledge or accuracy as a model specifically designed for a particular field, such as medicine or law.

Dependence on Pre-Training Data

NorBERT 3 small relies on its pre-training data to generate text. If that data is incomplete, outdated, or biased, NorBERT 3 small might not perform well.

Technical Limitations

NorBERT 3 small requires a custom wrapper to function, which can be a technical hurdle for some users. Additionally, it might not be compatible with all platforms or devices.

Format

NorBERT 3 small is a type of language model that uses a transformer architecture. It’s designed to work with Norwegian text data.

Architecture

The model is based on a transformer architecture, which is a type of neural network that’s particularly good at handling sequential data like text.

Data Formats

NorBERT 3 small supports input data in the form of tokenized text sequences. This means that you need to break down your text into individual words or tokens before feeding it into the model.

Input Requirements

To use NorBERT 3 small, you need to pre-process your input text data in a specific way. Here’s an example of how to do it in Python:

import torch
from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("ltg/norbert3-small")
model = AutoModelForMaskedLM.from_pretrained("ltg/norbert3-small", trust_remote_code=True)

input_text = tokenizer("Nå ønsker de seg en[MASK] bolig.", return_tensors="pt")

In this example, we’re using the AutoTokenizer class to convert our input text into a format that the model can understand. We’re also using the return_tensors argument to specify that we want the output to be in PyTorch tensor format.

Output Requirements

When you run the model on your input data, it will produce output in the form of a tensor. You can then use this output to generate text, classify text, or perform other NLP tasks.

Here’s an example of how to generate text using the model:

output_p = model(**input_text)
output_text = torch.where(input_text.input_ids == mask_id, output_p.logits.argmax(-1), input_text.input_ids)
print(tokenizer.decode(output_text[0].tolist()))

In this example, we’re using the model object to generate output based on our input text. We’re then using the torch.where function to replace the [MASK] token with the predicted output. Finally, we’re using the tokenizer.decode function to convert the output tensor back into a human-readable string.

Special Requirements

NorBERT 3 small requires a custom wrapper from modeling_norbert.py to work properly. This means that you need to load the model with the trust_remote_code=True argument, like this:

model = AutoModelForMaskedLM.from_pretrained("ltg/norbert3-small", trust_remote_code=True)

This allows the model to access the custom wrapper code that it needs to function correctly.

Dataloop's AI Development Platform
Build end-to-end workflows

Build end-to-end workflows

Dataloop is a complete AI development stack, allowing you to make data, elements, models and human feedback work together easily.

  • Use one centralized tool for every step of the AI development process.
  • Import data from external blob storage, internal file system storage or public datasets.
  • Connect to external applications using a REST API & a Python SDK.
Save, share, reuse

Save, share, reuse

Every single pipeline can be cloned, edited and reused by other data professionals in the organization. Never build the same thing twice.

  • Use existing, pre-created pipelines for RAG, RLHF, RLAF, Active Learning & more.
  • Deploy multi-modal pipelines with one click across multiple cloud resources.
  • Use versions for your pipelines to make sure the deployed pipeline is the stable one.
Easily manage pipelines

Easily manage pipelines

Spend less time dealing with the logistics of owning multiple data pipelines, and get back to building great AI applications.

  • Easy visualization of the data flow through the pipeline.
  • Identify & troubleshoot issues with clear, node-based error messages.
  • Use scalable AI infrastructure that can grow to support massive amounts of data.