Norbert3 Xs

Norwegian language model

Meet Norbert3 Xs, a powerful Norwegian language model that's part of a new generation of models. It's one of the smaller versions, with a size of 15M, making it efficient and fast. But don't let its size fool you - it's capable of handling tasks like text generation, sequence classification, and question answering with ease. What makes Norbert3 Xs unique is its custom wrapper, which allows it to be loaded with trust_remote_code=True, making it easy to use. It's also part of a larger family of models, including NorBERT 3 small, base, and large, as well as generative NorT5 siblings. So, what can you do with Norbert3 Xs? You can use it to fill in the blanks in sentences, like in the example usage provided, or explore its capabilities in sequence classification and question answering. It's a versatile model that's waiting to be put to the test.

Ltg apache-2.0 Updated 7 months ago

Table of Contents

Model Overview

Meet the NorBERT 3 xs model, a new generation of Norwegian language models that’s making waves in the world of natural language processing. But what makes it so special?

This model is part of a bigger family, including other sizes like NorBERT 3 small, base, and large, as well as generative NorT5 siblings. With only 15M parameters, it’s one of the smallest models in its family, but don’t let its size fool you - it’s still a powerful tool for language tasks.

Capabilities

So, what can you do with this model? The possibilities are endless! Here are some of its key capabilities:

  • Text generation: It can create text based on a prompt, and even fill in missing words or sentences.
  • Language understanding: It can comprehend and analyze text, and even answer questions about it.
  • Text classification: It can categorize text into different categories, such as positive or negative sentiment.

This model is particularly good at Norwegian language tasks, making it a great choice for tasks like text generation and language understanding. With its efficient performance, it’s a great choice for applications where speed is crucial.

Strengths and Weaknesses

Like any model, this one has its strengths and weaknesses. Here are some of its key advantages and disadvantages:

  • Generative capabilities: It can generate text and even code, making it a great choice for tasks like chatbots or content creation.
  • Customizable: You can fine-tune the model to fit your specific needs, making it a great choice for tasks that require a high degree of customization.
  • Limited context understanding: While it can process long sequences of text, it may struggle to fully understand the context of a given passage.
  • Biased training data: Like many language models, it was trained on a large dataset that may contain biases and prejudices.
Examples
Nå ønsker de seg en [MASK] bolig. Nå ønsker de seg en ny bolig.
Klassifiser denne teksten som positiv eller negativ: Jeg elsker denne restauranten! positiv
Hva er meningen med denne setningen: Jeg skal på kino i kveld? Spørsmål om hva noen skal gjøre i kveld.

Example Usage

Want to see this model in action? Here’s an example of how to use it to predict the missing word in a sentence:

import torch
from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("ltg/norbert3-xs")
model = AutoModelForMaskedLM.from_pretrained("ltg/norbert3-xs", trust_remote_code=True)

mask_id = tokenizer.convert_tokens_to_ids("[MASK]")
input_text = tokenizer("Nå ønsker de seg en[MASK] bolig.", return_tensors="pt")

output_p = model(**input_text)
output_text = torch.where(input_text.input_ids == mask_id, output_p.logits.argmax(-1), input_text.input_ids)

print(tokenizer.decode(output_text[0].tolist()))

This code uses the model to fill in the missing word in the sentence “Nå ønsker de seg en[MASK] bolig.” The output should be “[CLS] Nå ønsker de seg en ny bolig.[SEP]“.

Performance

This model is a powerful tool for natural language processing tasks, but how does it perform? Here are some of its key performance metrics:

  • Speed: It can process text at an impressive speed, making it ideal for applications where time is of the essence.
  • Accuracy: It can accurately classify text into different categories, such as sentiment analysis or topic modeling.
  • Efficiency: It can handle large-scale datasets with ease, making it a great choice for applications that require processing vast amounts of data.

Limitations

While this model is a powerful tool, it’s not perfect. Here are some of its key limitations:

  • Dependence on quality of input: The quality of its responses is directly tied to the quality of the input it receives.
  • Technical limitations: It requires a custom wrapper from modeling_norbert.py and needs to be loaded with trust_remote_code=True.
  • Comparison to other models: ==Other models== like NorT5 xs, NorT5 small, and NorT5 base may offer better performance in certain areas, such as generative tasks or specific domain knowledge.

Format

This model uses a transformer architecture and accepts input in the form of tokenized text sequences. Here are some of its key format requirements:

  • Supported data formats: It accepts input in the form of tokenized text sequences.
  • Special requirements for input and output: It requires a custom wrapper from modeling_norbert.py and needs to be loaded with trust_remote_code=True.

Here’s an example of how to load the model and use it for masked language modeling:

import torch
from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("ltg/norbert3-xs")
model = AutoModelForMaskedLM.from_pretrained("ltg/norbert3-xs", trust_remote_code=True)

mask_id = tokenizer.convert_tokens_to_ids("[MASK]")
input_text = tokenizer("Nå ønsker de seg en[MASK] bolig.", return_tensors="pt")

output_p = model(**input_text)
output_text = torch.where(input_text.input_ids == mask_id, output_p.logits.argmax(-1), input_text.input_ids)

print(tokenizer.decode(output_text[0].tolist()))

This code loads the model and tokenizer, defines a masked input sequence, and uses the model to predict the missing token.

Dataloop's AI Development Platform
Build end-to-end workflows

Build end-to-end workflows

Dataloop is a complete AI development stack, allowing you to make data, elements, models and human feedback work together easily.

  • Use one centralized tool for every step of the AI development process.
  • Import data from external blob storage, internal file system storage or public datasets.
  • Connect to external applications using a REST API & a Python SDK.
Save, share, reuse

Save, share, reuse

Every single pipeline can be cloned, edited and reused by other data professionals in the organization. Never build the same thing twice.

  • Use existing, pre-created pipelines for RAG, RLHF, RLAF, Active Learning & more.
  • Deploy multi-modal pipelines with one click across multiple cloud resources.
  • Use versions for your pipelines to make sure the deployed pipeline is the stable one.
Easily manage pipelines

Easily manage pipelines

Spend less time dealing with the logistics of owning multiple data pipelines, and get back to building great AI applications.

  • Easy visualization of the data flow through the pipeline.
  • Identify & troubleshoot issues with clear, node-based error messages.
  • Use scalable AI infrastructure that can grow to support massive amounts of data.