Ms Marco TinyBERT L 2

Passage Ranking Model

Meet the Ms Marco TinyBERT L 2, a powerful AI model designed for information retrieval tasks. What if you could quickly find the most relevant passages for a given query? This model makes it possible by encoding queries and passages, then sorting them in decreasing order. With a performance of 67.43 NDCG@10 on the TREC Deep Learning 2019 dataset and 30.15 MRR@10 on the MS Marco Dev dataset, it's clear this model is built for speed and efficiency. Plus, it can process 9,000 documents per second, making it a great choice for those who need fast and accurate results. Whether you're working with large datasets or just need to find the right information quickly, the Ms Marco TinyBERT L 2 is an excellent tool to have in your toolkit.

Cross Encoder apache-2.0 Updated 4 months ago

Table of Contents

Model Overview

The Cross-Encoder for MS Marco model is a powerful tool for information retrieval tasks. It’s trained to help you find the most relevant passages in a large text database, given a specific query.

Capabilities

Imagine you have a huge library with millions of books, and you’re looking for a specific piece of information. This model can encode your query and all the possible passages, and then sort them in order of relevance.

What can it do?

  • Rank passages: Given a query, the model can encode the query with all possible passages and sort them in a decreasing order of relevance.
  • Improve search results: By re-ranking passages, the model can help you find the most accurate and relevant information for your query.

How does it work?

The model uses a combination of natural language processing (NLP) and machine learning algorithms to understand the query and passages. It then calculates a score for each passage based on its relevance to the query.

What makes it special?

  • High performance: The model has achieved state-of-the-art results on the TREC Deep Learning 2019 and MS Marco Passage Reranking datasets.
  • Fast and efficient: It can process a large number of passages quickly, making it suitable for real-time applications.

Performance

But how does it stack up against other models? Let’s take a look at its performance metrics:

Model NameNDCG@10 (TREC DL 19)MRR@10 (MS Marco Dev)Docs / Sec
Current Model69.8432.569,000
MiniLM-L-2-v271.0134.854,100
MiniLM-L-4-v273.0437.702,500

Limitations

While the model is powerful, it’s not perfect. Let’s talk about some of its limitations.

Limited Context Understanding

The model is trained on a specific task, MS Marco Passage Ranking, which means it might not perform well on other tasks or datasets.

Dependence on Passage Retrieval

The model relies on a good passage retrieval system, like ElasticSearch, to provide it with relevant passages to rank.

Limited Scalability

As you can see from the performance table, the model’s performance decreases as the number of passages to rank increases.

Examples
How many people live in Berlin? Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.
What is the average temperature in Paris in January? The average temperature in Paris in January is around 3°C (37°F).
What is the highest mountain in the Alps? The highest mountain in the Alps is Mont Blanc, with a height of 4,810 meters (15,781 feet).

Usage

You can use this model with popular libraries like Transformers and SentenceTransformers. Here’s an example code snippet:

from sentence_transformers import CrossEncoder
model = CrossEncoder('model_name', max_length=512)
scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2')])

Note that the performance metrics were computed on a V100 GPU.

Format

The model accepts input in the form of tokenized text sequences. This means you’ll need to pre-process your text data before feeding it into the model.

Input Requirements

To use this model, you’ll need to provide two inputs:

  • A query (e.g. “How many people live in Berlin?“)
  • A passage (e.g. “Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.“)

You can provide multiple passages for a single query, and the model will rank them in order of relevance.

Output

The model outputs a score for each passage, indicating how relevant it is to the query. You can use these scores to rank the passages and select the most relevant ones.

Code Example

Here’s an example of how to use the model with the Transformers library:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained('model_name')
tokenizer = AutoTokenizer.from_pretrained('model_name')

features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], 
                     ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 
                      'New York City is famous for the Metropolitan Museum of Art.'], 
                     padding=True, truncation=True, return_tensors="pt")

model.eval()
with torch.no_grad():
    scores = model(**features).logits
print(scores)
Dataloop's AI Development Platform
Build end-to-end workflows

Build end-to-end workflows

Dataloop is a complete AI development stack, allowing you to make data, elements, models and human feedback work together easily.

  • Use one centralized tool for every step of the AI development process.
  • Import data from external blob storage, internal file system storage or public datasets.
  • Connect to external applications using a REST API & a Python SDK.
Save, share, reuse

Save, share, reuse

Every single pipeline can be cloned, edited and reused by other data professionals in the organization. Never build the same thing twice.

  • Use existing, pre-created pipelines for RAG, RLHF, RLAF, Active Learning & more.
  • Deploy multi-modal pipelines with one click across multiple cloud resources.
  • Use versions for your pipelines to make sure the deployed pipeline is the stable one.
Easily manage pipelines

Easily manage pipelines

Spend less time dealing with the logistics of owning multiple data pipelines, and get back to building great AI applications.

  • Easy visualization of the data flow through the pipeline.
  • Identify & troubleshoot issues with clear, node-based error messages.
  • Use scalable AI infrastructure that can grow to support massive amounts of data.