alvaro reyes qWwpHwip31M unsplash scaled

All You Need to Know About Labeling Instructions

Labeling instructions are key for ensuring data labeling quality. In fact, most data quality issues can be mapped into labeling instruction ambiguities of some kind. The challenge intensifies as often as the instructions are changing during the labeling process itself.

Therefore, solid labeling instructions are the first step in setting the bar for quality annotations. This is also what is going to make each labeling task a smooth process and save you lots of time and money. You can make it very simple for your annotators just by ensuring they follow these clear and simple steps.

Step 1: Define Your Labels

The first step of labeling instructions is defining your labels. Define your labels in a way that leaves no ambiguity to the other person on the other side. A good exercise is always thinking of your labels as an answer to a clear Yes/No question.  Let’s take a look at an example:

In the below example, we can see the annotator is required to label the subject’s eyes with a point tool. You’ll need to annotate the “glasses” with the box tool, and the “glasses’ color” using the green/black/blue attributes. (see picture and chart below).

Label NameToolDescription
“Glasses”BoxAnnotate all visible glasses.
“Eyes”PointAnnotate the center of all of the eyes.
Attribute NameLabels Relevant ForDescription
Green“Glasses”For each “Glasses” label, add the color attribute.
Black“Glasses”
Blue“Glasses”

Step 2: Create a Special Instructions Section 

Often, there are subtleties around the required annotation. When this exists, it’s very important to have clear references to positive/negative examples. We recommend adding a special instructions section in order to clarify the labels. 

For example:

Make sure to annotate the object as tightly and accurately as possible!

As we’ll see in the following step, more instructions will need to be added…

Let’s investigate.

Step 3: Get to Know Your Data

Never send data for labeling by an external vendor before doing it yourself. Labeling can be quite a tedious task and often many developers skip this step. You are guaranteed to learn new things about your data doing so and as a bonus, you can learn how long your requests take — which is a strong indication of your expected labeling cost. 

We recommend trying to annotate a few items by yourself and add clarifications to the instructions. Let’s take a look at the “Glasses” example in order to understand.

Labeling Glasses Image 1:

Description: In this image, there are glasses without a person wearing them.
Re-defined: There is no need to annotate the glasses because no person is wearing them.

Labeling Glasses Image 2:

Description: In this image, the woman’s hair is hiding part of the glasses. 
Re-defined:  If another object is hiding/occluding the glasses, annotate and estimate its size.

Labeling Glasses Image 3:

Description: These are sunglasses and not glasses.
Re-defined: If the glasses are sunglasses there is no need to annotate. 

Labeling Glasses Image 4:

Description: In this image, we can see glasses that have a red color attribute. But we only defined green, black, and blue color attributes.
Re-definition: Add another attribute in the case of glasses with a different color definition.

Step 4: Update Your Instructions

After you get to know your data, it is important to correct and update your instructions accordingly.

Label NameToolDescription
“Glasses”BoxAnnotate all the visible eyeglasses.
“Eyes”PointAnnotate the eyes’ center.
Attribute NameLabels Relevant ForDescription
Green“Glasses”For each “glasses” add the color attribute.
Black“Glasses”
Blue“Glasses”
Other“Glasses”

Special Instructions:

  • There is no need to annotate glasses if a person is not wearing them. (Image 1)
  • If another object is hiding/occluding the glasses, annotate and estimate its size. (Image 2)
  • If the glasses are sunglasses there is no need to annotate them.  (Image 3)
  • In case the glasses have different colors to the attributes referenced, redefine the attributes. (Image 4)
  • Please annotate as tightly and as accurately as possible.

Step 5: Adding Examples

Another way to strengthen your instructions is by adding examples to help annotators avoid future mistakes. Always add both positive and negative examples as your labeling progresses, referencing real-world results as examples will become a best practice. 

Negative Example:

In the following example, you will see an illustration of a wrong annotation.
Here, the bounding box is inaccurate as there is a gap.

Positive Example:

The annotation is tight around the eyeglasses, which is correct.

Example of What NOT to Do (“Negative”):

In this example, there is no need to annotate the glasses, since there isn’t a person wearing them.

Sourcehttp://All You Need to Know About Labeling Instructions

Step 6: Instructions Using Dataloop’s Annotation Studio

We recommend creating your instructions in the form of a presentation. This way, it’s more user-friendly for the annotator. In addition, they’ll be able to review the instructions any time during the annotation project using the PDF tool on the Dataloop platform. Use the bug report mechanism to harvest problematic examples. 

Be sure to stay tuned as more exciting capabilities come your way!

Conclusion

In this post, we reviewed how to easily create better annotation instructions by simply following some basic rules. This can help you improve your communication and prevent future mistakes.

By using our annotations instructions template, you’ll be able to quickly create your own simplified instructions.
Click here to download the full PDF instructions example.

Share this post

Facebook
Twitter
LinkedIn

Related Articles

Illustration of a control tower with floating data and hot air balloons, symbolizing orchestration across hybrid cloud environments

Hybrid Cloud AI Orchestration

Scale AI Workflows Across Cloud and On-Prem Environments Modern AI development is multi-modal, compute-intensive and increasingly hybrid – requiring workloads to run simultaneously across on-prem

Read More