Workflow Overview
  • Dark
    Light
  • PDF

Workflow Overview

  • Dark
    Light
  • PDF

Overview

Annotation work in the Dataloop system is contained in a "Task", a process that contains all its required elements:

  • Data: the data needs to be annotated
  • Team: the team to do the work (annotators)
  • Instructions: A recipe which includes the labels/attributes to use (ontology/taxonomy), the labeling tools to use (e.g classification, bounding-box, polygon etc), work instructions (a PDF attachment) and task specific settings.

Tasks & Assignments

We distinguish between 2 type of tasks:

  1. **Annotation task **- a taks in which items are being annotated
  2. QA task - a task for validating annotation work done in an annotation task, with the option to flag annotations with an "Issue".

A task is broken down into "Assignments" - a sub-task per each assignee in the task. Items from the task are allocated to assignment (in advance or based on a queue, as decided in the task settings).

Actions & Statuses

When a worker finished working on an item in their assignment, they perform an action to set a status on the item (e.g. Complete action on an item so it will have "Completed" status).

  • In annotation tasks - the default statuses are "Discarded" (for disqualified items) and "Completed" (for items ready for the next phase)
  • **In QA tasks **- the default statuses are "Discarded" and "Approved".
    When looking at an item with "Approved" status, you know that it went through at least 2 tasks, for annotation and QA.

Continouse Workflow / Micro-Tasking

The labeling workflow process in the Dataloop system can include any number of annotation and QA tasks, and you can add your own custom statuses, thus follow on an item's progress in the labeling workflow - for example you can choose to incorporate a status such as "For expert review", and have a new task with only such item, and the expert as the sole assignee on it.

Data items can go through multiple annotation and QA tasks, facilitating "micro-tasking" – breaking a large, complex annotation process involving many labels and annotation tools into multiple micro-tasks that are easier to complete, allowing the annotator to focus on very specific labels, objects, or annotation tools.

The QA Process

A QA task can be created for some or all items previously completed in an annotation task. The QA task has a manager and assigned contributors with individual QA assignments. QA tasks include by default “Approve” and “Discard” actions to set these statuses on items. These can be modified to suit the annotation workflow.

Annotations failing the QA process are flagged with “Issues,” and the item is returned to the annotator who created it. Flagging an annotation with an issue will cause the removal of the “Completed” status from the item.
A. Issues fixed by annotators are marked “For review” and return to the contributor who flagged the issue.
B. The QA contributor can reopen an issue or set the item to status “Approved,” a status used to show the progress of the QA task and the overall annotation process.

Sample Workflow

A complex labeling job is broken down to multiple “microtasks” where annotators can focus on specific labels, complete annotation quickly, and send the image to the next microtask in line. A labeling task is created using the default “Complete” and “Discard” statuses. Annotation managers can query for items completed in task-A and assign them to task-B (or automate this process using pipelines).

Items that went through all microtasks are applied to a QA task, where they can have “Approved” and “Discard” statuses. Another status can be added (e.g., through a pipeline) to send items for an expert review, as another QA task. At the end, all items with “Approved” status are extracted and used for model training.

A dataset of images from dashboard cameras, for example, can be assigned to a number of annotators: the first annotates cars, the second annotates traffic signs, the third annotates pedestrians, the fourth annotates buildings, and so on.

Finally, in addition to the default statuses of approved/discard in QA and completed/discard in labeling tasks, additional statuses can be created when tasks are created in Dataloop’s pipeline.

Annotation Task

From the dataset browser, you can create an annotation task over an entire dataset or a subset of its items, such as specific folders (e.g., WorkWeek25-Uploads), search criteria (pictures taken from camera 41).

Using a task, the annotation manager can get analytics about the working hours, the average time per item, etc.

As an annotator, you usually will have access only to the tasks needed to be annotated.

Adding Items to Existing Tasks

When repeating this process, you can choose to open new annotation tasks or have your selected items added to an existing task.

Breaking your work into small tasks enable better control over progress, rather than having long-lasting, continuously growing tasks

Distributing and Redistributing Tasks

Items in a task are distributed among all contributors. The automatic option will assign an equal portion of items per contributor, while the manual option gives you more control (some contributors are more experienced than others, and some may not be fully available to the assigned task).

While the task is actively worked by annotators you can choose to redistribute and balance again the remaining items between annotators

Reassigning tasks

Because changes in the workforce can occur while the task is active, you have the option to reassign an assignment from one annotator to the other.

Tasks/Assignments Statuses

To-Do - The task/assignment was created, but work has not started yet
In-Progress - An assignment is being worked on by assignees. In a task, at least 1 of its assignment is being worked on.
Completed - All items in the task/assignment have a status on them.
Completed with issues - All items had statuses, but 1 or more items have an issue on at least 1 of the annotations, therefore the task/assignment is not considered as completed

Users & Roles

These are the main users that typically take part in a task, correlated with their role in a project

  • Task Owner - typically has the role of project manager, and is responsible for arranging the data, preparing and delivering it to tasks and creating them.
  • Task Manager - usually has the role of annotation manager in the project, and responsible for task execution, workforce management, and accuracy assurance.
  • Contributor - responsible for actual labeling or quality assurance, and has a role of “annotator” in the project, limited to such labeling/QA work.

updatedFlow3.png