-
Print
-
DarkLight
-
PDF
Overview
-
Print
-
DarkLight
-
PDF
Dataloop is an enterprise-grade data-engine for AI/ML projects lifecycle management with unstructured data (Computer-Vision, NLP), from development to production at scale.
Core platform modules for Data-Management, Workforce-Management, Taxonomy-Management, Compute Management and Model Management, are building blocks for
-
Data Pipelines, to automate and streamline data processing with human-in-the-loop at scale
-
Data applications - to manually label and process data at high quality and lower costs
-
Custom applications - allowing developers to build project specific solutions, either new or forked from dataloop applications, driving highest project efficiency.

Data management - The platform allows you to manage and version unstructured data (such as images, videos, audio, text, and LIDAR) without affecting the binaries. This allows the entire data organization, including data scientists, data engineers, and data operators, to search, filter, sort, clone, merge, query, and annotate data from a single location.
-
Ingest (index only) your data into Dataloop by connecting your cloud storage, such as AWS, GCP, or Azure.
-
Manage datasets structured with folders, and perform subsecond queries on millions of files by any item attribute, item metadata, or user metadata.
-
Setup upstream and downstream sync to keep your cloud storage in sync with your datasets, no matter where you make changes.
Workforce Management - Manage project workforce by adding your domain experts, working with labeling service providers, or crowdsourcing. You can group them by domain expertise or any other context.
-
Create workflows for annotation work and QA/QC.
-
Workflows can include any number of steps, allowing breaking large, complex annotation tasks into small, easier-to-work, and control micro-tasks.
Compute Management (FaaS) - Function as a Service (FaaS) - It enables users to deploy packages and run services from them, with access to computing resources and data from the Dataloop system.
-
Extend the Dataloop functions or develop your project in Dataloop by running code and models in the FaaS module.
-
Connect your code from Git or upload a package, select compute resources and debug in runtime.
Model management - Manage your models' lifecycle and on-going development, alongside your data.
-
Connect your Model architecture (if pre-trained - with weight files) through Git to use it with your services(FaaS) and pipelines.
-
Train it with versions of Data or facilitate continuous-learning pipelines with human-in-the-loop.
Data applications / Annotation studios - Generate the data your models required by labeling with the Dataloop studios for Image, Video, Audio and NLP.
-
Tuned for performance and quality, our data-applications can be customized to deliver meet your project-specific needs.
-
For highest quality and accuracy, and lower data labeling costs, upload model inferences, use consensus tasks and break complex tasks into micro-task workflows.
Data Pipelines- Compose pipelines to process data with human-in-the-loop (HITL). Facilitate your business processes and development/project pipelines to achieve any data flow by combining functions, models and manual annotation work. You can even add your custom nodes to the pipelines with settings UI and node functionality you define and build.

Taxonomy Management (Recipe & Ontology) - Your project’s taxonomy provides the means for machines to understand hierarchies in the information. Dataloop ontology (domain specific taxonomy) and recipe contains hierarchical labels and attributes to associate multiple layers of labels to an image and store and retrieve them efficiently and cost-effectively.
- Ontology
- Labels - hierarchically structured names to classify annotations
- Attributes - a changeable property or value for the label.
- Recipe - It is linked with an Ontology, the recipe adds labelling instructions and settings, such as labelling tools to be used, mapping of tools to specific labels/attributes, PDF instructions files, etc.