- 01 May 2023
- DarkLight
- PDF
Data Versioning
- Updated On 01 May 2023
- DarkLight
- PDF
Dataloop’s powerful data versioning provides you with unique tools for data management.
Clone, merge, and slice & dice your files to create multiple versions for various applications. Sample use cases include:
- Golden training sets management
- Reproducibility (dataset training snapshot)
- Experimentation (creating subsets from different kinds)
- Task or assignment management
Data Version "Snapshot"
Use the versioning feature to save data (items, annotations, metadata, etc.) before any major process. For example, data versioning can serve as a rollback mechanism to original datasets in case of an error without losing the data.
Cloning Dataset Items
Clone dataset items with annotations or metadata. It does not clone the Item status, such as approved, completed, discarded, etc.
Cloning Entire Datasets
You can clone the entire datasets by following the instructions:
- From the left portal menu, select Data Management > Datasets.
- Choose the Dataset from the list and click on the ellipsis icon.

- Click Clone Dataset from the list.
- From the Clone Dataset/Items window, choose to which dataset you want to clone the items:
- Existing Dataset:
- Select a dataset from the list.
- Search and select the folder in the dataset you want the item to be cloned to (root folder, subfolders, etc.).
- New Dataset: Enter a name for the new dataset.
- Existing Dataset:
- Choose the cloning options:
- Clone with item annotations
- Clone with item metadata
- Click Clone.
Cloning Items
Dataloop allows cloning Items into the target datasets. You can clone items only from internal storage (for example, Dataloop cloud storage) to internal storage or from external storage (for example, S3) to external storage that uses the same storage driver (for example, using the same integration secret and storage driver pointing at the same location).
To clone an item, follow the steps:
- From the left portal menu, select the Data Management > Datasets.
- Click on a dataset from the list.
- Use any of the following options:
- Right-click a single or multiple item(s) and select Clone from the list.
- Select single or multiple item(s) and click Clone Dataset icon.
- Click Clone Dataset icon to clone all items in the dataset.

- From the Clone Dataset/Items window, choose to which dataset you want to clone the items:
- Existing Dataset:
- Select a dataset from the list.
- Search and select the folder in the dataset you want the item to be cloned to (root folder, subfolders, etc.).
- New Dataset: Enter a name for the new dataset.
- Existing Dataset:
- Choose the cloning options:
- Clone with item annotations
- Clone with item metadata
- Click Clone.

Merge Datasets
Dataset merging outcome depends on how similar or different the datasets are.
- Cloned Datasets – items, annotations, and metadata will be merged. It means that you will see annotations from different datasets on the same item.
Merging items from a cloned datasets is only possible if the items being merged were cloned from the same master item, i.e., the cloned items must both point to the same reference.
- Different datasets (not clones) with similar recipes: Items will be summed up, and related items will be duplicated.
- Datasets with different recipes: Datasets with different default recipes cannot be merged. Use the Switch recipe option on the dataset level (ellipsis icon) to match recipes between datasets, and be able to merge them.
To merge datasets, follow the instructions:
- From the left portal menu, select Data Management > Datasets.
- Select the datasets from the list.
- Click the Merge icon.
- In the Merge Datasets window, enter a Name for the newly merged dataset.
- Select whether to merge With Items Annotations and/or With Items Metadata (i.e., with information entered by annotators).

Once the merge is completed successfully, the new dataset is added to the list with Dataset type as Merge.