Data Browser
  • 09 Jan 2024
  • Dark
    Light
  • PDF

Data Browser

  • Dark
    Light
  • PDF

Article Summary

Overview

The dataset browser allows you to explore and navigate through dataset items. It typically provides a user-friendly interface for searching, visualizing, and accessing data within a dataset. You can filter, sort, and view the dataset items, including images, text, audio, video, LiDAR, etc. making it easier to analyze and work with large volumes of data.


The Dataset Browser page features are explained in the following sections (marked on the screenshot):


Section 1: Access the Dataset Browser

Access the dataset browser using one of the following options:

  • In the Dataloop left-side menu, select Data.
    • For existing datasets, double-click on the name datasets to open the dataset browser,
    • For a new dataset, see the Create a dataset article.
  • In the Project Overview > Data Management widget, click on your Dataset name.

The Classic Browser is displayed by default. To use the new version of the Dataset Browser, click Switch to Browser 2.1 (section 1, that is marked on the above screenshot).

User Roles and Permissions

In the Dataset browser, you can carry out a range of functions. Make sure you refer to the Roles and Permissions article to understand the user roles and permissions.


Section 2: Search and Filter Items

The dataset browser filter enables you to refine datasets by filtering them based on:

  • The item features, such as filename, media type, etc.
  • Annotation features, such as filter by label, type, etc.

For more information about the Search Filter, see the Smart Search article.


Section 3: Viewer

The Dataset browser offers two viewing options, Thumbnail and Details. By default, the thumbnail view is shown, and you can use the respective View icons to switch between these options.

Info

When there are no items in the Dataset:

  • Stored in Dataloop Storage, click Upload File or Upload Folder to add items into the dataset.
  • Stored in the external storage, click Sync Data to sync your external dataset items.


Thumbnail View

The Thumbnail view displays smaller versions of dataset items such as images, documents, and audio files in a grid of preview images. By default, this view is displayed, and the items are sorted by the latest upload date. It allows users to quickly browse and select specific content without opening each item individually.

In the Thumbnail view, you can:

  • Adjust the thumbnail size using the slider control at the bottom left of the page.
  • Use smaller thumbnails to view more items on a page.
  • By default, the Show File Name option in the Settings is enabled to show the file names. You can toggle it to hide the file names.
  • Use the Sort By option to sort the items in the dataset. By default, the items are displayed according to the Creation Date.
Items color-coded as Green

Dataloop displays items with distinct colors corresponding to their annotation status:

  • Green: The Green color indicates that the item is annotated.
  • No Color: The absence of color indicates that the item is not annotated.


Details View

In this view, users can see a list of items with their associated details including File Name, File Created Date, Media Type, Item's annotation status, etc. This makes it easier to access and review specific information about each item in a structured and organized manner.

In the Details view, you can:

  • Click on the checkbox next to the file name to select all the items in the current page.
  • Click on the Manage Columns to hide columns.
  • Sort items based on the Columns.

Common Features in the Thumbnail and Details Views

  • The number of selected items is highlighted.
  • Click on the Select All option to select all the items in the current page.
  • Total number of dataset items, and breadcrumbs navigation to give a clear path back to higher levels, such as sub-folder and folder.
  • Settings > Show Hidden Files: By default, the Show Hidden Files option is disabled. You can toggle it to show the hidden files.
  • Items per page: By default, 100 items are displayed per page. You can select 2, 25, 50, 100, 250, 500, and 1000 items per page.
  • Use the page navigation options to view next and previous pages, or enter a specific page number and click Go to view the page.

View Context

You can utilize one of the following view contexts to showcase items in the Dataset browser. When applying a filter, it is implemented within the scope chosen by the user, whether it's the entire dataset or a specific folder.

The dataset browser enables organizing file items in file-system-like:

  • Items based: By default, this view is displayed and in this view it shows all items regardless of their folder structure, enabling the application of filters and displaying all items at the Root Folder (Dataloop). When you select a folder, it shows Items Only, and it does Not show any sub-folders if available.
  • Folders based: It shows items based on the folders or subfolders you selected. When you select the Root Folder (Dataloop), it shows items and folders if available in the Root folder.

You can perform the following actions:

  • Click Folder based to view items and sub-folders in a folder.
  • Create folders: Select the root folder and click Add Folder or the new folder icon when you hover Root Folder (Dataloop).
  • Create Sub-Folders: Select the folder and click Add Folder or the new folder icon when you hover the selected folder.

  • Move items between folders:

    1. Select one or more items from the current page,
    2. Right-click and select File Actions > Move to Folder.
    3. Select the folder from the list.
    4. Click Move.
  • Edit the folder name, hover over the folder, and click on the Pencil icon.

  • Select a folder and right-click to:
    • Rename: Rename the selected folder.
    • Move: Moved items in the selected folder to another folder.
    • Copy item path: It copies the complete item path.
    • Create Trigger: It allows creating a trigger function for the selected folder items.
    • Delete: It allows you to delete the selected folder and items in the folder.
Items can no longer be viewed based on tasks.

Section 4: Settings, Actions, and Details

Data Browser allows users to perform various actions, manage settings, and access detailed information about specific items or features.

Dataset Details

The Dataset Details provides the following information related to the dataset. To view the Dataset Details, no items should be selected.

  • Dataset ID: The ID of the dataset. Click on the Copy icon to copy the dataset ID.
  • Recipe: The name and the link of the recipe that is configured to the dataset. Click on the link to open the recipe page.
  • Dataset Analytics: Dataset analytics refers to the process of collecting, analyzing, and deriving insights from a dataset. Click on the link to view the Progress tab of the Analytics page.
  • Project: The project name of the dataset. Click on the Copy icon to copy the project ID.
  • Owning Organization: The name of the organization to which the dataset is affiliated. Click on the Copy icon to copy the Organization ID.

Upload Items

Clicking on the Upload (Sync for external cloud storage) icon allows you to upload files and folders into Dataloop's storage.

Item Tab

Choose an item to view the following details in the right-side panel:

  • File Name: The name of the selected item. Click on the copy icon to copy the file name.
  • Created at: The creation date of the selected item.
  • Description: The text description of the item. Click on the pencil icon to add or edit descriptions. Also, item descriptions can be added during file uploads, serving as an additional way to search for items containing specific text or descriptions.
  • File path: The folder path where the file is located. Click on the copy icon to copy the file path.
  • Item ID: Unique identification for the item. Click on the copy icon to copy the item ID.
  • Item path: A URL link to the item on the Dataloop platform. Click on the copy icon to copy the item path.
  • Parent Item Link: If the item is a clone, it shows the parent item as a link. The link takes to the datasets where the original item is located, filtered to that item.
  • Labeling Tasks: This section provides the number of annotations and classifications associated with the selected item.

Automation Tab

The Automation tab allows your selected item to run with FaaS, Pipeline, or Model Predictions. Click on the following options to create a function, pipeline, or model prediction execution:

  • Run with FaaS: It lists the functions of the all the activated FaaS services. Select a function to execute with the selected items.
  • Run with Pipeline: It allows you to select a pipeline to execute with the selected items.
  • Run Model Predictions: It allows you to select a model to generate predictions for the selected items. Only trained and deployed models are available for selection.

When executions are available, you can search executions by function, application, or pipeline. Also, the following details are displayed:

  • Pipeline: The name and link of the pipeline. Click on the link to view the pipeline.
  • Application name: Name of the application.
  • Function name: Name of the function.
  • Execution Status: Success, Running, and Failed.
  • Updated At: Date and time of the execution update.
  • Rerun: If needed, click the Play icon to rerun the execution.
  • Filter icon: Filter executions based on the status, such as Success, Failed, Running, and Pending.
Logs and Executions

Click the link to access a comprehensive overview on the Executions or Logs page.

Metadata Tab

Item metadata refers to the descriptive information and attributes associated with individual items within a dataset.

You can perform the following actions:

  • Click on the copy icon to copy the metadata.
  • To edit the metadata:
    1. Click the Edit icon to open the editor.
    2. Make changes as required.
    3. Click on the Save icon to save the changes.

Dataset Actions

Dataset Browser allows you to perform the following actions based on dataset and item level. The following actions are available to perform when you click on the Dataset Actions. A few actions are not applicable if you select more than one item.

Right-click Actions

You can also use the right-click to perform the following actions. It's important to note that actions from the right-click menu cannot be applied to multiple selected items simultaneously. The menu is opened only for the individual item on which the right-click is performed.

Create a Task or Add the Dataset to an Existing Task

  1. In the Dataset Browser, click Dataset Actions.
  2. Select Labeling Tasks and select the following:
    1. Create a New Task
    2. Add to Existing Task
Items to Task or Model

When creating a task or model from the Dataset browser, it includes all items in the dataset.

Rename an Item

  1. In the Dataset Browser, select the item you want to rename.
  2. Click Dataset Actions.
  3. Select File Actions > Rename.
  4. Edit the name and click Rename. A confirmation message is displayed.

Export an Item

  1. In the Dataset Browser, select the item you want to export.
  2. Click Dataset Actions.
  3. Select File Actions > Export.

Clone an Item

  1. In the Dataset Browser, select the item you want to clone.
  2. Click Dataset Actions.
  3. Select File Actions > Clone.

Classify an Item

  1. In the Dataset Browser, select the item you want to classify.
  2. Click Dataset Actions.
  3. Select File Actions > Classification.

Move Items to a Folder

  1. In the Dataset Browser, select the item you want to move.
  2. Click Dataset Actions.
  3. Select File Actions > Move to Folder.
  4. Select a folder from the list.
  5. Click Move. A confirmation message is displayed.

Open an Item in a New Browser Tab

It allows you to view images, play audio files, etc. in a new browser tab.

  1. In the Dataset Browser, select the item.
  2. Click Dataset Actions.
  3. Select File Actions > Open File in New Tab. The selected file will be opened in a new browser tab.

Open an Item in the Annotation Studio

In the Dataset Browser, identify the item and double-click. The item will be opened in the default annotation studio based on the type of the item, such as image, audio, video, etc.

Open an Item in a Specific Annotation Studio Version

  1. In the Dataset Browser, select the item.
  2. Click Dataset Actions.
  3. Select File Actions > Open With.
  4. Select the Annotation Studio. The item will be opened in the annotation studio based on the type of the item, such as image, audio, video, etc.

Open an Item in the Annotation Studio in a New Browser Tab

It allows you to view images, play audio files, etc. in a new browser tab.

  1. In the Dataset Browser, select the item.
  2. Click Dataset Actions.
  3. Select File Actions > Open File in New Tab. The selected file will be opened in a new browser tab.

Show Hidden Files

  1. In the Dataset Browser, click on the Settings icon.
  2. Enable the Show Hidden Files option.

The hidden files will have the hidden icon (crossed eye) in the corner of the hidden item/folders. Also, the thumbnail will be grayed out.

Delete Annotations from an Item

  1. In the Dataset Browser, select the item you want to delete annotations.
  2. Click Dataset Actions.
  3. Select File Actions > Delete Annotations. A confirmation message is displayed.

Delete Items

  1. In the Dataset Browser, select the item you want to delete.
  2. Click Dataset Actions.
  3. Select File Actions > Delete Items.
  4. Click Yes. A confirmation message is displayed.

Create a New Model Version Using a Dataset

  1. In the Dataset Browser, click Dataset Actions.
  2. Select Models and select the Create New Model Version.

Deploy the Datasets to Export in COCO/YOLO/VOC Formats

The Dataset browser incorporates significant automation capabilities, enabling you to export dataset items in industry-standard formats through the following functions. Any function available within this application can be applied to selected items or an active query. For more information, see the Export in COCO/YOLO/VOC Format article.

  1. In the Dataset Browser, click Dataset Actions.
  2. Select Deployment Slot and select one of the following format. A message is displayed as the execution of function <global-converter> was created successfully, please check activity bell.
    1. COCO Converter.
    2. YOLO Converter.
    3. VOC Converter.

Use an Item to Generate Predictions with a Model

You can use the dataset items to generate predictions by using a trained and deployed model.

  1. In the Dataset Browser, select the item.
  2. Click Dataset Actions.
  3. Select Models > Predict with Model.
  4. Search and select a trained and deployed model from the list.
  5. Click Predict. A confirmation message is displayed.
Additional actions
  • Search models by model name, project name, application name, and status.
  • Use the filter to sort the models by scope and model status.

Deploy a Trained Model for Generating Predictions

You can use only trained and deployed models for generating predictions. To deploy a trained model, perform the following instructions:

  1. In the Dataset Browser, select the item.
  2. Click Dataset Actions.
  3. Select Models > Predict with Model.
  4. Identify the trained model and click Deploy. The Model Version Deployment page is displayed.
  5. In the Deployment and Service Fields tabs, make changes in the available fields as needed.
  6. Click Deploy. A confirmation message is displayed.

Assign an Item to a Model Test, Train, or Validation Datasets

You can assign the selected items to model datasets, such as test, train, and validation. When you assign, a tag (Test, Train, or Validation) will be added to the item details.

  1. In the Dataset Browser, select the item.
  2. Click Dataset Actions.
  3. Select Models.
  4. Select the following options as per requirement:
    1. Assign to Test Set. The Test Dataset is used to evaluate the performance of a trained model on new, unseen data.
    2. Assign to Train Set. The Train Dataset is used to train the machine learning model, helping it learn patterns and make predictions.
    3. Assign to Validation Set. The Validation Dataset is used to fine-tune the model and optimize its hyperparameters, helping prevent overfitting.

Remove an Item from the Model Test, Train, or Validation Datasets.

  1. In the Dataset Browser, select the item.
  2. Click Dataset Actions.
  3. Select Models.
  4. Select the following options as per requirement. A confirmation message will be displayed, and the respective tag will be deleted from the item details.
    1. Remove from Test Set.
    2. Remove from Train Set.
    3. Remove from Validation Set.

Run an Item with a FaaS or Pipeline

Run a selected item to a function from a running service (FaaS) or a running pipeline.

  1. In the Dataset Browser, select the item.
  2. Click Dataset Actions.
  3. Select:
    1. Run with FaaS: It allows you to select a function to execute with the selected items.
    2. Run with Pipeline: It allows you to select a pipeline to execute with the selected items.
  4. Select a function or pipeline from the list.
  5. Click Execute. A confirmation message will be displayed.
Additional actions
  • Search functions by function name, project name, and service name.
  • Search pipelines by pipeline name.
  • Filter functions by public functions, project functions and all functions in the user’s projects.

Automation Info and Warning Messages

The following information and warning messages are displayed when you run the item with a FaaS, Pipeline, or Model predictions.

  • When you select more than one item with a function/pipeline/model with item input: Triggering multiple items to a function with single-item input will execute each item separately, resulting in the creation of multiple executions.
  • When you select more than one item to a functions/pipelines with item[] input: Triggering multiple items to a function with an item[] list input will execute all items in a single execution.
  • When you select more than 1000 items to a functions/pipelines with item[] input: The functions with the item[] input are disabled, and displays a warning message that the function with the item[] input cannot be executed with more than 1000 items in the list.

Bulk Operations

The Dataset browser facilitates bulk operations within the specified context. To carry out bulk operations:

  1. Manually select one or more items using the Command or Windows key + mouse left-click.
  2. Perform the available actions for the items, such as Move to Folder, Export, Clone, Classification, etc.

Section 5: View Number of Items and Annotations

The Data Browser provides you the number of items and annotations available in the dataset. You can view the following information on the top-right side of the data browser page:

  • All Dataset Items: It displays the number of items available in the dataset.
  • Annotated Items: It displays the number of items that are annotated.
  • Annotations: It displays the number of annotations available in the all annotated items.