Data Management
  • 14 Nov 2023
  • Dark
    Light
  • PDF

Data Management

  • Dark
    Light
  • PDF

Article Summary

 GENERAL

 PROJECT
 EXPORT & IMPORT
 SORT & FILTERS
 CLOUD STORAGE SETUP


DATA MANAGEMENT FAQ

Why can't I save my item/folder/dataset/project's name?
When naming items, folders, datasets, projects and more, special characters like " * / : < > ? \ | cannot be used.
Top


I uploaded Users and cannot see them all.
If you cannot see all your Users, first go to your Project Dashboard and make sure to click on the "Show Users from All Domains" in the Users view. This should display all the Users assigned to the given project.

Learn more on how to add new Users.
Top


How can I rename my project?
To rename your project, visit the Project Dashboard and click "Project Actions."

To learn more, visit the Project section.
Top


I don't see the project I was looking for, what should I do?

If you get the message, You are no longer a memebr of this project and was removed by the project owner. Please contant the project owner and ask him to invite you to the project. Find more information about it in Adding Team members to Project.


Can I make annotations without storing them on the Dataloop servers?
No.

You can only store the binary layer on an external storage. For more info, go to our storage page.
Top


How can I run across all items on a dataset?
All you need is to get a page from the dataset and iterate it.

pages = dataset.items.list(filters=filters)
for page in pages:
        for item in page:
                item.print()

Top


Can I download masks without the JSON?
No, you cannot because in the SDK the mask is built from the JSON file.

The JSON files will be downloaded to a different folder.

Top


Why do I need to allow Dataloop a "full write access" to my external storage (S3/Azure/GCS)?
Connecting a storage driver to Dataloop requires bucket/container settings to be full write access, to enable full annotation and data management functionality. Like saving new files generated by the Dataloop platform inside your storage (including video thumbnails, .webm files, snapshots, etc.).
Top


How do I filter out items based on approved annotations?
To filter out items based on approved annotations, the following steps should be followed:

    1. Select Annotation Metadata in filters

    2. Select Add Field and Text

    3. Key input metadata.system.status

    4. Value input approved

To learn more, visit the Sort & Filters page.
Top


What is the difference between annotated, completed, approved, and not annotated items?

An item is considered “annotated” once it has any annotations (including a classification, a note, or an annotation from any other tool), either when accessed directly from the dataset browser or from a task. An item can be “annotated” without having a “completed” status. This can be as a result of the item having been annotated in the dataset browser (where the annotation can be saved but no status can be given to the item) or the item has been annotated in a task but the annotator did not click the “complete” button (e.g., the annotator wanted to do more work on the item later).

An item can only have a “completed” status when it goes through a task and the annotator clicks the “complete” button to assign it “completed” status.

An item can only have an “approved” status when it goes through a QA task and the QA tester clicks the “approve” button.

An item is considered “not annotated” if it has not been annotated. An item can have a “completed” status and still be “not annotated” if the annotator clicked the “complete” button in the task but no annotations were created on the item.

To learn more, visit the Sort & Filters page.
Top



Setup S3 policy

Please be sure that your policy is set to the following:

s3: ListBucket - Mandatory
s3: PutObject - for write permissions - Mandatory.
s3: DeleteObject - for the allow delete option
s3: GetObject - Mandatory
Click for a full Guide
Go to the IAM - identity, and access management section and set the following:

Go to the policies page and press “create policy”

image.png

Fill in a name, and add a description if you want
image.png

On this page paste the next script:
image.png

Please make sure to write your bucket name.

    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::bucket name/*",
                "arn:aws:s3:::bucket name"
            ]
        }
    ]
}

Create a user with the bucket’s policy to get the access key.

Go to the Users page on the IAM section and create a new user.

Write a name and select Programmatic access.
image.png

Select “Attach existing policies directly”
image.png

Optional
image.png

Review the data
image.png

Copy the URL and Access Key to the Dataloop platform:
image.png

Top




Setup GCS policy

Define permissions in GCS and get your Access Key for Dataloop
Please be sure that your permissions are set to the following:

  • storage.objects.list - mandatory
  • storage.objects.delete
  • storage.objects.create - mandatory
  • storage.objects.get - mandatory
  • storage.buckets.get - mandatory

Click for a full Guide
Go to the IAM section and search for the roles page that lets you create roles.

image.png

Fill in a name, unique ID, and add a description if you wish.

Press “Add Permission”
image.png

Add permissions ->

Select the following:

  • storage.objects.create
  • Storage.buckets.get
  • storage.objects.delete

image.png

Filter by Bucket

image.png

Add storage.objects.list as well, press “ADD” and press “create role”

image.png

Go to the IAM -> Service accounts page and create a service account.
image.png

Select the custom role you created.

If you don’t see the role and you created it recently please note that it can take 5-10 to appear on the list.

image.png

Optional
image.png

Create a new key->

-> Download as JSON file which you will use for integration:
image.png

Top


Setup Azure policy

  1. Go to "App registrations"

image.png

  2. Add new registration and choose a name (for example: dataloop-app).

image.png

  3. Find your "clientId" and "tenantId" in the Application Overview.
image.png

  4.Go to "certificates & secrets", Click on “New client secret".

Fill in a name in the description and define when the integration will expire (we recommend setting it to "Never").

NOTICE: Copy and store this secret's value right away because you can't see it later.
image.png

  5. Go to "Storage accounts".
image.png

  6. Enter the wanted account (Create an account if you don't already have one).

  7. Go to "Containers" (Create a container if you don't already have one).

image.png

  8. Go to the specific container (the one you wish to add the Dataloop's permissions to).

  9. Go to "Access Control (IAM)".

  10. Click "Add role assignments".

  11. Select:

  • Role: "Storage Blob Data User".
  • Select: "dataloop-app".
    image.png

  12. Click on the "dataloop-app" icon and hit save.

image.png

  13. It can take 5 minutes for the permissions to be updated and available to use by Dataloop.

Top



Extract ECR Patameters

Note: details regarding Private container registry integration can be found in here.

To get access to your ECR private registry from Dataloop, the following parameters should be sent to the “Private Container Registry” integration:

  • Account
  • Region
  • Access key ID
  • Secret access key

image.png

On your console.aws.amazon.com:
Navigate to the user dialog and open the “Security credentials” page.

image.png

On the Security credentials page:
Under the “Account details” section you can find your Account ID next to “AWS account ID”

image.png

Click on “Create access key” button:
image.png

A dialog box will open, displaying the Access key ID and Secret access key.
image.png

To find your Region parameter go to EC2 service:
image.png

Take your default region or find the one you're using with (for your registry)
image.png



Top



What's Next