Basic SDK Sort & Filters
-
Print
-
DarkLight
To learn more about Dataloop's filters go to Sort & Filters.
To access the filter's repository click here
The highlighted text requires your input
The Dataloop Query Language - DQL
Using The Dataloop Query Language, you may navigate through massive amounts of data.
You can filter, sort, and update your metadata with it.
Filters
Using filters you can filter items and get a generator of the filtered items. Filters entity help build such filters
Filters - Field & Value
Filter your items or annotations using the parameters in the JSON code that represent its data within our system.
Access your item/annotation's JSON using to_json()
Field
Field refers to the field you filter, the type of data.
As "dir"
is for the directory to use if you wish to filter items by their folder directory.
Value
Value refers to the input by which you want to filter.
Like "/new_folder"
is the directory folder name of items that are needed to be filtered.
Prep
import dtlpy as dl if dl.token_expired(): dl.login() project = dl.projects.get(project_name='project_name') dataset = project.datasets.get(dataset_name='dataset_name')
Items
Filtering Fields - JSON
Use a dot to access parameters within curly brackets.
For example use
field='metadata.system.originalname'
to filter by the item's original name
{ "id": "5f4b60848ced1d50c3df114a", "datasetId": "5f4b603d9825b9f191bbd3b3", "createdAt": "2020-08-30T08:17:08.000Z", "dir": "/new_folder", "filename": "/new_folder/optional.jpg", "type": "file", "hidden": false, "metadata": { "system": { "originalname": "file", "size": 3290035, "encoding": "7bit", "mimetype": "image/jpeg", "annotationStatus": [ "completed" ], "refs": [ { "type": "task", "id": "5f4b61f8f81ab6238c331bd2" }, { "type": "assignment", "id": "5f4b61f8f81ab60508331bd3" } ], "executionLogs": { "image-metadata-extractor": { "default_module": { "run": { "5f4b60841b892d82eaa2d95b": { "progress": 100, "status": "success" } } } } }, "exif": {}, "height": 2734, "width": 4096, "statusLog": [ { "status": "completed", "timestamp": "2020-08-30T14:54:17.014Z", "creator": "user@dataloop.ai", "action": "created" } ], "isBinary": true } }, "name": "optional.jpg", "url": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a", "dataset": "https://gate.dataloop.ai/api/v1/datasets/5f4b603d9825b9f191bbd3b3", "annotationsCount": 18, "annotated": "discarded", "stream": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/stream", "thumbnail": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/thumbnail", "annotations": "https://gate.dataloop.ai/api/v1/items/5f4b60848ced1d50c3df114a/annotations" }
Create filters
filters = dl.Filters() # set resource - optional - default is item filters.resource = dl.FiltersResource.ITEM # add filter - only files filters.add(field='type', values='file') # add filter - only annotated items filters.add(field='annotated', values=True) # add filter - annotation status - only completed items filters.add(field='metadata.system.annotationStatus', values='completed') # add filter - filename includes 'dog' filters.add(field='filename', values='/dog.jpg') # filter by metadata field filters.add(field='metadata.user.some_filed', values=True) # -- time filters -- must be in ISO format and in UTC (offset from local time). converting using datetime package as follows: import datetime, time timestamp = datetime.datetime(year=2019, month=10, day=27, hour=15, minute=39, second=6, tzinfo=datetime.timezone(datetime.timedelta(seconds=-time.timezone))).isoformat() filters.add(field='createdAt', values=timestamp, operator=dl.FiltersOperations.GREATER_THAN)
Get Filtered Items
# return results sorted by ascending id filters.sort_by(field='filename') pages = dataset.items.list(filters=filters)
Update Filtered Items
The update_value must be a dictionary.
The dictionary will only update user metadata.
# to add filed annotatedDogs to all filtered items and give value True # this field will be added to user metadata # create update order update_values = {'annotatedDogsSingJune2019': True} # update pages = dataset.items.update(filters=filters, update_values=update_values)
Delete Filtered Items
dataset.items.delete(filters=filters)
Filter Items by Their Annotations
filters = dl.Filters() # set resource filters.resource = 'items' # add filter - only files filters.add(field='type', values='file') # add annotation filters - only items with 'box' annotations filters.add_join(field='type', values='box') # add annotation filter - only items with issues annotations filters.add_join(field='metadata.system.status', values='issue') # add annotation filter - only items with review annotations filters.add_join(field='metadata.system.status', values='review') # add annotation filter - only items with approved annotations filters.add_join(field='metadata.system.status', values='approved') # get results pages = dataset.items.list(filters=filters)
Annotation
Filtering Fields - JSON
Use a dot to access parameters within curly brackets.
For example use
field='metadata.system.status'
to filter by the annotation's status.
{ "id": "5f576f660bb2fb455d79ffdf", "datasetId": "5e368bee106a76a61cf05282", "type": "segment", "label": "Planet", "attributes": [], "coordinates": [ [ { "x": 856.25, "y": 1031.2499999999995 }, { "x": 1081.25, "y": 1631.2499999999995 }, { "x": 485.41666666666663, "y": 1735.4166666666665 }, { "x": 497.91666666666663, "y": 1172.9166666666665 } ] ], "metadata": { "system": { "status": null, "startTime": 0, "endTime": 1, "frame": 0, "endFrame": 1, "snapshots_": [ { "fixed": true, "type": "transition", "frame": 0, "objectVisible": true, "data": [ [ { "x": 856.25, "y": 1031.2499999999995 }, { "x": 1081.25, "y": 1631.2499999999995 }, { "x": 485.41666666666663, "y": 1735.4166666666665 }, { "x": 497.91666666666663, "y": 1172.9166666666665 } ] ], "label": "Planet", "attributes": [] } ], "automated": false, "isOpen": false, "system": false }, "user": {} }, "creator": "user@dataloop.ai", "createdAt": "2020-09-08T11:47:50.576Z", "updatedBy": "user@dataloop.ai", "updatedAt": "2020-09-08T11:47:50.576Z", "itemId": "5f572f4423a69b8c83408f12", "url": "https://gate.dataloop.ai/api/v1/annotations/5f576f660bb2fb455d79ffdf", "item": "https://gate.dataloop.ai/api/v1/items/5f572f4423a69b8c83408f12", "dataset": "https://gate.dataloop.ai/api/v1/datasets/5e368bee106a76a61cf05282", "hash": "11fdc816804faf0f7266b40d1cb67aff38e5c10d" }
Create Filters
filters = dl.Filters() # set resource filters.resource = dl.FiltersResource.ANNOTATION # add filter - only box annotations filters.add(field='type', values='box') # add filter - only note annotations filters.add(field='type', values='note') # add filter - only dogs filters.add(field='label', values=['Dog', 'cat'], operator=dl.FiltersOperations.IN) # add filter - annotated by Joe and David filters.add(field='creator', values=['Joe@dataloop.ai', 'David@dataloop.ai', 'oa-test-1@dataloop.ai'], operator=dl.FiltersOperations.IN )
Get Filtered Annotations
# return results sorted by descending id filters.sort_by(field='id', value=dl.FiltersOrderByDirection.DESCENDING) pages = dataset.items.list(filters=filters)
Update Filtered Annotations
update_value
must be a dictionary.
The dictionary will only update user metadata.
# to add filed annotation_quality to all filtered annotations and give them value 'high' # this field will be added to user metadata # create update order update_values = {'annotation_quality': 'high'} # update pages = dataset.items.update(filters=filters, update_values=update_values)
Delete Filtered Annotations
dataset.items.delete(filters=filters)
Further Explore SDK Filters
Advanced Filters
See more advanced filtering options, like filtering with multiple options, using operators, deleting a filter, and ignoring SDK's defaults.
Iterator of Items
Consult the "iterator of items" page in order to understand more about iterating your filtered items and annotations.
Filter by Customized Metadata
Explore this end-to-end tutorial from “adding your own metadata to an item” all the way to “filtering items” that have it.