FaaS Concepts
  • 21 Sep 2023
  • Dark
  • PDF

FaaS Concepts

  • Dark
  • PDF

Article Summary


A Package is a static code with a schema that contains all modules, functions, and the code base from which they can be taken. It also stores its revision history, allowing services to be deployed from previous package revisions.

A Package is split into Modules (not Python modules). For each Module, the Package specifies where the entry point of the Module within the Codebase is, as well as howto load it and the configuration by which the Module should be loaded, and how the Class's functions should be run. More on modules.

Only project members with the role Developer or Project Owner can push/delete a package.


The package codebase is the code you import to the platform containing all of the modules and functions.
When you upload the code to the platform, either from your computer or from Github, it is saved on the platform as an item (in a .zip file).
If you wish to download the codebase, use the following script:

# Get the service
service = dl.services.get(service_id='my_service _id')
# get the item object of the code base
code_base = dl.items.get(item_id=service.package.codebase_id)
# Download the code to a local path


Packages have a versioning mechanism.
This mechanism allows you to deploy a Service from an old revision of the Package.
Every time you modify the package attributes or codebase by pushing a package or by performing package.update(), a new version of this package is created. The package revisions field contains a list of all prevision states of the Package.
If the codebase has no changes, you can use package.update(); if not, you need to push a new one.
When you deploy a Service you can specify a specific revision from which to deploy the Service, the default is the last update.

If the codebase has no changes, you can use package.update(); if not, you need to push a new one.

# the package latest version
# List of all package versions


These are not Python modules.

Modules are a reference to a Python file containing the Python class (by default, ServiceRunner) with functions inside it.

The idea behind separating a package into modules is to allow users to develop many different services that share some mutual code using one codebase. This means you can translate and define the code-base or a substance of it to a service, without needing a new code-base.

Using the module you can load a model on the service deployment and then run any function that references it (see the init method for reference) or proceed to the next step in the pipeline.
A Module contains 2 fields that determine which code within the Codebase to load, initialize and run.
The fields are entryPoint and className.

Entry Point

The entry point is the name of the Python file containing the class and functions of the package code. It should end with .py. The default entry point is main.py.
The entry point specifies the relative path from the CodeBase root to the main file that should be loaded when loading the Module.

Class Name

The className specifies the name of the Python class within a file that should be loaded in the Service. The class referenced by className must extend dl.BaseServiceRunner to work.
The Python class (by default is ServiceRunner. You can name it however you like. Just change the class_name attribute of the Package's Module) will be loaded when the service is deployed, see the init method for reference.

Module JSON

A module does not have its own JSON file. The basic scheme looks like this:

    "name": "default_module",  # module name
    "className": "ServiceRunner",  # optional
    "entryPoint": "main.py",  # the module entry point which includes its main class and methods
    "initInputs": [],  # expected init params at deployment time
    # list of module functions that can be executed from remote
    "functions": [
            "name": "run",  # function name - must be the same as the actual method name in the signature
            "description": "",  # optional -  function description
            # expected function params
            "input": [
                    # input name - identical to input param name in the signature
                    "name": "item",
                    "type": "Item",  # Item / Dataset / Annotation / JSON
            "output": [],  # not implemented - keep blank

Class Functions

This is the basic running unit of the FaaS. You can define the functions on the class and when the service is deployed, you can run each of them.

Class Functions Input types

Each function's input has a name and a type.
Input types can be grouped into two categories:

  • JSON: This allows the function to receive all possible inputs as well as have no prepossessing done by the runner.
  • Dataloop types("item", "dataset", etc...): When invoking the function, the ID of the resource should be passed as an input. See the types referenced in the repositories.

Package JSON

  "name": "default_package",
    "modules": [
        "name": "default_module",
        "entryPoint": "main.py",
        "initInputs": [],
        "functions": [
            "name": "run",
            "description": "This description for your service",
            "input": [
                "name": "item",
                "type": "Item"
            "output": []


A Service is a running deployment of Modules within a Package. Running a service requires having a bot user.

  • The Service entity specifies the Module it deploys, the arguments passed to the Module's exported class' init method, and the infrastructural configuration for it, such as the number of replicas (machine instances) and concurrent executions.

  • Once the service is ready, its functions can be invoked with the required input parameters and entities.

  • Multiple services can be deployed from the same module.

Init Method

The init method is invoked immediately after loading the Module and is used for initializing the replica of the Service.

  • The modules' Init method inputs schema is defined in the Modules' initInputs field.
  • The actual value passed to the init method at loading time is taken from the "initParams" field of the service.

For example:

  • For the Image bounding box classifier service, the init method will download the model and load it to its memory.
  • All subsequent function invocations in a Service will use the loaded model to calculate the bounding box for the image.

Service Status

A service has an operational status assigned to it. Status can be one of the following values:

  • Active: An active service has running replicas or an autoscaler configured to create replicas automatically when executions are created for the service.
  • Inactive: When a service is paused, all replicas go down. Any trigger or UI slot set for the service will become inactive as well.
  • Initializing: Initializing one or more of the service replicas. This status is usually displayed after activating a service, or when the auto scaler is in action.
  • Error: When the service fails to deploy. The service replicas have active errors (can be found under Error info).


A deployed service gets a version that represents its state. Every time a service is updated its version number increments and its previous value is saved in the service revision. Services can be deployed using earlier versions.

To view the service revisions:

# List service revisions history from past updates using  'service.update()' command
# Current service version for currently used package codebase (services can use old package versions)

Service JSON

  "name": "service-name",           # service name
  "packageName": "default_package", # package name
  "packageRevision": "latest",      # What package version to run?
  "runtime": {
    "gpu": false,                   # Does the service require a GPU?
    "replicas": 1,                  # How many replicas should the service create
    "concurrency": 6,               # How many executions can run simultaneously?
    "runnerImage": ""               # You can provide your docker image for the service to run on.
    "triggers": [],                 # List of triggers to trigger service
    "initParams": {},               # Does your init method expect input if it does provide it here?
    "moduleName": "default_module"  # Which module to deploy?


An execution is a single function invocation with input data. As a Dataloop entity, Executions has unique IDs, statuses, and logs, which allows for monitoring it from invocation to completion.

Execution Input

The execution input is similar to the input the function requires, provided to the method the execution invokes.

For example:

dl.FunctionIO(name='model_filename', type=dl.PackageInputType.JSON),
dl.FunctionIO(name='dataset', type=dl.PackageInputType.DATASET),
dl.FunctionIO(name='item', type=dl.PackageInputType.ITEM)
#When executing:
execution = service.execute(execution_input=dl.FunctionIO(name='item', value='item-id',type=dl.PackageInputType.ITEM), project_id='project-id', function_name='function-name')

The input of Dataloop type ( item, dataset, annotation, etc.) must be passed with an ID of the corresponding entity.
The input of type Json can have any JSON serializable value and will provide it as is to the method.

Execution Status

Every execution has a status

  • execution.statusLog holds an array of all status updates of the execution
  • execution.latest_status is the latest status update of the execution
  • The execution status can be updated by using the Progress object
  • Progress updates are API calls and can only be updated in increments of five.

For example: When performing progress.update(progress=4) and then progress.update(progress=7) the update to 7 won't be in effect. However, performing progress.update(progress=9) will have an effect.

class ServiceRunner(dl.BaseServiceRunner):
    def detect(self, item: dl.Item, progress: dl.Progress): 
        progress.update(status='inProgress', progress=0, 
                        message='execution started')
        ### DO WORK ###
        progress.update(status='inProgress', progress=30)
        ### DO MORE WORK ###
        progress.update(status='inProgress', progress=80)
        ### DO MORE WORK ###
        progress.update(status='inProgress', progress=100, 
                        message='execution completed')

Retry Mechanism

Executions failure to complete may occur for various reasons, for example:

By setting the value of the max_attempts attribute, services can be configured to automatically retry failed executions.
Consequently, the number of attempts on a single execution can be read from the attempts attribute.

Waiting for Execution

Processes can be paused until an execution is completed with either status (success/failed/aborted/etc), using the execution.wait() method.

Execution cURL

To export the execution in a cURL format:

curl = service.executions.create(return_curl_only=True)

Rerun Execution

To rerun failed executions:

execution = dl.executions.get('execution_id')

Execution JSON

  "id": "string",
  "url": "string",
  "createdAt": "2020-11-18T17:41:25.740Z",
  "updatedAt": "2020-11-18T17:41:25.740Z",
  "creator": "string",
  "attempts": 0,
  "maxAttempts": 0,
  "toTerminate": true,
  "input": {},
  "output": {},
  "feedbackQueue": {
    "exchange": "string",
    "routing": "string"
  "status": [
      "timestamp": "2020-11-18T17:41:25.740Z",
      "status": "created",
      "message": "string",
      "percentComplete": 0,
      "error": {},
      "output": {}
  "latestStatus": {
    "timestamp": "2020-11-18T17:41:25.740Z",
    "status": "created",
    "message": "string",
    "percentComplete": 0,
    "error": {},
    "output": {}
  "duration": 0,
  "projectId": "string",
  "functionName": "string",
  "serviceId": "string",
  "triggerId": "string",
  "serviceName": "string",
  "packageId": "string",
  "packageName": "string",
  "packageRevision": 0,
  "serviceVersion": 0,
  "syncReplyTo": {
    "exchange": "string",
    "route": "string"


A bot is a dummy project user with 'Developer' role permissions used for running services.
A deployed service uses the associated bot credentials to log in and perform operations on the platform - if a service is not created by a bot, then the platform will create one automatically. If a service function creates an annotation as part of its operation, the bot will be recorded as the annotation creator.

  • A project can have unlimited bots.
  • A service is limited to exactly one bot, however, a service's bot can be changed.
  • A bot’s role is developer and it functions as a regular user with all its permissions.
  • Bots can be created by users with a Developer role or higher.
  • Deleting a bot, removing it from the project-users list, or demoting it to project-role lesser than Developer is likely to fail to run associated services and consequencing failed executions.

Use Service in Other Projects

For a service to run in other projects within the same organization with full access to the projects' entities (datasets, items, tasks, etc.), the bot user associated with the service needs to be added as a user (developer permissions) to the required projects. By doing so, you are providing the bot with developer access to your projects and ensuring that it won't encounter any blocks. If you need to copy the service's bot, please refer to the following guide.

Copy Service's Bot

To copy the bot user email of an existing service:

  1. Go to Application Hub > Installed tab.
  2. Click on the 3 dots button of the required service.
  3. Click Copy service bot.

Create/Set a Bot

Create Bot


Set Bot for a Service

The bot being assigned to a service must be a user to the Project and a member of the Organization owning the project.

service = package.deploy(


The Dataloop FaaS enables you to register Functions to events in the system.
Use the Triggers entity for the registration.
A Trigger contains a project on which it monitors events, a resource type, the action that happened to the resource, a DQL (The Data Query Engine) filter that checks whether or not to invoke the operation based on the resource JSON, and an operation.

Currently, the only supported operation is the creation of an execution.

Trigger Events - Resource

Events of which entities to listen to?
Resources are entities in the Dataloop system: Item, Annotation, Dataset, Task, etc. Each resource is associated with an action to define a specific event for the trigger.
Once an event trigger occurs, the resource object is delivered as an input for the execution created.

Note that you can only have one resource per trigger.
If you wish to have more than one, use a manual execution from a new function.

dl EnumAPI ValueDataloop EntityAvailable Actions (*)
dl.TriggerResource.ITEMItemPlatform itemCreated,Updated,Deleted,Clone
dl.TriggerResource.DATASETDatasetPlatform datasetCreated,Updated,Deleted
dl.TriggerResource.ANNOTATIONAnnotationPlatform item's annotationCreated,Updated,Deleted
dl.TriggerResource.ITEM_STATUSItemStatusPlatform Item's statusUpdated**, taskAssigned
dl.TriggerResource.TASKTaskPlatform taskCreated,Updated,Deleted,statusChanged
dl.TriggerResource.ASSIGNMENTAssignmentPlatform assignmentCreated,Updated,Deleted,statusChanged
dl.TriggerResource.MODELModelPlatform ModelCreated,Updated,Deleted

(*) Your function can be triggered by the following actions when taken over a resource (a specific resource must be set!):

  1. Created: run your function over any newly created resource.
  2. Update: run your function over any updated resource. See (**) for more details regarding "itemStatus.Updated".
  3. Deleted: run your function over any deleted resource.
  4. Clone: run your function over every cloned item.
  5. taskAssigned: once an item is assigned to a task, run the function over this item. To use this event, set the trigger: "ItemStatus.TaskAssigned".
  6. statusChanged: once a task/assignment status is changed, run the function over this task/assignment (task 'completed', assignment 'done').

(**) Use the trigger "itemStatus.Updated" to run your function over items once their status is changed within a task context ('approved', 'discard', etc.).

Trigger Events - Action

What event action types are available?

Actions are the events that will trigger the function.
For further explanation regarding the actions, read the explanations above (*).

headerdl EnumAPI ValueAvailable Modes
celldl.TriggerAction.UPDATEDUpdatedOnce, Always (excluding Task & Assignment)
celldl.TriggerAction.STATUS_CHANGEDstatusChangedOnce, Always (excluding Task & Assignment)
celldl.TriggerAction.TASK_ASSIGNEDtaskAssignedOnce, Always

Trigger Execution Mode

Some events, such as item updates, can happen more than once on the same entity. Trigger execution mode defines those repeating events that will trigger the service every time they happen, or only on the first time they happen.

  • Once: The function will only run once when triggered. For instance, for an "item" resource and an "Updated" action, the function will only work on the first updated item.
  • Always: The function will run each time when triggered. For instance, for an "item" resource and an "Updated" action, the function will run for every updated item.
dl Enumdl.TriggerExecutionMode.ONCE

Be careful not to loop a trigger event. Known examples:

Creating a function that adds items to a folder, while an event trigger "item.created" is set for this function.
Creating a function that updates items, while an event trigger "item.updated" with execution_mode='always' is set for this function.
The function will be triggered again and again when implementing the above functions.

Trigger JSON

  "name": "service-name",  
  "packageName": "default_package",  
  "packageRevision": "latest",  
  "runtime": {
    "gpu": false,           
    "replicas": 1,            
    "concurrency": 6,      
    "runnerImage": "" 
    "triggers": [
        "name": "trigger-name",
        "filter": {
          '$and': [{'dir': '/train'}, {'hidden': False}, {'type': 'file'}]},
        "resource": "Item",
        "actions": [
        "active": true,
        "function": "run",
        "executionMode": "Once"
    "initParams": {},                     
    "moduleName": "default_module"    

Trigger at Specific Time Patterns

Dataloop FaaS enables you to run functions at specified time patterns with constant input using the Cron syntax.

In the Cron trigger specification, you specify when you want the trigger to start, when you want it to end, the Cron spec specifying when it should run, and the input that must be sent to the action.

# start_at: iso format date string to start activating the cron trigger 
# end_at: iso format date string to end the cron activation 
# inputs: dictionary "name":"val" of inputs to the function 
import datetime
up_cron_trigger = service.triggers.create(function_name='my_function',
                                          start_at=datetime.datetime(2020, 8, 23).isoformat(),
                                          end_at=datetime.datetime(2024, 8, 23).isoformat(),
                                          cron="0 5 * * *")

Triggers Examples

Trigger for Items Uploaded to Directory /Input

filters = dl.Filters(field='dir', values='/input')
trigger = service.triggers.create(function_name='run',

Trigger for Items Updated in Project

trigger = service.triggers.create(function_name='run',

Trigger for Annotated Video Items that are Being Updated

filters = dl.Filters(field='annotated', values=True)
filters.add(field='metadata.system.mimetype', values='video/*')
trigger = service.triggers.create(function_name='run',

Trigger for Datasets that are Being Updated

trigger = service.triggers.create(function_name='run',

Trigger for Annotations Created from Box Type

filters = dl.Filters(field='type', values='box', resource=dl.FiltersResource.ANNOTATION)
trigger = service.triggers.create(function_name='run',

Trigger for Annotations Updates of Label "DOG"

filters = dl.Filters(field='label', values='DOG',resource=dl.FiltersResource.ANNOTATION)
trigger = service.triggers.create(function_name='run',

Trigger for Item Status Change in a Task

trigger = service.triggers.create(function_name='run',