- 23 May 2023
- DarkLight
- PDF
Running and Monitoring Pipelines
- Updated On 23 May 2023
- DarkLight
- PDF
Starting the Pipeline
To install and activate your pipeline, click Start Pipeline in the pipeline editor screen or the play button from the project's Pipeline page. If you are unable to click Start Pipeline, or that the installtion process has failed, it might be due to configuration issues of your pipeline nodes or errors in the pipeline composition:
To monitor node configuration issues, hover over the warning/errors icons on the nodes to see what issues need to be resolved. Resolve the issue, and the warning/error icon should disappear.
To monitor installtion errors, click on the Error tab in the pipeline’s information panel on the right and check the error messages.
Invoking Data To Pipelines
Auto-Invocation: Cron & Event Triggers
Triggers can be defined in pipelines, to automatically invoke a pipeline flow based on events in Dataloop system or on cron expression.
To learn more about automatic triggers, click here.
Manual Invocation: Selected Items in Dataset Browser
You can manually trigger data items from any dataset into any pipeline. You can trigger an entire dataset, a specific folder, all items from a DQL query/filter, or manually selected items:
- Open the dataset browser.
- Filter for specific items based on any criteria.
- Click the “Create Trigger” icon
or right-click a folder and select “Create Trigger.”
- Select the destination pipeline from the list and click “Approve.”
Manual Invocation: SDK Using Filter
To add a trigger to any node in the pipeline, use the add_trigger()
function, and then update the pipeline with the pipeline.update()
function.
You can use the SDK to execute your pipeline flow from any starting point you wish. When creating the execution, you can specify the input for the pipeline execution you create (send a specific item Id, dataset Id, etc.) as well as using a DQL filter to specify several items, etc.
Pipeline Information Panel
In the Pipeline page, you will find the Pipeline Information Panel. When no specific node is selected, the panel displays pipeline-level information.
Scope – The default setting in the Information Panel is to display information from the “Last 3 hours.” However, you may select a different time frame to display information from.
Context – The side bar has breadcrumbs at the top, displaying the current context (entire pipeline, specific node, or specific cycle), which also allows for easy navigation.
Pipeline Cycles
A Pipeline cycle refers to all node executions performed on a single pipeline run (usually over a specific item); the executions are listed in the order in which they occurred. Since some items may be routed differently in the pipeline based on filters and user actions, each cycle may have a different number of executions.
Select a cycle from the list to see its details, including first (node) execution time, last (node) execution time,
Select an execution from the list to see its details, including the function used in the execution, the input, and the output.
Clicking the play button will show the item's progress in the pipeline, highlighting the nodes involved in processing the item.
In the Pipeline Cycle list, click on the number in the Executions column to drill-in and see each execution. This allows you to browse the executions, see the highlighted node on the Pipeline canvas (which enables you to monitor the item’s progress in the pipeline), and see the execution details (input, output, item with item link, execution time, and duration).
Use the Up and Down arrows to browse between the executions and trace the item’s progress over the canvas.
Node Details
Clicking on a node in the pipeline will bring up the Configuration tab relevant to that node. Click here to read more nodes and their configuration.
Executions tab – When selecting a node, the Executions tab displays information about the execution of each item that went through this node. Select an execution from the list to view its details.
Logs tab – The Log tab displays all log entries generated during executions of the respective node within the selected time-frame (the default time-frame is the last 3 hours).
Instances tab – This tab is only active for FaaS or Code nodes and displays information about the machines that are connected to the pipeline.
Errors to Scale Up a Service
Service restart loop errors may affect Code/FaaS nodes and prevent the node service (FaaS) from scaling up. Possible causes can be issues with the defined requirements or the specified docker image. Once such error occurs, an indication will appear on the relevant Pipeline node and its service under the installed FaaS table (Application Hub page):
Pipeline service error indication:
FaaS page service error indication:
Overcoming Execution Errors
After resolving the root cause for any problem that resulted in failing to execute items (e.g. code problems in packages, insufficient compute resources), you can rerun the execution of failed items
- From the side-panel – select the node, switch to the Executions tab, select the Failed filter option, hover over an item and, click the Rerun button.
- From the Applications (FaaS) > Executions page – in the search field, filter by pipeline and by execution.status: failed, and select Rerun All.
Pausing a Pipeline with running cycles
Once pausing a pipeline with any pending/running cycles, the cycles' status will be updated to "Paused" and the cycles will stop running. When resuming the pipeline, a dialog will open offering two options:
Resume all available cycles (pending/in-progress)
Abort all available cycles (pending/in-progress) - cycles will get "Terminate" status
Aborted CyclesAt the moment, aborted cycles are filtered out automatically from the cycles list in the side panel (can be displayed by filtering cycles with "Terminated" status) and are excluded from the pipeline "statistics bar" counters as well.
If the pipeline was modified while paused and you choose to resume it, the resumed cycles will continue to run according to the new pipeline composition. The "pause" action may not immediately halt all pipeline activity.
Node executions that have already started running will not be affected, and will only be paused once the current execution is completed. Additionally, it is possible that cycles that are waiting in a node queue at the pause time will be still executed on the node, before being paused.
Pausing a Pipeline with Active Event Triggers
> Keep Event Triggers Activate During Pipeline Pause
You are provided with the option to keep the pipeline event triggers active when the pipeline is paused, so you won't loose events while editing the pipeline. Please click here to read more.