Azure Datalake Gen2
-
Print
-
DarkLight
-
PDF
Azure Datalake Gen2
-
Print
-
DarkLight
-
PDF
To connect your Azure Datalake Gen2 to Dataloop, follow these setup instructions.
Create App Registration
- Navigate to “App registrations” on the search bar
- Create a new registration and choose a meaningful name (e.g. dataloop-app )
- Locate the app's clientID (Application ID) and tenantID (Directory ID) in the Application overview, you’ll need them for the integration phase
Create a new client secret
- At the newly created registration, on the left-side panel, navigate to "certificates & secrets"
- Click on “New client secret" to create a new secret to for the application
- Fill in a meaningful name for the secret in the description field
- Define when will the integration expire (We recommend setting it to the longest period possible)
- NOTICE: Copy and store this secret's value right away, because you won’t have access to it later, and you’ll be needing it for the integration phase.
- Click add to add the secret to the Application
Create a Storage account
- Navigate to “Storage accounts” on the search bar
- Choose the desired Storage account (Or create a storage account if you don't already have one) - Must enable hierarchical namespace
- When creating Storage Account under advanced tab, go to Data Lake Storage Gen2 and mark Enable hierarchical namespace
For a step-by-step guide on creating a storage account in Azure, read Azure docs.
Create a container
- Navigate to “Storage accounts” on the search bar
- Choose the desired Storage account
- On the left-side panel Navigate to "Containers”
- Choose the desired Container (Or create a container if you don't already have one)
For a step-by-step guide on creating a Container in Azure, read Azure docs.
Add an IAM role assignments to container
- Select the chosen container (the one you will integrate with DataLoop)
- On the left-side panel navigate to "Access Control (IAM)"
- Click on the Add button
- Choose “Add role assignment”
- Under Role section search & choose “Storage Blob Data Contributor”, click on it and click next
- Under Members section, make sure “Assign access to” is set to “User, group, or service principals”
- Under Members section, click on “Select members”
- On the search bar enter your registration app’s name you created earlier, choose it and click select
- Click “review + assign”
- It can take 5 minutes for the permissions to be updated and available to use by Dataloop
For a step-by-step guide including screenshots on setting up a policy in Azure, read here.
Create an Azure Integration on DataLoop platform
- On the Dataloop platform navigate to the left-side panel and choose integrations
- Click on the “CREATE NEW INTEGRATION” button
- Enter a meaningful name for the integration
- Under Type, choose “Azure Blob”
- Under Account Name enter your storage account name (Where the container is located)
- Under Client Id, enter the registration app clientID from earlier
- Under Tenant Id, enter the registration app tenantID from earlier
- Under Client secret, enter client secret’s value from earlier
- Click on create
Create an Azure Storage Driver on DataLoop platform
- On the Dataloop platform navigate to the left-side panel, choose “Data Management” and then click on “Cloud Storage”
- Click on the “CREATE DRIVER” button
- Enter a meaningful name for the storage driver
- Under Integration choose your relevant Azure Integration
- Under Type choose “AzureDataLakeGen2” type
- Enter your Container name (The one that is relevant to the integration)
- (Optional) Allow delete items
- Click “TEST” to test if the connection is successful
- Click “Create”