Skip to content

Developing with Templates#

This guide walks you through the essential steps to deploy your first pipeline using project templates. Follow each step to ensure your environment is ready and your bundle is validated for Databricks deployment.

General Requirements#

What You'll Learn:

  • How to set up your environment for Databricks pipeline deployment
  • How to install and verify required CLI tools
  • How to build and validate your asset bundle

1. Install Required CLI Tools#

  • Install the Azure CLI:
brew install azure-cli
# Or see: https://docs.microsoft.com/en-us/cli/azure/install-azure-cli
  • Install the Databricks CLI (latest version recommended):
pip install --upgrade databricks-cli
databricks --version  # Should be 0.267 or newer

Warning: Older versions may not work well with bundles.

2. Set Up Your Python Environment#

Create and activate a virtual environment for your project:

python -m venv .venv
source .venv/bin/activate

3. Build Your Project Wheel#

Lock dependencies and build your package:

poetry lock
poetry build

This creates a dist/ folder with a .whl file. If you encounter SSL issues, see How to resolve SSL issue.

Your wheel file will be referenced in Databricks jobs/tasks inside databricks.yml:

libraries:
 - whl: ./dist/*.whl

4. Validate Your Asset Bundle#

Log in to Azure (if you don't have a subscription, use the flag):

az login --allow-no-subscriptions

Validate your Databricks bundle:

databricks bundle validate

Sample output:

(.venv) ➜  code git:(main) ✗ databricks bundle validate
Name: my-qms-dab-project
Target: dev
Workspace:
 Host: https://adb-4321798407796733.13.azuredatabricks.net/
 User: adminluzj@novonordisk.com
 Path: /Workspace/Users/adminluzj@novonordisk.com/.bundle/my-qms-dab-project/dev

Validation OK!

If validation passes, your bundle is ready for deployment!

5. Deploy Your Bundle to Databricks#

Deploy your validated bundle to the Databricks workspace:

databricks bundle deploy -t dev

Sample output:

(.venv) ➜  code git:(main) ✗ databricks bundle deploy -t dev
Building default...
Uploading .databricks/bundle/dev/patched_wheels/default_pipelines/pipelines-0.1.0+1757535806030139694-py3-none-any.whl...
Uploading bundle files to /Workspace/Users/adminluzj@novonordisk.com/.bundle/my-qms-dab-project/dev/files...
Deploying resources...
Updating deployment state...
Deployment complete!

This will deploy the dummy jobs to the workspace you have configured. You can verify the deployment by visiting the Databricks UI and navigating to Jobs & Pipelines. You should see jobs similar to the screenshot below:

Hold "Alt" / "Option" to enable Pan & Zoom
Databricks Jobs UI


Next Steps: Define and Deploy a Custom Job#

The default jobs are placeholders. To create your own job:

Copy and rename a job in databricks.yml:

  • Copy the structure of my_job and rename it (e.g., new_job).
  • Rename the entrypoint of the task under that job (e.g., your_task).
  • If you have an existing cluster, add existing_cluster_id under the task to use it.
new_job:
 name: new_job
 tasks:
  - task_key: your_task
   existing_cluster_id: <your_cluster_id> # replace with your cluster id
   description: "Landing to bronze job"
   python_wheel_task:
    package_name: pipelines
    entry_point: your_task
   libraries:
    - whl: ./dist/*.whl

Create your task implementation:

  • Under pipelines/, create a new file your_task.py.
  • Implement your job logic in this file.

Update your pyproject.toml to include the new script:

  • Add an entry for your new task:
[tool.poetry.scripts]
landing_to_bronze = "pipelines.landing_to_bronze:landing_to_bronze"
your_task = "pipelines.your_task:your_task"

DataPipeline Class & ETL Steps (Demo)#

TODO: Fill in info about the DataPipeline class and the corresponding ETL steps included in the demo.


Troubleshooting#

  • Databricks CLI version error: Upgrade with pip install --upgrade databricks-cli.
  • SSL issues with Poetry: See Poetry SSL troubleshooting.
  • Azure login issues: Try az login --allow-no-subscriptions.

Key Takeaways#

  • Always use the latest Databricks CLI for bundle support.
  • Build your wheel before referencing it in Databricks jobs.
  • Validate your bundle before deployment to catch issues early.