Skip to content

Orchestration & Monitoring#

This guide covers how orchestration, scheduling, and monitoring can be configured in Databricks using Databricks Asset Bundles.

Reader's Guide: This guide covers Databricks orchestration fundamentals - jobs, tasks, and triggers - followed by practical examples using (Databricks Asset Bundles) to create and deploy workflows with Python. Databricks Asset Bundles is a way to specify resources in Databricks using YAML, JSON or python files.

The guide consists of two main sections: the first section covering the core concepts, and the second section covering orchestration with Databricks Asset Bundles.

Core Concepts#

Lakeflow Jobs#

Lakeflow Jobs (jobs) are used to create and automate workflows in Databricks. It could simply be to schedule a single task or create a multitask workflow.

You can specify different properties for the job such as the ones below:

  • Trigger - this defines when to run the job.
  • Parameters - run-time parameters that are automatically pushed to tasks within the job.
  • Notifications - emails or webhooks to be sent when a job fails or takes too long which can be used for monitoring.
  • Git - Version control of the source code that the the job/task should run.

Task#

There are many types of tasks, here are a few examples.

  • Notebook - run a notebook as a task.
  • Python Wheel - run a Python wheel as a task.
  • dbt - run a dbt command(s) as a task.

Trigger#

A trigger is used to start the execution of a job. It can either be scheduled or event-based so the job is triggered when a new file arrives in the cloud-storage.

  • Event Based
file_arrival:
  url: "/Volumes/catalog/stg/sources/file_location"
  min_time_between_triggers_seconds: 60 # 60 seconds minimum between runs
  wait_after_last_change_seconds: 60 # Wait 60 seconds after last file change
  • Scheduled
periodic:
    interval: 1
    unit: WEEKS

Orchestration with Databricks Asset Bundles#

The second part of the guide describes how a basic workflow with two tasks can be created using Databricks Asset Bundles with python.

1. Create a new Databricks project by running the following commands.

Authenticate by running databricks configure or databricks auth login --host https://adb-XXXX.azuredatabricks.net

2. Now create a new project by running the command below:

databricks bundle init experimental-jobs-as-code

When prompted to include the different stubs, just say yes to all of them.

3. Now run databricks bundle deploy --target dev

If you go to the Databricks UI under 'Jobs & Pipelines' you can see that the job has been deployed. If you click on the Job you can see the workflow

If you open the file resources\jobs_as_code_project_job.py you can see the code that defines the job.

4. By taking a closes look at the job, we can see that the consists of two tasks, one task that runs a notebook and one task that runs a python wheel. It is scheduled to run every day as defined with the trigger key.

Expand the example below to see the python job definition example:

Example: Python Job Definition Code
from databricks.bundles.jobs import Job

"""
The main job for jobs_as_code_project.
"""


jobs_as_code_project_job = Job.from_dict(
    {
        "name": "jobs_as_code_project_job",
        "trigger": {
            # Run this job every day, exactly one day from the last run; see https://docs.databricks.com/api/workspace/jobs/create#trigger
            "periodic": {
                "interval": 1,
                "unit": "DAYS",
            },
        },
        # "email_notifications": {
        #     "on_failure": [
        #         "xxxx@novonordisk.com",
        #     ],
        # },
        "tasks": [
            {
                "task_key": "notebook_task",
                "job_cluster_key": "job_cluster",
                "notebook_task": {
                    "notebook_path": "src/notebook.ipynb",
                },
            },
            {
                "task_key": "main_task",
                "depends_on": [
                    {
                        "task_key": "notebook_task",
                    },
                ],
                "job_cluster_key": "job_cluster",
                "python_wheel_task": {
                    "package_name": "jobs_as_code_project",
                    "entry_point": "main",
                },
                "libraries": [
                    # By default we just include the .whl file generated for the jobs_as_code_project package.
                    # See https://docs.databricks.com/dev-tools/bundles/library-dependencies.html
                    # for more information on how to add other libraries.
                    {
                        "whl": "dist/*.whl",
                    },
                ],
            },
        ],
        "job_clusters": [
            {
                "job_cluster_key": "job_cluster",
                "new_cluster": {
                    "spark_version": "15.4.x-scala2.12",
                    "node_type_id": "Standard_D3_v2",
                    "data_security_mode": "SINGLE_USER",
                    "autoscale": {
                        "min_workers": 1,
                        "max_workers": 4,
                    },
                },
            },
        ],
    }
)

Here is a breakdown of the key orchestration features used in the example above:

Scheduling:

  • The trigger section uses periodic scheduling with interval: 1 and unit: "DAYS"
  • Job runs exactly one day after the previous run completes

Task Dependencies:

  • main_task has a depends_on configuration referencing notebook_task
  • This creates orchestration dependency - main_task only runs after notebook_task succeeds
  • Dependencies ensure proper execution order and prevent processing incomplete data

Task Types:

  • Notebook Task: Runs the notebook at src/notebook.ipynb

  • Python Wheel Task: Executes packaged Python code with entry point main

  • Where both tasks share the same cluster via job_cluster_key

Cluster Configuration:

  • Auto-scaling cluster (1-4 workers) defined in job_clusters
  • Shared cluster reduces costs while maintaining task isolation
  • Cluster terminates automatically after job completion

This concludes the guide.

Official Docs:

[1] https://docs.databricks.com/aws/en/jobs/

[2] https://docs.databricks.com/aws/en/dev-tools/bundles/pipelines-tutorial

[3] https://docs.databricks.com/aws/en/dev-tools/bundles/python/