Skip to content

Engineering Gaps - Orchestrate#

Back to Menu

Version Control#

Version Date Owner Change Description
0.1 18 March 2025 Gareth Stretch Initial Framework created
0.2 23 March 2025 Gareth Stretch Moved Gaps to its own page

Orchestration Gaps#

The diagram and section below is a high-level overview of the azure databricks data platform as highlighted in the to-be deliverable with the respective area being highlighted for the gaps and recommendations.

Hold "Alt" / "Option" to enable Pan & Zoom
screenshot

Table : Orchestration Gaps#

Area Gap Recommendation
Cross stack orchestration Orchestration is limited to workflows: this prevents end to end orchestration of pipelines which span multiple technologies. eg databricks and snowflake Use orchestration tools like Databricks Jobs, Apache Airflow, Dagster and Databricks Workflows to manage and automate pipeline tasks
Automation Make use of databricks API to improve automation and orchestration. - The Databricks API allows you to automate complex workflows, reducing the need for manual intervention. This can include tasks such as starting and stopping clusters, running jobs, and managing libraries.
- The API enables seamless integration with other tools and services in your data ecosystem. This includes CI/CD pipelines, monitoring tools, and data ingestion frameworks, allowing for a more cohesive and streamlined data processing environment
- Real-time Monitoring: The API provides endpoints for monitoring the status of clusters, jobs, and other resources in real-time. This allows for proactive management and quick response to any issues that arise
Automation Automatic Publish of Data Marketplace