Skip to content

Data Governance for Data Products#

Overview#

This chapter covers how to integrate data governance practices into each phase of the data product lifecycle, including principles, roles, and metadata requirements for the data product creation process. It facilitates the development of data products that are Findable, Accessible, Interoperable, and Reusable (FAIR), enhancing operational efficiency, analytics, data science, and AI use cases. It underscores the importance of effective governance for ensuring that Data Products are trustworthy, high-quality, compliant, and valuable across the organization.

What you will learn:#

After reading this chapter, you will understand: - The significance of Data Governance in the Data Product lifecycle - Key roles and responsibilities in Data Governance - Metadata requirements for Data Products

Prerequisites:#

Data Governance Roles#

Objective:#

Establish governance foundation and assign key roles.

The following data governance roles are essential for ensuring effective governance of data products:

Role Org. Level Primary Goal Key Responsibilities Accountability
Executive Sponsor EVP/SVP Level Guides Data Governance strategy at the highest level Defines LoB Data Governance vision, approves resources, advocates for DG across the enterprise Ensures alignment with corporate strategy, allocates resources, drives LoB Data Governance culture
Business Sponsor CVP/VP Level Represents their business domain and ensures local alignment Leads LoB implementation, drives collaboration, reviews KPIs Applies framework in their LoB area, communicates with NNDGC, ensures alignment with enterprise goals
Data Owner VP/VP-1 Level Accountable for specific data assets and their full lifecycle Defines relevant data-specific rules, ensures quality, assesses impacts of changes Strategically accountable for data reliability, ensures LoB area compliance
Data Steward SMEs at job level 7-8 Oversees data quality and provides subject matter expertise Fixes quality issues, ensures policy compliance, maintains metadata, supports users, monitors workflows Maintains high-quality data, reports trends, aligns data asset practices with enterprise standards
Data Maintainer Data Engineers, Data Architects, etc. Manages technical aspects of data infrastructure Resolves technical issues, automates workflows, supports upgrades, validates pipelines, optimizes tools Ensures technical integrity and performance, aligns infrastructure with framework, reports issues
Data Producer Any level Captures data at point of origin to ensure complete and valid records Ensure data accuracy and completeness at point of entry and follow standards for data capture and classification Maintain data lineage and support data governance by documenting source processes
Data Consumer Any level Accesses & utilizes data responsibly to support business decisions Adhere to governance policies, report data quality issues, validate data accuracy, provide feedback for improvement Ensure proper data usage, highlight gaps, and support continuous improvement

The detailed roles and responsibilities are defined in the appendices of the * Novo Nordisk Data Governance Framework

Data Governance during Design Phase#

Objective:#

Ensure data products meet FAIR principles through comprehensive metadata capture.

During the design phase, Data Governance ensures that Data Products are functional, high-quality, and secure by capturing metadata for data product attributes to ensure they are Findable, Accessible, Interoperable, and Reusable.

Metadata Requirements for FAIR Data Products#

To ensure Data Products are Findable, Accessible, Interoperable, and Reusable, follow these guidelines: 1. Governance: Assign a Data Owner and Data Steward for every Data Product.
2. Risk Assessment: Conduct and document risk and classification assessments (e.g., Data Classification, PII, GxP). Currently, use the standard risk assessment format for IT. A simplified version of a risk assessment for data products is planned for November linked to the entity in ServiceNOW. 3. Data Product Attributes: Maintain complete metadata, including source and product IDs, descriptions, etc.
4. Source Tagging: Tag source data with classification and domain tags from enterprise data guardrails.
5. Lineage: Ensure data lineage from source to product is captured in NNDM, where all input and output ports are present, e.g., with linked sources, applications, and data contracts or consumer data products. 6. Semantically Consistent: Use defined business terms from the business glossary; link data models to the enterprise model where relevant.
7. Data Quality: Define and monitor data quality metrics (completeness, accuracy, timeliness). For more information and support, check the services and support from data quality here. 8. Usage Map: Map producing and consuming systems; document interface agreements. To be defined further
9. Implementation SOP: Ensure the implementation of data products is in alignment with the applicable SOPs, e.g., that data products are correctly classified according to Protecting and Handling Information SOP.
10. Retention Policy: Define and document retention and disposal procedures aligned with corporate policies.

Fair Data Product Requirement#

Data Product Attributes Description Required/ Optional Rationale
1. Governance
Data Owner Owner of Data Products Required To establish ownership and stewardship for Data Products critical for governance, maintenance, and enhancements of Data Products. LoBs can further customize the roles to required granularity, e.g., Data (product) Owner, Data (integrity) Owner. Key consideration is that Owner is ultimately accountable, while steward is responsible for operations and adherence to the requirement of CRUD operations. SME roles cover extended roles specific to an LoB (Data Modelers, ontologists, scientific stewards, researchers, etc.).
Data Steward Data Steward responsible for maintaining the data product content and quality Required
Subject Matter Expert Responsible for questions regarding the data asset or data product Optional
2. Risk Assessment
Business Impact Risk Classification (High, Medium, Low) in alignment with IT Security Risk assessment Framework Required Helps assess and manage the risks associated with a data product to protect NOVO NORDISK data and ensure compliance with regulations.
Data Sensitivity Classification Data sensitivity classification (e.g., public, Internal Use, Confidential, Strictly Confidential) Required
Personal Identifiable Information (PII) Specify if the data source contains Personally identifiable information (PII) (e.g., Yes or No) Required
GxP Specify if the data source contains GxP data as part of a GxP validated process Required
3. Attributes
Data Product ID Global Unique and persistent identifier from Purview and NOVO NORDISK DM Required Identification and brief description of Data Product for quick understanding.
Data Product Name A user-oriented name of the data product. Use plain English and avoid abbreviations Required
Data Product Description A description of the data product in non-technical lingo, use plain English and do not use abbreviations that are more commonly used, e.g., PC, ATM, etc. Required
Source ID Unique ID of the source system Required Identification and brief description of the source system where actual data is hosted and its owner.
Source System Name Name of the software used to perform a function Required
IT-system owner The owner of an IT system is a Line Manager who is responsible for one of the business processes supported by the IT system Required
Source Type Type of source (e.g., Oracle, Azure SQL, SQL Server, ADLS Gen2, Databricks, PostgreSQL, Snowflake, Power BI) Optional Identifies the origin of the data product, helping consumers understand the underlying technology.
Source Connection Details Connection details like S3 Bucket URL Optional Defines how data consumers can technically connect to the data source, e.g., PostgreSQL: Server, Port, DB Name.
Service Account Azure AD identity used for accessing the source (if Azure-hosted) Optional To enforce secure authentication and ensure compliance with Novo Nordisk's security guidelines.
Access Credentials Credentials (username/password) for accessing the data source (if not using a service principal) Optional
4. Source Tagging
Tag Name Name of tags in Data Sources (E.g., Data Quality tags, Business or Domain tags such as Financial Data, Clinical Data, etc.) Optional Helps categorize the Data Products.
5. Lineage
Data Lineage Lineage from source to data product in the data marketplace Optional Helps with auditing, troubleshooting, and understanding data flows.
6. Semantically Consistent
Business Glossary Definition of business terms used in Data Product in the glossary Optional Ensures common understanding of business terminology, enhancing communication and reducing ambiguity across teams.
Data Model Data Model for Data Assets & Products linked to enterprise data model Required Helps understand the underlying structure of a Data Product.
7. Data Quality
Data Quality Measures Data should have data quality metrics and monitoring defined Required Ensures Data Products meet business and user needs.
8. Usage Map
Interface Agreements Data map over producing and consuming systems Optional Interface Agreements act as the foundation for standardized, secure, and governed data exchanges, enabling seamless interoperability between systems while ensuring compliance, trust, and usability of Data Products.
9. Implementation SOP Ensure Implementation of data product is in alignment with the applicable SoPs e.g., that data products are correctly classified according to Protecting and Handling Information SOP. Optional Ensure all compliance requirements are met.
10. Retention Policy
Retention Time Retention time for the content of the data Required Ensure compliance with legal and regulatory requirements and optimal data storage management.

Data Governance during Build & Deploy Phase#

Objective:#

Implement governance standards and prepare for production deployment.

In the Build & Deploy phases, Data Governance ensures the implementation of Data Products in alignment with established standards and best practices: 1. Final risk assessment and compliance check
2. Establish access controls per governance policies
3. Train users on how to utilize the Data Product and relevant governance practices
4. Ensure that metadata meets standards for Findability, Accessibility, Interoperability, and Reusability.
5. Track data lineage from source to product.
6. Define and monitor data quality metrics to ensure data meets business and user needs. For more information and support, check the services and support from data quality here.

Data Governance during Operate Phase#

Objective:#

Maintain ongoing governance and continuous improvement.

The operate phase involves ongoing management and monitoring of the Data Product, ensuring compliance with governance: 1. Regularly monitor data quality and compliance metrics. For more information and support, check the services and support from data quality here. 2. Conduct periodic reviews of the metadata attributes for recency and accuracy
3. Gather feedback from users for continuous improvement

Success Metrics & Checkpoints#

  • Checkpoint 1: Data Governance roles assigned and documented
  • Checkpoint 2: Risk assessments completed and documented
  • Checkpoint 3: Metadata validated against FAIR Data Product requirements
  • Checkpoint 4: Access controls implemented per governance policies

Common Challenges & Solutions#

  • Challenge: Lack of clarity in roles and responsibilities
  • Solution: Regularly review and communicate roles and responsibilities
  • Prevention: Establish and document clear governance policies early in the process
  • Challenge: Incomplete metadata leading to poor discoverability
  • Solution: Implement a comprehensive checklist for required metadata attributes
  • Prevention: Conduct regular reviews of metadata against established standards

Next Steps#

After completing this chapter, you should: - Have established clear governance roles for your data product
- Completed all required metadata documentation
- Implemented appropriate access controls and monitoring
- Be ready to proceed with ongoing operational governance

Additional Resources#