Overview#
This is the second phase of the Data Product Creation lifecycle and the first stage from where AI Foundation assumes active responsibility of deliverables and Product Owner assuming accountability of the solution.
Success
-
Create a working project in a project management tool of choice
-
Document detailed requirements. The high level requirement document from previous phase acts as input to this phase
-
Data Product Registration
-
Solution Design
-
Risk Assessment
Post the design phase TDA needs to be done
Note
A typical team setup for development of a Use case, shall have the following key roles identified in the early stages of initiation.
| Function | Role to be staffed from | How to request the role |
|---|---|---|
| Solution / Product Owner | LoB | LoB Responsibility, can take guidance from GD&AI vertical |
| Solution / Product Manager | Lob | LoB Responsibility, can take guidance from GD&AI vertical |
| System Manager | GD&AI | Reach out to GD&AI vertical |
| Technical Lead | GD&AI Vertical | Reach out to GD&AI vertical |
| IT QA | GD&AI | Reach out to GD&AI vertical |
| Validation Lead | LoB | LoB Responsibility, can take guidance from GD&AI horizontal |
| Project Manager | LoB | LoB Responsibility, can take guidance from GD&AI vertical |
| Scrum Master | GD&AI | LoB Responsibility, can take guidance from GD&AI vertical |
| Lead Architect | GD&AI | LoB Responsibility, can take guidance from GD&AI vertical |
| Tester | GD&AI | LoB Responsibility, can take guidance from GD&AI vertical |
| Developers | GD&AI | LoB Responsibility, can take guidance from GD&AI vertical |
Data Governance during Design Phase#
Objective:#
Ensure data products meet FAIR Principles through comprehensive metadata capture.
During the design phase, Data Governance ensures that Data Products are functional, high-quality, and secure by capturing metadata for data product attributes to ensure they are Findable, Accessible, Interoperable, and Reusable.
Metadata Requirements for FAIR Data Products#
To ensure Data Products are Findable, Accessible, Interoperable, and Reusable, follow these guidelines: 1. Governance: Assign a Data Owner and Data Steward for every Data Product.
-
Risk Assessment: Conduct and document risk and classification assessments (e.g., Data Classification, PII, GxP). Currently, use the standard risk assessment format for IT. A simplified version of a risk assessment for data products is planned for November linked to the entity in ServiceNOW.
-
Data Product Attributes: Maintain complete metadata, including source and product IDs, descriptions, etc.
-
Source Tagging: Tag source data with classification and domain tags from enterprise data guardrails.
-
Lineage: Ensure data lineage from source to product is captured in NNDM, where all input and output ports are present, e.g., with linked sources, applications, and data contracts or consumer data products.
-
Semantically Consistent: Use defined business terms from the business glossary; link data models to the enterprise model where relevant.
-
Data Quality: Define and monitor data quality metrics (completeness, accuracy, timeliness). For more information and support, check the services and support from data quality here.
-
Usage Map: Map producing and consuming systems; document interface agreements. To be defined further
-
Implementation SOP: Ensure the implementation of data products is in alignment with the applicable SOPs, e.g., that data products are correctly classified according to Protecting and Handling Information SOP.
-
Retention Policy: Define and document retention and disposal procedures aligned with corporate policies.
Fair Data Product Requirements#
| Data Product Attributes | Description | Required/ Optional | Rationale |
|---|---|---|---|
| 1. Governance | |||
| Data Owner | Owner of Data Products | Required | To establish ownership and stewardship for Data Products critical for governance, maintenance, and enhancements of Data Products. LoBs can further customize the roles to required granularity, e.g., Data (product) Owner, Data (integrity) Owner. Key consideration is that Owner is ultimately accountable, while steward is responsible for operations and adherence to the requirement of CRUD operations. SME roles cover extended roles specific to an LoB (Data Modelers, ontologists, scientific stewards, researchers, etc.). |
| Data Steward | Data Steward responsible for maintaining the data product content and quality | Required | |
| Subject Matter Expert | Responsible for questions regarding the data asset or data product | Optional | |
| 2. Risk Assessment | |||
| Business Impact | Risk Classification (High, Medium, Low) in alignment with IT Security Risk assessment Framework | Required | Helps assess and manage the risks associated with a data product to protect NOVO NORDISK data and ensure compliance with regulations. |
| Data Sensitivity Classification | Data sensitivity classification (e.g., public, Internal Use, Confidential, Strictly Confidential) | Required | |
| Personal Identifiable Information (PII) | Specify if the data source contains Personally identifiable information (PII) (e.g., Yes or No) | Required | |
| GxP | Specify if the data source contains GxP data as part of a GxP validated process | Required | |
| 3. Attributes | |||
| Data Product ID | Global Unique and persistent identifier from Purview and NOVO NORDISK DM | Required | Identification and brief description of Data Product for quick understanding. |
| Data Product Name | A user-oriented name of the data product. Use plain English and avoid abbreviations | Required | |
| Data Product Description | A description of the data product in non-technical lingo, use plain English and do not use abbreviations that are more commonly used, e.g., PC, ATM, etc. | Required | |
| Source ID | Unique ID of the source system | Required | Identification and brief description of the source system where actual data is hosted and its owner. |
| Source System Name | Name of the software used to perform a function | Required | |
| IT-system owner | The owner of an IT system is a Line Manager who is responsible for one of the business processes supported by the IT system | Required | |
| Source Type | Type of source (e.g., Oracle, Azure SQL, SQL Server, ADLS Gen2, Databricks, PostgreSQL, Snowflake, Power BI) | Optional | Identifies the origin of the data product, helping consumers understand the underlying technology. |
| Source Connection Details | Connection details like S3 Bucket URL | Optional | Defines how data consumers can technically connect to the data source, e.g., PostgreSQL: Server, Port, DB Name. |
| Service Account | Azure AD identity used for accessing the source (if Azure-hosted) | Optional | To enforce secure authentication and ensure compliance with Novo Nordisk's security guidelines. |
| Access Credentials | Credentials (username/password) for accessing the data source (if not using a service principal) | Optional | |
| 4. Source Tagging | |||
| Tag Name | Name of tags in Data Sources (E.g., Data Quality tags, Business or Domain tags such as Financial Data, Clinical Data, etc.) | Optional | Helps categorize the Data Products. |
| 5. Lineage | |||
| Data Lineage | Lineage from source to data product in the data marketplace | Optional | Helps with auditing, troubleshooting, and understanding data flows. |
| 6. Semantically Consistent | |||
| Business Glossary | Definition of business terms used in Data Product in the glossary | Optional | Ensures common understanding of business terminology, enhancing communication and reducing ambiguity across teams. |
| Data Model | Data Model for Data Assets & Products linked to enterprise data model | Required | Helps understand the underlying structure of a Data Product. |
| 7. Data Quality | |||
| Data Quality Measures | Data should have data quality metrics and monitoring defined | Required | Ensures Data Products meet business and user needs. |
| 8. Usage Map | |||
| Interface Agreements | Data map over producing and consuming systems | Optional | Interface Agreements act as the foundation for standardized, secure, and governed data exchanges, enabling seamless interoperability between systems while ensuring compliance, trust, and usability of Data Products. |
| 9. Implementation SOP | Ensure Implementation of data product is in alignment with the applicable SoPs e.g., that data products are correctly classified according to Protecting and Handling Information SOP. | Optional | Ensure all compliance requirements are met. |
| 10. Retention Policy | |||
| Retention Time | Retention time for the content of the data | Required | Ensure compliance with legal and regulatory requirements and optimal data storage management. |
Data Governance Success Metrics & Checkpoints#
- Checkpoint 1: Risk assessments completed and documented
- Checkpoint 2: Metadata validated against FAIR Data Product requirements
Data Governance Common Challenges & Solutions#
Challenge: Incomplete metadata leading to poor discoverability - Solution: Implement a comprehensive checklist for required metadata attributes - Prevention: Conduct regular reviews of metadata against established standards
Read more about Data Governance
| Solution / Product Owner | LoB | LoB Responsibility, can take guidance from AI Foundation vertical |
| Solution / Product Manager | Lob | LoB Responsibility, can take guidance from AI Foundation vertical |
| System Manager | AI Foundation | Reach out to AI Foundation vertical |
| Technical Lead | AI Foundation Vertical | Reach out to AI Foundation vertical |
| IT QA | AI Foundation | Reach out to AI Foundation vertical |
| Validation Lead | LoB | LoB Responsibility, can take guidance from AI Foundation horizontal |
| Project Manager | LoB | LoB Responsibility, can take guidance from AI Foundation vertical |
| Scrum Master | AI Foundation | LoB Responsibility, can take guidance from AI Foundation vertical |
| Lead Architect | AI Foundation | LoB Responsibility, can take guidance from AI Foundation vertical |
| Tester | AI Foundation | LoB Responsibility, can take guidance from AI Foundation vertical |
| Developers | AI Foundation | LoB Responsibility, can take guidance from AI Foundation vertical |