Skip to content

Requirements Specification#

Overview#

The Requirements Specification phase translates business needs into detailed technical and functional requirements for your data product. This critical step ensures all stakeholders have a shared understanding of what will be delivered and establishes clear boundaries for the implementation team.

What you will learn#

After completing this chapter, you will understand how to:

  • Transform business requirements into actionable technical specifications
  • Conduct stakeholder interviews to gather comprehensive requirements
  • Create essential documentation including data contracts, business glossary, and gap analyses
  • Define clear scope boundaries and success criteria for your data product
  • Establish proper governance, security, and compliance requirements

Key Personas & Stakeholders - RACI Matrix#

Activity Product Owner Business Analyst Data Engineer Solution Architect Business Stakeholders Governance Team
Requirements Gathering A R C C C C
Technical Analysis C R R A I C
Gap Analysis A R C R C C
Scope Definition A R C C I I
Business Glossary Creation/Maintenance A R I C C R
Data Contract Development/Maintenance A C R C I I
Security Requirements I C C R C A

R = Responsible, A = Accountable, C = Consulted, I = Informed

Prerequisites#

  • Completed: Business Requirements Document
  • Access: Stakeholder contact information and availability for interviews
  • Tools: Access to ADO/TIMS for requirement management and dc-release framework setup
  • Knowledge: Understanding of your organization's data governance framework and compliance requirements

Before You Begin

Consider using TIMS/ADO managed projects for capturing final requirements. If your team follows the dc-release Operating Model, requirements should be captured in feature files.

Step-by-Step Process#

Step 1: Stakeholder Interview Planning#

Objective: Identify all relevant parties and plan comprehensive requirement gathering sessions.

  1. Create stakeholder matrix: - Primary business stakeholders (Product Owner, Business Owner) - Technical stakeholders (Data Engineers, Solution Architects) - Governance parties (Compliance, Security, Data Governance) - End users and consumers of the data product

  2. Schedule interview sessions with each stakeholder group

  3. Prepare interview templates focusing on their specific domain expertise

Interview Best Practices

Structure interviews around the stakeholder's expertise area. Ask governance teams about data definitions, security teams about access requirements, and business users about expected outcomes.

Expected Outcome: Comprehensive stakeholder engagement plan with scheduled interviews

Step 2: Requirements Gathering and Documentation#

Objective: Capture detailed functional and non-functional requirements from all stakeholders. Use Business Requirements Document as the baseline.

Functional Requirements#

Document what the data product must do:

  • Data transformations and calculations
  • Expected outputs and formats
  • User interface requirements (for dashboards/reports)
  • Integration requirements with existing systems

Non-Functional Requirements#

Define how the data product should perform:

  • Performance requirements (latency, throughput)
  • Security requirements (encryption, access controls)
  • Compliance requirements (GDPR, industry regulations)
  • Scalability and availability needs

Use this Requirements Template to document your requirements

Metadata Requirements for FAIR Data Products#

To ensure Data Products are Findable, Accessible, Interoperable, and Reusable, follow these guidelines:

  1. Governance: Assign a Data Owner and Data Steward for every Data Product.
  2. Risk Assessment: Conduct and document risk and classification assessments (e.g., Data Classification, PII, GxP). Currently, use the standard risk assessment format for IT. A simplified version of a risk assessment for data products is planned for November linked to the entity in ServiceNOW.
  3. Data Product Attributes: Maintain complete metadata, including source and product IDs, descriptions, etc.
  4. Source Tagging: Tag source data with classification and domain tags from enterprise data guardrails.
  5. Lineage: Ensure data lineage from source to product is captured in NNDM, where all input and output ports are present, e.g., with linked sources, applications, and data contracts or consumer data products.
  6. Semantically Consistent: Use defined business terms from the business glossary; link data models to the enterprise model where relevant.
  7. Data Quality: Define and monitor data quality metrics (completeness, accuracy, timeliness). For more information and support, check the services and support from data quality here.
  8. Usage Map: Map producing and consuming systems; document interface agreements. To be defined further
  9. Implementation SOP: Ensure the implementation of data products is in alignment with the applicable SOPs, e.g., that data products are correctly classified according to Protecting and Handling Information SOP.
  10. Retention Policy: Define and document retention and disposal procedures aligned with corporate policies.
Data Product Attributes Description Required/ Optional Rationale
1. Governance
Data Owner Owner of Data Products Required To establish ownership and stewardship for Data Products critical for governance, maintenance, and enhancements of Data Products. LoBs can further customize the roles to required granularity, e.g., Data (product) Owner, Data (integrity) Owner. Key consideration is that Owner is ultimately accountable, while steward is responsible for operations and adherence to the requirement of CRUD operations. SME roles cover extended roles specific to an LoB (Data Modelers, ontologists, scientific stewards, researchers, etc.).
Data Steward Data Steward responsible for maintaining the data product content and quality Required
Subject Matter Expert Responsible for questions regarding the data asset or data product Optional
2. Risk Assessment
Business Impact Risk Classification (High, Medium, Low) in alignment with IT Security Risk assessment Framework Required Helps assess and manage the risks associated with a data product to protect NOVO NORDISK data and ensure compliance with regulations.
Data Sensitivity Classification Data sensitivity classification (e.g., public, Internal Use, Confidential, Strictly Confidential) Required
Personal Identifiable Information (PII) Specify if the data source contains Personally identifiable information (PII) (e.g., Yes or No) Required
GxP Specify if the data source contains GxP data as part of a GxP validated process Required
3. Attributes
Data Product ID Global Unique and persistent identifier from Purview and NOVO NORDISK DM Required Identification and brief description of Data Product for quick understanding.
Data Product Name A user-oriented name of the data product. Use plain English and avoid abbreviations Required
Data Product Description A description of the data product in non-technical lingo, use plain English and do not use abbreviations that are more commonly used, e.g., PC, ATM, etc. Required
Source ID Unique ID of the source system Required Identification and brief description of the source system where actual data is hosted and its owner.
Source System Name Name of the software used to perform a function Required
IT-system owner The owner of an IT system is a Line Manager who is responsible for one of the business processes supported by the IT system Required
Source Type Type of source (e.g., Oracle, Azure SQL, SQL Server, ADLS Gen2, Databricks, PostgreSQL, Snowflake, Power BI) Optional Identifies the origin of the data product, helping consumers understand the underlying technology.
Source Connection Details Connection details like S3 Bucket URL Optional Defines how data consumers can technically connect to the data source, e.g., PostgreSQL: Server, Port, DB Name.
Service Account Azure AD identity used for accessing the source (if Azure-hosted) Optional To enforce secure authentication and ensure compliance with Novo Nordisk's security guidelines.
Access Credentials Credentials (username/password) for accessing the data source (if not using a service principal) Optional
4. Source Tagging
Tag Name Name of tags in Data Sources (E.g., Data Quality tags, Business or Domain tags such as Financial Data, Clinical Data, etc.) Optional Helps categorize the Data Products.
5. Lineage
Data Lineage Lineage from source to data product in the data marketplace Optional Helps with auditing, troubleshooting, and understanding data flows.
6. Semantically Consistent
Business Glossary Definition of business terms used in Data Product in the glossary Optional Ensures common understanding of business terminology, enhancing communication and reducing ambiguity across teams.
Data Model Data Model for Data Assets & Products linked to enterprise data model Required Helps understand the underlying structure of a Data Product.
7. Data Quality
Data Quality Measures Data should have data quality metrics and monitoring defined Required Ensures Data Products meet business and user needs.
8. Usage Map
Interface Agreements Data map over producing and consuming systems Optional Interface Agreements act as the foundation for standardized, secure, and governed data exchanges, enabling seamless interoperability between systems while ensuring compliance, trust, and usability of Data Products.
9. Implementation SOP Ensure Implementation of data product is in alignment with the applicable SoPs e.g., that data products are correctly classified according to Protecting and Handling Information SOP. Optional Ensure all compliance requirements are met.
10. Retention Policy
Retention Time Retention time for the content of the data Required Ensure compliance with legal and regulatory requirements and optimal data storage management.

Expected Outcomes:

  • Complete requirements documentation in your chosen format (ADO stories, TIMS requirements, or markdown).
  • Risk assessments completed and documented.
  • Metadata validated against FAIR Data Product requirements

Step 3: Data Modelling#

Objective: Capture data modelling requirements for conceptual and logical data models. Physical data models are purely design choice determined by underlying storage formats and data loading logic.

  1. Conceptual Data Modelling(If present for data domain, can be re-used)

    Hold "Alt" / "Option" to enable Pan & Zoom
    Conceptual Data Model

    • Identify and define key business entities within the data domain
    • Capture their high-level relationships and dependencies
    • Represent these entities and associations using any diagramming tool of choice, such as Miro, Whiteboard, etc. Optionally start in Erwin, as it preserves metadata and eases extension into the logical model
  2. Logical Data Modelling(mandatory, will outline the schematic representation of the critical data elements and their relations)

    Hold "Alt" / "Option" to enable Pan & Zoom
    Logical Data Model

    • Refer conceptual model and expand entities with attributes, data types, and business rules
    • Define primary and foreign key relationships and cardinality
    • Document assumptions and design decisions for traceability
    • Represent the above in diagramming tool of choice, such as Miro, Whiteboard, etc. Optionally start in Erwin, as it preserves metadata

Step 4: Business Glossary and Data Contracts#

Objective: Establish standardized definitions and data agreements.

  1. Create Business Glossary:

    • Define all business terms and KPIs
    • Establish calculation methods for metrics
    • Document data lineage for key attributes
    • Create managed tags for NNDM compliance
  2. Develop Data Contracts:

    • Ingestion contracts: Define source data expectations
    • Processing contracts: Specify transformation rules
    • Output contracts: Define target data product structure

Critical Success Factor

Ensure all stakeholders agree on business definitions before proceeding. Misaligned definitions are a leading cause of data product failures.

Expected Outcome: Approved business glossary and comprehensive data contracts

Step 5: Gap Analysis and Technical Assessment#

Objective: Identify potential implementation challenges and dependencies happening post the feasibility phase.

Note

These requirements go beyond the high-level feasibility phase. They cover the detailed specifications teams need when designing, modifying, or expanding the solution

Evaluate the following areas:

  1. Data Availability, Volume And Sensitivity:

    • Are required data sources and assets accessible? Can any part of the required data assets (columns or tables) change the data classification.
    • Do source systems have necessary APIs or export capabilities?
    • Has the data volume changed from during the initial feasibility?
    • Is historical data available for the required timeframe?
  2. Infrastructure Readiness:

    • Are compute resources sufficient? (E.g. There is a change in data volume)
    • Do you need additional storage capacity?
    • Are networking connections established? (E.g. New Sources being added or new services which can act as new source are discovered)
  3. Organizational Capabilities:

    • Are governance processes established? (E.g. Data Classification has changed due to sensitive attributes being added which were not determined in feasibility, appropriate governance process is required in such cases)
    • Is support structure in place? (E.g. If your product requires a 24*5 support but no team exists, either lower the criticality or establish support processes first.)

Gap Analysis Template

For each identified gap:

  • Gap Description: What is missing?
  • Impact: How does this affect delivery?
  • Resolution: What needs to be done?
  • Owner: Who is responsible for resolution?
  • Timeline: When must this be resolved?

Expected Outcome: Documented gaps with resolution plans and updated project scope in the project management tool of choice for e.g. ADO.

Step 6: Scope Definition and Finalization#

Objective: Establish clear project boundaries and deliverable definitions.

  1. Define what's included:

    • Specific data sources to be integrated
    • Transformations to be implemented
    • Outputs to be delivered
    • Support and maintenance responsibilities
  2. Define what's excluded:

    • Out-of-scope data sources or requirements
    • Future enhancements not part of initial delivery
    • Dependencies resolved by other teams
  3. Create acceptance criteria for each major deliverable

  4. Establish success metrics for the overall data product

Expected Outcome: Signed-off requirements specification with clear scope boundaries

Step 7: dc-release/CI-CD Feature File Creation#

Objective: Translate requirements into dc-release-compatible feature files for CI/CD implementation.

  1. Structure feature files following dc-release standards
  2. Define scenarios for each major requirement
  3. Link to supporting documentation

Expected Outcome: Complete feature files ready for dc-release pipeline integration

Success Metrics & Checkpoints#

  • Stakeholder Sign-off: All key stakeholders have approved their respective requirement areas
  • Complete Documentation: Business glossary, data contracts, and technical specifications are documented
  • Gap Resolution: All identified gaps have approved resolution plans with assigned owners
  • Scope Clarity: Project boundaries are clearly defined and accepted by all parties
  • dc-release Integration: Feature files are created and validated in the dc-release pipeline
  • Compliance Review: Security and governance requirements have been reviewed and approved

Next Steps#

After completing requirements specification:

  1. Begin technical design based on approved requirements
  2. Start data marketplace registration process
  3. Initialize dc-release pipeline with your feature files

The next chapter will guide you through Data Marketplace Registration to ensure your data product is properly catalogued and discoverable.

Additional Resources#