Go to Playbook Main Page
Prev: Requirements Specification
Next: Implementation

Design

Design#

Hold "Alt" / "Option" to enable Pan & Zoom

By now the team is aligned , the stakeholders have spoken, its the time now the Solution Architects to step into the arena to build a scalable, secure systems that satisfies the business , governance and technical requirements .

Roles Involved#

Accountable: Product Owner
Responsible: Solution Architect
Consulted: Business Analyst, Data Governance Lead, Data Engineer

Key Ideas

Solution Design is conducted to:
Address architectural gaps (including data modeling) that prevent implementation from meeting the Requirements Specification.
Apply best engineering and data practices to enhance the solutions supportability, data quality, performance, and readiness for consumption.
Solution Design results are documented as an Architecture Decision Record (ADR).
Every ADR must undergo Design Authority review, a collective evaluation by solution or platform architecture owners and authorities.
All Design Authority outcomes, especially approvals, must be documented with clear confirmation, such as an email or pull request approval.

1. Inputs#

What you need to begin designing your MDM system

Category	Input	Owner/POC
Approved Requirements	- URS - Functional Requirements Specification (FS)	Business Analyst
Data Assets	- Data profiling reports - Source system metadata and entity models	Business Analyst
Technical Architecture	- High-level infrastructure and integration architecture - API and connector specifications	Architect
Security & Compliance	- Data classification - User access requirements - Regulatory constraints (e.g., GDPR)	Business Analyst
Use Cases & Scope	- Defined MDM domains (Customer, Product, Vendor, etc.) - Golden record rules - Stewardship requirements	Business Analyst
Best Practices and Guardrails	- Documented best practices and design guardrails for key processes including Data Quality Rules, ingestion, match and merge, and data publishing - Defined naming standards adopted across Novo Nordisk for MDM design consistency. integration-guardrails match-and-merge-best-practices naming-standards	Architect

2. Process#

Activities that define the technical and logical architecture of your MDM solution

Step	Description	Know more
Architecture Decision Record (ADR)	- Record all the architectural decision .	ADR process flow developed by GDAI Architecture team ADR process
Logical Data Modeling	- Define core entities, attributes, relationships (Customer, Product, etc.) - Normalize vs. denormalize structure	Data-modelling
Physical Data Modeling	- Translate logic model into a document model for MDM SaaS	Data-modelling
Design Ingestion	- Define the way in which data from source can be brought into the MDM System - Design ingestion data flows using CAI/CDI/CDQ based on Source-to-target mapping	Ingestion pattern
Design Data Quality Rules	- Configure data quality rules, meta data driven data quality rules, data quality KPI	Data-Quality
Design Error Management	- Error Categorization, Error Collection, Error Notification, Error Resolution	Error-management
Design Match & Merge Rules	- Configure match keys, rulesets, and survivorship strategies	match-and-merge
Security & Roles	- Design user roles, data access layers, and stewardship access controls	Security-and-roles
Workflow & UI Design	- Design custom workflows and task flows for data stewardship - Page layouts in Informatica MDM UI	business-events
Publish Design	- API design, real-time sync ,events ,batch mode - Data flow between MDM and consuming apps	Publish-pattern
Test Plan	- Identify test objectives based on the business and technical requirements - Define test scope , strategy and types - Outline test environment, roles, tools, and schedules for execution - Document entry/exit criteria, risks and deliverables in the test plan	Test-Plan

3. Output#

Artifacts produced and reviewed at the end of the Design phase

Output	Description	Documentation Template
Logical & Physical Data Models	- ER diagrams, data definitions, data types	Data-modelling
Solution Design Documnent	- Consolidated blueprint of architecture, data flow, rules,match rules, interfaces,dependencies ,Role matrix, access policies	Solution-Design-Document
Integration Design Document	- API specs,Swagger Documents, Kafka Schemas, data contracts, orchestration flows	API-Specification-GxP
Design Specification Document (DS)	- Document the low level design where each document covers a seperate component.	DS-GxP
Data Contracts	- Formal document that defines the structure, format, semantics, quality, and terms of use for exchanging data between a data provider and their consumers. .	Data-contract-sample-country
Design Sign-Off	- Stakeholder and IT leadership approval to proceed to the build phase.	Solution-Design-Document
Test Plan	- Define test plan and scope	Test-Plan-GxP
Requirements Traceability Matrix	- Update requirements traceability matrix to map and track the design and test cases to the requirements.	Requirement Traceability Matrix

What Happens Next?#

With the Requirements Specification and Design Documents finalized, the team proceeds to:
- Shape the delivery backlog based on the defined scope and technical design.
- Begin detailed Implementation planning and execution.

This phase ensures that development activities are grounded in clear, approved specifications and aligned with stakeholder expectations.

Go Up

Go to Playbook Main Page
Prev: Requirements Specification
Next: Implementation

Guardrails#

Guardrails for Master Data Management (MDM) integrations are essential to ensure that the processes of integrating, synchronizing, and maintaining master data across systems follow best practices and avoid common pitfalls. These guardrails serve as guiding principles, technical standards, and operational practices that help ensure the integration process is secure, scalable, and aligned with organizational goals. Below are the key guardrails that guide and govern MDM integrations

1. System and Data Architecture Guardrails#

Core Principle:Design MDM integrations to be consistent, modular, and scalable, enabling seamless usage across the enterprise.
- Centralized Data Models: Standardize data models and schemas for integration to prevent mismatched definitions across teams or regions. Establish and use a canonical model for data movement between systems.
- Scalable Infrastructure: Design for scalability to accommodate growing master data volumes, expansion into new geographies, and onboarding of additional data domains. - Decoupling of Systems: Avoid tight coupling of MDM and other operational systems, allowing systems to evolve independently. - Initial Data Load: Build an initial data load framework to be reusable, enabling ad hoc re-execution when needed. Ideally, align it with the delta load framework so that future change management requires code updates in a single, centralized location.

2. Quality and Validation Guardrails#

Core Principle: Enforce data quality and validation checks at every stage of the integration lifecycle to maintain trusted and reliable master data. - Meta Data driven: Create modular services based on rules that are driven by meta data rules. Orchestrate the business rules by reusing the services created on the meta data rules. - Data Profiling: Assess source systems’ data quality before integration; identify duplicates, missing fields, and inconsistencies. - Validation Rules: Implement meta data driven validation logic to clean or prevent bad data from entering the MDM system. For example: Verify format and validity of email and phone number. Use data standardization rules to format the data and error rules to reject in valida data . - Search before create: Search for matches before creating a new record in MDM for further consolidation. - Data Enrichment: Supplement incomplete master records by integrating with third-party enrichment services (e.g., Informatica Address Verification DAAS.). - Continuous Monitoring: Automate quality checks across the entire integration lifecycle using tools like Informatica Data Quality (IDQ) to ensure consistency and reduce manual oversight.

3. Security and Access Control Guardrails#

Core Principle: Ensure that the integration process adheres to enterprise security policies and protects sensitive master data. - Access Control: Implement role-based access control (RBAC) for users accessing the MDM system and associated integrations. - Data Encryption:.All data at rest is encrypted , managed by the AWS cloud provider.Data encryption in transit must be enforced based on integration patterns, endpoints and use cases involved. - APIs Security: Secure APIs used for integration with Mulesoft . - Audit Logging: Log all integration activities and maintain immutable records for audits. - Compliance with Privacy Regulations: Mask or anonymize personally identifiable information (PII) or sensitive data during integration if required by GDPR, HIPAA, or other regulations.

4. Integration Process Guardrails#

Core Principle: Build robust and adaptable integration pipelines to streamline data movement between systems. - Real-Time vs. Batch: Use real-time integration for time-sensitive updates and batch integration for large-scale periodic updates. - For integration with operational systems such as CRM or ERP, consider real-time or near real-time integration patterns. - For analytical systems like data warehouses, batch integration is generally preferred. - Error Handling: Implement robust error-handling mechanisms to address failures within ETL processes, API calls, or data loads. - Retry Logic: Use automatic retry mechanisms for integration tasks that fail due to temporary issues (e.g., network outages or unavailable source systems). - Monitoring and Alerts: Use proactive monitoring and alerting tools to identify integration issues early (e.g., failures in data synchronization or delayed workflows). - Versioning and Backward Compatibility: Design integrations to handle changes in source/target systems by maintaining API versioning and backward compatibility.

5. Data Synchronization Guardrails#

Core Principle: Keep master data in sync between the MDM system and all connected source/target systems. - Hub-and-Spoke Architecture: Use an MDM as hub to serve as the central repository for all systems rather than creating direct integrations between source systems (avoids data silos and duplication). - Bidirectional Synchronization: Enable two-way data synchronization where both the MDM and source/target systems can update each other based if required by the use case. - Conflict Resolution Rules: Define and enforce business rules for resolving conflicting updates across systems. - Event-Driven Architecture: Implement event-driven updates (e.g., using Apache Kafka or Informatica’s Real-Time Integration) for immediate synchronization when updates occur. - Checkpoints and Reconciliation: Include checkpoints and reconciliation phases in the integration process for alignment verification.

7. Performance Optimization Guardrails#

Core Principle: Ensure integration pipelines deliver consistent and scalable performance as data volume grows. - Pushdown Optimization: Optimize ETL/ELT processes to harness database engines for better performance (e.g., database-native data transformation in Informatica). - Partitioning and Parallelism: Distribute large master data processing workloads across multiple processes to ensure timely delivery. - Incremental Loading: Avoid full data loads unless necessary; use incremental data loads based on timestamps (e.g., only load updated or newly added records).

8. Stakeholder Alignment Guardrails#

Core Principle: Ensure seamless collaboration between all business and IT stakeholders. - Defined Roles and Responsibilities: Clearly articulate roles for IT teams, system owners, and business users involved in master data integration. - Business Alignment: Regularly revisit business case objectives and confirm that MDM integrations are meeting KPIs (e.g., reduction of data duplication or regulatory compliance metrics). - Collaboration Between Teams: Facilitate ongoing communication between IT architects, data engineers, and end-users regarding integration workflows and any changes.

9. Continuous Improvement and Documentation Guardrails#

Core Principle: Promote ongoing refinement and maintain comprehensive documentation. - Document Integration Pipelines: Keep an up-to-date inventory of all data sources, integration workflows, and system touchpoints for troubleshooting and future updates. - Analyze Metrics: Regularly review metrics such as data synchronization speed, error rates, and volume processed to identify optimization opportunities. - Feedback Loops: Act on feedback from business users regarding integration challenges or gaps in functionality. - Iterative Updates: Adopt an agile, iterative approach to fine-tune integration logic over time.

Data Quality Rules#

Data quality rules must be centrally defined and stored as metadata (e.g., rule name, logic, threshold, severity).
Rules should be domain-aware and mapped to business entities and attributes (e.g., HCP.Email must be valid format).
Enable rule cataloging and version control, making them easy to govern, audit, and extend.
Rules should be atomic and reusable across data domains (e.g., email validation used for both HCP and HCO).
Use a library-based approach where common DQ patterns (e.g., null checks, range checks, regex) are predefined.
Allow rules to be invoked via APIs or service endpoints (e.g., DQ-as-a-Service) so external systems can reuse them for validation at the source.
Records failing rules should logged and reported .
Design rules to run both inline (real-time) for API-based ingestion and offline (batch) for bulk loads and periodic checks.
Maintain full audit logs of rule execution results, rule changes, and remediation actions.

Error Management#

Key Ideas

Use consistent error schema across ingestion, match, workflow, and publishing stages.
Expose key errors through dashboards and APIs for visibility.
Ensure traceability of errors back to the record and source system.
Design for resilience—fail gracefully and isolate issues to prevent cascading failures.

This section describes the approach for designing an error management framework, which is a critical part of any Master Data Management (MDM) system. The framework should be designed to:

Capture Errors Across All Stages
* Ensure errors are captured at all key stages - ingestion, consolidation, and publish * All errors (system, data, integration, rule violations) must be captured with context, including:Source system,Timestamp,Entity and attribute, Processing stage (e.g., ingestion, consolidation, publish)
* Use structured log formats (e.g., tables or JSON) for analysis and integration with monitoring tools (Splunk).
Categorize Errors
- Classify errors to support proper handling and resolution. Common categories include:
  - System Errors – Connectivity, infrastructure failures
  - Data Errors – Invalid formats, missing required fields
  - Business Rule Errors – DQ rule failures, reference lookup issues
  - Process Errors – Workflow routing, match/merge issues
- Include severity levels (Critical, High, Medium, Low) and disposition flags (retryable, needs manual fix, ignorable).
Report Errors
- Design a reporting framework to:
- Notify source systems of data issues
- Alert data stewards to address and resolve errors
- Inform consumer systems when applicable
- Include error resolution guidelines and history for audit and reuse.
Reprocess Resolved Errors
Once an error is fixed, an automated mechanism or pipeline should reprocess the affected records.
Example: If records are rejected due to missing reference data, and the reference data is subsequently added, the system should re-ingest those records from either the source or a staging layer.
Archive Errors Archiving helps keep the error data store clean and manageable, improving performance and ensuring maintainability. * Archive resolved errors with full context and resolution logs in a read-only audit store (e.g., cloud object storage or database). * Define retention periods based on compliance needs (e.g., 2 years for GxP). * Periodically purge soft errors or low-priority alerts to maintain performance.

Security and Roles#

Platform Level Access - IDMC Security#

User	Group	Role
User is an individual account used to access IDMC. Users are authenticated via single sign-on.	A group is a collection of users with similar responsibilities. Groups act as an intermediary between users and roles. A user can belong to multiple groups and inherit combined privileges.	Roles is a predefined set of privileges within the IDMC platform. Roles determine what actions can be performed.

Key Ideas

Always assign roles to groups, and not individual users, to ensure scalable and manageable access control across the platform.
Groups is assigned to one or more roles.
A user is a member of atleast one group.
We shold not have user with "direct" access.

Business 360 Access and Role Assignment in SaaS#

A user* is an individual account in Informatica Intelligent Cloud Services.
Customer 360 and Business 360 Console users must have an Informatica Intelligent Cloud Services user account.
You can create and manage users in the Administrator service.
A role** is a collection of privileges that you assign to users to allow access to Customer 360 and Business 360 Console.It controls access to business entities, relationships, and hierarchies. Grants permissions to create, read, update, and delete records.
Data Access Rules (Record-based) limits access to a subset of records in a record store based on conditions.
Data Access Rules (Attribute-based) restricts access to specific attributes based on user roles and attribute values. Attribute-based rules are always deny rules.

In SaaS, after a user is created, it needs to be assigned to relevant roles* for accessing services. To develop solutions, one needs access to Business360 services. The user roles listed below have the privileges to access the Business 360 Service.

Admin#

Provides full access to Business 360 Console.
Allows users to perform solution upgrades.
Allows users to create reports in business applications.

Designer#

Allows users to perform tasks in Business 360 Console that require integration with other services such as Data Integration and Data Quality.

MDM Designer#

Allows users to manage MDM SaaS-specific assets in Business 360 Console such as:
Modelling
Match-merge configuration
Ingress/egress job definition

Business 360 Process Executor#

Allows users with custom roles to perform tasks in Business 360 Console and business applications such as:
Import records
View hierarchy
Workflow tasks

Job Executor#

Allows users to run ingress and egress jobs in Business 360 Console.

To access and act on the Master Data—whether mastered in a pre-configured OOTB service (e.g., Customer360) or a custom-built service (e.g., Multidomain MDM)—roles must be assigned. These may be predefined or custom.

Example Roles for Customer 360 Service#

Customer 360 Analyst#

Create and edit records in Customer 360.
Changes trigger a review process requiring approval from a Customer 360 Manager.

Customer 360 Manager#

Review, approve, or update customer records.
Can also create/edit records without needing approval.

Customer 360 Data Steward#

Create records and hierarchies.
Can edit records without approval, run jobs, and approve customer records.

MDM Business User#

View-only access to records in Customer 360.
Cannot create or edit records.

Key Considerations:

Ensure all Business 360 processes under Application Integration are in published state.
Org should be upgraded to the latest release of Business 360.
Org state should ideally be ORG_PROVISIONED (optional).
Activate the Admin user via CSM and assign:
The admin role
A group to users setting up the application.
Create new projects for additional assets like:
Mappings
Mapping tasks
Taskflows
Other MDM SaaS assets
Avoid using predefined projects like Business360, Reference360, or Customer360 for this purpose._
Do not add any permissions to the predefined projects on the Explorer page in MDM SaaS.
Data access rules are only available for custom roles, not out-of-the-box roles.
Fields used in access rule conditions must be searchable.
Use only one access type when configuring rules.
The simplest and most transparent configuration for record-based access is to use an "allow all" rule.
Attribute-based rules are always deny rules.
The role must have access to the smart field or field group before configuring attribute-based rules.
Excessive rules on a business entity may degrade search performance.
With data access rules defined, the search result page shows up to 1,000 records (versus 10,000 without rules).
Multiple rules can be combined using the AND condition.
For OR conditions, create separate roles, each with relevant data access rules. Assign both roles to the user or group to simulate OR logic.

Go to Playbook Main Page
Next: Implementation