Article

First Accommodate Master Data, Then Clean It

< Back to Insights
Insights  <  First Accommodate Master Data, Then Clean It

In this blog post, I want to challenge a deeply held notion of Data Quality and Master Data Management.

I have had many, many conversations with technology professionals seeking to implement MDM in their organization. In those first conversations, among the first questions asked is a complex one, disguised as a simple one – How can I start with clean data?

Listen: if you try to start your Master Data implementation with clean master data – you will never get started!

Instead you need to embrace two fundamental realities of Master Data Management. First, there is no clear authoritative source for your master data (if there were, you wouldn’t have a problem). Second: Data Quality is “Front of House” work. The IT department may have data integration, data profiling, third party reference data and matching algorithms in their toolbox, but they can only do so much. IT tools are Back Office Tools and the IT data cleanups happen in the shadows. When they get it wrong, they get it very wrong and comprehensively wrong (and the explanation is hard to understand).

This sequence of events is straightforward, enables the business to take ownership and provides a clear path to getting started.

  • Accommodate your Data – in order for business people to steward and govern their own data – they need to see it with their own eyes, and they need to see all of it, even the data they don’t like. In order to do this, you must:
    • Maintain a clear relationship between data in the MDM hub and its source – don’t attempt to reduce the volume of records. The Federated approach to MDM does this best.
    • Keep rationalization/mapping to a minimum – avoid cleaning the data as you load it. Its wasteful to do it in ETL code when your MDM toolset is ready to do it for you much more easily.
    • Take a “Come as You Are” approach – avoid placing restrictions on the data at this stage of the project, because this only serves to keep data out of your system. We want the data in.
  • Establish Governance of your Data – once you have all of the data loaded into a Federated data model, you have the opportunity to start addressing the gaps
    • First, take some baseline measurements. How incomplete is your data?
    • Next, begin developing rules which can be enforced in the MDM Hub. These rules should be comprehensible to a business user. Ideally, your toolset integrates the rules into the stewardship experience, so that rules declared in the hub are readily available to them. Once you have a healthy body of rules, validate the data and take another baseline measurement
    • Now your data stewardship team can get to work, and you’ll have real metrics to share with the business with regards to the progress you are making towards data compliance.
  • Improve your Data – MDM toolsets automate the process of improving master data sourced from different systems. They do this in three ways:
    • Standardize your Data – MDM tools help data stewards establish standards and check data against those standards
    • Match your Data – MDM tools help data stewards find similar records from multiple systems and establish a grouping of clusters of records. The Group becomes the “Golden Record” – none of the sources get to be the boss!
    • Harmonize your Data – MDM tools help data stewards make decisions about which sources are most authoritative and can automate the harmonization of data within a grouping

Organizations whose starting approach with MDM is “Get the data clean and Keep the data clean” often fail to even get started. Or worse, they spend a lot of time and money requiring IT to clean the data, and then abandon the project after 6 months with nothing to show for it. Clean, then Load is the wrong order: Flip the Script and stick to these principles.

  1. Design a Federated MDM Data model which simplifies identity management for the master data.
  2. Identify where your master data lives and understand the attributes you want to govern initially.
  3. Bring the master data in as it exists in the source systems.
  4. Remove restrictions to loading your data.
  5. Establish some baseline measurements.
  6. Devise your initial rules set.
  7. Use MDM Stewardship tools to automate standardizing, matching and harmonizing.

Continue the Conversation with Our Team
Get in touch with us.

Contact Us