Presentation on theme: "Master Data Management (MDM)"— Presentation transcript:
1Master Data Management (MDM) ARKANSAS BLUE CROSS BLUE SHIELD(501)Tuesday 9/25, 8:40 - 9:20
2Speaker at national conferences for Oracle, Fidelity, and MDM-CDI. Robert Fox - Data Architect, Arkansas Blue Cross Blue Shield17 years of data architecture and warehousing experience in the finance, telecom, and health insurance industries.Installed over 50 data warehouses worldwide, some loading more than 4.5 terabytes of new transactional data per day.Guest lecturer in information management masters degree programs at 3 universities.Speaker at national conferences for Oracle, Fidelity, and MDM-CDI.
3Abstract – MDM is bigger than you think Master Data Management (MDM) is critical to effective managing the information assets of your company. Unfortunately, however, the meaning of the term Master Data Management has become more limited in scope over time.To understand the full scope of MDM, you must ask yourself two questions:What is “master data”?What all is involved in “managing” it?How you answer these two questions determines whether your MDM program is a departmental or enterprise solution, and whether you are managing the information or managing a process.This presentation will propose a framework for MDM that is broad in terms of both data and functionality.
4Why is Information Important? The Evolution of Competition The Land GrabAcquiring unclaimed customersInvestment in infrastructure, name recognition, and geographic territoryThe Killer AppStealing customers from the competition by offering new and better products and servicesInvestment in software developmentThe Information WarStealing and retaining customers from the competition by knowing more about the customer and giving them individually customized treatmentInvestment in business intelligence, customizable offeringsWe are now in the age of informationIf your nephew was going to college to study “computers,” you would tell him to…
5First: What is “Master Data”? Application-specific, non-duplicated, non-shared data…Data duplicated in multiple appsApp AApp BApp CApp ZOther Data shared with enterpriseThere seem to be three different definitions:Master Data is all the data, everywhere.Master Data is just the data that is duplicated in multiple applications.Master Data is any data that is exposed outside of an application, whether it is duplicated or not.There has never been 100% consensus on what “Master Data” really is.In the early days of MDM, most people tended to think along the lines of definition 1 above. This proved, however, to be too far reaching, both for the companies who were attempting to control data at this level, and for the consultants and tool vendors trying to provide products and services.In an effort to reduce the scope to something more manageable, the companies, consultants, and tool vendors narrowed the definition of master data dramatically, so that the result was a well defined, manageable scope of data problems – the problem of having the same data (typically customer names and addresses) stored and independently maintained in multiple applications. As products and services became available to address this problem, the definition of Master Data became synonymous with this reduced scope. “We can help you with this master data problem” quickly changed to “we can help you with the master data management problem.” The solutions themselves were not bad things. Synchronizing duplicate data is a very pressing need in almost every business, and a very complex problem to solve. If you are not solving this problem within your infrastructure, you are not doing a good job of supporting your business. The problem with these products was not in the products themselves, but in the danger of thinking that solving this problem was the extend of master data management.There are some consulting companies and tool vendors out there who still cast MDM as composed entirely of solving the customer name and address synchronization problem. After all, if that’s the tool they have in their toolbelt, then it’s to their advantage to describe MDM in this way. They can then say they have a complete solution. But this is not a complete solution. Most professionals these days have broadened the definition of master data back out to definition three, striking a healthy balance between too broad a scope (definition 1) and too narrow a scope (definition 2). Part of the driver cam from business, having solved the immediate problem of name and address synchronization, realized that there were still many other data management issues to address. Another driver of the expansion of the defintion of master data came from vendors who were developing products and services for these other data management issues.Gartner, Forrester, Information Management Magazine, the CDI-MDM group, and more advanced constultants and vendors such as Oracle and IBM now agree on definition 3). Internal data that is never exposed outside of an application is not considered master data, but any information that is exposed outside of the source application is master data, an enterprise asset that must be managed. If a particular data element is needed outside of an aplication for reporting, for pricing, for billing, for forecasting, for risk analysis, for commissions, or for any other purpose, then the data is an enterprise asset, not an application-specific asset.
6Second: What does it mean to “manage” data? If someone asked you to describe your organization, you would know, because you leaned it in kindergarten, that to completely answer the question, you would have to address the who, what, where, when, why, and how, right. All of these questions can’t be answered in a single picture. There are, however, standard modeling techniques to depict the answers to each of those questions. To answer the “where and how,” you would use an environment diagram showing your various platforms and how data flows between them. The “why” question should be in your mission statement. The “who and what” questions, you would begin with a functional area diagram like the one shown here.This is a very simplified version of our functional areas. I cut out a lot, and it’s still probably too small to read. But that isn’t really important. What’s important is that you document all the functions your organization performs. When you “manage” data, you are doing far more than just synchronizing it across multiple platforms. You also “manage” the security of the data, the quality of the data, the delivery of the data, the meta data describing the business data, and many many more management functions.This is a very useful diagram. When you do your strategic planning, you need to pull out this functional area diagram and make sure you have a strategy for ALL the functional areas in your organization. When you are assigning roles and responsibilities, you need to make sure everyone knows what they are and are not responsible for.And when it comes to master data management, you need to make sure that you are managing ALL of these aspects of data. This conference is called the IM symposium. IM – Information Management. This diagram describes the breadth of what it means to manage information. If you don’t have this clearly documented somewhere, then I suggest that you literally do not know what you are doing.So, now we’ve discussed what “master data” is, and what it means to “manage” data. We are now ready to discuss “master data management.”
7First Generation IT Architecture Application AApplication BApplication Z…………CustomerProductReportingCustomerProductReportingCustomerProductReportingDataDataDataBusiness LogicBusiness LogicBusiness LogicUser InterfaceUser InterfaceUser InterfaceMonolithic applications with proprietary, embedded dataCustom point-to-point integrationVery little ability to establish enterprise architectureData “owned” by application. No such thing as enterprise data, data standards, standard integration interfaces, etc. Lots of “re-inventing the wheel” in each applicatioin.IntegrationVery expensive to add new application to mixVery expensive to testVery difficult to manage at enterprise level
8The Evolution of Data Architecture ModularDBData AccessBusiness LogicPresentationDB Server3 Tier Client/ServerApp ServerDBData AccessBusiness LogicClientPresentationClient5 Tier SOAWeb ServerPresentationDB ServerApp ServerDBData Access ServicesBusiness ServicesIntegration BusMonolithicCustom DataCustomApplicationModular – tired of re-inventing the wheel in each application, standard tools were developed. Proprietary data stores were replaced by database APIs. Proprietary reporting engines were replaced by reporting API’s. Proprietary user interfaces were replaced by standard GUI API’s. And so on. But applications were still relatively monolithic, just built from standard components that were “compiled” into the application.Client Server – at this point, the embedded, compiled API’s for the various subsystems were replaced with discrete components, coupled together by interfaces. These interfaces were initially vendor specific, but over time standards emerged. ODBC to connect database servers to app servers. HTML to connect app servers to front end applications.SOA – interfaces continued to standardize until all components at all layers communicate with the same standard interface – XML services. This standardization effectively decoupled data from applications (Data as a Service, or DaaS). Data can now be read directly by the presentation layer, or by other applications. Applications can even share the same data stores. Instead of each application having its own copy of customer names and addresses, there can theoretically be one single storage domain for customer name and address, existing independent of application business logic.Monolithic architecture has evolved into loosely coupled functional specialty areas. Unfortunately, IT department organization charts have not kept pace with this architectural evolution.Early on in this evolution, IT departments developed into Application Development groups and Technology/Infrastructure groups. Application Development did the software development and configuration, and the Technology groups handled all the hardware and the networking. At this point in the evolution of IT architecture, data management was buried within application development
9Effective MDM cannot be a component of application development Traditional Location of Information Management ResponsibilityEmerging Location of Information Management ResponsibilityITITApplicationDevelopmentTechnologySystemsInformation ManagementApplicationDevelopmentTechnologySystemsInformation ManagementApplication developers have an application-centric view. If they think about data at all, it is in the context of an application, not in the context of a separately manageable enterprise asset. Even if the application development manager “gets” MDM, when conflicts arise, their ultimate priority is to put out an application, not an enterprise data infrastructure.Recognizing the emerging importance of information as a competitive differentiator and corporate asset requires recognizing the importance of information management.Information management is an industry-recognized discipline, responsible for the collection, organization, quality, and delivery of a company’s information assets. The scope of responsibility is not limited to a development project or to a particular department’s information. It’s the whole enterprise.2Make sure your MDM program has the authority to be effective
10Three legs on the IT “stool” Financial ViewProfitabilityNowInvestmentfor the Future$$Business ViewCustomer ValueOperational EfficiencyStrategic PlanningIncrease RevenueReduce ExpenseGrow the CompanyIT ViewRemember the evolution of competitiveness? Infrastructure – Feature/Function – Information? These competitive differentiators still exist within your IT departments. All three are necessary in order to support the business's goals. If you don’t have information management working at this level of your organization, not only is your organization chart not aligned with your architecture, but you are not allowing your business to effectively compete on information.You may have noticed that I’m no longer talking about MDM in the limited scope of an enterprise data warehouse. Ideally, MDM is not limited to analytical copies of data. A responsible IT organization is going to have a master data management program that manages ALL the enterprise information assets, not just the analytical information. That said, information management almost always begins first with analytical data. Probably because that’s the one area of the company who’s main focus is on information, not on applications or technology.If you are here representing the MDM program that is limited in scope to analytical information, and your company doesn’t have an enterprise MDM program, your should be thinking about setting up your program in such a way that, once you’ve proven your ability to manage information assets in the analytical space, you can expand your MDM program to the operational space without completely throwing it away and starting over.ApplicationInformationTechnology(How?)(What?)(Where?)The goal of IT is to support the business.The goal of the business is to make money.
11MDM ComponentsOK, so now that we’ve defined what master data is and what it means to manage it, and now that we’ve stressed the need for MDM to exist outside of individual projects and outside of just analytical repositories, we can begin to discuss some of the components of MDM. We won’t have time to discuss all of them, of course, so I’m just going to pick a few that I think are particularly interesting or misunderstood.
12Customer Data Integration (CDI) 2. Synchronization CDIApp3. System of Record CDIApp ACDIApp BApp C1. Analytical CDIFirst, I do want to discuss customer data integration, or CDI. CDI is the tools and processes that address the issue of names and addresses being independently updated within the data stores of multiple independent applications.Whether right or wrong, most companies initially implement CDI for analytic purposes only. The duplications and missing or innacurate data in the source systems is left unchanged, but the names and addresses are imported into the enterprise data warehouse where they are cleansed, de-duplicated and merged into a single golden copy of the name and address of each unique individual for analytic purposes.Once the value of this cleansed data is realized, companies often see the next logical step being the pushing of the golden copy of the name and address back to each application. This usually begins as a batch process, but there is great value in performing CDI functions immediately as part of every name and address insert and update. The cost of going from batch updates to real time updates is less in the vendor solution, and more a factor of the effort it takes to add the real-time capability within the application. In many cases, however, the CDI tools can now interface directly with the database via triggers or change log capture.Of course, in order to provide this “golden copy” functionality, these CDI tools usually are configured in a way that implements a name and address database for matching and merging purposes. Since these tools generally have the capability to provide real-time synchronization services, they can often be accessed directly by applications and front end interfaces as the system of record for name and address information. This leads to the final development in CDI architecture implementation, where you begin developing new applications without a name and address repository of their own, and gradually begin the process of decommissioning your application-local copies of name and address, and redirecting the applications to the CDI hub as the system of record.I’ve said that, in my experience, this is the way CDI typically rolls out. That doesn’t mean that this is the path every institution takes, or even that it is the best path to take. There is a cost in all these iterations that can be avoided if you have the foresight and wherewithal to jump directly over some of the intermediate steps. Your path may look different, but one way or the other, you need to solve this problem.I should also say that vendor solutions vary. Some, rather than implementing a local “golden copy” database in the CDI solution, implement a federated solution instead, where the data remains spread across all of the source systems. I admit to a bias against this approach. I feel that it is too dependant on all of the connections being avaliable all the time, I’ve found they don’t perform as well as the centralized solutions, and I feel that a federated solution can never become the final system of record.WH
13MDM Components: Enterprise Logical Data Model PartyClaimProductDB DesignSOA ServicesMapping SourcesDefining System of RecordDefine StewardsGlobal DictionaryStandard TermsDocument Bus. RelationshipsMemberClaimDrugNameHeaderDemographicServiceLineRiskAdjustmentProviderPaymentNPIBase it on business view of data, not application viewUseful for:Designing a normalized warehouse distribution tierDesigning SOA servicesMapping existing app db’s to a common schemaDefining system of RecordDefining Data StewardsCreating an enterprise data dictionaryAgreeing on terminology, standardizing namesAgreeing on business conceptsAn ELDM is not a database. It is a conceptual model from which databases and services are derived.SpecialtyDiagnosisClinicProcedureLocation
14Data Standards No duplicate data SoA same as SoR unless proven untenableOnly one warehouseAll data is owned by the enterprise, not individuals or departmentsData storage and transmission must comply with security standardsServices used by multiple applications must use data structures derived from the enterprise data modelAll data structures must be documentedAll new data structures must be derived from enterprise data modelData quality should be addressed at the point of entry into corporate networkInformation should be as accurate and current as possibleData access should always be directed to the SoAThe data model should drive the choice and architecture of applications, and should be based in turn on business requirementsAny exceptions should go through Data Governance
15Dimensions of Data Quality for an Analytical Repository INTEGRITY – does the data accurately reflect the source system?VALIDATION – is the data valid for the field?VERIFICATION – is the data actually true?BALANCING – does the data compare to other systems (i.e. GL)?AUDITING – Are there outlying values?There is no universally agreed upon list of the various types of data quality functions. Do a google search on “data quality dimensions” and look at each of the results returned on the first page. Not one of them will return the same list, or even a list of the same length. I’ve seen lists that defined three dimensions, and lists that defined 25 dimensions, and just about everything in between.I personally think that all but the most esoteric suggestions, though, can be grouped into these five areasWhile a data quality manager may define the standards for all of these dimensions, some of these are best implemented by development staff, i.e. batch integrity checks for completeness of file transmission. At the other end of the spectrum, auditing is a statistical analysis process best performed outside the batch process by data quality professionals.
16Software Development Life Cycle (SDLC) Project PipelineBusiness RequirementsEscalation ProcessData QualityProject SupervisorsSenior Leadership TeamExecutive SponsorArchitectural ApproachResource AssignmentTechnical DesignProject ProcessDesign ReviewWaiver ProcessDevelopmentStandards Review BoardStandards CommitteeExecutive SponsorDevelop a system that has minimal overhead necessary for administrative oversightMake it as easy to use as possible – invest in toolsSee Agile Development presentation for a discussion of Waterfall vs Agile development methodologiesEscalation – when team members cannot agree on which approach is bestWaiver – when team members agree, but approach violates standardsYour SDLC may have more than one project path (strategic vs fasttrac)Unit Testing/ QA TestingCode ReviewSystem/User TestingImplementation4Develop an SDLC and infrastructure to support it
18Maturity Model For Business Intelligence 5Real-Time Decisioning4Batch Campaigns3Enriching with 3rd-party data2Modeling, Scoring, and Segmentation1Enterprise Reporting
19Figure out what “Master Data Management” means in your organization SummaryFigure out what “Master Data Management” means in your organizationThink larger than the warehouse.Make sure your program is empowered with authority to be effective within the enterprise.Develop a functional area framework and assign responsibilitiesDevelop efficient MDM strategies for each functional area: Software development process, standards and waivers, aging and archiving, data quality, data delivery, security, etc.Always remember that you work for the business!