Master Data Management (MDM) Using SQL Server

转载: https://www.red-gate.com/simple-talk/sql/database-delivery/master-data-management-mdm-using-sql-server/
So many of the problems that organisations have with their IT applications are due to the struggle with data, in the absence of overall organization-wide control and supervision of data and its progress through the various parts of the organization. Master data management (MDM) offers a solution to the many data woes by controlling data change, It does it in an analogous way to Version Control, so that changes are cleansed, checked, tracked and audited, and any named version can be published to other services . Now Microsoft has an implementation as part of the data platform.

In this article I will describe the basics of Master Data Management and its importance for any organization. After sketching out the concepts of MDM, I will summarize the problems you can solve proactively by keeping your enterprise master data at the highest level of current practice.

I will also explain how MDM is implemented for SQL Server and point out the major services that are provided, and I’ll give a brief overview of how MDM solutions are designed and implemented using MDS, DQS, and SSIS. Since each of these services are quite complex I will tackle a more detailed description of each service in subsequent separate articles.

Context

It is the data flowing through applications that defines the business value of an enterprise systems. All organizations of any size will have a complex set of disparate systems and technologies, including Enterprise Resource Planning (ERP), Human Resource Management (HRM), Customer Relationship Management (CRM), Supply Chain Management (SCM), and Financial system. These systems create a tendency toward data silos. The consequence of this is that data can end up being duplicated, sometimes inconsistent, occasionally incomplete, and sometimes inaccurate. The symptoms of these problems are:

  • Fragmented data
    Because data is handled by several poorly-connected systems that don’t share a common understanding of the data, it is fragmented across the organization. Even if this data is unambiguously related to a specific customer or product, none of the systems have a complete set of information about that customer or product.
  • Inconsistent data
    Because the organization cannot implement all their systems together at one time, and because data is created by different departments and users, there is no simple automated way to make these data points consistent with other systems. Inconsistent data creates integration problems because the various other systems don’t necessarily understand the data exported by a particular system even if they have data about the same customer or product.
  • Inaccurate data state
    In general, business systems do not provide a way for other systems to retrieve the data as it was at a particular point in time: only the final, current state of the data. There is no way to capture the data state at point in time. This can make even simple business-intelligence questions impossible to answer.
  • Data Correlation
    Business transactions are recorded in various systems for each customer/vendor at a different point in time without a unique, consistent and commonly-held customer ID. Due to absence of common identifier this, or any other, business entity, it becomes hard to analyze customer behavior stored in various systems and opportunity to cross-sell/up-sell is lost.
  • Domain based data security Though almost all enterprise-level systems come with security features, most of them neglect to provide any security at domain level. This means that anyone with application-level access will than have access to all data whether it is appropriate to their role and level of responsibility or not. There is often no concept of domain-specific security in most enterprise applications and they often lack granular security controls: In consequence, data is at risk. Security should allow sufficient access to data for the user to perform their role in the organization, but no more than that.
  • Data-centric ownership
    Most enterprise applications do not provide a way of allowing organizations to impose ownership rules on data across applications, identifying business domains like product, customer, etc. thereby allowing domain-based ownership. Anyone with application access can modify any and all data. This makes audit and security far more difficult to impose and track.
  • No business process
    Most systems do not allow the organizations that use them to comply with life-cycle rules and processes for data governance. This means that, even where industry standards and the regulatory framework demands that changes to data require a subsequent approval process, sign-off or quality check. This cannot be done because of the rigidity of the system.
  • Slow response to changing business need
    It’s hard to react quickly to the changing business requirements of most enterprises when change has to be made in many separate systems. Not only is this prone to errors but it is also very time consuming. You have to make changes to each of the systems affected, and manage a coordinated release: This is because there is no mechanism defined for automatically distributing the changed data to all consumers.
  • Multiple data formats
    Businesses handle increasing amounts of data, in more formats and from various sources. Because there is no centralized single standard format for the import of data, the enterprise has to use a lot of resources in converting data to, and from, a variety of formats and then distribute it this data to all the systems that need it.
  • Mergers and Acquisitions Mergers and acquisitions happen frequently, and when they do, it means that all the various systems have to be merged together into a single logical entity. Where this process is fudged, more data inconsistency is added and there are then more disparate systems, even for a single business domain, that have somehow to be kept in sync.
  • Regulatory compliance
    When a business has to have an external audit, this can be a very expensive operation where an extra day spent can break a budget. Where an organization has disparate systems, with different ways of handling changes, and where not all support audit capability, external audit can get slowed down to a crawl. Without a centralized system for auditing data, it is hard to collect and merge reports from various systems for the purposes of audit and regulatory compliance.

These are some of the more significant problems that have led to the creation of Master Data Management. This is a solution that aims to help organizations to achieve and maintain a single view of master data across the organization. To handle these difficulties that I’ve outlined, a there must be a reasonable way to consolidate, cleanse, enrich, manage and ultimately distribute this data to downstream systems.

What is Master Data Management

Master data management (MDM), defines a process of collecting enterprise data from various sources, applying standard rules and business processes, building a single view of the data, and finally distributing this ‘golden’ version of data to various systems in the enterprise, thereby making it accessible to all consumers.

This is different from existing data warehousing systems. The purpose of data warehousing is to make it easier to perform analytics and business intelligence on historical data from transactional systems as well as from an MDM system. Master Data Management (MDM) reconciles data from various systems to create a single view of the master data, more usually for operational purposes. MDM stores data about entities: in other words

  • People (Customers, Vendors, Employees, Patients, etc.)
  • Things (Products, Business Units, Parts, Equipment, etc.)
  • Places (Locations, Stores, Geo Areas, etc.)
  • Abstractions (Accounts, Contracts, Time, etc.) which is less frequently changed.

MDM provides a way to import data from various sources using different formats into staging tables: It then can map this staged data to domain attributes for standardization and normalization, cleanse data, apply business rules, enrich data, and finally mark it as a ‘named’ version. This named version of data is ready for transferring to downstream systems, usually via web service endpoints, and provide a mechanism for publishing data to subscribed consumers.

MDM creates a new version of data every time changes are made, along with information about who is making the change. You can trace it back and look for differences (delta) between various versions, when it was made and who did it. By having this level of auditing history, your organization is helped to achieve complete regulatory compliance and also to provide an overall enterprise information management program.

Because MDM comes with industry-standard access controls, only authenticated and authorized users can see and make changes to data. MDM doesn’t only capture an organizations master data at one place but it also provides security to data.

Features of a typical MDM system

  • Domain
    A logical way to keep master data separated by business domains such as Customer, Vendor or Product.
  • Repository/Entity
    A repository defines the structure of your master data. The structure is determined by using pre-defined attributes and user-defined attributes. It is analogous to database tables.
  • Attributes
    Attributes define the structure of a repository. These is best thought of as being like database table columns. Every repository comes with one or more predefined attributes and you can add custom attributes as and when you need them.
  • Attribute Groups
    It’s a way to logically group similar types of attributes together, based on a business area. An example might be having a ‘Tax Info’ attribute group that might include attributes related to taxation USFED, VAT, etc.
  • Business Processes
    A business process models the real-world processes to perform various tasks such as email, approval notification, etc. A well-defined business process can be implemented as a set of procedures to govern the data.
  • Business Rules
    Repository attributes are defined as being of a particular type and with a certain length, but business rules are used for more complex constraints such as where number values should be in range of 5 and 250 or where medicine storage temperature should be between -20F and 80F.
  • Permissions – Authentication and authorization
    Apart from credential-based authentication, most MDM solutions come with role-based security where you can create as many roles as you want based on business domains. Typical roles will be Data steward, approvers, requesters, etc. for data management and Administrator, domain owner, repository owner, deployment, etc. for administration purpose.
  • User Interface (UI)
    A user interface for MDM administrator to create and deploy domains, repositories, etc. and for data stewards to add or update data.
  • Web Services
    Master data is not very useful if you do not make it accessible and web services are one of the better ways to access master data and administrative functions.
  • Data Publish
    This is another way, other than Web Services, of making data available for consumption by various current and future consumers in bulk. The difference is that, compared to a pull-based web service approach, this is a push mechanism where MDM systems can publish modified data and interested parties can subscribe to channels.
  • Data Quality
    Data Quality is a tool to profile, cleanse and masks data, while monitoring data quality over time regardless of format or size. By using data de-duplication, validation, standardization, and enrichment you create clean data for access. Data Quality System (DQS) is totally separate from MDM systems and can be used with and without MDM. Because data sources and data formats for MDM are increasing day-by-day, we will be exploring DQS as part of this MDM solution series.

MDM architecture

Here are the various components of typical MDM systems. Components can differ from vendor to vendor.

image

MDM in the context of SQL Server

Different vendors implement the MDM concepts that I’ve already explained. They use a range of different technologies such as java, .NET, database, etc. Microsoft implemented MDM concepts with a SQL Server approach using Master Data Services (MDS), Data Quality Service (DQS), and Integration Services (SSIS). Although MDS is a core service to implement MDM, both DQS and SSIS can supplement MDS if required. SQL Server 2008 R2 was the first release with MDS. This was drastically improved in 2012 and improvements have been continuing since then.

This image depicts an example where data is stored in various systems across the enterprise where data quality leaves a lot to be desired. Data is managed using MDS before making available for consumption.

image

Master Data Services (MDS)

Master Data Services is a Microsoft product for developing MDM solutions, it is built on top of SQL Server database technology for back-end processing and provides service-oriented architecture endpoints using Windows Communication Foundation (WCF). You can implement hub architecture using MDS to create centralized and synchronized data sources to reduce data redundancies across systems.

MDS provides following tools and components to implement MDM solutions:

Configuration Manager

This is the starting point to configure Master Data Services and you can create and configure databases using Configuration Manager. This database comes with lots of stored procedures, database tables, and functions which collectively support back-end processing. You can also create a web application called ‘Master Data Manager’. You can associate the database and web application into a single MDM solution.

Master Data Manager

A web application used to perform administrative tasks such creating models, entities, business rules, hierarchies, users, and roles for authorized access. Users can access Master Data Manager to update data. You can also create versions, subscription views, and enable DQS and SSIS integration.

MDSModelDeploy.exe

The typical database lifecycle for any enterprise solution requires several server environments for such activities as development, testing and production. **MDSModelDeploy.exe **utility is a tool to use to create packages of your model objects and data so you can deploy them to other environments.

MDS Web Service

Service-Oriented Architecture (SOA) is a standard way to tackle an enterprise solution with disparate tools and technologies. MDS provides a web service, which can be used to extend MDS capabilities or develop custom solutions.

Add-in for Excel

Microsoft Excel is powerful tool: Business users understand it well as they use it quite often for day-to-day tasks. Master Data Services Add-in for Excel, allows business users to manage data while also allowing administrators to create new entities and attributes with ease. Most of the features available via Master Data Manager are also possible using the Excel plugin.

Data Quality Services (DQS)

Due to an exponential increase in data sources, it’s highly likely that the data from them is incomplete, inaccurate, duplicate, and possibly missing important business attributes as well. The reasons could be anything from user entry error, by way of corruption in transmission or storage to the use of different data standard by different sources.

Data Quality Services (DQS) provides a way to build a knowledge-base using various data points over time and then it can be used for data correction, enrichment, standardization, and de-duplication of data coming from various data sources. DQS maintains a knowledge base in various domains and each domain is specific for data fields. DQS also supports cloud-based reference data services to cleanse enterprise data using this external knowledge bases.

SQL Server Data Quality Service (DQS) provides following features:

  • Knowledge base creation
  • Data Cleansing
  • Data Matching
  • Data De-duplication
  • Data Profiling

Conclusions

Over many years, organizations have been beset with problems that come from the fact that there is no overall organization-wide control and supervision of data and its progress through the various parts of the organization. The problems are made worse by the increasing ‘commoditization’ of organizational functions such as payroll, stock control, and accounting. This has inevitably let to a tendency toward data silos. On top of this, there is the trend towards Service-oriented architectures and micro-service architectures that require far greater coordination of data and a far better self-service data-broking system. To cap these pressures, the legislative overhead within which organizations must operate mean that audit must not only be possible but must be extremely efficient. By adopting Master Data Management, organizations can tackle these three major problems, and also provide way of managing data that is far more appropriate for the changing needs of organizations

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 215,384评论 6 497
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,845评论 3 391
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 161,148评论 0 351
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,640评论 1 290
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,731评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,712评论 1 294
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,703评论 3 415
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,473评论 0 270
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,915评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,227评论 2 331
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,384评论 1 345
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,063评论 5 340
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,706评论 3 324
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,302评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,531评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,321评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,248评论 2 352

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,320评论 0 10
  • **2014真题Directions:Read the following text. Choose the be...
    又是夜半惊坐起阅读 9,460评论 0 23
  • 1.postnatal产后的 2.breastfeeding母乳喂养 3.covert隐蔽的 4.weigh in...
    Maei阅读 202评论 0 0
  • 今晨美梦鼾睡中,敲门如雷忽惊醒。 好友告知缘由后,同乘铁驴搞服务。 今是令爱结婚日,亲朋好友贺喜事。...
    阿山老师阅读 227评论 0 1
  • 标 题2####### 标题3 标题4 标题5 标题6 #######标题7?#########标题9?看来只支持...
    IT书生阅读 652评论 0 0