How to standardize data using Data Quality Services?

I was playing with Data Quality Services when I though that it would be nice if I could implement what I had learned and create a quick demo that shows how to standardize data using the technology. So here it goes:

Note: This is not a step by step Guide – I am assuming that you are aware of different pieces of DQS. I am going to post a brief outline of how did I go about implementing a quick demo.

Technology used:

1) SQL Server 2012 Data Quality Services

2) SQL Server Integration services to show the DQS cleansing  component in action.

Problem:

We want to standardize to Gender to have only two values M & F – but now data coming from two files creates data inconsistency issues:

Two files that look like this:

File 1: (Excel)

1 sql server data quality services data source 1 excel file

File 2: (CSV)

2 sql server data quality services data source 2 csv file

Solution:

Let’s see how SQL server 2012’s data quality services would help us solve this problem:

1) I created a Knowledge base with a domain Gender and created domain values that looks like this:

3 data quality services domain management values sql server

2) Tested the Knowledge base quickly by creating a Data Quality Project. It worked!

3) Switched to SSIS and created a Data Flow which uses following components:

4 data quality services sql server integration services

What would be interesting to you to see mapping tab in the DQS cleansing transformation component, so here it is:

5 data quality SSIS data cleaning transformation

5) And let me show you the Target File – note that you can choose to NOT have Gender_Source, Gender_Status & Record Status. The column of interest is Gender_Output

6 standardized output SSIS gender data quality services

That’s about it for this post. I hope this helps!

9 thoughts on “How to standardize data using Data Quality Services?

  1. Pingback: Difference between Term based relations and Domain values in SQL server 2012 Data Quality Services | Paras Doshi – Blog

  2. Pingback: How to clean address records using third-party reference data-sets in Data Quality Services? | Paras Doshi – Blog

  3. Pingback: How to clean records using Regular Expressions in Data Quality Services? | Paras Doshi – Blog

  4. Pingback: SQL Server 2012 Data Quality Services Term based Relation’s in action! | Paras Doshi – Blog

  5. Pingback: Resource: Learn to build an Enterprise Information Management system using Data Quality Services, Master Data Services and SQL Server Integration Services | Paras Doshi – Blog

  6. Pingback: Data Quality Service’s Composite Domains in action! | Paras Doshi – Blog

  7. Pingback: Data Profiling and SQL Server 2012 Data Quality Services | Paras Doshi – Blog

  8. Pingback: Matching activity in Data Quality Services in action! | Paras Doshi – Blog

  9. Pingback: Seven Demo’s for SQL Server 2012 Data Quality Services: | Paras Doshi – Blog

Thank this author by sharing the article on social media. If you have any questions or comments, please leave a reply below:

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s