DQ Global

Data Quality

08May

The Usefulness of Fuzzy Matching for Data Quality

When British Gas was privatised at the end of 1986, the public was offered a limited number of shares per person. Unscrupulous individuals who saw the chance to earn a fast buck tried to fool the offering's auditors and get more than they should by using variations of their names and other personal details. For the first time, the artificial intelligence of fuzzy matching was used successfully to catch out the fraudsters. Now it has many more uses and is one of the common tools in master data management.

Posted in Data Quality

24April

It’s All or Nothing for Data Quality

It's no good doing things by halves if you want data quality. When you've done a data profiling exercise, you know where your data problems lie. That's half the exercise. Next you have to sort them out and deal with what might have a detrimental effect on the business. Change the metadata that doesn't match the data, and use data quality software to make sure the data is verified, complete and unique, with no duplicates.

Posted in Data Quality

17April

Ride Your Quality Data to the Winning Post

Just as a horse has to be in tip top condition to win its race, so does your data need to be of really good quality to beat the competition. But managing the condition of your data is not as costly as maintaining a fit and healthy race horse. Data scrubbing - being able to remove duplicate records, maintain integrity and consistency - can all be achieved without having a massive weekly bill on top of an initial investment.

Posted in Data Quality

10April

How to Guarantee Failure in your Data Quality Initiatives!

Is there anyone out there who doesn't want clean data? If so, you're in luck. Here is the definitive checklist of things NOT to do.

Posted in Data Quality

03April

When do you Get More from Less?

Like solving riddles? This one should be easy for any IT specialist familiar with this website. We are all about data quality, which comes from clean data; and clean data means fewer records because all the duplicates have been weeded out.

Posted in Data Quality

20March

IT Knights Need Data Quality Champions

Sometimes IT managers feel they could be losing the joust in trying to get the clean data message across. Everyone is so focused on their own aspects of winning the competitive edge, they don't see or understand that faulty tools could be undermining their efforts. Or if they do, it's why can't IT get it right for a change, or we've already invested in all these information systems, why do we need to spend even more money?

Posted in Data Quality

06March

A Single Customer View for Success in Business

How loyal are your customers? Do your senior managers know? Can you measure this? If your data gives you a single customer view you can see at a glance how much repeat business you are getting, and what is attracting the most loyal customers.

All businesses need to be able to adapt to market demand. Many astute business leaders seem to have an instinct for what is going to work well and what direction to take next. That's the way it appears to outsiders, but the truth is that their gut feel has most likely evolved from the accurate information from clean data available for them to study.

Posted in Data Quality

27February

Data Quality: Making a Culture of it

Sometimes, no matter how much hard work IT is putting in on data scrubbing andusing data quality software, you keep getting complaints that what's coming up in reports is not helpful. Unless everyone who touches data has a good understanding of how it is used to support the business activities, and how vital it is to get it right, things will continue to go wrong.

Posted in Data Quality

14February

The 'Benefits' of Data Matching

A data matching project has uncovered more than 5,000 ineligible immigrants receiving benefits in the UK. Since the government undertook this exercise of using matching software on benefits, tax and border control data for the first time recently, it has released figures stating that that around 371,000 benefit claimants were non-UK nationals when they applied for a National Insurance number.

Posted in Data Quality

31January

What has chaotic Data Quality got in common with Entropy (the 2nd Law of Thermodynamics)?

Firstly, I promise you won’t need to be a scientist or engineer to understand this. And yes, it is relevant to how data decays from order (high quality) to chaos (low quality).

There is a ubiquitous phenomenon we all instinctively accept that data has an unerring ability to go from high quality to low quality.

This phenomenon - with energy - is defined by the 2nd Law of thermodynamics; which loosely states that energy has an absolute and unfailing tendency to go from "more concentrated" to "less concentrated". It kind of "spreads out" and gets "diluted". Some examples are:

  • Energy flows from a higher temperature to a lower temperature (heat exchange)
  • Energy flows from a higher pressure to a lower pressure (expansion).
  • Energy flows from a higher voltage potential to a lower voltage potential (electric current).
  • Energy flows from a higher gravitational potential to a lower gravitational potential (falling objects).
  • Water flows and falls from higher elevation to a lower elevation (downhill).

Basically, energy always goes from high concentrations to low concentrations and when the transfer stops there is a state of equilibrium, when it is said to be at its maximum entropy.

In science, "Entropy" is defined as a measure of unusable energy. As usable energy decreases and unusable energy increases, "entropy" increases. So, as usable energy is irretrievably lost, disorganization, randomness and chaos increase.

In the context of this article, it sort of validates why our, once orderly databases - if left to their own devices - rapidly decay into a disorderly, untrusted, fragmented and duplicated mess.

Entropy may therefore be thought of as a measure of the usefulness of data or information. Eventually all of the data in our organizations just gets less useful; until finally, it becomes mostly useless. It has reached a point of equilibrium, or its maximum entropy, where it has no further potential to be actively used, for say marketing, or, for informed decision making.

Sounds like what happens to any database when neglected and left to decay naturally to me?

Unlike energy though, unfortunately, as yet, we cannot scientifically measure the degree of data entropy as I don't believe there are any universally accepted units of data chaos or disorder. It does sound like a good legal term though for disciplinary action… "You are guilty of generating 3.5 units of disorder in my CRM and 4.2 units in my ERP system, you are sentenced to x years of data entry”.

So what can we learn from this?

Well if we borrow from science and again stretch the energy metaphors to apply to data and information, it seems pretty obvious that if we wish to reverse data chaos and overcome data decay, we need to apply some effort and actually do some work!

In science, work is defined as (force x distance moved) e.g. the work or effort required to lift a weight, compress a gas, pump water uphill etc., or, in the case of data, we might consider it the work or effort required to change its state from “A RIGHT STATE”, to “THE RIGHT STATE”.

Basically, if we are to change the state of data within business applications into a state which is fit for use, there is hard work to be done! There can be no more excuses or corporate slacking; because, when it comes to: refreshing, standardizing, formatting, validating, suppressing, deduping and enhancing your data. All of which are incidentally verbs, action is the key.

Data does not clean itself

Unless you take action, things simply stop happening, or don't start, when there is equilibrium or maximum entropy. Putting data back into a fit for use state requires work, hard work.

It will be worth the input of physical and emotional energy though as businesses will be rewarded with high value data yielding high value returns. Basically, things happen when high energy high value data is allowed to move from high potential to low potential through its use.

Action is always the key.

In the case of corporate data, it requires effort from everyone:

  • Business Leaders need to lead a culture of Corporate Data Responsibility (CDR), where trusted data is the norm and accurate information a corporate imperative.
  • Management to implement CDR through a data governance culture where data is skilfully curated to deliver business information and organisational insight.
  • I.T. to ensure CDR where any data migrations, data integrations and data processing take place to guarantee they are co-ordinated, repeatable and correct all of the time.
  • Data workers to ensure CDR through data which are captured correctly, first time and every time so it is fit for use by all upstream consumers in the data and information demand chain.

All of this combined effort means better business; reduced entropic waste, reduced operational friction, reduced data scrap and re-work. It leads to: actionable information, which in turn drives better decisions, which creates, greater shareholder value, greater sustainability, happier employees and much, much higher profits!

Posted in Data Quality

26January

Listen to the interview with Martin Doyle from DQ Global on OCDQ Radio

Martin Doyle is a Data Quality Improvement Evangelist and the CEO of DQ Global, which is a UK-based data quality software and services vendor providing data cleansing, international address and email verification, data deduplication, and data matching solutions for Customer Relationship Management, Single Customer View, and Master Data Management. DQ Global has worked with over 500 businesses worldwide on a variety of projects, providing their clients with improved data quality, making their data fit for business use, and enabling them to trust their data and make decisions based on a foundation of fact.

Listen to the full interview here:

http://www.ocdqblog.com/home/the-johari-window-of-data-quality.htmlhttp://www.ocdqblog.com/home/the-johari-window-of-data-quality.html

Posted in Data Quality

24January

Data Quality Diagnose Before You Prescribe

When it comes to data quality improvement, I believe you must take the approach a doctor might take, in that you must diagnose before you prescribe.

Posted in Data Quality

19January

Good v Bad Data Quality

Good content is data which is fit for purpose.  It should be:

Posted in Data Quality

12January

What's In / What's Out for Data Quality

A recent report from Enterprise Data Management Council from "What's In and What's Out” in data quality.  Take a look at our DQ360 product which will help you with your data quality issues.


Posted in Data Quality

04January

Our Data Quality Predictions for 2012

As we move into 2012, businesses that capitalise on their data, create single customers views and master their enterprise data will have a distinct advantage over their competition. They will survive and thrive, whilst those who don’t will be at a severe disadvantage and could become extinct.

Posted in Data Quality