top of page

4 Red Flags That Dirty Data Is Impacting Your Company

Time spent planning and documenting your data will pay off either in hours your team won’t have to spend on clean-up or by preventing you from making poor decisions based on misleading information.

We regularly uncover these four issues when I help clients use their data. There are many ways these can impact your company, and they can all be resolved.

Red Flag #1 - You collect contact information manually

This is common in call centers and customer support teams. Sales agents or customer service representatives enter the name of the customer while speaking and listening. And there’s often little incentive to get it right. Accurate data collection isn’t typically the priority because it doesn’t impact how success is measured for these team members. Names and addresses can easily be misspelled or replaced with generic (e.g., “JOHN DOE”) or incomplete (e.g, “Texas”) data.


Service representatives enter inaccurate customer contact info during a call.


  • You send communications to the wrong place.

  • You send communications to the right place but with an inaccurate (or even offensive) name.

  • You unknowingly have multiple but differing instances of the same customer in your system and can’t properly tailor the customer's experience with your company.


Consider a national consumer database provider. They can help you clean up existing contact information in one sweep and can also help you validate or correct your new entries in real-time via API.

Red Flag #2 - Your data values aren't clearly defined.

This can be a huge time waster for anyone who uses your data and can also lead to bad decision-making. Often, companies will have a few people who have become deeply familiar with the data over time, but reliance on this legacy knowledge is dangerous as companies grow.


‘0’ could be interpreted as “unknown” OR as “$0.”


  • You’re unaware that you are missing key information.

  • Your key metrics are inflated/deflated depending on how each report and analyst treats the value “0.”


  • Put a system in place to assign values appropriately and consistently.

  • Document the system, the values, and the reasons for the approach.


Red Flag #3 - You have legacy data.

Many companies have been collecting and tracking data for years, and this must evolve along with the business. This is another case where systems, documentation, and communication are critical. This is especially relevant following an acquisition.


You’ve established a new way to categorize your services but don’t want to lose track of the previous assignments, so you create “legacy” and “current” service category assignments. Not everyone with data access is made aware of the difference between these two assignments.


  • Some reports and recommendations are based on current data, and some are not.

  • You have conflicting information guiding your business decisions.

  • When discrepancies are discovered, your team must recreate the analysis based on the appropriate information. This wastes resources and creates delays.

  • Your teams stop trusting the reports.


  • Isolate the legacy data if it’s not needed for regular operations. This data shouldn’t be easily available to be used accidentally or by default.

  • If legacy data must be readily available, update all automated reporting as soon as changes are made.

  • Maintain a list of all people with access to this data to push notifications when updates are made.

  • Clearly define old and new metrics in documentation that is available to everyone with access to the data.


Red Flag #4 - Default values aren't well-documented.

Using default values to represent unknown or off-scale values for some data points is helpful and appropriate. An example is a numeric field (e.g., INCOME) for which you need to distinguish between non-numeric values (no income to date, unemployed, prefer not to answer). In these cases, numeric codes can be assigned to flag values that represent information other than their literal numeric values (e.g., 99999). The key is to assign values that couldn't be misinterpreted as literal values.


Your business runs credit to verify eligibility for special pricing terms. In cases where credit isn’t run, you assign a value of ‘999’ to indicate “unknown.” This is appropriate because ‘999’ isn’t a valid credit score so there’s no danger in confusing this with a true score of ‘999.’ However, this decision is made without awareness of an automated process that groups anyone with a score greater than or equal to 720 into a “high credit” segment.


  • All customers with unknown (“999”) scores are now inadvertently considered to have high credit.

  • Often credit isn’t run because the customer knows that the score won’t be high and prefers not to have this information pulled. In these cases, this grouping is especially unfortunate.

  • You unknowingly make offers to those who are ineligible.

  • You draw inaccurate conclusions in your analyses of your high-credit customers.


  • Clearly define data values and processes that use them in documentation that is available to everyone with access to the data.


How can you create accountability for your data?

Data quality is a passion of mine. Even if you can’t say the same, it should be top-of-mind for someone in every business. If you could use some guidance, contact me here!


bottom of page