The Business Intelligence Guide
   BI Strategy | BI Program | BI Projects | BI Data | BI Infrastructure | BI User Tools | BI Vendors | Articles | BI Blog
HOME
 
Business Intelligence
BI Definition
BI Evolution
Updates In BI
 
BI Strategy
Drivers of BI
BI Lifecycle
Setting BI Strategy
BI Strategy Doc
BI Scorecard
BI Guiding Principles
 
BI Programs
BI Governance
BI Program
BI Roadmap
BI Roles
Barriers To BI
 
BI Tools
About BI Tools
OLAP
Scorecards
Dashboards
BI Tools and BPM
Text Mining
 
BI Solutions
BI Software
BI Solution Comparison
BI Vendor Updates
CRM & BI
 
Data
About Data
Data Definition
Data Management
Data Governance
MDM
Metadata
Data Cleansing
Data Integration
 
Databases
About Databases
Data Warehouses
Data Marts
Microsoft SQL
Oracle OODBMS
Contextual Databases
Development Platforms
 
DW Solutions

DW Appliances

Netezza PS
Datallegro
Teradata ADW
 
Industry Solutions
Airline
Health
Retail
Telecommunications
 
Case Studies
BI Case Study Index
Govt Planning Office
Manufacturing Co
Port Logistics
Postal Logistics
Telco Customer Churn
 
RESOURCES
ARTICLES
NEWS
Sitemap

 

Data Cleansing


Prior to data integration, a process of data cleansing is required. This is the process of detecting and correcting [or deleting] corrupt or inaccurate records from a record set.

Data inconsistencies may arise when:

  • Different data dictionary definitions of similar entities are used in different data stores.
  • User entry errors
  • Data corruption during transmission or storage.

Data cleansing is also known as data scrubbing.

 

Data Cleansing Process

The data cleansing process involves:

  1. Removing typos
  2. Validating** data defintions against that of the destination data warehouse. The validation may be strict [rejecting any item that does not have a valid value] or fuzzy [correcting records that partially match existing, known records].
  3. Correcting values against a known list of entities.

Data cleansing is synonymous with the less frequently-used term data scrubbing.

**Data cleansing differs from data validation in that validation commonly means data is rejected from the system at entry and is performed at entry time, rather than on batches of data.

Next: Data Normalization

Back To Top

For The World's Leading Guide To BI Strategy, Program & Technology


Data Index | Data Defintion | Meta Data | Data Management | MDM | Data Governance | Data Cleansing | Normalization | Data Integration | Data Growth | Data Solutions

 


NOW AVAILABLE!

The Logical Organization
A Strategic Guide To Corporate Performance Using Business Intelligence

THE ULTIMATE BI REFERENCE
FOR MANAGERS & CONSULTANTS

The Logical Organization Book Cover



Feature Articles

Using BI To Drive Corporate Performance

Pervasive BI - The Next Step in BI Excellence

The Executive Guide to BI Tools and Solutions

The Executive Guide To Understanding Corporate Data

Using Business Intelligence To Power Boost Corporate Performance