Data Match 2010 - Word Smith ™

Data Match's trademarked Word Smith ™ is essential for cleaning up common text issues and harmonizing your data.

  • Groups and counts the all words in your data set.
  • Replace or remove unwanted text. Ideal for fixing spelling errors and harmonizing your data without having to touch every occurrence of a misspelling. Example: Change "Streeet" or "STrt" to "Street".
  • Comes with several standard Word Smith ™ files saving you the time of coming up with them yourself.
    • Common company suffixes "LLC, Corporation, Corp, etc."
    • Common misspellings "thhe", "nineteeen", etc.
    • Save your Word Smith ™ files for future use across projects.
  • Place keywords into new fields. Example identify the keywords "Blue" and "Purple" and place into a new field called color. Ideal for creating new fields to match and harmonize your data.
  • Extract numbers with common prefixes or suffixes. Example: extract the text "1 Gallon" or "3 Gallons" to a new field called "Volume". Ideal for creating new fields to match and harmonize your data.
  • Preview your Word Smith actions in the dynamic data window. Apply actions to all tables, or selected tables/fields.

Word Smith Screenshot

 

Pattern Extraction Example

 

 

 

Data Match 2010 Screenshots

1. Data In

2. Data Quality

3. Clean

4. Word Smith ™

5. Match Definition

6. Results

7. Deduplicate Preview