Edit Duplicates

 

After a successful duplicate search, the program switches to duplicate view and shows the duplicates in groups. Later you can return to this view at the menu item "View->Dubletten".

Here you will be able to

  1. Manually review the duplicates to check the results
  2. during the review cancel the allocation of individual records, if they are found not to be duplicates.
  3. automatically delete duplicates from each set apart from one.

Manual Review of Duplicates

When you are searching for duplicates in important customer databases, it is advisable to perform a manual review of the results. In this case you should perform a search with a somewhat lower threshold value (80-85%), which will produce a result with too many duplicates. During the manual review you can deallocate sets which are not actually duplicates (see below). (You will realize that with some data sets it is also difficult for a human to determine whether they are duplicates or not). Through the clear presentation of duplicates, a manual review can be carried out quickly and is well worth the effort.

With addresses used for marketing or similar data, where it doesn’t matter if some records are eliminated, a manual review is probably not necessary. In this case, you would select a somewhat higher threshold value (90-95%) for the duplicate search so that only certain duplicates are found. You can then go ahead and automatically correct these (i.e. automatically delete sets in a pair of duplicates, see below).

Unassigning Individual Sets

To remove the duplicate status of two records (in other words, saying: this is not a duplicate), mark one of the records and select "Review->Unassign Duplicate".

TIP: You also find this menu by right-clicking the mouse on a record.

Create Deletion List

After the manual review you want to delete all duplicates.

After performing a manual review and deallocating invalid duplicates, as a last step you will want to delete all records which have duplicate entries, so that only one record per duplicate remains (all similar records deleted). Following this deletion process your database will be free of duplicates.

To do this select main menu Duplicates->Create Deletion List, to create a list of records that can be deleted in the next step.

In order to keep the oldest/newest record, provide the program with a corresponding column for the sorting order. Select, for example, a column named created, provided that your database contains a column with the date of creation for every record. The Autonumber-column in Access and SQL-Server databases also contain values in chronologically ascending order and can be used for this purpose.

Delete Duplicate Records

After creating the deletion list select main menu Duplicates->Delete Duplicate Records, in order to delete these records physically from your database or to create a purged temporary list.

Please check before whether deletion in your table is at all permitted. This may not be the case, for example, if the table has relations to other tables in your database. Also be aware that other data could be deleted because of relations to other tables.

Check your data model or clarify with the database developer whether you can simply delete records in the table without further problems.

On the other hand is the creation of a purged list without any risks because it makes no changes to your database.