Thursday, December 30, 2010

Removing duplicate records with an Excel VBA script

Let’s say you have a set of data where there are known duplicates.  Now, if you had a small listing, you might be able to pick out the duplicates manually.  The problem is, often you are handed much larger sets of data.  Below are two methods that can be used to remove them:
  1. The easiest method is to use the =countif(B:B,B1) function already built into Excel.
    1. B:B is the range you are looking for the duplicate in.
    2. B1 is the value you are referencing in that range.
    3. By inserting a new column A then populating all rows with this formula your duplicates can be found by clicking on data, filter, then filtering for any items greater than 1 in that column.
    4. Highlight all cells and click Edit | Goto | Special | Visible Cells Only as shown here:

    1. Right click, select delete then select entire row and you will be left with only unique records.
    2. Delete the column containing your formulas and you are finished!
  1. If you find yourself doing the above often (or the data set of is too large) your best bet is to create the following script:
    1. From tools select macro and visual basic editor

    1. Creating the following script will deleted your duplicate records automatically:

Tip:  By going into tools, options, calculation and switching to manual you may speed this procedure up dramatically.

You can now run the script anytime you encounter records where column 2 contains the same value more than once!

No comments: