RSS

Arrays and ranges in Excel VBA - delete rows

Many, who start using VBA in Excel, quickly learn how to use ranges, because they are really handy and relatively easy to understand and use.

This page shows examples of how you can speed things up a lot by using arrays. This could be deleting rows in a range (table) or other operations.

You CAN loop through a range and remove or insert e.g. rows, but it is tricky, because your range suddenly changes, and if for instance you remove the row you are in, you will suddenly be in the row that was the row below just a moment ago! It is both easier and faster with an array.

The first example shows how to remove every second row, and the next how to delete rows going from the last row upwards, if the row above has identical/duplicate values in selected columns.

You can use other criteria, and the task doesn't have to be the removal of rows - that is just the objective for my examples. You can also download a spreadsheet with the examples.

The advantage of using arrays

If you work with big tables/ranges, you can usually speed up the code a lot, if you copy the range to an array. Once you have finished operating on the array, you copy the array to the spreadsheet in one lightning fast operation.

You get the speed increase, because there is a large overhead writing to and from a range in a spreadsheet, whereas it is very fast to operate on a virtual table (the array).

The method isn't suitable for all tasks, but I think that many people work with very big tables in Excel (e.g. log files or reports) and often need to "weed" the data using certain criteria.

If the task is sorting or filtering, you should always use Excel's built in functions (e.g. auto filtering), because no VBA code will ever match the speed of those; but if you have special needs, VBA can be a great help.

Delete every second row

The following example is fairly simple: It removes every second row in a range. I have no idea if anybody ever needs to do this, but because it is simple, it is good for showing the technique.

  1. Copy the range to a two-dimensional array in one (very fast)operation. The array will automatically get the same dimensions as the range (number of rows and columns)
  2. Loop through the array and copy the chosen records (rows - in this case every second) to a new array.
  3. Insert the new array as a table in the spreadsheet in one fast operation - no slow looping writing cell by cell.

For the example to work the sheet with the table must be the active sheet, and the table must start in cell A1. You can highlight the code with the mouse, copy it (CTRL+C) and paste it into a VBA module with CTRL+V. You can also download a spreadsheet with the examples.


Sub DeleteEverySecondRow()
'This procedure removes every second row
'in a table and shift the rows up. First the
'table is copied to an array, and next every
'second item is writte to a new array. When done
'the new array is copied to a table on a new
'sheet in one fast operation.
Dim MyArray() As Variant    'The first array
Dim NewArray() As Variant   'The second arrayet
Dim rTable As Range         'Range variable
Dim lCount As Long          'Counter
Dim lCount2 As Long         'Counter
Dim lCount3 As Long         'Counter
Dim lRows As Long           'Number of rows
Dim lCols As Long           'Number of columns
Dim lNewRows As Long        'Rows in second array

On Error GoTo ErrorHandle

'Switch off screen updating for speed
Application.ScreenUpdating = False

If Len(Range("A1")) = 0 Then
   MsgBox "Celle A1 is empty. For the example to work " & _
   "the table must start in cell A1."
   GoTo BeforeExit
End If

'Sets rTable = the table. "CurrentRegion" is convenient,
'when the table has empty cells as "border". If there are
'neighbooring cells with content, the range must be defined
'some other way.
Set rTable = Range("A1").CurrentRegion

'The table/range is copied to MyArray in one operation. There is
'no need to loop the range cell by cell and fill the array.
'It is much simpler to make a copy.
MyArray = rTable.Value

'The number of rows and columns is saved in the variables,
'lRows og lCols.
With rTable
   lRows = .Rows.Count
   lCols = .Columns.Count
End With

'Before we dimension the new array that will contain the
'final result, we calculate the number of rows, and this
'depends on whether the original table has an even or uneven
'number of rows.
If lRows Mod 2 = 0 Then
   lNewRows = lRows / 2
Else
   lNewRows = lRows / 2 + 1
End If

'The new array is dimensioned to have the number of rows, that
'will NOT be deleted. The number of columns is unchanged.
ReDim NewArray(lNewRows, lCols)

'We now loop through the first array, and the "uneven" rows
'are copied to the second array. Notice the use of "Step 2"
'to get every second. The outer loop finds the rows,
'and the inner loop copies the cells from left to right.
For lCount = 1 To lRows Step 2
   lCount2 = lCount2 + 1
   'Cells are now copied from left to right.
   For lCount3 = 1 To lCols
      NewArray(lCount2, lCount3) = MyArray(lCount, lCount3)
   Next
Next

'We add a new sheet
Sheets.Add

'rTable is resized to the dimensions of NewArray
Set rTable = Range("A1").Resize(lNewRows, lCols)

'We insert the sorted table in one operation.
rTable.Value = NewArray

BeforeExit:
On Error Resume Next
Set rTable = Nothing
'Switch screen updating back on
Application.ScreenUpdating = True

Exit Sub
ErrorHandle:
MsgBox Err.Description & " Error in procedure DeleteEverySecondRow."
Resume BeforeExit
End Sub

Delete identical rows in a table

The next example is more complex, but probably more useful! It shows how you can delete rows by using certain criteria.

I use a very similar macro at work to "weed" daily reports with chemical analyses. Not all products are analysed every hour, and if there isn't a new analysis on the hour, the last one will be repeated in the report.

The daily report has 24 rows, one for every hour, and in my tables there can be data for maybe a year.

So to have unique analyses only I want to remove the rows that are mere repetitions of the row above. If for instance a production unit is stopped, the same analysis will be there for days, and that is not clever, if you want to find standard deviation and stuff like that.

This is what happens:

  1. The table is defined as a range (here: rTable). The example requires that the table is on the active sheet and starts in cell A1.
  2. The range is copied to the array, MyArray, in one fast operation.
  3. The user is asked to select the column(s) that must be compared.
  4. We then loop through the array from bottom to top, row by row, and if a row is a copy, we insert an identifier (in this case the word "delete") in what matches the first cell in the row.
  5. A new array, NewArray, is defined.
  6. All items (rows) in MyArray that do not contain the word "delete" in the first column are copied to NewArray.
  7. The new array with the "weeded" table is inserted on a new sheet in one, fast operation.

Here we go:


Sub RemoveDuplicates()
'Finds and removes rows that contain data
'identical to the row above.
Dim bSame As Boolean        'Flag
Dim MyArray() As Variant    'The array
Dim NewArray() As Variant   'The result array
Dim rCell As Range          'Range variable
Dim rTable As Range         'Range variable
Dim rIsect As Range         'Range variable
Dim lRows As Long           'Number of rows
Dim lCols As Long           'Number of columns
Dim lCount As Long          'Counter
Dim lCount2 As Long         'Counter
Dim lCount3 As Long         'Counter
dim lDel as Long            'Counter
'colColumns holds the column numbers,
'where data must be compared.
Dim colColumns As Collection

On Error GoTo ErrorHandle

If Len(Range("A1")) = 0 Then
   MsgBox "Cell A1 is empty. The example requires " & _
   "that the table begins in cell A1."
   GoTo BeforeExit
End If

'Sets rTable = the table. CurrentRegion is convenient,
'if the table's bordering cells are empty. If not
'you must define the range some other way.
Set rTable = Range("A1").CurrentRegion

'The range (the table) is copied to MyArray
MyArray = rTable.Value

On Error Resume Next

'The user must now select the columns with
'values to compare. It is enough to select
'one cell per column.
Set rCell = Application.InputBox(prompt:="Select the columns " & _
" you want to compare. Selecting one cell per column " & _
"is enough.", Type:=8)

'If the user cancels and doesn't select any column, we cancel.
If rCell Is Nothing Then
   MsgBox "You didn't select any column - the program stops."
   GoTo BeforeExit
End If

On Error GoTo ErrorHandle

'Switch off screen updating
Application.ScreenUpdating = False

'If the selected area is outside the table, we cancel.
Set rIsect = Application.Intersect(rCell, rTable)
If rIsect Is Nothing Then
   MsgBox "The selected area is not in the table"
   GoTo BeforeExit
End If

'colColumns is initialized.
Set colColumns = New Collection

'We now find the selected column numbers in the table (rTable).
'We use the numbers, when we loop and compare the values
'in the array, MyArray.
With rTable
   For lCount = 1 To .Columns.Count
      Set rIsect = Application.Intersect(rCell, .Columns(lCount))
      If Not rIsect Is Nothing Then
         'The column number is saved in colColumns.
         colColumns.Add lCount
      End If
   Next
End With

'The number of rows and columns in the table is saved
'in the variables lRows and lCols.
With rTable
   lRows = .Rows.Count
   lCols = .Columns.Count
End With

lDel = 0

'We now loop through the array, and duplicate rows
'get the word "delete" in the first item. We loop
'from the bottom and stop in row 2, because there
'is no row above row 1.
For lCount = lRows To 2 Step -1
   bSame = True
   With colColumns
      For lCount2 = 1 To .Count
         'If the row is different, we set bSame = false
         If MyArray(lCount, .Item(lCount2)) <> _
         MyArray(lCount - 1, .Item(lCount2)) Then
            bSame = False
            Exit For
         End If
      Next

      'If values are identical, we set bSame = True,
      'and put in the word "slet" in the first item.
      If bSame Then
         MyArray(lCount, 1) = "delete"
         lDel = lDel + 1
      End If
   End With
Next

lCount2 = 0
lDel = lRows - lDel

'The new array is dimensioned to have the same number of
'rows as the old one minus the number of rows that must
'be deleted. The number of columns is unchanged.
ReDim NewArray(lDel, lCols)

'We loop through the old array, and rows that
'don't have the word "delete" in the first item
'are copied to the new array.
'The outer loop loops the rows, and the inner loop
'copies from left to right, when a new row has been
'added to NewArray.
For lCount = LBound(MyArray) To UBound(MyArray)
   If MyArray(lCount, 1) <> "delete" Then
      lCount2 = lCount2 + 1
      'We now copy from left to right.
      For lCount3 = 1 To lCols
         NewArray(lCount2, lCount3) = MyArray(lCount, lCount3)
      Next
   End If
Next

'We add a new sheet
Sheets.Add

'We define a range, rCell, and give it the same dimensions
'as NewArray
Set rCell = Range("A1").Resize(lDel, lCols)

'The sorted table is inserted in one operation.
rCell.Value = NewArray

BeforeExit:
'We set our objects to Nothing
'to prevent memory leak.
Set rCell = Nothing
Set rTable = Nothing
Set rIsect = Nothing
Set colColumns = Nothing
'Switch screen updating back on
Application.ScreenUpdating = True

Exit Sub
ErrorHandle:
MsgBox Err.Description & " Error in procedure RemoveDuplicates."
Resume BeforeExit
End Sub

To top

Related: