Excel Tip of the Week #231 - VBA case study: Text handling and COMMAEXTRACT

Hello and welcome back to the Excel Tip of the Week!  This week we have a Developer-level post in which we're going to build a custom function that replicates some of the functionality of the Text to Columns feature.  If you want a refresher on that, it was covered in TOTW #105.

The goal

We want to be able to take text which is broken up with commas, and extract the text between the nth and n + 1th commas.  This is essentially a formula equivalent to text to columns.

For example, we might have some cells with addresses in them:

The town names (when they are present) are between the 1st and 2nd commas.  Extracting them without extremely complex formula gymnastics is quite tricky, so we want to create a custom function to do the job for us.

Building up our code

We have two key assets to help us build our VBA code:

  • VBA permits us to use several Excel formulas related to text - specifically LEFT, RIGHT, MID, and LEN (see TOTW #161 for a recap on those)
  • We can also use concatenation with the & operator (see TOTW #70 for more)

Let's talk through what we need our function to do in English first of all:

  1. Take as input a piece of text and a number telling us which section (i.e. which comma) we care about
  2. Comb through the text letter by letter until the appropriate "start" comma is reached
  3. Extract the content from that point until the next comma (or the end of the text) is reached

To do this, we are going to use a few extra variables:

  • i, to count how far through the text our search has gone
  • j, to count how many commas we have encountered
  • Mode, which will track whether or not we have reached our target text or not

The skeleton of the function is now set to go:

Public Function COMMAEXTRACT(InputCell As Variant, StartComma As Variant) As String
Dim i As Integer
Dim j As Integer
Dim Mode As String
 
j = 0
 
If StartComma = 0 Then
    Mode = "Extract"
    Else
    Mode = "CommaFind"
End If
 
[main code goes here]
 
End Function

So here we are setting up the function and its variables, and setting the initial values for those variables.  A few notes:

  • The syntax of our function will be =COMMAEXTRACT(text, number)
    • Both inputs can be either entered directly into the function or obtained via a reference to a single cell - that's what the 'Variant' type is for
    • The number is indexed from 0 - i.e. 0 will extract text from the beginning of the string until the first comma, 1 for between the 1st and 2nd commas, and so on
  • We have made a little if-then-else block to set which mode the function will start in; if we have a StartComma of 0 then we want to be extracting straight away; otherwise we will start by finding our starting comma's position.

Let's add in the code for what we will be doing while in 'CommaFind' mode:

Public Function COMMAEXTRACT(InputCell As Variant, StartComma As Variant) As String
Dim i As Integer
Dim j As Integer
Dim Mode As String
 
j = 0
 
If StartComma = 0 Then
    Mode = "Extract"
    Else
    Mode = "CommaFind"
End If
 
For i = 1 To Len(InputCell)
    If Mode = "CommaFind" Then
        If Mid(InputCell, i, 1) = "," Then
            j = j + 1
            If j = StartComma Then
                Mode = "Extract"
                i = i + 1
                Else
            End If
            Else
        End If
        Else
    End If
    [Extract mode code goes here]
Next i
 
End Function

We have set up a loop over the i variable, which will run until the length of the string.  Each step of the loop uses the Mid function to examine the ith character in the string.  If it's a comma, then we:

  • Up our comma count by 1
  • Check if we've reached the target starting comma yet
  • If we have, switch into 'Extract' mode
    • (and if we have, increase i by 1 so we don't immediately execute the 'Extract' code that comes next and output the comma itself)

Finally, to complete our code, we add in what will happen when we are in Extract mode:

Public Function COMMAEXTRACT(InputCell As Variant, StartComma As Variant) As String
Dim i As Integer
Dim j As Integer
Dim Mode As String
 
j = 0
 
If StartComma = 0 Then
    Mode = "Extract"
    Else
    Mode = "CommaFind"
End If
 
For i = 1 To Len(InputCell)
    If Mode = "CommaFind" Then
        If Mid(InputCell, i, 1) = "," Then
            j = j + 1
            If j = StartComma Then
                Mode = "Extract"
                i = i + 1
                Else
            End If
            Else
        End If
        Else
    End If
    If Mode = "Extract" Then
        If Mid(InputCell, i, 1) = "," Then
            Exit Function
            Else
            COMMAEXTRACT = COMMAEXTRACT & Mid(InputCell, i, 1)
        End If
        Else
    End If
Next i
 
End Function

In Extract mode, we first make sure we haven't hit another comma - and if we have, we use Exit Function to stop there.  Otherwise, we use the & operator to glue the current character on to our result and then iterate to the next character.

And that's it!  Here's an example of our function in action (combined with the TRIM function to remove any leading spaces):

If you want to try some coding yourself, try altering this function to work with any single-character delimiter instead of only commas!

Previous post - Revisiting Excel options
TOTW index 
Next post - Analysing frequency tables

This blog is brought to you by the Excel Community where you can find additional blogs, extended articles and webinar recordings on a variety of Excel related topics. In addition to live training events, Excel Community members have access to a full suite of online training modules from Filtered. There is also an online forum where you can ask questions and share ideas with other community members.

Anonymous