Read badly structured worksheet data into Power Query

It's the time of year when you start to think deeply about suitable gifts for the spreadsheet users in your life. What better than a gift that has the potential to deliver lifelong spreadsheet time-savings? In the build up to Christmas we will be posting 15 brief, practical guides to ways in which the Excel Power Query feature can automate the manual and solve the insoluble. The posts will concentrate on Power Query as an additional spreadsheet tool rather than as a data tool.

Most of the posts are on the Excel Community, but we are posting a few here. This one shows how Power Query can be used to extract data from existing Excel worksheets.

Order from chaos – estimated time: 25 seconds

One of the most useful applications of Power Query is to rescue poorly structured spreadsheets. Because of the lack of any enforced structure in cell-based Excel, many spreadsheets resort to complex and inconsistent formulae in an effort to compensate for deficiencies in their design. We are going to use a simple example to show how Power Query can create consistent and usable data from less than ideal spreadsheet content. We will be trying to construct a lookup table to enable us to categorise the transactions that we have been working with up to now. This is the lookup information in our spreadsheet:

As you can see, it has been set up as an Excel Table but in a way that would make it difficult to create an Excel formula that would allow Code AB to be included in the Electronics Type. We will be using two simple Power Query techniques to turn this into a usable lookup table.

Step 1 – load our Table into Power Query

We can click on any cell in our Table and use Data Ribbon tab, Get & Transform Data group, From Table/Range to read our Table into the query editor. As we saw last time, if our data hadn't been set up as a Table, we could have applied a Range Name and used that as the basis for reading in our data. However, using a Range Name rather than a Table would not automatically incorporate adjacent rows and columns added to our range in the way that using a Table would. If our worksheet was in a different workbook, it might even be better to bring in the entire sheet and work from there.

Here is our data in the Query Editor:

We have two issues to deal with. To use it as a lookup table we need the Type column to be complete and we need to remove the rows that don't contain a Code value. We need to address these two issues in the correct order.

Step 2 – Fill Down

Right-click in the Type column heading and choose Fill, Down. This will overwrite any null values with the first actual value reading upwards in the column from the current row:

Step 3 – filter

Our fill down operation has copied out Type heading to each of our rows. We are now just left with the heading rows that we no longer need. We can remove these using the dropdown in the Code column heading and deselecting (null):

Our table is now structured correctly for use as a lookup table and we can use Home Ribbon tab, Close and Load To… to Only Create Connection in order to make our Table available for use tomorrow:

Previous (Excel Community and Tech Faculty only)

Next (Excel Community and Tech Faculty only)

Index to other articles in the series:

Part 1 – introduction and combining Tables

Part 2 – coping with US date formats

Part 3 – correcting errors

Part 4 – rounding and formatting

Part 5 – using a single cell value in Power Query

This blog is brought to you by the Excel Community where you can find additional blogs, extended articles and webinar recordings on a variety of Excel related topics. In addition to live training events, Excel Community members have access to a full suite of online training modules from Excel with Business. There is also an online forum where you can ask questions and share ideas with other community members.