Hello all and welcome back to the Excel Tip of the Week! This week, we actually have a guest post, a Developer-level post written by former IT Faculty chairman, Jonathan Teller. Jonathan kindly shared the following article on the Excel Data Model late last year, and I realised it was a good fit for this series. I'll let Jonathan take it from here (his words in non-italics):
Whilst listening to sessions from the Microsoft Data Insight Summit back in March I had one of those light bulb moments.
Excel had acquired an important extra dimension called the Data Model. Although it seemed sudden, it wasn’t. The features and tool sets thus far making up the Data Model had been added over the last few versions. It was just that it hadn’t been that explicit. I had gleefully been using many of the new features as they were added to Excel. Tables, Power Pivot with the Dax language and Power Query with the M language to name but a few. But never thought of it in terms of a unified whole or “data model”.
A spreadsheet has always seemed like a simple, almost natural, place to hold lists, sort them, filter them, analyse them, amend them etc. However, in reality, lists were best held in a database application such as Access where there was more structure and attention paid to robustness. The Data Model toolset has shifted the balance towards Excel.
This article highlights some of the elements of Excel that interact and access the data model. This is based on the 2016 version. Some features are backward compatible others are not. The toolset is also constantly evolving.
The Data Model exists in every spreadsheet and there is only one per spreadsheet.
It enhances the power of pivot tables as it enables analysis to be performed across multiple tables of data. (Without using VLOOKUP()).
These tables can be held inside the spreadsheet or “adjacent” to the spreadsheet, in the computer’s memory. Holding the data adjacent to the spreadsheet can save space and allows the volume of data to easily exceed the spreadsheet’s one million row limit.
The tables that form the model can be generated within the spreadsheet or originated from a large number of other sources including other Excel files. The ability to create and maintain tables locally provides for substantial flexibility. Although a word of caution, this can give rise to one of the standard analysis problems where a proliferation of local spreadsheets can give rise to many variations of the truth.
In the past, one could use the Get External Data functionality to bring data into the spreadsheet. Now, whilst building a “Connection” to an external database there is a point when you are asked whether the data is to be imported into a table or a pivot table/chart or whether you only want to create the connection and whether the data should be added to the data model.
With version 2016, there is an additional, adjacent set of buttons on the Data ribbon, called Get and Transform (previously an add-in called Power Query or previous to that Data Explorer) that enables one to not only create the connection for importing data but also to transform it in a myriad of ways before it is loaded to the spreadsheet/ data model. As with the Get External Data there is an option to load the data to the data model without importing it into the spreadsheet. Note that for the moment, this tool can only access data within the current Data Model if it is also held in the spreadsheet or is the result of another “Power Query” query.
Managing the Data Model is carried out via the Power Pivot add-in. When Power Pivot is launched one will see individual tabs for each of the tables held within the spreadsheets data model including those not held physically in the spreadsheet.
PowerPivot provides facilities for importing data directly into the data model and does not add the data as a table within the spreadsheet. Instead it makes it available for analysis using pivot tables. Although one can’t make cell like amendments to that data there are very powerful ways to enhance the data with additional columns and calculations. This is where the very powerful DAX language makes its appearance.
Simply adding tables to the data model and then choosing elements from them in a pivot table or in a pivot chart doesn’t work. The answers will be wrong and often many orders of magnitude greater than expected.
Therefore, a key feature of the data model is the ability to link items in the tables to provide the answer. These are called relationships and an in depth look at creating useful relationships is a subject in itself.
The facilities for creating relationships include the previously mentioned importing tools ability to “automatically” recognise relationships as implied by the originating sources, PowerPivot’s facility to specify table relationships either using lists or using a diagram view and the Relationships button within the Pivot Table Analyze tab. In addition, the DAX language can be used to create links where simply highlighting a column in each of two tables isn’t flexible enough.
Excel is used for many, many diverse tasks. The addition of the Data Model and its toolkit significantly enhances its capabilities. Particularly, but not only, in its use for data analysis. There has always be more than one right way to do things within Excel and now there are even more.
Many thanks to Jonathan for sharing his insights. If you're interested in more data model-related thinking and learning more about Power Query, check out Rory Neary's blog posts on the Excel Community.
Previous post - VBA case study: TimestampsTOTW index Next post - Basic statistics functions
This blog is brought to you by the Excel Community where you can find additional blogs, extended articles and webinar recordings on a variety of Excel related topics. In addition to live training events, Excel Community members have access to a full suite of online training modules from Excel with Business. There is also an online forum where you can ask questions and share ideas with other community members.