Understanding your data in Tableau – A step by step guide

Standard

Tableau is a great tool to create charts at impressive speeds. But with this trend of fast analytics, and a focus on impressive visuals, the data checking process and thus accuracy aspect is sometimes be neglected. Fundamentally, you need to understand your data in order to make an effective dashboard

Right at the start of training the team at Operation Fistula in Tableau we used a #MakeoverMonday dataset on Nike factories. Of course the team focused on the variable looking at the proportion of women working in factories. When they presented their visualisations at the end of our one hour I quickly realised two things:

  1. They were presenting the numbers at the country level, although the data was at the factory level
  2. They were presenting the sum of this measure, even though it was a percentage

With these two aspects taken together, the visualisation obviously were not showing any meaningful or accurate picture of the situation.

In order to help avoid similar mistakes in the future and approach the data in a more systematic manner, I set up a checklist, that the team can work through in order to arrive at a visualisation that is an accurate reflection of the data while also being an effective tool of communication. Apart from the first step (look at the data preview or Excel file) all following steps should be completed using charts made in Tableau. This is far more effective than trying to figure something out from looking at a table of data of course.

I hope this checklist can be of use to others as well. I would love to hear from you if you have any steps that you would add to this list!

Steps to building a Tableau visualisation

Step 1 – Familiarise yourself with your data

First look at the raw data to get a general overview of the fields. Sometimes the dataset is so small you can understand it with one glance. Then use visual analytics in Tableau to answer the following questions:

  • How many rows of data do you have?
  • What is the lowest level of detail? I.e. what does Number of records count?
  • Are there duplicate entries?
  • What are the members of the different dimensions?
  • If there are several levels of detail, how do they relate? Are there hierarchies? (E.g. category and subcategory)
  • What do numeric fields mean? Are they percentages or whole numbers?
  • Are all fields assigned their correct data type?
  • Are all fields correctly classified as dimensions or measures?

Step 2 – Start asking questions

Using Tableau, explore the data to find first insights. There are likely hundreds of ways you can combine the fields in the data, and it might take some time until you find a combination that is meaningful. Some of the things you can do are:  

  • Break down your continuous measures with the different dimensions
  • Look for correlations between your continuous measures by creating scatterplots
  • Identify outliers
  • Look for trends over time

Step 3 – Build your visualisation

Once you have identified your story, think about how to best communicate it. Your story can be the answer to a very specific question you can ask about the data or it can guide the user in answering their own questions:

  • What are you trying to say with your visualisation? What is your message?
  • Are you creating an exploratory or explanatory visualisation?
  • Do you need just one chart or will more charts add to your story?
  • What chart types are best suited for your data and story?
  • How can colours help you communicate your insights?
  • What is the best dashboard layout for your message?
  • Does your dashboard need additional explanation?
  • What is a concise, informative title that expresses the purpose of the dashboard?

Change your mark colour when using an action filter in Tableau

Standard

What we want to do

I recently had the chance to get quite colourful for a dashboard that we are developing at Operation Fistula. The landing page gives the user an overview of the issue areas different entrepreneurs work in, and where in the world they are active. An action filter connects the two sheets so that we can filter the map to just one issue area.

2018-08-07_10-30-56

In order to make it clear that this is what is happening, I wanted the marks on the map to change to the colour of the issue area when it is filtered, so that it is clear that the green dots on the map correspond to the green bar below.

2018-08-07_10-22-21

How to do it

We need to create a calculation in order to get our action filter to behave this way. What we are telling Tableau is basically: When just one of the issue areas is selected, then return just the one value associated with that filter, when there is more than one issue area selected, show the default black

In a calculation it looks like this:

2018-09-18_14-24-21

Let’s take that apart.

1. We are creating an IF statement, that tests a condition and returns one of two values: the string ‘All’ or the string that corresponds to the the issue area.

2. When there is more than one issue area, we want to return the value ‘All’ (This could be any other string though). The view is broken down to the country level, and each country has a different number of issue areas associated with it. So we need to use a level of detail calculation to exclude the [Country] field from this part of the calculation or ‘All’ might apply to those countries that have only one issue area associated with them. By excluding [Country] Tableau now looks at all of the issue areas in the view and counts each unique issue area once. At the outset this is 23, as this is the total number of issue areas we have in the data.

3. When we apply the action filter, the whole view will only show one issue area. This is when the first condition of the IF statement is no longer TRUE and the ELSE condition gets activated. Now that we only have one issue area, the calculated field will return the [Issue area and group] field.

So much for the calculation. The next steps are needed to complete the dashboard action setup.

1. Place the newly created calculation on the colour area of the marks card on the map sheet.

2. Place both your sheets onto a dashboard.

3. Set up your action filter so that when you select a bar the map is filtered to just the category that you have selected.

Now, you should see that the map colour has changed, but it will probably not be the same as the colour of the bar that you selected. That is because the colour assignment is tied to a specific field, and while you are referencing the original field [Issue area and group] in your calculation, it is still a new field.

So you need to go through and assign the same colours to the new field as well. This is a bit tricky, as you need to have the action filter activated and need to have the reference colour available to know what you are setting it to. There might be better ways of doing this, and if you can think of one please leave a comment, but I went with the following steps: 

1. On the dashboard, reveal the colour legend for the map. 

2. Select your first bar to activate the filter, the colour legend will change to your selection. 

3. Double-click the legend, then double-click the dimension member and use the colour picker to get the right colour from your bar. 

4. Repeat for every filter selection. 

5. Delete the colour legend from your dashboard.

Normally you wouldn’t be using as many colours as I am hopefully, so this should be a fairly efficient process. Caveat is of course that if a new dimension member is added to your data you will need to adapt the colour match manually. 

Tableau Conference Recap – #data17

Standard

It’s been a week now since my first Tableau conference and I have arrived back in reality. I was lucky enough to be invited along to represent my current placement company Exasol at this year’s Tableau on Tour Conference at Tobacco Dock in East London. So I got to spend three days demonstrating the power of Exasol, meeting great people from the Tableau community and attending breakout sessions.

Before the conference I had extended my original Makeover Monday viz for week 16 (see original here) on English prescription data to allow it to fully show off the quick insights possible when connecting to your data directly from Exasol into Tableau. This was on display live for anyone who was interested to interact with. As the Exasol expert Johannes Meier led the on the spot demos, analysing a dataset of over 3b rows in Tableau and answering technical questions.

On both Tuesday and Wednesday I was invited to the Alteryx area to demonstrate the newly improved integration (to be released for general download in the next few weeks) of Exasol and Alteryx to conference visitors. In-DB tools now support a connection to Exasol, which will be great news to those Tableau analysts who use Alteryx to clean their data and perform more complex analysis to then show in Tableau. If you have ever tried to import data from Exasol to Alteryx with the import tool you will know that for large data volumes this can be a painfully slow process. The new In-DB support brings this import down to just a few seconds.

Then of course I had opportunity to visit keynotes and sessions to hear more about the ways in which others use Tableau and the sort of problems they face with their data as well as refresh and develop my technical skills. I want to highlight three of these in particular.

Of course I had to see my once ago team leader at Leicestershire County Council (and the person who helped me discover both Tableau and the Data School) Rob Radburn talk about the way in which he and his 8 year-old son use Tableau. A great talk to end the conference, which I am sure many regretted having to miss due to the late time slot. Rob reflected on what we can learn from people with new approaches to things that are familiar to us.

Another one I couldn’t miss was the talk by Inmarsat’s Laura Schofield, who actually started her Tableau journey with the Data School. A little over a year ago, in our second week at the Data School, I was the project leader for our very first client project and it was an intense week for us, trying to wrap our heads around the complex data that we were presented with by Inmarsat. While I had the opportunity to catch up with Laura a few times at Tableau User Groups since then it was great to get a closer insight into the journey that Inmarsat has been on in the last year and see how a very complex analysis can be made so easy with Tableau you can just discover it by accident.

And my personal favourite was the Wednesday morning keynote speech by David Spiegelhalter, who is a fantastic speaker with a great sense of humour and who touched on a lot of really important topics. As a tool for self-service analytics Tableau is sometimes a little removed from the academic side of things and he managed to convey these topics in approachable ways that I am sure motivated not just me to revisit some of my stats sessions from university. As a psychology nerd I also appreciated his references to research into heuristics and biases. Basically, if you mention Tversky in your presentation you have me sold.

Thanks again to Exasol for giving me this opportunity for development at this year’s Tableau Conference on Tour in London. I am sure it won’t be my last.  

TCoT.png

 

 

Getting started with Exasol for Tableau and Alteryx users – Connecting to Tableau

Standard

Using Exasol allows us to process large amounts of data at very high speeds and it works great in combination with Tableau. When you open Tableau and navigate to the connection types in the welcome view you will see that Tableau has a native connector (called EXASOL in v10.2 and EXASolution in previous versions). Once selected, a dialogue will prompt you to enter your credentials.

Tableau connection
  • Server: This is the IP address of the database
  • Port: This will usually be 8563 for Exasol
  • Username: The user name that was assigned by your database manager or that you specified when installing your personal version of Exasol
  • Password: The associated password with this user name. Again, this will have either been given to you when you received information about your schema or you will have specified this yourself during installation.

If you are using this connection for the first time you will probably receive an error message informing you that you haven’t got the necessary drivers to establish the connection. A link is included to a page where you can install this from the Exasol website.

Windows users: Navigate to the section ” Download ODBC Driver” and select the file called: EXASOL_ODBC-6.1.3-x86_64 .msi or EXASOL_ODBC-6.1.3-x86.msi. The version number in this case is 6.1.3 but there are other versions available as well.

Mac users: You are looking for a file that is called EXASOL_ODBC-6.1.0-MacOS.dmg or EXASOL_ODBC-6.2.rc1-MacOS.dmg. You will need to use the menu on the left of the page to navigate to either version 6.1.0 or 6.2.rc1 in order to find these files (At the time of writing these were the only two versions available for Mac but this might change in the future).

Once installed, return to the connection page in Tableau and try again. If everything is entered correctly you should now be connected to the database and will be shown the typical data source screen in Tableau, where you can select the schema you would like to connect to and interact with the tables just as you would with any flat file.

If you are having issues connecting even when entering the correct credentials or you cannot see any tables when connecting to the schema you are working with you may not have been given sufficient permissions by your database administrator.

Exasol will speed up your processes in Tableau and provide better performance than you will have when using flat files. However, aspects of Tableau’s performance are still dictated by rendering speeds within Tableau. Some views, for instance when creating a map with many individual locations, might still take a while to set up.