Tableau: Data Cleaning for Geographic Maps

Data cleaning is a major part of any analytic’s/data-visualization undertaking. If data cleaning is ignored then it leads to inaccurate data reporting & thus suboptimal business decisions.

To that end, if you create a Tableau’s Geographic map, please check the accuracy of your data by going to:

Menu Bar > Map > Edit Locations

Let me give you some examples:

Now, I have “states/province” as my geographic role for the variable and when I created a geographic map, I created a geographic map it didn’t show any state for New York State! See Before:

data cleaning geogrphic map before

So what did I do?

I navigated to Menu bar > Map > Edit locations:

data cleaning geogrphic map State

So I fixed it!

data cleaning geogrphic map Tableau

And After:

data cleaning geogrphic map after

Note that New York State is lighted up!

In the past, I’ve also have entered Latitude & Longitude if need be.  This is when it was not able to recognize few US cities and it was saying “ambiguous” – I inputted Latitude & Longitude to clean the data:

data cleaning geogrphic map city


In this post, I described how you should check the data accuracy of a Tableau Geographic Map.

Visualizing MapReduce Algorithm with an Example: Finding Max Temperature

Problem Statement: Find Maximum Temperature for a city from the Input data.

Step 1) Input Files:

File 1:

New-york, 25

Seattle, 21

New-york, 28

Dallas, 35

File 2:

New-york, 20

Seattle, 21

Seattle, 22

Dallas, 23

File 3:

New-york, 31

Seattle, 33

Dallas, 30

Dallas, 19

Step 2: Map Function

Let’s say Map1, Map2 & Map3 run on File1, File2 & File3 in parallel, Here is their output:

(Note how it outputs the “Key – Value” pair. The key would be used by the reduce function later to do a “group by“)

Map 1:

Seattle, 21

New-york, 28

Dallas, 35

Map 2:

New-york, 20

Seattle, 22

Dallas, 23

Map 3:

New-york, 31

Seattle, 33

Dallas, 30

Step 3: Reduce Function

Reduce Function takes the input from Map1, Map2 & Map3, to give an output:

New-york, 31

Seattle, 33

Dallas, 35


In this post, we visualized MapReduce Programming Model with an example: Finding Max Temp. for a city.  And as you can imagine you can extend this post, to visualize:

1) Find Minimum Temperature for a city.

2) In this post, the key was City, But you could substitute it by other relevant real world entity to solve similar looking problems.

I hope this helps.

Related Articles:

Visualizing MapReduce Algorithm with WordCount Example

Seven Interesting Google Projects that a Data Professional may not have heard about:

Here’s the list:

1. Google Refine

2. Google Prediction API

3. Google Trends

4. Google Chart Tools

5. Google Big Query

6. Google Correlate

7. Google Fusion Tables

Note: These projects may not be ready to be used in your production environment as some of them are in Beta/Experimental stages and their support/development may be deprecated in future.

Thanks: I thought of writing this blog post after a discussion I had with Parth Acharya about Google and it’s projects for Data Professionals. He pointed me to some of the most interesting samples that used Google Fusion Tables and here’s his one of the blog post on related topic: Google Fusion Table & Data Visualization

Mapping “Facebook Page Likes vs Country” using PowerView in Office 2013

Just a quick note that you can quickly create maps in PowerView in office 2013. I just created one in 2 minutes:

Facebook Page likes VS Country

This seems like a great way to visualize where your fans are from. In my case, most of them are from India and so one actionable insight would be to schedule posts based on Time Zone in India. And I can imagine that such reports could be very helpful to brands who have sizable fan following on Facebook.

Here’s the screenshot:

Maps Power View Excel 2013 social media analytics
Thanks to the following blog-posts for inspiration:

1. Google Fusion Table & Data Visualization (He used Google Fusion Tables, I used PowerView!)

2. Creating Maps in Excel 2013 using Power View