Back to basics: continuous Vs. Discrete variables and their importance in Data Visualization.

Take a look at the following chart, do you see any issues with it?

month trend chart line chart string to date

Notice that the month values are shown as “distinct” values instead of shown as a “continuous” values and it misleads the person looking at the chart.  Agree? Great! You already know based on your instincts what continuous and discrete values are, it’s just that we will need to label what you already know.

In the example used above, the “Date & Time” shown as a “Sales Date” is a continuous value since you can’t never say the “Exact” time that the event occurred…1/1/2008 22 hours, 15 minutes, 7 seconds, 5 milliseconds…and it goes on…it’s continuous.

But let’s say you wanted to see Number of Units Sold Vs Product Name. now that’s countable, isn’t it? You can say that we sold 150 units of Product X and 250 units of product Y. In this case, Units sold becomes discrete value.

The chart shown above was treating Sales Date as discrete values and hence causing confusion…let’s fix it since now you the difference between continuous and discrete variables:

Statistics Discrete Continuos Variable Data Visualization

Conclusion:

To develop effective data visualizations, it’s important to understand the data types of your data. In this post, you saw the difference between continuous and discrete variables and their importance in data visualization.

SQL Server Analysis services – How to set the order by attribute sort key?

Problem:

How to sort the dimension attribute by something other than the key and name column? How do you set the “OrderBy” property?

Example: You have created an Inventory age buckets 1-50,51-100,101-150 and so if a business user uses this dimension attribute then the sorting won’t be logical. It would be 1-50, 101-150,51-100 – so how to show the buckets in the logical order?

Solution:

1. make sure that the table/view that you are bringing in has the sort key.

Example:

1 SSAS Attribute order by sort key2. Now, switch to SSAS and open your dimension. I am assuming that you’ve already configured your data source views and you are already bringing in these columns in the dimension:

Dim Inventory SSAS SSIS VIEW Data source VIEW

3. Let’s start with hiding Aging Bucket Sort key so that it’s not visible to user. Change the AttributeHierarchyVisible to False

4. Now, switch to Attribute Relationships – Right Click on Aging Bucket and click on New Attribute Relationship. And set the attribute relanship between Aging bucket and Aging Bucket Sort Key

Attribute Relationships SSAS

And you should see something like this in your attribute relationship section:

SSAS Attribute Relationship Sort Key

5. Now, one more thing to configure. Go back to dimension structure section. Open the properties section for the Aging Bucket Attribute and change the OrderBy property to AttributeKey. Also, change the orderByAttribute property to Aging Bucket Sort Key (in your case, choose the sort key that you have)

SSAS Order Sort by attribute property

That’s it, after you process the model then you should see the attribute being sorted based on the sort key that you had.

Conclusion:

In this post, you saw how to configure sort/order property of a dimension attribute.

How to train your users to create their own Business Intelligence reports. #2 of 5: Pre Training Prep

In part 1, I wrote about why is it important to enable business users to create their own BI reports. In this post, part 2, I am going to share the pre-training preparations before you start training. I’ve classified into three categories: 1) Data 2) Tools 3) Culture. Let’s go through them:

1. Data

Data assets needs to be published before you start training. It should be a business friendly analytic layer on top of your data sources. It could something as simple as a Power Pivot Model to a SQL server analysis cube. As long as you have an analytics layer – you’re good! Do NOT grant access to transactional systems. I’ve seen a business analyst who was considered the go-to-business-expert of a system having issues trying to create reports using the system’s relational data source – He had challenge trying to get his head around multiple tables, keys, unfriendly field names. He got something up & running but it was hard for him! What’s the lesson here? Try to make it as easy as possible for business users to use data – create an analytic layer over your data sources.

Apart from this, Data Integrity is very important! If the users don’t trust data, they are not going to use it. Invite selected set of business users to test the integrity of the data before you publish the data assets.

Also, the analytic layer that you developed should perform well. if it takes a minute to return fairly simple result, then you will have challenge driving adoption.

2. Tools

tools Business Intelligence reporting dashboarding

What tools would you use to teach business users reporting? Of course, Excel is a top choice since many of the users are already familiar with using excel. Also, Show them a Power View using YOUR data – that may get them excited enough to learn Power View.

How about SSRS report builder & performance point dashboard designer? This is mainly targeted for IT developers so it won’t be great idea to train business users using this tools.

What about Power Pivot/ Power Query? from #1, business users face challenge trying to analyze relational data sources, so test your audience to see Power Pivot/ Power Query is a fit or not. It might work well if they want to combine from couple of spreadsheets with IT’s data assets, that would work! But don’t expect business users to spend time trying to learn Power Pivot & Power Query to analyze data. Again, test it with your audience, see if they pick it up, some of your users may pick it up, great! But usually, you’ll have to create data assets (cubes/power pivot models/tabular models) to reach the masses!

3. Culture

Data Driven Culture Business Intelligence

Image Source: Economist & Tableau

The more time that you spend understanding the culture, the more successful you are going to get in training users. It’s because the things you’ll learn while trying to understanding an organization’s culture will be useful in content creation, delivery mechanisms, target audience selection & communicating business value of data driven decisions.

let’s step back. what is culture? It is a characteristics of a group of people. What characteristics are you trying to find before you start training? Try answering following questions:

a. Where are the “analysts”?

- Are they part of IT teams? Who requests reports from “IT analysts”. (they are your target audience!)

- Are they part of business units? (Great! Make them efficient by removing manual data copy-pasting from their to-do list)

- Do they report to CxO’s/presidents/VP’s? (Great! request examples of data driven decisions)

b. Are there examples of value generated using data driven decisions?

- understanding how business uses data to generate value is very important. you will create content using these examples!

c. How comfortable are users learning new technology?

- Have they shown resistance in learning new systems?

- Are they used to receiving ready-to-consume reports! (don’t expect them to change their behavior. But figure out the person creating reports for them. Train them! Make them better)

In summary, understand the culture of the organization, it would help you prepare before you start the training.

Conclusion:

In this post, we saw three pre training preparations (1. Data 2. Tool Selection 3. Culture) before you start training users.

Questions Power Users Ask about Excel: #3 of N

In this series, I intend to document common questions asked by Power users about Excel connected to SSAS cubes (or data warehouse) after they go beyond the basic stage of understanding & using Row labels, column labels, report filter & values in Pivot Tables. This post is #3 of N:

a. How to remove Grand Totals & Sub Totals?

1. Select the Pivot Table.

2. From the toolbar, go to Pivot Table Tools > Design > Sub Totals > Do not Show Sub Totals.

you can also choose to remove it just for rows or columns. it depends on your requirement on how to layout data.

Pivot Tables Sub Totals How to remove

Similarly, you can remove Grand Totals:

Pivot Tables Turn off Grand TotalsAlternatively, you can remove grant totals from Pivot Table options:

Grant Totals Pivot Table Options

There’s also an alternate method for removing subtotals. Put your mouse over the hierarchy level for which you do not need subtotals. Right click > remove subtotals “field name”. But you’ll have to do that for all hierarchy levels if you need to remove all subtotals.

b. How to configure Automatic Data Refresh?

A common question asked by power user is how to make sure that the excel file is pulling the latest data from the cube? Good news for them is that Excel files that are configured to connect to a data source like SSAS cube can be configured to automatically refresh. Here’s how:

1. From the toolbar, Go to Data > Connections

Excel Data Connections Cube Properties2. Select the connection that interested for automatic data refresh > properties

Excel SSAS Cube Property connection3. From here, you can configure the file to do an automatic data refresh every xyz minute and/or configure the file to refresh data every time you open it.

Excel Data Refresh Cubeclick ok when you’re done and close the workbook connections after you’ve configured the data sources that you needed for automatic data refresh.

Conclusion:

In this post, we saw how to remove grand totals & subtotals and, how to configure the automatic data refresh.

In previous articles we saw:

#1: How to sort data? How to add slicers? How to change Pivot table Layout

#2: How to add calculated measures?

How to strip double quotes while importing data from CSV or TSV using SSMS Import Data wizard OR SSIS?

Long Title! Let me explain. This post will help you solve following problem if you run into it:

1) You are using SSMS Import data wizard to load data from a comma (or tab) separated value (CSV/TSV) file into SQL Server Table & you find that your source data values has double quotes and so you want to strip them before loading to destination table.

2) You are using SSIS to load data from a CSV/TSV file into SQL Server Table & you want to strip the double quotes in source table fields before you load the data to destination table.

Double Quotes CSV file SSMS SSIS LoadSolution:

1. After you’ve configured the Flat File connection. you’ll reach to a point where you’ll see “Flat File Connection Manager” in SSIS. Or in the SQL Server Import & Export data wizard, you’ll see a dialog box to configure flat file connections.

2. In the Text Qualifier, enter

Strip Double Quotes SSMS SSIS Import Wizard3. Make sure to Preview the data to verify that the double quotes around data fields have been trimmed.

4. That’s it! You’ve successfully configured the flat file connection manager to strip double quotes.

What is the purpose of creating Tables & Graphs?

Knowing why we do what we do is important. Stephen Few lists four reason for creating Tables & Graphs in his book “Show me the number”. I really liked them so I am posting it here for your reference:

  1. it helps us communicate. It helps present information to others.
  2. it helps us analyze data. it helps us find the insights in the data.
  3. It helps us Monitor Performance. It helps us keep track information about performance e.g. Sales Performance, Speed of Manufacturing, etc.
  4. It helps us Plan. It helps us predict and prepare for the future.

Slicing/Dicing data in multiple Excel Pivot Tables with ONE slicer:

Background:

This week I created a couple of Excel Dashboard’s for a project that I am working on. As a part of the that, I decided to include slicers for these Excel Dashboards. And the reason I did that was because data in multiple pivot tables needed to be sliced with one click. In this blog post, I’ll show you steps to connect an Excel slicer to multiple pivot tables:

Steps for Excel 2010:

1. Based on your requirements, decide if multiple items on a dashboard (excel sheet) need to be affected with one click

2. Now, for the purpose of this blog post, let’s assume that you have two Pivot Tables like shown below:

two excel reports slicer 1

Note: the Two Excel Pivot Tables are created by connecting to same cube. And the slicer comes from the same cube.

3. And what if you need to slice data in both pivot tables by one common slicer? Let’s say we want to slice the data in the two pivot tables that we have by Product Category – here’s what you do (assuming that the tables are related in your data source and slicing makes sense)

4. click on any of the pivot table and you’ll see PivotChart Tools in the toolbar

pivot table charts tools5. Switch to Analyze Tab & click on insert slicer & Select the field that you want as slicer & click OK

pivot table insert slicers6. you’ll see a slicer on your sheet now:

slicer on pivot tables

7. Let’s format the report to make it look better:

#1: move the slicer such that it does not overlap any of the pivot tables.

#2: In this case, I want to move the slicer to the top of the sheet and change the slicer to have 5 columns. Here’s how you can do that:

select the slicer > from the toolbar > slicer tools > column

slicer tools option excel

#3 I also changed the slicer style from the slicer tools to match its color palette with that of the Pivot Tables:

8. Note that the slicer is connected to just ONE pivot table. Let’s connect it to both pivot tables

9. From Slicer Tools options > Pivot Table connections > check all pivot tables that you want the slicer to be connected to:

pivot table connections excel

10. Done! Test your slicers, with one click you should be able to see that the data gets sliced in both pivot tables:

#1: Sliced by Bikes

excel pivot analysis data 1

#2: Sliced by Accessories

excel pivot analysis data 2That’s about it for this post. Your comments are very welcome!

Related Posts:

Comparing Slicers in Excel 2010 to Standard PivotTable Filters

 

How to change the layout of the Excel Pivot Tables?

Pivot Tables are powerful mechanism to analyze data. And as you may know, it’s really popular among business professionals that use excel for their day-to-day data analysis needs. With that context, Here’s the blog post on Pivot Tables Layout:

Situation:

I got a question from my client who had a specific need on how she wanted her Pivot Tables to look. Here’s how the Pivot Table should look like:

excel pivot table power pivotSolution:

I am using Excel 2010 Pro Plus to demo the solution, please use this solution to adapt to use it with other versions of excel.

Without formatting and changing the layout, the excel pivot table looked like this:

step 1 excel pivot table from power pivot

Now, let’s work step by step to meet the requirement:

Step 1: click somewhere on the Pivot Table and from the Toolbar, Switch to the Design Tab under PivotTable Tools

Pivot Table Tools Design ExcelStep 2: Now here, Go to Report Layout > Show in Tabular Form

Pivot Table Layout Tabular Form

Step 3: As you might have noticed the “hierarchical” structure is now broken up into multiple structure getting us closer to meet the requirement.

half way pivot table layout

Step 4: In our requirement, you can see that it does not give the user to see the “+” or “-” (expand, collapse) buttons. So, let’s hide these buttons from the report from Step #3

Right click somewhere on the pivot table > PivotTable Options > Display tab > uncheck the box that says “Show expand/collapse buttons“:

pivot table hide expand collapse button

Step 5: so now it looks as follows:

excel pivot table power pivot

So as you can see, it meets our requirement now.

To recap, Here’s what we had to do:

#1: Change the Layout of the Pivot Table

#2: We saw how to hide the expand/collapse buttons too.

 

That’s about it for the post! Thanks for reading.

your comments are most welcome!

 

Resource: Introduction to Data Science by Prof Bill Howe, UW

Introduction to Data Science course taught by Bill Howe just started on coursera platform. Having studied the Data Intensive Computing in Cloud course at UW taught by Prof Bill Howe, I can say that this course would be great resource too!

Check it out: https://www.coursera.org/course/datasci

Introduction to Data Science

PASS Business Analytics Conference Keynote Day #2

Dr. Steven Levitt’s (Indiana Jones of economics & Author of Freakonomics) work involves finding insights from data. In the keynote, he shared some of the interesting & fun insights that he found from data.

One Example: Dr. Levitt: According to the data, It was 7 times more dangerous to sell crack in Chicago than it was being in combat in Iraq. https://twitter.com/markvsql/status/322707949158006786

He also talked about other insights that he found which could also be found in his book Freakonomics. After getting audience fascinated about what analyzing data can do – he moved to his real world experiences of analyzing data for businesses. And tied all these fascinating insights back to some tips he had for the audience. Here is a brief recap of the tips he shared:

> “Ideas don’t come out of the blue. Almost always ideas come out of the data” – Dr. Steven Levitt

> “You guys are the future. What you’re doing is the key to a business’ success or failure.”

> Experiment & Test Hypothesis using DATA

> Misconceptions can cripple you. Let the data speak, even when it might be difficult

> Most important people = who understand and know what to do with data, not those who pretend they know the answer.

> Dr. Levitt: without data any biz will be left behind, must experiment and accept failure

*Above text is linked to tweets.

That’s about it for this post. What do you think about the tips that Dr Levitt shared?