How to get descriptive statistics in Excel?


you are analyzing a dataset and before modeling/analyzing you need to generate descriptive statistics on a field. you have the data loaded in Excel and wondered if there’s a way to do that in Excel.


There’s an out of the box solution that will support your needs to generate descriptive statistics on a field. Here are the steps:

Note: for the purpose of this blog post, I am using Excel 2013 but data analysis toolpak is available in Excel 2007+.

1. Active “Data Analysis” toolpak.

Follow this steps:  File > Options > Add-ins > Manage: Excel Addins > “GO”

excel data analysis toolpak

2. make sure to check the “analysis toolpak” checkbox.

3. Now you should see a “data analysis” option under the “Data” pane:

Excel Data Analysis Descriptive Statistics

4. Now click on “Data Analysis” and select one of the following options:

Anova, Correlation, Covariance, Descriptive Statistics, Exponential Smoothing, F-Test Two-Sample for Variances, Fourier Analysis, Histogram, Moving Average, Random Number Generation, Rank and Percentile, Regression, Sampling, t-Test, z-Test.

in this case, let’s go with descriptive statistics but you can see that you can perform other tasks as well.

5. Once you click on the descriptive statistics, a dialog box will show up and you will have to enter some data like your input range to generate descriptive statistics. Once you have filled the data needed, click on OK and it should generate descriptive statistics for you in EXCEL!

I hope that helps!


In this post, we saw how to generate descriptive statistics in Microsoft Excel.

Author: Paras Doshi

Back to basics: Design your Business Intelligence system to have lowest level data even if it’s not asked!

Here’s a scenario:

A Business Intelligence (BI) system for Sales is being developed at a company. Here are the events that occur:

1) Based on the requirements, It is documented that the Business needs to analyze Sales numbers by product, month, customer & employee

2) While designing the system IT learns that the data is stored at each Invoice Level but since the requirements document doesn’t say anything about having details down to invoice level, they decide to aggregate data before bringing in their system.

3) They develop the BI system within the time frame and sends it to business for data validation.

4) Business Analysts starts looking at the BI system and finds some numbers that don’t look right for a few products and need to see Invoices for those products to make sure that the data is right so they ask IT to give them invoice level data.

5) IT realizes that even though business had not requested Invoice Level data explicitly but they do NEED the lowest level data! They realize it’s crucial to pass data validation. Also, they talk with their business analysts and found out that they may  sometimes need to drill down to lowest level data to find insights that may be hidden at the aggregate level.

6) so IT decides to re-work on their solution. This increases the timeline & budget set for the project. Not only that they have lost the opportunity to gain the confidence of business by missing the budget and timeline.

7) They learn to “Design BI system to have the lowest level data even if it’s not asked!” and decides to never make this mistake again in the future!

This concludes the post and it’s important to include lowest level data in your BI system even if it’s not explicitly requested – this will save you time & build your credibility as a Business Intelligence developer/architect.

Business Intelligene Dashboard for Quality Managers

Business Intelligene Dashboard for Quality Managers

Business Goal:

Need to understand the patterns in Quality test results data across all plants.


- The solution involved creating a Business Intelligence system that gathered data from multiple plants. I was involved in mentoring IT team, development and end-user training of a Business Intelligence Dashboard that used SQL server analysis services as it’s data source.

- Dashboard development involved multiple checkpoint meetings with business leaders since this was the first time they had a chance to visualize quality test results data consolidated from multiple plants. Since they were new to data visualization, I used to prepare in advance and create 3-4 relevant visualization templates to kick off meetings.


(it is intended to look generic since I can’t discuss details. Also, drill down capabilities had been added to the dashboard to go down to the lowest granularity if needed)

Quality Test Results Dashboard

Back to basics: continuous Vs. Discrete variables and their importance in Data Visualization.

Take a look at the following chart, do you see any issues with it?

month trend chart line chart string to date

Notice that the month values are shown as “distinct” values instead of shown as a “continuous” values and it misleads the person looking at the chart.  Agree? Great! You already know based on your instincts what continuous and discrete values are, it’s just that we will need to label what you already know.

In the example used above, the “Date & Time” shown as a “Sales Date” is a continuous value since you can’t never say the “Exact” time that the event occurred…1/1/2008 22 hours, 15 minutes, 7 seconds, 5 milliseconds…and it goes on…it’s continuous.

But let’s say you wanted to see Number of Units Sold Vs Product Name. now that’s countable, isn’t it? You can say that we sold 150 units of Product X and 250 units of product Y. In this case, Units sold becomes discrete value.

The chart shown above was treating Sales Date as discrete values and hence causing confusion…let’s fix it since now you the difference between continuous and discrete variables:

Statistics Discrete Continuos Variable Data Visualization


To develop effective data visualizations, it’s important to understand the data types of your data. In this post, you saw the difference between continuous and discrete variables and their importance in data visualization.

SQL Server Analysis services – How to set the order by attribute sort key?


How to sort the dimension attribute by something other than the key and name column? How do you set the “OrderBy” property?

Example: You have created an Inventory age buckets 1-50,51-100,101-150 and so if a business user uses this dimension attribute then the sorting won’t be logical. It would be 1-50, 101-150,51-100 – so how to show the buckets in the logical order?


1. make sure that the table/view that you are bringing in has the sort key.


1 SSAS Attribute order by sort key2. Now, switch to SSAS and open your dimension. I am assuming that you’ve already configured your data source views and you are already bringing in these columns in the dimension:

Dim Inventory SSAS SSIS VIEW Data source VIEW

3. Let’s start with hiding Aging Bucket Sort key so that it’s not visible to user. Change the AttributeHierarchyVisible to False

4. Now, switch to Attribute Relationships – Right Click on Aging Bucket and click on New Attribute Relationship. And set the attribute relanship between Aging bucket and Aging Bucket Sort Key

Attribute Relationships SSAS

And you should see something like this in your attribute relationship section:

SSAS Attribute Relationship Sort Key

5. Now, one more thing to configure. Go back to dimension structure section. Open the properties section for the Aging Bucket Attribute and change the OrderBy property to AttributeKey. Also, change the orderByAttribute property to Aging Bucket Sort Key (in your case, choose the sort key that you have)

SSAS Order Sort by attribute property

That’s it, after you process the model then you should see the attribute being sorted based on the sort key that you had.


In this post, you saw how to configure sort/order property of a dimension attribute.

How to train your users to create their own Business Intelligence reports. #2 of 5: Pre Training Prep

In part 1, I wrote about why is it important to enable business users to create their own BI reports. In this post, part 2, I am going to share the pre-training preparations before you start training. I’ve classified into three categories: 1) Data 2) Tools 3) Culture. Let’s go through them:

1. Data

Data assets needs to be published before you start training. It should be a business friendly analytic layer on top of your data sources. It could something as simple as a Power Pivot Model to a SQL server analysis cube. As long as you have an analytics layer – you’re good! Do NOT grant access to transactional systems. I’ve seen a business analyst who was considered the go-to-business-expert of a system having issues trying to create reports using the system’s relational data source – He had challenge trying to get his head around multiple tables, keys, unfriendly field names. He got something up & running but it was hard for him! What’s the lesson here? Try to make it as easy as possible for business users to use data – create an analytic layer over your data sources.

Apart from this, Data Integrity is very important! If the users don’t trust data, they are not going to use it. Invite selected set of business users to test the integrity of the data before you publish the data assets.

Also, the analytic layer that you developed should perform well. if it takes a minute to return fairly simple result, then you will have challenge driving adoption.

2. Tools

tools Business Intelligence reporting dashboarding

What tools would you use to teach business users reporting? Of course, Excel is a top choice since many of the users are already familiar with using excel. Also, Show them a Power View using YOUR data – that may get them excited enough to learn Power View.

How about SSRS report builder & performance point dashboard designer? This is mainly targeted for IT developers so it won’t be great idea to train business users using this tools.

What about Power Pivot/ Power Query? from #1, business users face challenge trying to analyze relational data sources, so test your audience to see Power Pivot/ Power Query is a fit or not. It might work well if they want to combine from couple of spreadsheets with IT’s data assets, that would work! But don’t expect business users to spend time trying to learn Power Pivot & Power Query to analyze data. Again, test it with your audience, see if they pick it up, some of your users may pick it up, great! But usually, you’ll have to create data assets (cubes/power pivot models/tabular models) to reach the masses!

3. Culture

Data Driven Culture Business Intelligence

Image Source: Economist & Tableau

The more time that you spend understanding the culture, the more successful you are going to get in training users. It’s because the things you’ll learn while trying to understanding an organization’s culture will be useful in content creation, delivery mechanisms, target audience selection & communicating business value of data driven decisions.

let’s step back. what is culture? It is a characteristics of a group of people. What characteristics are you trying to find before you start training? Try answering following questions:

a. Where are the “analysts”?

- Are they part of IT teams? Who requests reports from “IT analysts”. (they are your target audience!)

- Are they part of business units? (Great! Make them efficient by removing manual data copy-pasting from their to-do list)

- Do they report to CxO’s/presidents/VP’s? (Great! request examples of data driven decisions)

b. Are there examples of value generated using data driven decisions?

- understanding how business uses data to generate value is very important. you will create content using these examples!

c. How comfortable are users learning new technology?

- Have they shown resistance in learning new systems?

- Are they used to receiving ready-to-consume reports! (don’t expect them to change their behavior. But figure out the person creating reports for them. Train them! Make them better)

In summary, understand the culture of the organization, it would help you prepare before you start the training.


In this post, we saw three pre training preparations (1. Data 2. Tool Selection 3. Culture) before you start training users.

Questions Power Users Ask about Excel: #3 of N

In this series, I intend to document common questions asked by Power users about Excel connected to SSAS cubes (or data warehouse) after they go beyond the basic stage of understanding & using Row labels, column labels, report filter & values in Pivot Tables. This post is #3 of N:

a. How to remove Grand Totals & Sub Totals?

1. Select the Pivot Table.

2. From the toolbar, go to Pivot Table Tools > Design > Sub Totals > Do not Show Sub Totals.

you can also choose to remove it just for rows or columns. it depends on your requirement on how to layout data.

Pivot Tables Sub Totals How to remove

Similarly, you can remove Grand Totals:

Pivot Tables Turn off Grand TotalsAlternatively, you can remove grant totals from Pivot Table options:

Grant Totals Pivot Table Options

There’s also an alternate method for removing subtotals. Put your mouse over the hierarchy level for which you do not need subtotals. Right click > remove subtotals “field name”. But you’ll have to do that for all hierarchy levels if you need to remove all subtotals.

b. How to configure Automatic Data Refresh?

A common question asked by power user is how to make sure that the excel file is pulling the latest data from the cube? Good news for them is that Excel files that are configured to connect to a data source like SSAS cube can be configured to automatically refresh. Here’s how:

1. From the toolbar, Go to Data > Connections

Excel Data Connections Cube Properties2. Select the connection that interested for automatic data refresh > properties

Excel SSAS Cube Property connection3. From here, you can configure the file to do an automatic data refresh every xyz minute and/or configure the file to refresh data every time you open it.

Excel Data Refresh Cubeclick ok when you’re done and close the workbook connections after you’ve configured the data sources that you needed for automatic data refresh.


In this post, we saw how to remove grand totals & subtotals and, how to configure the automatic data refresh.

In previous articles we saw:

#1: How to sort data? How to add slicers? How to change Pivot table Layout

#2: How to add calculated measures?

How to strip double quotes while importing data from CSV or TSV using SSMS Import Data wizard OR SSIS?

Long Title! Let me explain. This post will help you solve following problem if you run into it:

1) You are using SSMS Import data wizard to load data from a comma (or tab) separated value (CSV/TSV) file into SQL Server Table & you find that your source data values has double quotes and so you want to strip them before loading to destination table.

2) You are using SSIS to load data from a CSV/TSV file into SQL Server Table & you want to strip the double quotes in source table fields before you load the data to destination table.

Double Quotes CSV file SSMS SSIS LoadSolution:

1. After you’ve configured the Flat File connection. you’ll reach to a point where you’ll see “Flat File Connection Manager” in SSIS. Or in the SQL Server Import & Export data wizard, you’ll see a dialog box to configure flat file connections.

2. In the Text Qualifier, enter

Strip Double Quotes SSMS SSIS Import Wizard3. Make sure to Preview the data to verify that the double quotes around data fields have been trimmed.

4. That’s it! You’ve successfully configured the flat file connection manager to strip double quotes.

What is the purpose of creating Tables & Graphs?

Knowing why we do what we do is important. Stephen Few lists four reason for creating Tables & Graphs in his book “Show me the number”. I really liked them so I am posting it here for your reference:

  1. it helps us communicate. It helps present information to others.
  2. it helps us analyze data. it helps us find the insights in the data.
  3. It helps us Monitor Performance. It helps us keep track information about performance e.g. Sales Performance, Speed of Manufacturing, etc.
  4. It helps us Plan. It helps us predict and prepare for the future.

Slicing/Dicing data in multiple Excel Pivot Tables with ONE slicer:


This week I created a couple of Excel Dashboard’s for a project that I am working on. As a part of the that, I decided to include slicers for these Excel Dashboards. And the reason I did that was because data in multiple pivot tables needed to be sliced with one click. In this blog post, I’ll show you steps to connect an Excel slicer to multiple pivot tables:

Steps for Excel 2010:

1. Based on your requirements, decide if multiple items on a dashboard (excel sheet) need to be affected with one click

2. Now, for the purpose of this blog post, let’s assume that you have two Pivot Tables like shown below:

two excel reports slicer 1

Note: the Two Excel Pivot Tables are created by connecting to same cube. And the slicer comes from the same cube.

3. And what if you need to slice data in both pivot tables by one common slicer? Let’s say we want to slice the data in the two pivot tables that we have by Product Category – here’s what you do (assuming that the tables are related in your data source and slicing makes sense)

4. click on any of the pivot table and you’ll see PivotChart Tools in the toolbar

pivot table charts tools5. Switch to Analyze Tab & click on insert slicer & Select the field that you want as slicer & click OK

pivot table insert slicers6. you’ll see a slicer on your sheet now:

slicer on pivot tables

7. Let’s format the report to make it look better:

#1: move the slicer such that it does not overlap any of the pivot tables.

#2: In this case, I want to move the slicer to the top of the sheet and change the slicer to have 5 columns. Here’s how you can do that:

select the slicer > from the toolbar > slicer tools > column

slicer tools option excel

#3 I also changed the slicer style from the slicer tools to match its color palette with that of the Pivot Tables:

8. Note that the slicer is connected to just ONE pivot table. Let’s connect it to both pivot tables

9. From Slicer Tools options > Pivot Table connections > check all pivot tables that you want the slicer to be connected to:

pivot table connections excel

10. Done! Test your slicers, with one click you should be able to see that the data gets sliced in both pivot tables:

#1: Sliced by Bikes

excel pivot analysis data 1

#2: Sliced by Accessories

excel pivot analysis data 2That’s about it for this post. Your comments are very welcome!

Related Posts:

Comparing Slicers in Excel 2010 to Standard PivotTable Filters