Titanic Data


Here’s a link to download the Titanic data — http://lib.stat.cmu.edu/S/Harrell/data/descriptions/titanic.html — it’s really useful in analytics and data science projects. You can:

  1. Build a predictive model. Example: https://www.kaggle.com/c/titanic
  2. I also use this data set to create interactive dashboards on tools like Qlik and Tableau to understand their features.


If you liked this, you may also like other data sets that I have here: http://parasdoshi.com/2012/07/31/where-can-we-find-datasets-that-we-can-play-with-for-business-intelligence-data-mining-data-analysis-projects/

Productivity Tip: Learn to Comment/Uncomment SQL code using shortcuts


I spend a lot of time writing SQL code — and as a reader of this blog, You might be in the same boat. So any productivity gains that we could get here could go a long way. On that note, here’s a quick productivity tip: Learn to comment/uncomment multiple lines of SQL code using keyboard shortcut.


If you are using SQL Server Management Studio, it’s “CTRL-K followed by CTRL+C” for commenting AND “CTRL+K followed by CTRL+U” for uncommenting.

If you are using some other Data Management Software tool, I am sure you can find it using their HELP section or googling around.

Either ways, these shortcuts go a long way in making you more productive! What is your favorite productivity tip?

Qlik sense: How to see Data Load Editor scripts for apps developed by your Team members?


(This post first appeared on the Qlik Community. here)


So you just joined a Business Intelligence Team and one of the responsibilities include building apps for your business users. Eventually, you would have a need to see Data Load editor scripts for apps developed by other members in the team. So what permission do you need to be able to do that?

Credits: darkhorse

Qliksense Version: Enterprise Server 2.0

Source: can’t see a peer’s data load editor scripts


This a two-step process.

1) Get “content admin” access (or “higher” level access)

2) Double check if you have access to see data load scripts for ALL apps

Step 1:

The short answer is that you need “Content Admin” permission from your Qlik sense admin…But with this access level, you will have access to other developer’s app via QMC. If you need to do this via HUB as well then you will have to change the content admin role.

Here’s how Serhan ( darkhorse ) explained how to get this done:

QMC–> Security Rules–>Content Admin–> Edit–> Context–> Both in Hub and QMC

Qlik sense management console

Step 2:

Now, once you get the “content” admin access, you might want to double two things:

1) You can get access to data load scripts on published apps — (I was able to do this but there still seems to some open questions around some folks not being able to see the data load scripts for published apps. If this is the case for you, you need to duplicate the app on your “my work” area and see the scripts)

2) You can duplicate apps on your “my Work” area and see scripts — this is also useful if you want to make changes to published apps that are out there.


I hope this helps you resolve the permission issues and help you collaborate with your team members!

Data puking and how T-mobile alienated a potential customer:


I saw this ad on a highway earlier today and my reaction: why would I switch to a network that has just “96%” coverage.

T mobile ad — example of data puking

…instead of converting a potential buyer, this ad actually made me more nervous. You know why? Its a case of what I like to call “data puking” where you throw bunch of numbers/stats/data at someone hoping that they will take action based off of it. So what would have helped in this ad? It would have been great to see it compared against someone else. Something like: we have the largest coverage compared to xyz. My ATT connection is spotty in downtown areas so if it said something like we have 96% coverage compared to ATT’s 80% then I would have been much more likely to make the switch.

I wrote about this adding benchmark in your analysis here

Takeaway from this blog: don’t throw data points at your customers. Give them the context and guide them through the actions that you want them to take.

How to add Sparkline data visualization to Google spread sheets?


I like using spark lines data viz when it makes sense! It’s a great way to visualize trends in the data without taking too much space. Now, I knew how to add sparklines in Excel but recently, I wanted to use that on Google sheet and I had to figure it out so here are my notes:

1. Google has an inbuilt function called “SPARKLINE” to do this.

2. Sample usage: =SPARKLINE(B2:G2) — by default you can put line chart in your cells.

3. Then there are other options including changing the chart type. You can find them documented here:  https://support.google.com/docs/answer/3093289

4. One of the best practices that I advocate when you spark-line to “compare” trends is to make sure that you have the consistent axis definition. So the sample usage for that could like this:


(if you want to do this for excel then here’s the post: http://parasdoshi.com/2015/03/10/how-to-assign-same-axis-values-to-a-group-of-spark-lines-in-excel/ )

After you’re done, here’s what a finished version could like on Google sheet:

Google Sheet Data visualization spark line

Here’s the working google sheet: https://docs.google.com/a/parasdoshi.com/spreadsheets/d/1EJYDTxOifeEL-YwW1a0oxXw7tFG1iAVQlwjo4EU8R-s/edit?usp=sharing

Data -> Insights -> ?


I was at the HP Big data conference last week and I heard something during the keynote that’s worth sharing with you.

As Data & Analytics professionals, we spend a lot of our time on finding insights, trends & patterns out of the data but the keynote speaker (Ken Rudin, Facebook) encouraged everyone to take that a step further = Think about Driving impact based on the insights. It’s simple yet a powerful idea! Over past few months, I have started working closely with decision makers and helping drive impact vs just “handing-off” insights.

I hope that helps! Just wanted to share that with you. What do you think?


PASS Business Analytics VC has grown 123% in a year! #sqlpass @passbavc


It’s been amazing to see the growth of Business Analytics community over the past couple of years as one of the chapter leaders on the PASS Business Analytics Virtual chapter…Here’s a data viz that I put together to analyze effectiveness of our marketing campaigns:

Here’s the chart: 

PASS Business Analytics Virtual Chapter Marketing Effectiveness Chart

It shouldn’t come as a surprise that an “Analytics” virtual chapter is using data-driven marketing techniques! ;)


May’14 = 100 attendees. Jun’15 = 223 attendees. % Diff = 123%


With this growth rate, we should have ~500 attendees in our future virtual chapter meeting in Jun 2016. Can’t wait! :)


A lot of work by Dan English (current president) and Melissa Demcsak (Immediate past president) went into growing this chapter!

How to add custom User-ID to your Universal Analytics (Google) implementation?


There are three different implementations that you could have with Google Analytics:

  1. Classic Google Analytics
  2. Universal analytics
  3. Universal analytics with Google Tag manager.

If you fall under “2. Universal analytics” then this post is for you since I’ll walk you through steps that you need to take see UserID’s on your google analytics report.

First why do you want UserID on your GA reports? 

1) Data blending is powerful. If you can combine your transactional datasets with web traffic data then you can extract some powerful insights! To be able to join your transactions data with web traffic data, you need some field that is common across those datasets. UserID might be one of the most useful fields that you could use for data blending.

2) Also having UserID in your reports let’s you perform some user behavior analysis at individual level and that could be pretty powerful too!

Now, How?

There are three steps:

  1. Identify the User ID
  2. Create a Custom Dimension on Google Analytics
  3. Modify your tracking code to send data for the custom dimension

Step 1. Identify the User ID

The first step is to identify the ID that you are going to send to Google analytics. Remember that you can’t send PII (personally identifiable information) so you can’t send something like an email id but you can send other ID’s that may be used in your database.

Step 2: Create a custom dimension on Google Analytics

Go to google analytics admin section > Select your account & property > Go to Custom dimensions

Custom dimensions google analytics

Now, create a new custom dimension. Give it a name, scope it and make sure it’s marked active.

Custom dimension universal analytics User ID

Step 3: Modify your tracking code to send data for the custom dimension

Notice that once you create the dimension, it will also show the example code (on the right side in the picture above). Send this to your web developer!

Just a note on this: The custom dimensions for which you are passing value using the tracking code are referenced as dimensionxx and you don’t use the Name like “Custom User ID” — if it’s the first custom dimension that you will refer to it as dimension1 in the code.

Next steps: Let me know if you have any further questions and if you are interested in seeing the steps for classic GA and Universal analytics w/ Google Tag manager then here’s a blog for you: http://dan-russell.com/2014/03/how-to-set-a-userid-as-a-custom-variable-using-the-google-analytics-cookie/

I hope this was helpful!

An amazing framework to solve business problems:


There are two type of things I learned in Graduate school:

1. Useful

2. Not Useful! (useless)

This post is NOT about discussing useless learning’s! So let me share one of most useful things that I learned in my two years at a School of management: How to solve business problems? Sounds cliched but that was, I think, one of the most important skills I picked up there. In particular, I learned about Frameworks used to solve business problems, one of them was called “MECE” which is what I want to share with you in this post.

(Side-note: Most folks learn this at some strategy consulting firm like McKinsey but unlike them, I learned about it in school)

Before we begin, I want to share about why you should care and then I’ll talk about what is it.


No matter which team you work for, you are solving problems. You wouldn’t have a job if you’re not doing that — so why not get better at it?

If you want to find a root cause of a business problem (& find the solution faster!) then you need to break it down…to break it down, you need to structure it. Now, they are many ways (or Frameworks) to structure a problem — MECE is one of the most effective frameworks out there. So lets learn about that:

(side-note: MECE framework may sound like a simple idea BUT it’s NOT easy to apply!)

What is MECE?

It’s an acronym and it stands for “Mutually Exclusive and Collectively Exhaustive” which means that when you break a problem into sub-items then they should be:

1. Not overlap with each other (mutually exclusive)

2.  If you add up all sub-items then it should represent all possible solutions (collectively exhaustive)

Let’s take an example:

Say that you are asked to analyze “why is Profitability declining?”

Here’s a non-MECE way:

  • Find Top 10% profitable products [does NOT pass the collectively exhaustive test]
  • Out of them find products that are have declining profits
  • Try to find reasons why those products would have declining profits

Here’s a MECE way:

Visual for MECE principle

  • Break it down to Revenue & Cost
  • let’s start with cost, let’s say it’s constant = revenue must be going down for declining profits
  • further break down revenue into 1) Revenue from all non-usa locations 2) USA locations (Note the use of MECE principle here)
  • let’s say that revenue for non-usa locations is increasing, then it must be USA locations that’s the problem! (Note how effectively are able to narrow down and find the root cause faster!)
  • Let’s further break down to product categories for USA locations…Continue breaking down the sub-items in a MECE way till you find the root cause

I hope that gives you a good overview of MECE principle.


MECE is one of the few effective frameworks that you can use to solve a business problem. If you want to get better at structuring your ideas (to solve business problems), consider practicing MECE as there are ample resources available online that would help you master this!

What percentage of users are authenticated? (Google Universal Analytics)


You’re using Google’s Universal Analytics — That’s great! They key to make sure that you get the most out of it is to make sure that you incentivize your users to log-in aka authenticate. First step in doing that is to figure out percentage of users that are authenticated…Here’s how you can see that report:

1. Login to Google Analytics

2. Select your view > Go to “Reporting” section

3. Navigate to Audience > Behavior > User-ID coverage

Google Analytics User ID Universal

4. On this report, you can see authenticated vs unauthenticated sessions:

Percentage of authenticated users google analytics Universal


In this post, we talked about how to run a report that shows you percentage of authenticated users. (In google’s Universal analytics)