Data -> Insights -> ?


I was at the HP Big data conference last week and I heard something during the keynote that’s worth sharing with you.

As Data & Analytics professionals, we spend a lot of our time on finding insights, trends & patterns out of the data but the keynote speaker (Ken Rudin, Facebook) encouraged everyone to take that a step further = Think about Driving impact based on the insights. It’s simple yet a powerful idea! Over past few months, I have started working closely with decision makers and helping drive impact vs just “handing-off” insights.

I hope that helps! Just wanted to share that with you. What do you think?


PASS Business Analytics VC has grown 123% in a year! #sqlpass @passbavc


It’s been amazing to see the growth of Business Analytics community over the past couple of years as one of the chapter leaders on the PASS Business Analytics Virtual chapter…Here’s a data viz that I put together to analyze effectiveness of our marketing campaigns:

Here’s the chart: 

PASS Business Analytics Virtual Chapter Marketing Effectiveness Chart

It shouldn’t come as a surprise that an “Analytics” virtual chapter is using data-driven marketing techniques! ;)


May’14 = 100 attendees. Jun’15 = 223 attendees. % Diff = 123%


With this growth rate, we should have ~500 attendees in our future virtual chapter meeting in Jun 2016. Can’t wait! :)


A lot of work by Dan English (current president) and Melissa Demcsak (Immediate past president) went into growing this chapter!

How to add custom User-ID to your Universal Analytics (Google) implementation?


There are three different implementations that you could have with Google Analytics:

  1. Classic Google Analytics
  2. Universal analytics
  3. Universal analytics with Google Tag manager.

If you fall under “2. Universal analytics” then this post is for you since I’ll walk you through steps that you need to take see UserID’s on your google analytics report.

First why do you want UserID on your GA reports? 

1) Data blending is powerful. If you can combine your transactional datasets with web traffic data then you can extract some powerful insights! To be able to join your transactions data with web traffic data, you need some field that is common across those datasets. UserID might be one of the most useful fields that you could use for data blending.

2) Also having UserID in your reports let’s you perform some user behavior analysis at individual level and that could be pretty powerful too!

Now, How?

There are three steps:

  1. Identify the User ID
  2. Create a Custom Dimension on Google Analytics
  3. Modify your tracking code to send data for the custom dimension

Step 1. Identify the User ID

The first step is to identify the ID that you are going to send to Google analytics. Remember that you can’t send PII (personally identifiable information) so you can’t send something like an email id but you can send other ID’s that may be used in your database.

Step 2: Create a custom dimension on Google Analytics

Go to google analytics admin section > Select your account & property > Go to Custom dimensions

Custom dimensions google analytics

Now, create a new custom dimension. Give it a name, scope it and make sure it’s marked active.

Custom dimension universal analytics User ID

Step 3: Modify your tracking code to send data for the custom dimension

Notice that once you create the dimension, it will also show the example code (on the right side in the picture above). Send this to your web developer!

Just a note on this: The custom dimensions for which you are passing value using the tracking code are referenced as dimensionxx and you don’t use the Name like “Custom User ID” — if it’s the first custom dimension that you will refer to it as dimension1 in the code.

Next steps: Let me know if you have any further questions and if you are interested in seeing the steps for classic GA and Universal analytics w/ Google Tag manager then here’s a blog for you:

I hope this was helpful!

An amazing framework to solve business problems:


There are two type of things I learned in Graduate school:

1. Useful

2. Not Useful! (useless)

This post is NOT about discussing useless learning’s! So let me share one of most useful things that I learned in my two years at a School of management: How to solve business problems? Sounds cliched but that was, I think, one of the most important skills I picked up there. In particular, I learned about Frameworks used to solve business problems, one of them was called “MECE” which is what I want to share with you in this post.

(Side-note: Most folks learn this at some strategy consulting firm like McKinsey but unlike them, I learned about it in school)

Before we begin, I want to share about why you should care and then I’ll talk about what is it.


No matter which team you work for, you are solving problems. You wouldn’t have a job if you’re not doing that — so why not get better at it?

If you want to find a root cause of a business problem (& find the solution faster!) then you need to break it down…to break it down, you need to structure it. Now, they are many ways (or Frameworks) to structure a problem — MECE is one of the most effective frameworks out there. So lets learn about that:

(side-note: MECE framework may sound like a simple idea BUT it’s NOT easy to apply!)

What is MECE?

It’s an acronym and it stands for “Mutually Exclusive and Collectively Exhaustive” which means that when you break a problem into sub-items then they should be:

1. Not overlap with each other (mutually exclusive)

2.  If you add up all sub-items then it should represent all possible solutions (collectively exhaustive)

Let’s take an example:

Say that you are asked to analyze “why is Profitability declining?”

Here’s a non-MECE way:

  • Find Top 10% profitable products [does NOT pass the collectively exhaustive test]
  • Out of them find products that are have declining profits
  • Try to find reasons why those products would have declining profits

Here’s a MECE way:

Visual for MECE principle

  • Break it down to Revenue & Cost
  • let’s start with cost, let’s say it’s constant = revenue must be going down for declining profits
  • further break down revenue into 1) Revenue from all non-usa locations 2) USA locations (Note the use of MECE principle here)
  • let’s say that revenue for non-usa locations is increasing, then it must be USA locations that’s the problem! (Note how effectively are able to narrow down and find the root cause faster!)
  • Let’s further break down to product categories for USA locations…Continue breaking down the sub-items in a MECE way till you find the root cause

I hope that gives you a good overview of MECE principle.


MECE is one of the few effective frameworks that you can use to solve a business problem. If you want to get better at structuring your ideas (to solve business problems), consider practicing MECE as there are ample resources available online that would help you master this!

What percentage of users are authenticated? (Google Universal Analytics)


You’re using Google’s Universal Analytics — That’s great! They key to make sure that you get the most out of it is to make sure that you incentivize your users to log-in aka authenticate. First step in doing that is to figure out percentage of users that are authenticated…Here’s how you can see that report:

1. Login to Google Analytics

2. Select your view > Go to “Reporting” section

3. Navigate to Audience > Behavior > User-ID coverage

Google Analytics User ID Universal

4. On this report, you can see authenticated vs unauthenticated sessions:

Percentage of authenticated users google analytics Universal


In this post, we talked about how to run a report that shows you percentage of authenticated users. (In google’s Universal analytics)

A key driver for business intelligence adoption: Embedded analytics.


Did you know most business intelligence (BI) solutions are under-utilized? Your BI solution might be one of them — I definitely had some BI solutions that were not as widely used as I had imagined! Don’t believe me? Take a guess at “number of active users” for your BI solution and then look up that number by using your BI server logs. Invariably, this is Shocking to most BI project leaders = Their BI solution is not as widely used as they had imagined! Ok, so what can you do? Let me share one key driver to drive business intelligence adoption: Embedded analytics.

Embedded analytics

#1: what is Embedded analytics? 

Embedded analytics is a technology practice to integrate analytics inside software applications. In the context of this post, it means integrating BI reports/dashboards in most commonly used apps inside your organization.

#2: why should you care? 

You should care because it increase your business intelligence adoption. I’ve seen x2 gains in number of active users just by embedding analytics. if you want to understand why it’s effective at driving adoption, here’s my interpretation:

Change is hard. You know that — then why do you ask your business users to “change” their workflow and come to your BI solution to access the data that they need. Let’s consider an alternative — put data left, right & center of their workflow!

Example: You are working with a team that spends most of their time on a CRM system then consider putting your reports & dashboards inside the CRM system and not asking them to do this:

Open a new tab > Enter your BI tool URL > Enter User Name > Enter Password > Oops wrong password > Enter password again > Ok, I am in > Search for the Report > Oops, not this one! > Ok go back and search again > Open report > loading…1….2….3…. > Ok, here’s the report!  

You see, that’s painful! Here’s an alternative user experience with embedded analytics:

They are in their favorite CRM system! And see a nice little report embedded inside their system and they can click on that report to open that report for deeper analysis in your BI solution.

How easy* was that?

*Some quick notes from the field:

1) it’s easy for users but It’s not easy to implement! But well — there’s ROI if you invest your resources in setting up embedded analytics correctly!

2) Don’t forget context! example: if a user is in their CRM system and is looking at one of their problem customers — then wouldn’t it be great if your reports would display key data points filtered for that customer! So context. Very important!

3) Start small. Implement embedded analytics for one subject area (e.g. customer analysis) for one business team inside one app! Learn from that. Adjust according to your specific needs & company culture AND if that works — then do a broad roll out!

Now, think of all the places you can embed analytics in your organization. Give your users an easy way to get access to the reports. Don’t build it and wait for them to come to you — go embed your analytics anywhere and everywhere it makes sense!

#3: Stepping back

Other than Embedded analytics — you need to take a look at providing user support and training as well…And continue monitoring usage! (if you’re trying to spread data driven culture via your BI solution then you should “eat at your own restaurant” and base your adoption efforts on your usage numbers and not guesses!)


In this post, I shared why embedded analytics can be a key drive for driving business intelligence adoption.

SQL: How to get first/last transaction details for each user account?


Looking at user’s first/last transaction details is a common requirement. So given that you have a “user_transaction_details” table which looks something like:

Transaction_id | user_id | transaction_date | purchase amount

How would get first/last transaction details for each user account?

Here’s a design pattern:

select AllTransactions.user_id,AllTransactions.purchase_amt from user_transaction_details AllTransactions
inner join 
select user_id, min(transaction_timestamp) as first_transaction_timestamp from user_transaction_details
group by user_id
) FirstTransaction
on AllTransactions.user_id = FirstTransaction.user_id and AllTransactions.transaction_timestamp = FirstTransaction.first_transaction_timestamp

To get the last transaction details for each user account, replace min function with max and that should be it.

Note: Depending on your data model and how you used it in the join, it might be that there would be multiple rows marked as “first/last” transaction and so would want to spend some time figuring out how to deal with these multiple row situation especially if you’re planning to further join this data.

In this post, I shared a design pattern to write a SQL query to get first/last transaction details for each user account type.

Question for you:
How would you solve this problem? Share you query in the comments section!

PASS Business Analytics conference 2015 – Keynote Day 1. #passbac


I’m at PASS BA conference this week to learn about current & future state of business analytics world. I would categorize today’s keynote as a session that gave me an insight what’s our “future”.

Before I begin describing my takeaway’s, I would provide some context by providing keynote speaker’s background.

The keynote speaker was Carlo Ratti who directs MIT’s seanseable city — a group that focuses on how to use data to understand our cities and works on research projects to make it better!

So, the theme of the keynote was to show some of the research projects and inspire audience on how big data was used to come up with actionable insights to make our cities better. The projects are usually at the intersection of Internet of things and Big Data.

With that context, here’s are my notes:

– What is Big Data? 

(I thought that was funny but behind it, there’s an interesting takeaway: we need NEW tools and approach to deal with Big Data! btw, MIT has their in-house tools to generate data viz’s on bigdata — one such example: Data Collider

– Internet of things & Big data

Carlo also showed the relation between Internet of things and Big data by using data to show exponential growth that we have seen in mobile connections and how it’s poised to grow further in coming decade because of internet of things.

– Big Data = opportunistic + user-generated + purposely sensed. 

Carlo then categorized his projects in three categories: Opportunistic, User-generated & Purposely sensed.

– Opportunistic

couple of examples shared by the speaker:

1. Re-drawing Great Britain’s map from a network of human interactions:

2. A better way to handle traffic at traffic lights. (Project Wave)


1. using user-generated content to figure out where’s the best party in Barcelona? (project World’s eye)

2. analyzing user tweets during an event. (Project: Tweetbursts)

– purposely sensed

1. Having sensors in the bike that gives you info like pollution, traffic congestion and road congestions in real-time.

(Project: Copenhagen wheel.)

copenhagen wheel

2. Following e-waste around the world. Sometimes the energy put into discarding the waste is more than we can ever get out of it after its recycled. This should be improved! (Project: Trash Track)

That’s it for the notes.

You can check out all of their projects on their site!

After seeing the demo’s, I can’t wait to live in one of this “smarter” city. What a fascinating application of data! — The future is awesome!

How to analyze “new users” metric by specific pages in Google Analytics?



you want to create a funnel chart of how your new users move from their landing page to your desired destination. Ideally it’s goes something like this:

Stage 1) lands on your home/landing page

Stage 2) goes to a product page

stage 3) goes to a checkout page

stage 4) sees a thank you page

Now, if you want to analyze the conversion among these stages for a “new” user then you will need create custom reports in google analytics. You will basically need to create a report for each specific page that you want to analyze. So how to set one up?


1) Navigate to Google analytics profile

2) On the top of the go to “customization” section and click on create a new custom report

3) here’s how you can set up a custom report that will use you new users by a specific page (notice the page filter?).

New users by specific pages funnel visualization google analytics


In this post, I outlined the steps that you need to take to setup a custom report in google analytics that shows you new users by specific pages.

How to assign same axis values to a group of spark-lines in Excel?


Spark-line is a very handy data visualization technique! It’s great when you are space constrained to show trends among multiple data points.

Here’s an example:

Spark Line Trend Excel Data Visualization

But there’s an issue with above chart! Axis values for these group of spark-lines do not seem match – it could throw someone off if they didn’t pay close attention. So a good practice – when you know users are going to compare segments based on the spark-lines – is to assign them same axis values so it’s easier to compare. Here’s the modified version:

Excel Sparkline data visualization same axis

And…here are the steps:

1. Make sure that spark-lines are grouped.

Select the spark-lines > go to toolbar > Sparkline Tools > Design > Group

Excel Sparkline Group

2. On the “group” section, you’ll also find the “Axis” option – select that and make sure that “same for all axis” is selected for Vertical axis minimum and maximum values:

Excel Spark Line Data Viz same min max value


That’s about it. Just a quick formatting option that makes your spark-lines much more effective!

Author: Paras Doshi