PASS Business Analytics Conference Keynote Day #1

In this post, I’ll summarize the PASS Business Analytics Conference’s Keynote Day #1:

The structure of the Keynote:

PASSt Business Analytics Conference

One of the NEW challenges that Data Pros face today is complexity involved in building a BI solution. Following slides nicely represent the challenge from the Tools standpoint:

pass business analytics conference keynote hadoop

Image Courtesy: https://twitter.com/SQLGal/status/322342662013321216

Microsoft’s Goal is to SIMPLIFY the above situation

NEW Tools:

> Data Explorer (Excel add-in)

> Power View in Excel 2013

> Geo Flow

Key Take away from the demo’s was:

Power View is a great tool that you could use to extract insights from data.

E.g. Insights about Music Charts from Germany:

Now combine the power of Power View w/ the new capabilities like Data Explorer that let’s you find, combine & refine data via Data Explorer.

In the Demo, they combined data in hadoop w/ data in relational sources. This is Powerful!

And Also

The Preview for GeoFLow in Excel was announced!

They had a great demo on a pretty big touch device:

GEO FLOW For EXcel

Sorry for the poor image – but imagine a touch device of that size w/ an interactive data visualization that has 3D geo maps!

Conclusion:

They had a nice message at the end of the keynote:

 

Excel: Swapping (reversing) the Axis of a Table Data

Data preparation (or call it pre-processing) is an essential and time-consuming part of any data analytic’s project. To that end, I was working on a data set needed some changes before I could plot it on an effective data visualization. Here’s what I did:

My Challenge:

I was working on a data set that looked like this:

Date Abu Dhabi, United Arab Emirates Adalaj, Gujarat, India Addison, TX
1/1/2013 1 4 2
1/2/2013 1 4 2
1/3/2013 1 4 3
1/4/2013 3 3 3
1/5/2013 2 2 4
1/6/2013 2 3 4
1/7/2013 2 3 3
1/8/2013 2 2 4
1/9/2013 2 2 3

BUT: I wanted my data to look like

Date 1/1/2013 1/2/2013 1/3/2013 1/4/2013 1/5/2013 1/6/2013 1/7/2013 1/8/2013 1/9/2013
Abu Dhabi, United Arab Emirates 1 1 1 3 2 2 2 2 2
Adalaj, Gujarat, India 4 4 4 3 2 3 3 2 2
Addison, TX 2 2 3 3 4 4 3 4 3

What did my real data looked liked?

it has 380 columns and 500+ Rows and so MANUAL copy pasting was NOT an option!

Excel 2010 Solution:

It’s so simple!

Step 1: Select the data > COPY (Shortcut: ctrl + c)

Step 2: Switch to a new/different excel sheet

step 3: Paste Special > Transpose (T)

excel paste special transpose swap axis data

So After doing this, This is how the Input & output looks:

excel paste special reverse axis

Conclusion:

In this post, We saw how to swap or reverse the axis of a table data in Excel 2010.

Tableau: Data Cleaning for Geographic Maps

Data cleaning is a major part of any analytic’s/data-visualization undertaking. If data cleaning is ignored then it leads to inaccurate data reporting & thus suboptimal business decisions.

To that end, if you create a Tableau’s Geographic map, please check the accuracy of your data by going to:

Menu Bar > Map > Edit Locations

Let me give you some examples:

Now, I have “states/province” as my geographic role for the variable and when I created a geographic map, I created a geographic map it didn’t show any state for New York State! See Before:

data cleaning geogrphic map before

So what did I do?

I navigated to Menu bar > Map > Edit locations:

data cleaning geogrphic map State

So I fixed it!

data cleaning geogrphic map Tableau

And After:

data cleaning geogrphic map after

Note that New York State is lighted up!

In the past, I’ve also have entered Latitude & Longitude if need be.  This is when it was not able to recognize few US cities and it was saying “ambiguous” – I inputted Latitude & Longitude to clean the data:

data cleaning geogrphic map city

Conclusion:

In this post, I described how you should check the data accuracy of a Tableau Geographic Map.

Business Metrics #2 of N: Customer Retention Rate

In this post, We’ll explore a Business metric called “Customer Retention Rate”

What is it?

It is a metric that helps an organization monitor the % of customers retained.

Let me give you an example:

Year Number of Customers Retention Rate
0 100 100%
1 85 85%
2 70 70%
3 65 65%
4 61 61%

Do you notice the third column that keeps a tab on the percentages of customer retained? This is the basic Idea behind customer retention rate.

How is it used?

This metric correlates with other key business performance measures like: customer service, product quality, customer loyalty. Think about it. If the customer retention rate is higher than the organization must be doing “something” right – that something could be: great loyalty program, great customer service or great product quality! If it’s low then it requires some action from decision makers – they would want to know the reasons so that they could fix the situation.

In earlier post, we talked about Customer Lifetime Value – now higher customer retention rate would also help us have a higher customer lifetime value.

Also it’s important to realize that the cost of acquiring a new customer is typically higher than keeping existing customer – and so organization that sells products/service like to measure the customer retention rate.

Also, if you customer data then you can drill down to find trends in the retention rate. Questions like: Which Age group has the highest retention rate? or which has lower? Retention rate for male customers? And also predicting customer retention rate of a new customer?

Conclusion:

In this post, we learned about a business metric “customer retention rate”.

And as a reminder, This series is meant to understand Business Metrics from Analytics Perspective.

Beginner’s Guide: Sentiment Analysis using Python on Windows

This is beginner’s guide to sentiment analysis using Python NLTK on windows. We’ll start w/ installing Python and NLTK and then see how to perform sentiment analysis.

Step 1: Install Python & NLTK

I followed the steps listed on http://nltk.org/install.html

1. Search for python 2.7.3 for windows and install it.

2. Search for Python setup Tools for Windows and install it.

3. Install PIP (for win 64 bit), NLTK and PyYAML.

4. Test installation: Start>All Programs>Python27>IDLE, then type import nltk

Now,

5. Also type:

>>> Import random

6. And also install movie_reviews corpus by typing:

>>>nltk.download()

in the new window that opens, install the movie_reviews corpus.

python nltk download data

Step 2: Sentiment Analysis

I followed the code explained in the NLTK book in the section “document classification” in ch 6 learning to classify text. Here is the section: http://nltk.org/book/ch06.html#document-classification

Using the code I was able to run the Naive Bayes Classifier to categorize text:

python sentiment analysis

Conclusion:

In this post, we learned how to perform sentiment analysis using Python on windwos platform. NLTK supports classifiers other than Naive Bayes, and also there are resources that will help  you increase the accuracy of the classifier. And I hope that this post acts as a starting guide for you!

Related articles

Three Data Collection Tips for Social Media Analytics

Data integrity is important especially if critical business decisions are based off on data. To that extent, in this post, I’ll write about five data collection tips to help you have accurate data for “social media analytics”. So here are the tips that are applicable to social media analytics irrespective of the tool you are using:

1. Social Media Platform

social_media

Select the right social media platform for capturing data. You do not want to select few such that you miss data.And you do want to select irrelevant social media platforms because if you do, then you’ll introduce noise in the data. Let me take an example. If your project needs to be based on USA only then you do not need to add “sina weibo” (Chinese social network) in your social media sources.

Now, Based on your business need for “social media analytics” campaign, you should test all possible social media platforms – you never know who might be talking about things that you are interested in. After you have selected the right social media platforms for your project, let’s go the next step:

2. “Search Keyword” Selection

Some of the social media platforms let’s you collect data via “search keywords”. Like twitter allows you to collect data via “hashtags” and/or keywords. So if you want to collect data about all social media posts having “american airlines” then you should not collect data using:

AMERICAN OR Airlines:

If you select the above rule, then it will introduce a LOT of noise because we’ll collect data people talking about just “American” PLUS data about people talking about just “airlines”. That’s bad!  What you want is rules like these:

1. American AND airlines

2. “American Airlines” (as a phrase)

american airlines social mediaNow, I can’t stress the importance of selecting the right search keywords enough. Choosing wrong keywords will add noise that would be bad for analytics. So choose keywords such that you are not adding noise as well as not missing on conversations. There’s no secret formula here, continuous improvement is the way to go!

3. Language & country Filtering

global-social-network

Social networks are GLOBAL in nature and so it’s important to filter (or include) based on the project that you’re working on. Not doing so would add noise in your data. And also remember to include country and language because you do not want to miss out on conversations either.

Conclusion:

Three Data Collection Tips for Social media analytics that I shared in this post are:

1. Select Right Social Media Platform

2. Select Right search keywords

3. Select Right Country and Language.

Data Reporting ≠ Data Analysis

One of the key thing I’ve learned is importance of differentiating the concepts of “Data Reporting” and “Data Analysis”. So, let’s first see them visually:

data analysis and data reporting

Here’s the logic for putting Data Reporting INSIDE Data Analysis: if you need to do “analysis” then you need reports. But you do not have to necessarily do data analysis if you want to do data reporting.

From a process standpoint, Here’s how you can visualize Data Reporting and Data Analysis:

data analysis and data reporting process

Let’s thing about this for a moment: Why do we need “analysis”?

We need it because TOOLS are really great at generating data reports. But it requires a HUMAN BRAIN to translate those “data points/reports” into “business insights”. This process of seeing the data points and translating them into business insights is core of what is Data Analysis. Here’s how it looks visually:

Data analysis Data Reporting

Note after performing data analysis, we have information like Trends and Insights, Action items or Recommendations, Estimated impact on business that creates business value.

Conclusion:

Data Reporting ≠ Data Analysis

Guest Blog: How to measure ROI of Social Media Marketing?

Introduction:

This is Guest Blog by Jugal Shah. Jugal is pursuing MBA w/ focus on Marketing from a premier university in India. He shares his views on marketing, sales and strategy via his Blog & Facebook.In this post, He briefly comments on “How to measure Social Media Marketing ROI”.

Jugal Shah’s Short post on Measuring Social Media Marketing ROI:

In social media marketing, ROI is not in just monitory terms. So, for social media ROI, my focus would be on
1) to how many people I have reached
2) How many people I have engaged through online activities
3) Becoming a conversation enabler and perception driver

Then focus on

1) how much increased revenue is due to social media reach (you can do this by tracking referred link)
2) How many leads you generated through social media
3) How social media efforts helped to resolve customer query/problems and led to more customer satisfaction (remember customer acquisition cost 10 times more than customer retention cost).

In a nutshell, It’s of utmost important to use Social Media as:

  • conversation enabler
  • perception driver
  • customer retention

Conclusion:

Paras: Jugal, Thanks for this post. I am sure, this short post would be a great food for thought for readers who are interested in Digital Marketing Analytics or analytics in general. Readers, Feel free to reach out to him on his blog and/or Facebook page.

Business Analytics Continuum:

Think of “continuum” as something you start and you never stop improving upon. In my mind, Business Analytics Continuum is continuous investment of resources to take business analytics capabilities to next level. So what are these levels? Douglas McDowell explained about this concept in recent post here – I think it was a great food for thought for me and hence I posting about this particular concept here. 

Here is the visual representation of the concept:

business analytics continuum

And I would encourage you to read the entire post and other posts in the series here: PASS BAC Preview Series: Business Analytics Defined

Business Metrics #1 of N: Customer Lifetime Value

This post will briefly describe an important marketing metric called “Customer Lifetime value”.

What is it?

It’s an important metric in the world of marketing. It helps businesses measure a customer’s worth to a business during the entire business relationship. In other words, it helps a business calculate net profit associated with a customers relationship starting from first purchase AND subsequent purchases along with expected future purchases.

How is it used?

It’s used to measure return on investment when formulating marketing strategies. Here’s an example: If your strategy costs $100 to acquire a customer and the average lifetime value of customer is $400 – then well, that’s a great thing, isn’t it?

It also helps business focus on making the most out of the existing customer relationships.

To extend these examples in the Internet marketing world, let’s take an example:

Suppose that the cost of acquiring a customer via Internet marketing is $25. The customer buys a $10 worth of goods. Is this good? Not from what we’ve seen so far. But the lifetime value of customer is $120 – see, now it does makes sense to spend $25 to acquire a customer.

Conclusion:

In this post, I wrote about a key Business Metric that should be of help when you work on your Marketing analytics project. Note that accurately measusring this metric is NOT an addition of couple of numbers and there is some thinking involved. To that end, I would leave you thinking about this critical business metrics that could be used in marketing analytics project! Your comments are very welcome!