Examples to help you differentiate between Business Intelligence and Data Science problems:

In this post, I’ll list few examples from various industries to help you differentiate between business intelligence and data science problems.

Sometime back, I blogged about “Business Analytics Continuum” and in the post we saw that Every Organization has DATA but they use their business data at different levels because of their maturity level. Excel (or other transactional reporting tools) is usually the starting point for any organization – it helps them see WHAT happened. They advance to the next stage, where they get capabilities to slice and dice their data – To find out WHY – and usually this capability is delivered using Business Intelligence tools & techniques. Once the data culture spreads – Thanks to a successful Business Intelligence project – then they soon start to outgrow their business intelligence capabilities by asking problems that need predictive capabilities. This is advanced analytics and Data Science stage. To that end, here are 5 examples to help you differentiate between business intelligence and data science problems:

Business Intelligence.(WHAT & WHY) Data Science & advanced analytics.
Bike Rentals
  1. How many bikes did we rent in Q3 2014? How does that compare to Q3 2013?
  2. What is the trend of total bike rentals at week level? Can you break it down by geography?
Can you predict bike rentals on an hourly basis?
Credit Risk
  1. How many customers have a credit risk of ‘C’?
  2. Can you rank customers by their payments due amount that have a credit risk ‘C’?
Can you predict the credit risk of the customer during contract negotiations stage?
Customer relationship management
  1. How many account cancellations occurred this year (broken down by month and customer segmentation)?
  2. How does percentage of account cancellations this year compare to that previous year?
 Can you predict customer churn?
Flight Delays
  1. What is the trend of % of flight delayed this year?
  2. Can you break down flight delays this year by their reasons?
Can you predict whether a scheduled flight will be delayed by more than 15 minutes?
Customer feedback
  1. What is the customer satisfaction % trend this year?
  2. What is the customer satisfaction % broken down by customer segments and product segments?
Can you classify a customer feedback comment into “positive”, “negative” or “neutral”?

I hope this helps!

How to get descriptive statistics in Excel?

Problem:

you are analyzing a dataset and before modeling/analyzing you need to generate descriptive statistics on a field. you have the data loaded in Excel and wondered if there’s a way to do that in Excel.

Solution:

There’s an out of the box solution that will support your needs to generate descriptive statistics on a field. Here are the steps:

Note: for the purpose of this blog post, I am using Excel 2013 but data analysis toolpak is available in Excel 2007+.

1. Active “Data Analysis” toolpak.

Follow this steps:  File > Options > Add-ins > Manage: Excel Addins > “GO”

excel data analysis toolpak

2. make sure to check the “analysis toolpak” checkbox.

3. Now you should see a “data analysis” option under the “Data” pane:

Excel Data Analysis Descriptive Statistics

4. Now click on “Data Analysis” and select one of the following options:

Anova, Correlation, Covariance, Descriptive Statistics, Exponential Smoothing, F-Test Two-Sample for Variances, Fourier Analysis, Histogram, Moving Average, Random Number Generation, Rank and Percentile, Regression, Sampling, t-Test, z-Test.

in this case, let’s go with descriptive statistics but you can see that you can perform other tasks as well.

5. Once you click on the descriptive statistics, a dialog box will show up and you will have to enter some data like your input range to generate descriptive statistics. Once you have filled the data needed, click on OK and it should generate descriptive statistics for you in EXCEL!

I hope that helps!

Conclusion:

In this post, we saw how to generate descriptive statistics in Microsoft Excel.

Author: Paras Doshi

Business Metric #5 of N: “Conversion rate” (online marketing)

Summary:

In this post, we will see an important metric in online marketing called “conversion rate”.

Description:

so, what is conversion rate?

Conversion rate = (Number of Goals Achieved)/(Total Visitors)

why is this important to track?

In my previous blog on leads (marketing), I mentioned it’s important to track number of people interested in your products/services but along with that it’s important to provide context while reporting on Leads – this is where conversion rate comes into picture and provides the necessary context. Conversion rate can tell us the Quality of the leads & visitors that you get from your online marketing efforts.

Let’s take an example of an e-commerce site:

An e-commerce site decides to increase their monthly online marketing budget and they see a spike in the number of visitors – so that’s great, right? They should continue to increase their marketing budget, right? Well – that might not be true. While the number of visitors may have increased how do we know that increased number of visitors results in increased revenue? It all depends on the quality of the visitors that’s being generated – so how do you quantify the quality of the visitors? That’s right – conversion rate with the goal: number of visitors clicking “buy” button. So you want to make sure that with increased online marketing budget 1) Conversion rate is good or better 2) Number of visitors/leads have grown.

This was a basic scenario helping you appreciate the power of tracking the conversion rate for your online marketing efforts.

Now, If your marketing funnel is more complex then you might also create multiple conversion rate metrics to track conversions at each stage of a marketing funnel. This is VERY powerful. Here’s an Example:

Conversion Rate #1: (Number of Leads)/(Number of Total Visitors)

Note: your marketing team would define a “lead” based on their criteria(s) like downloads a newsletter, submits a contact us form, favorites a product, etc.

Conversion Rate #2: (Number of Actual Customers)/(Number of Leads)

Note: Conversion Rate #1 is great to evaluate effectiveness of marketing campaigns and conversion rate #2 is great to evaluate sales effectiveness.

How can you capture this data?

A good web analytics tool (like Google analytics) should help you track your conversion rates.

Conclusion:

In this post, we saw that tracking conversion rate is very important metric to track your online marketing efforts.

Business Metric #4 of N: “Leads” (marketing)

Summary:

In this post, we will discuss about a common metric used by Sales & Marketing teams called “leads”.

Description:

In simple terms,

Leads = number of individuals (or companies) that have expressed an interest in your goods or services.

why do we want to measure this?

For a business to grow, it’s important that the sales & marketing department work to make sure that there is a growing interest in company’s goods or services. It’s important to track this metric to make sure that it’s a positive upward trend!

Word of caution: It’s important to also note that this metric on its own can be misleading. It might be a good idea to also track “conversion ratios” (converting leads or potential customers into actual customers) to make sure that high-quality leads are being generated.

where can you get this data?

Depending on the channel that you use to capture potential customer’s information & the technology maturity of the company, it varies. I’ve seen CRM systems used to report “leads” data and I’ve also seen manual excel files that are used to generate leads report.

Are there any sub-categories?

Yes, it’s usually subdivided into 1) Marketing Qualified Leads and 2) Sales Marketing Leads.

usually, Marketing Qualified lead (MQL) is someone who has shown interest in your product or service but you don’t know if they fulfill your qualifications to buy your products or services. out of all MQL’s, those leads that qualify your criteria and are identified are someone who is ready to buy your products or services becomes your Sales Qualified Lead (SQL) and sales department get’s ready to engage with these leads to make them an actual customer.

Marketing Funnel Sales Qualified Lead

Conclusion:

In this post, we saw a high level overview of a business metric used in marketing and sales called “leads”.  As mentioned earlier, don’t report on just “leads” – it can be misleading for marketing & sales executives since upward trend in number of leads doesn’t necessarily result in increased sales unless the quality of new leads is same or better. Marketing and sales executives would really appreciate any context  (example: conversions) that you can provide to their “leads” report. I hope that helps!

How does Internet of Things (#IoT) impact data professionals?

Internet enabled computers to be connected with each other.

Internet enabled Mobile Devices to be connected with each other.

Now, Internet will be used to enable physical things to be connected with each other. This is what is called “Internet of things” (IoT).

So what happens?

since more devices are connected with internet – we will able to generate more data! This is usually good if there’s a business vision around how to make sense of data to increase efficiency of all these things.

Here’s a nice case study from Microsoft (focus on the business case – the things in this case is “elevator” to drive reliability)

 

This is all good news for data professionals! There will be increased demand for professionals who can help businesses make sense of data generated via IoT.

Also beware of the “hype” around this technology. It’s important to take incremental steps to achieve the vision – Instead of trying to analyze data from ALL devices in your organization, start with one physical thing that matter the most for your organization or start with data that you have and take incremental steps to spread data culture in your organization!

Now that Big Data has become a mainstream word in IT and business, we have a new buzzword to learn/talk about IoT – but remember it’s all about making sense of data and your skills would be more valuable than ever!

PASS Business Analytics VC: Insider’s Introduction to Microsoft Azure Machine Learning (#AzureML). #sqlpass

RSVP: http://bit.ly/PASSBAVC091814


Session Abstract:
Microsoft has introduced a new technology for developing analytics applications in the cloud. The presenter has an insider’s perspective, having actively provided feedback to the Microsoft team which has been developing this technology over the past 2 years. This session will 1) provide an introduction to the Azure technology including licensing, 2) provide demos of using R version 3 with AzureML, and 3) provide best practices for developing applications with Azure Machine Learning.
Speaker BIO:
Mark is a consultant who provides enterprise data science analytics advice and solutions. He uses Microsoft Azure Machine Learning, Microsoft SQL Server Data Mining, SAS, SPSS, R, and Hadoop (among other tools). He works with Microsoft Business Intelligence (SSAS, SSIS, SSRS, SharePoint, Power BI, .NET). He is a SQL Server MVP and has a research doctorate (PhD) from Georgia Tech.

RSVP: http://bit.ly/PASSBAVC091814

Hope to see you there!

Paras Doshi
Business Analytics Virtual Chapter’s Co-Leader

 

Back to basics: Multi Class Classification vs Two class classification.

Classification algorithms are commonly used to build predictive models. Here’s what they do (simplified!):

Machine Learning Predictive Algorithms analytics Introduction

Now, here’s the difference between Multi Class and Two Class:

if your Test Data needs to be classified into two classes then you use a two-class classification model.

Examples:

1. Is it going to Rain today? YES or NO

2. Will the buyer renew his soon-to-expire subscription? YES or NO

3. What is the sentiment of this text? Positive OR Negative

As you can see from above examples the test data needs to be classified in two classes.

Now, look at example #3 – What is the sentiment of the text? What if you also want an additional class called “neutral” – so now there are three classes and we’ll need to use a multi-class classification model. So, If your test data needs to be classified into more than two classes then you use a multi-class classification model.

Examples:

1. Sentiment analysis of customer reviews? Positive, Negative, Neutral

2. What is the weather prediction for today? Sunny, Cloudy, Rainy, Snow

I hope the examples helped, so next time you have to choose between multi class and two class classification models, ask yourself – does the problem ask you to predict two classes or more? based on that, you’ll need to pick your model.

Example: Azure Machine Learning (AzureML) studio’s classifier list:

Azure Machine Learning classifiers list

I hope this helps!

Business Intelligene Dashboard for Quality Managers

Business Intelligene Dashboard for Quality Managers

Business Goal:

Need to understand the patterns in Quality test results data across all plants.

Summary:

- The solution involved creating a Business Intelligence system that gathered data from multiple plants. I was involved in mentoring IT team, development and end-user training of a Business Intelligence Dashboard that used SQL server analysis services as it’s data source.

- Dashboard development involved multiple checkpoint meetings with business leaders since this was the first time they had a chance to visualize quality test results data consolidated from multiple plants. Since they were new to data visualization, I used to prepare in advance and create 3-4 relevant visualization templates to kick off meetings.

Mockup:

(it is intended to look generic since I can’t discuss details. Also, drill down capabilities had been added to the dashboard to go down to the lowest granularity if needed)

Quality Test Results Dashboard

Achievement Unlocked: Tableau Desktop 8 Qualified Associate!

To test my Tableau knowledge, I attempted the Tableau product certification and got the “Tableau Desktop 8 Qualified Associate” certificate.

Tableau 8 Qualified associate Certificate paras doshi

 

Back to basics: continuous Vs. Discrete variables and their importance in Data Visualization.

Take a look at the following chart, do you see any issues with it?

month trend chart line chart string to date

Notice that the month values are shown as “distinct” values instead of shown as a “continuous” values and it misleads the person looking at the chart.  Agree? Great! You already know based on your instincts what continuous and discrete values are, it’s just that we will need to label what you already know.

In the example used above, the “Date & Time” shown as a “Sales Date” is a continuous value since you can’t never say the “Exact” time that the event occurred…1/1/2008 22 hours, 15 minutes, 7 seconds, 5 milliseconds…and it goes on…it’s continuous.

But let’s say you wanted to see Number of Units Sold Vs Product Name. now that’s countable, isn’t it? You can say that we sold 150 units of Product X and 250 units of product Y. In this case, Units sold becomes discrete value.

The chart shown above was treating Sales Date as discrete values and hence causing confusion…let’s fix it since now you the difference between continuous and discrete variables:

Statistics Discrete Continuos Variable Data Visualization

Conclusion:

To develop effective data visualizations, it’s important to understand the data types of your data. In this post, you saw the difference between continuous and discrete variables and their importance in data visualization.