Data Analysis and In Memory Technologies, let’s connect the dots:

SPEED is one of the important aspect of Data Analysis. Wouldn’t it be great if you query a data source, you get your answers as soon as possible? Yes? Right! Of course, it depends on factors like the size of the data you are trying to query but wouldn’t it be great if it’s at “SPEED OF THOUGHT“?

So Here’s the Problem:

Databases are mostly disk based and so the bottleneck here is the speed at which can get access to data off the disks.

So what can you do?

Let’s put data in a RAM (memory) because data-access via memory is faster.

If it’s sounds so easy, why didn’t people do it earlier? And why are we talking about “In Memory” NOW?

1) BIGGER Data Size/sets and so today with more data, it takes more time to query data from databases. And so researchers have looked at other approaches. One of the effective approach they found is: In-memory

(And I am not ignoring the advances in Database Technologies like Parallel databases, But for the purpose of understanding “Why In-memory”, it’s important to realize the growing size of data sets and a viable alternative we have to tackle the problem: In memory. And also I am not saying that it’s the ONLY way to go. I am just trying to understand the significance of in-memory technologies. We, as data professionals, have lot’s of choices! And only after evaluating project requirements, we can talk about tools and techniques)

2)  PRICE of Memory: Was the price of RAM/memory higher than what it is today? So even though it was a great idea to put data in memory, it was cost-prohibitive.

So Let’s connect the dots: Data Analysis + In Memory Technologies:

What’s common between Microsoft’s PowerPivot, SAP HANA, Tableau and Qlikview?

1) Tools for Data-Analysis/Business-Intelligence 2) Their Back End data architecture is “In Memory”

So since Data Analysis needs SPEED and In-Memory Technologies solves this need – Data Analysis and Business Intelligence Tools adopted “In-memory” as their back-end data architecture. And next time, when you hear a vendor saying “in-memory”, you don’t have to get confused about what they’re trying to say. They’re just saying that we got you covered by giving you ability to query your data at “speed of thought” via our In-memory technologies so that you can go back to your (data) analysis.

That’s about it for this post. Here’s a related post: What’s the benefit of columnar databases?

your comments are very welcome!

The role of Sentiment Analysis in Social Media Monitoring:

I’ve posted tutorial/resources about the Technical Side of Sentiment Analysis on this Blog. Here are the Links, if you need them:

LingPipe (Java Based) | Python | R language resource | Microsoft’s Tool “Social Analytics

Apart from this, I’ve used other Tools per project requirements and It’s been fun designing and developing projects on “Sentiment Analysis” primarily using Social Media Monitoring. Having worked with clients on projects that use “Sentiment Analysis” – I reflected about the role of Sentiment Analysis in Social Media Monitoring. And in this blog post, I am sharing these reflections:

What is Social Media Monitoring?

Social Media Monitoring is a process of “monitoring” conversations happening on social media channels about your brand/company.

Is it NEW? Not really. The idea of monitoring or gathering data about what is being talked about the brand/company is not new. Earlier, it was newspapers and magazine-articles and now, it’s the social media channels including online news, forums and blogs and thus the name given to this process is “Social Media Monitoring”

brand monitoring social media

What is Sentiment Analysis?

Analyzing data to categorize it under a “sentiment” (emotion).

Example. Is this review saying positive, negative or neutral thing about our product.

sentiment analysis positive negative neutral

side-note: Sentiment analysis is often categorized under “Big Data Analytics”.

What’s the Role of Sentiment Analysis in Social Media Monitoring?

We’ve seen that in social media monitoring, we gather all online conversations about a brand/product/company. Now wouldn’t it be great to take the data that we have and bucket it under “Positive”, “Negative” or “Neutral” categories for further analysis?

So few questions that can be answered after we have results from sentiment analysis:

1) Are people happy or sad about our product?

2) What do they like about our product?

3) What do they hate about our service?

4) Is there a trend or seasonality in sentiment data?

Among other business insights that may be not be easily answerable with just plain text data.

Thus sentiment analysis is one of the step in social media monitoring that assists in analyzing sentiment of all the conversations happening on the social web about a brand/product.

That’s about this for this post. Here’s a related post: Three Data Collection Tips for Social Media Analytics

your comments are very welcome!

How conditionally formatting your data in Excel can help you save time in answering business questions?

Visual analytics is amazing – it helps “data enthusiasts” save time in answering questions using Data. Let’s see one such example. For the purpose of the blog post, I am going to show how to do it in Excel 2010:

Problem:

Here’s the Business Question: What was sales of Tea in North Region in 2012 Q1

Here’s the data:

SALES DATA(2012 Q1)  East West Central North South
Coffee  $  7,348.00  $  7,238.00  $  1,543.00  $  9,837.00  $    1,823.00
Tea  $  9,572.00  $  8,235.00  $  3,057.00  $  8,934.00  $  13,814.00
Herbal Tea  $  5,782.00  $  8,941.00  $  9,235.00  $     392.00  $    1,268.00
Espresso  $  9,012.00  $  2,590.00  $  4,289.00  $  7,848.00  $       340.00

So it’s easy to give out answer using the data: $8934

But let me CHANGE the business question:

WHICH Products in WHAT regions are doing the best?

Now this questions is not as easy as the previous one? WHY? because you’ll have to manually go through each number in a linear fashion to answer the question. Now imagine a bigger data-set. It’ll take even more time.

Solution

What can Excel Power users and Data Enthusiasts do to answer the new business question in an efficient way? Well, let’s see what conditional formatting can do it:

Excel Visual Analytics Conditional formatting

Now with the Data Bars, it’s easier to just glance at the report and see best performing products and regions. For instance, it’s very easy to spot that Tea is performing best in South among all products and region.

So how do you create data bars?

1. Select the data

2. Home > Conditional Formatting > Data Bars

Excel Visual Analytics Conditional formatting 2

3.Done! you’ll see this:

Excel Visual Analytics Conditional formatting

4. You can play with other options here to see what suits the best for your needs. But I just wanted to point out that there is a way for you to highlight the data in a way that helps you save time in answering business questions using data

Conclusion:

Visual analytics is a great way to quickly analyze data. In most cases, Human brain is much faster at interpreting the visual results as oppose to text/numbers – so why not use it to your advantage. And tools like Excel have inbuilt functionality to help you do that!

Resource: Introduction to Data Science by Prof Bill Howe, UW

Introduction to Data Science course taught by Bill Howe just started on coursera platform. Having studied the Data Intensive Computing in Cloud course at UW taught by Prof Bill Howe, I can say that this course would be great resource too!

Check it out: https://www.coursera.org/course/datasci

Introduction to Data Science

Unpivoting data using the data explorer preview for Excel 2010/2013:

Introduction:

Data Explorer add-in is amazing! It’s helps you: combine, find and re-shape your data in Excel 2010/2013. I’ve blogged about: 1) How to merge Table Data and 2) How to clean duplicate data and now in this blog post, I want to share a step-by-step on Unpivoting data using the Data Explorer add-in.

Before we begin, If you haven’t downloaded and installed the data explorer add-in for Excel 2010 & 2013, you can find Information about it here: http://office.microsoft.com/en-us/excel/download-data-explorer-for-excel-FX104018616.aspx 

Problem:

What is un-pivoting? I hear you ask. Instead of explaining it, let me share an Image:

data explorer unpivot excel

BTW, the above data is from my Facebook Page Insights.

So our problem statement is (please refer to above Image): we are given table blue and we need to output table green. In other words, we need to Unpivot the data.

Solution:

Here are the steps:

1) Open Excel, Open Data Explorer add-in. And Connect to your data. Wait when you see the Query Editor.

2) (Optional) In the Query Editor, Rename the query. I renamed it to “Unpivot Data”. And this how my query editor looks:

data explorer unpivot excel 2

3) Now, Select the columns that need to be unpivoted > Right Click > Unpivot Column

Note that I’ve selected all columns that I want to UnPivot:

data explorer unpivot excel 3

4) You’ll see the updated results in the query editor window. I renamed the columns “Attribute” to “Age and Gender” and “value” to “reach”. If you want to rename the columns, select the column > Right click > rename.

data explorer unpivot excel 4

If everything looks OK, click on Done in the bottom right corner

5) There you have it, Unpivoted data in Excel 2010/2013 using Data Explorer add-in!

And then its super easy to create charts, Here’s one I created after I had unpivoted the data:

data explorer unpivot excel 5

Insight: For my blog, my Target Audience seems to Male between the age of 18-24 and then 25-34.

FYI: The Date Range of the Data Set of 1st Jan 2013 – 25th Apr 2013.

That’s about it for this post, Here are some Related articles:

Your comments are very welcome!

 

Merging/Joining datasets in Excel using Data Explorer add-in

Problem:

Merging/Joining/Combining data-sets in Excel has not been an easy task. There are third-party add-ins that makes it easy but out of the box, excel didn’t have an easy way to merge/join table data. But now with the Data Explorer add-in, we have an add-in that let’s us merge/join data in excel w/ few clicks.

If you haven’t downloaded and installed the data explorer add-in for Excel 2010 & 2013, you can find Information about it here: http://office.microsoft.com/en-us/excel/download-data-explorer-for-excel-FX104018616.aspx 

Situation:

Input is Table 1 & Table 2. The output we need is merged Table.

Table 1:

Date Daily New number Month
1/1/2012 0 1
1/2/2012 0 1
1/3/2012 0 1
1/4/2012 0 1
1/5/2012 0 1
1/6/2012 0 1
1/7/2012 0 1
1/8/2012 0 1
1/9/2012 0 1

………………………………..

Table 2:

Month Month Name
1 January
2 February
3 March
4 April
5 May
6 June
7 July

Merged Table:

Date Daily New number Month Month Name
1/1/2012 0 1 January
1/2/2012 0 1 January
1/3/2012 0 1 January
1/4/2012 0 1 January
1/5/2012 0 1 January
1/6/2012 0 1 January
1/7/2012 0 1 January
1/8/2012 0 1 January
1/9/2012 0 1 January

………………………………

Solution:

Let’s see how data explorer can help us Join/Merge Table 1 & Table 2.

1) create query that connects to Table 1 & Table 2.

data sources explorer excel

2) Once you have queries that connect to the tables need to be merged, then click on Merge

3) Once you click on Merge, you’ll see a dialog:

Here you need to configure three things:

a) First Table

b) Second Table

c) Columns that will be used to merge/join data

In this case, this is how my merge dialog looks:

merge join excel data explorer

4) Once configured correctly, click on OK. You’ll see a dialog box where you can configure the output of the merged table. click on the new column to see the options that are available to you to configure the output of the merged table:

merge join excel data explorer 2

5) In this case, I’ve selected just one column month name that needs to be merged. You can also explore the aggregate tab in case you’ve numbers that needs merging.

merge join excel table data explorer 3

6) This is how the output looks:

merge join excel table data explorer 4

7) Rename the new column.

Select the new column > Right Click > Rename

8) Click Done if it looks OK.

9) The merged data is now available to you in Excel!

And one can analyze it!

Let’s see before and after. Note that instead of month numbers, we now have month names

merged data join table visualized excel 3

In this post, we saw how to merge/join/combine data from two different sources in Excel 2010.

Business Analytics project is like an iceberg:

Business persons may not realize that Business Analytics project may involve significant efforts for the under the hood technical tasks like Data Cleaning, Data Integration, Building-a-data-warehouse, creating ETL processes, gathering business requirements among other tasks. And that explains the title of this blog: Business Analytics project is like an iceberg. It’s because, a business person may just see the tool used to visualize data but may not realize the work that went into making it “analytics-ready”. From a project management standpoint – before a project is initiated, the discussion about this different aspects of the project need to communicated to the business stakeholders so that they are in the know of efforts involved in building an analytics solution. And with that, Here’s the summary of this discussion in form of an Image:

iceberg business analytics data cleaning business intelligence data warehouse

Related articles

PASS Business Analytics Conference Keynote Day #2

Dr. Steven Levitt’s (Indiana Jones of economics & Author of Freakonomics) work involves finding insights from data. In the keynote, he shared some of the interesting & fun insights that he found from data.

One Example: Dr. Levitt: According to the data, It was 7 times more dangerous to sell crack in Chicago than it was being in combat in Iraq. https://twitter.com/markvsql/status/322707949158006786

He also talked about other insights that he found which could also be found in his book Freakonomics. After getting audience fascinated about what analyzing data can do – he moved to his real world experiences of analyzing data for businesses. And tied all these fascinating insights back to some tips he had for the audience. Here is a brief recap of the tips he shared:

> “Ideas don’t come out of the blue. Almost always ideas come out of the data” – Dr. Steven Levitt

> “You guys are the future. What you’re doing is the key to a business’ success or failure.”

> Experiment & Test Hypothesis using DATA

> Misconceptions can cripple you. Let the data speak, even when it might be difficult

> Most important people = who understand and know what to do with data, not those who pretend they know the answer.

> Dr. Levitt: without data any biz will be left behind, must experiment and accept failure

*Above text is linked to tweets.

That’s about it for this post. What do you think about the tips that Dr Levitt shared?

 

Excel: Swapping (reversing) the Axis of a Table Data

Data preparation (or call it pre-processing) is an essential and time-consuming part of any data analytic’s project. To that end, I was working on a data set needed some changes before I could plot it on an effective data visualization. Here’s what I did:

My Challenge:

I was working on a data set that looked like this:

Date Abu Dhabi, United Arab Emirates Adalaj, Gujarat, India Addison, TX
1/1/2013 1 4 2
1/2/2013 1 4 2
1/3/2013 1 4 3
1/4/2013 3 3 3
1/5/2013 2 2 4
1/6/2013 2 3 4
1/7/2013 2 3 3
1/8/2013 2 2 4
1/9/2013 2 2 3

BUT: I wanted my data to look like

Date 1/1/2013 1/2/2013 1/3/2013 1/4/2013 1/5/2013 1/6/2013 1/7/2013 1/8/2013 1/9/2013
Abu Dhabi, United Arab Emirates 1 1 1 3 2 2 2 2 2
Adalaj, Gujarat, India 4 4 4 3 2 3 3 2 2
Addison, TX 2 2 3 3 4 4 3 4 3

What did my real data looked liked?

it has 380 columns and 500+ Rows and so MANUAL copy pasting was NOT an option!

Excel 2010 Solution:

It’s so simple!

Step 1: Select the data > COPY (Shortcut: ctrl + c)

Step 2: Switch to a new/different excel sheet

step 3: Paste Special > Transpose (T)

excel paste special transpose swap axis data

So After doing this, This is how the Input & output looks:

excel paste special reverse axis

Conclusion:

In this post, We saw how to swap or reverse the axis of a table data in Excel 2010.

Tableau: Data Cleaning for Geographic Maps

Data cleaning is a major part of any analytic’s/data-visualization undertaking. If data cleaning is ignored then it leads to inaccurate data reporting & thus suboptimal business decisions.

To that end, if you create a Tableau’s Geographic map, please check the accuracy of your data by going to:

Menu Bar > Map > Edit Locations

Let me give you some examples:

Now, I have “states/province” as my geographic role for the variable and when I created a geographic map, I created a geographic map it didn’t show any state for New York State! See Before:

data cleaning geogrphic map before

So what did I do?

I navigated to Menu bar > Map > Edit locations:

data cleaning geogrphic map State

So I fixed it!

data cleaning geogrphic map Tableau

And After:

data cleaning geogrphic map after

Note that New York State is lighted up!

In the past, I’ve also have entered Latitude & Longitude if need be.  This is when it was not able to recognize few US cities and it was saying “ambiguous” – I inputted Latitude & Longitude to clean the data:

data cleaning geogrphic map city

Conclusion:

In this post, I described how you should check the data accuracy of a Tableau Geographic Map.