News from PASS Summit’14 for Business Analytics Professionals: #sqlpass #summit14

This post is a quick summary for all Business Analytics related updates that I saw at PASS Summit’14:

1. Theme of the Keynote(s)/Session(s) seemed to be around educating the community about the benefits of the NEW(er) tools. I saw demos/material for cloud-based tools like SQL databases, Azure stream analytics, Azure DocumentDB, AzureHDInsight & Azure Machine learning. The core message was pretty clear: A data professional does two things – 1) Guards data OR 2) helps to generate Insights from Data – And they will need to keep up-to-date on the new tools to future-proof their career.

Read more about this here: http://blogs.technet.com/b/dataplatforminsider/archive/2014/11/05/microsoft-announces-major-update-to-azure-sql-database-adds-free-tier-to-azure-machine-learning.aspx

2. Coming soon: Power BI will be able to connect to on-premise SSAS data sources (multi-dim & tabular).

3. Coming soon: A better experience to create Power BI dashboards.

Read more about Power BI updates here: http://www.jenunderwood.com/2014/11/05/pass-summit-2014-bi-news/

4. Azure Machine Learning adds a free-tier! You won’t need a credit-card/subscription to sign up for this.

5. I also saw sessions proposing new way of thinking about an architecture for “Self Service BI” and “Big Data” which might be worth following because since these are newer tools, it’s definitely worth considering an architecture that’s designed to make the most of the investments in these new tools. That’s it & I’ll leave you with a quote from James Phillips from Day 1’s keynote:

How does Internet of Things (#IoT) impact data professionals?

Internet enabled computers to be connected with each other.

Internet enabled Mobile Devices to be connected with each other.

Now, Internet will be used to enable physical things to be connected with each other. This is what is called “Internet of things” (IoT).

So what happens?

since more devices are connected with internet – we will able to generate more data! This is usually good if there’s a business vision around how to make sense of data to increase efficiency of all these things.

Here’s a nice case study from Microsoft (focus on the business case – the things in this case is “elevator” to drive reliability)

 

This is all good news for data professionals! There will be increased demand for professionals who can help businesses make sense of data generated via IoT.

Also beware of the “hype” around this technology. It’s important to take incremental steps to achieve the vision – Instead of trying to analyze data from ALL devices in your organization, start with one physical thing that matter the most for your organization or start with data that you have and take incremental steps to spread data culture in your organization!

Now that Big Data has become a mainstream word in IT and business, we have a new buzzword to learn/talk about IoT – but remember it’s all about making sense of data and your skills would be more valuable than ever!

SQL Server 2014!

SQL Server 2014 was Announced in Teched’s Keynote!

SQL Server 2014 Teched KeynoteSo while the focus of the SQL server 2012 was around in-memory OLAP, the focus of this new release seems to be In-memory OLTP (Along with Cloud & Big Data)

Here’s the Blog Post: http://blogs.technet.com/b/dataplatforminsider/archive/2013/06/03/sql-server-2014-unlocking-real-time-insights.aspx  (Also Thanks for the Picture!)

 

 

Resource: Introduction to Data Science by Prof Bill Howe, UW

Introduction to Data Science course taught by Bill Howe just started on coursera platform. Having studied the Data Intensive Computing in Cloud course at UW taught by Prof Bill Howe, I can say that this course would be great resource too!

Check it out: https://www.coursera.org/course/datasci

Introduction to Data Science

Resource: A great tutorial for Hadoop on local windows and Azure.

Here’s the resource: http://gettingstarted.hadooponazure.com/gettingStarted.html > “HDInsight Jumpstart”

The Tutorial will teach you how to analyze log files using Hadoop Tools like MapReduce, Hive, SQooP – check it out! It works with both HDInsight for local windows as well as Hadoop on Azure:

HDInsight hadoop on windows starting guide tutorial

Conclusion:

I hope this resource helps you get started on building an end-to-end solution with Hadoop on Windows/Azure.

Event Recap: SQL Saturday 185 Trinidad!

I was selected to a be a speaker at SQL Saturday Trinidad! And it was amazing because not only did I get a chance to interact with the wonderful people who are part of SQL Server community there but also visited some beautiful places on this Caribbean island!

I visited Trinidad in January, just before their carnival season! And even though, people were busy preparing for carnival season, it was great to see them attend an entire day of SQL Server Training:

SQL Saturday 185 trinidad attendees

And here’s me presenting on “Why Big Data Matters”:

(Thanks Niko for the photo!)

paras presenting on big data

And after the event, I also got a chance to experience the beauty of this Caribbean island!

view trinidad island port of spain

port of spain sql saturday

Thank you SQL Saturday 185 Team for a memorable time!

Presentation Slides: The slides had been posted for the attendees prior to my presentation and if you want you can view them here:

http://parasdoshi.com/2013/01/25/download-ppt-why-big-data-matters/

Examples of Machine Generated Data from “Big Data” perspective:

I just researched about Machine Generated Data from the context of “Big data”, Here’s the list I compiled:

- Data sent from Satellites

- Temperature sensing devices

- Flood Detection/Sensing devices

- web logs

- location data

- Data collected by Toll sensors (context: Road Toll)

- Phone call records

- Financial

And a Futuristic one:

Imagine sensors on human bodies that continuously “monitor” health. How about if we use them to detect diabetes/cancer/other-diseases in their early phases. Possible? May be!

Interesting Fact:

Machine can generate data “faster” than humans. This characteristics makes it interesting to think about to analyze machine generate data and in some cases, how to analyze them in real-time or near real-time

Ending Note:

Search for Machine Generated Data, you’ll be able to find much more, it’s worth reading about from the context of Big Data.

Thanks:

http://www.dbms2.com/2010/04/08/machine-generated-data-example/

http://en.wikipedia.org/wiki/Machine-generated_data

http://tdwi.org/articles/2012/10/23/machine-generated-big-data.aspx

Data visualization: Cost of Hard Drive storage space

Here are the visualization:

1982 – 2009:

1982 2009 storage cost

2000 – 2008

2000 2008 storage cost

I grabbed data from: http://www.mkomo.com/cost-per-gigabyte And http://ns1758.ca/winch/winchest.html – Thanks!

Conclusion

Storage cost has drastically decreased. Mathematically, Storage cost has decreased exponentially. No wonder we can store lot’s of data for few dollars and no wonder that the age of Big Data has already arrived!

How to start Analyzing Twitter Data w/ R?

Over the past few weeks, I have posted notes about Analyzing Twitter Data w/ R, listing them here:

1. Install R & RStudio

2. R code to download twitter data

3. Perform Sentiment Analysis on Twitter Data (in R)