Internet enabled computers to be connected with each other.
Internet enabled Mobile Devices to be connected with each other.
Now, Internet will be used to enable physical things to be connected with each other. This is what is called “Internet of things” (IoT).
So what happens?
since more devices are connected with internet – we will able to generate more data! This is usually good if there’s a business vision around how to make sense of data to increase efficiency of all these things.
Here’s a nice case study from Microsoft (focus on the business case – the things in this case is “elevator” to drive reliability)
This is all good news for data professionals! There will be increased demand for professionals who can help businesses make sense of data generated via IoT.
Also beware of the “hype” around this technology. It’s important to take incremental steps to achieve the vision – Instead of trying to analyze data from ALL devices in your organization, start with one physical thing that matter the most for your organization or start with data that you have and take incremental steps to spread data culture in your organization!
Now that Big Data has become a mainstream word in IT and business, we have a new buzzword to learn/talk about IoT – but remember it’s all about making sense of data and your skills would be more valuable than ever!
SQL Server 2014 was Announced in Teched’s Keynote!
So while the focus of the SQL server 2012 was around in-memory OLAP, the focus of this new release seems to be In-memory OLTP (Along with Cloud & Big Data)
Here’s the Blog Post: http://blogs.technet.com/b/dataplatforminsider/archive/2013/06/03/sql-server-2014-unlocking-real-time-insights.aspx (Also Thanks for the Picture!)
Introduction to Data Science course taught by Bill Howe just started on coursera platform. Having studied the Data Intensive Computing in Cloud course at UW taught by Prof Bill Howe, I can say that this course would be great resource too!
Check it out: https://www.coursera.org/course/datasci
Here’s the resource: http://gettingstarted.hadooponazure.com/gettingStarted.html > “HDInsight Jumpstart”
The Tutorial will teach you how to analyze log files using Hadoop Tools like MapReduce, Hive, SQooP – check it out! It works with both HDInsight for local windows as well as Hadoop on Azure:
I hope this resource helps you get started on building an end-to-end solution with Hadoop on Windows/Azure.
I was selected to a be a speaker at SQL Saturday Trinidad! And it was amazing because not only did I get a chance to interact with the wonderful people who are part of SQL Server community there but also visited some beautiful places on this Caribbean island!
I visited Trinidad in January, just before their carnival season! And even though, people were busy preparing for carnival season, it was great to see them attend an entire day of SQL Server Training:
And here’s me presenting on “Why Big Data Matters”:
(Thanks Niko for the photo!)
And after the event, I also got a chance to experience the beauty of this Caribbean island!
Thank you SQL Saturday 185 Team for a memorable time!
Presentation Slides: The slides had been posted for the attendees prior to my presentation and if you want you can view them here:
I just researched about Machine Generated Data from the context of “Big data”, Here’s the list I compiled:
- Data sent from Satellites
- Temperature sensing devices
- Flood Detection/Sensing devices
- web logs
- location data
- Data collected by Toll sensors (context: Road Toll)
- Phone call records
And a Futuristic one:
Imagine sensors on human bodies that continuously “monitor” health. How about if we use them to detect diabetes/cancer/other-diseases in their early phases. Possible? May be!
Machine can generate data “faster” than humans. This characteristics makes it interesting to think about to analyze machine generate data and in some cases, how to analyze them in real-time or near real-time
Search for Machine Generated Data, you’ll be able to find much more, it’s worth reading about from the context of Big Data.
Here are the visualization:
1982 – 2009:
2000 – 2008
I grabbed data from: http://www.mkomo.com/cost-per-gigabyte And http://ns1758.ca/winch/winchest.html – Thanks!
Storage cost has drastically decreased. Mathematically, Storage cost has decreased exponentially. No wonder we can store lot’s of data for few dollars and no wonder that the age of Big Data has already arrived!
Over the past few weeks, I have posted notes about Analyzing Twitter Data w/ R, listing them here:
1. Install R & RStudio
2. R code to download twitter data
3. Perform Sentiment Analysis on Twitter Data (in R)
This Blog post applies to Microsoft® HDInsight Preview for a windows machine. In this Blog Post, we’ll see how you can browse the HDFS (Hadoop Filesystem)?
1. I am assuming Hadoop Services are working without issues on your machine.
2. Now, Can you see the Hadoop Name Node Status Icon on your desktop? Yes? Great! Open it (via Browser)
3. Here’s what you’ll see:
4. Can you see the “Browse the filesystem” link? click on it. You’ll see:
5. I’ve used the /user/data lately, so Let me browse to see what’s inside this directory:
6. You can also type in the location in the check box that says Goto
7. If you’re on command line, you can do so via the command:
hadoop fs -ls /
And if you want to browse files inside a particular directory:
HDFS File System Shell Guide
In this post, we saw how to browse Hadoop File system via Hadoop Command Line & Hadoop Name Node Status