OK, so you know I recently installed Data Mining Excel add-in: How to enable Data Mining in EXCEL powered by SQL Server Analysis Services? – and I couldn’t wait to go beyond the samples provided with the Excel add-in. So I decided to start with Forecasting. In this blog-post, I downloaded my Twitter stats into Excel. And of course, I had to clean and add computations which was equally exciting and I ended up with a data-set that had the follower count and also number of tweets I had.
The Date-Range in the Data-set is from 23 July. 2012 – 5 Sep. 2012. Of course, to get “better” forecast – you need to feed more historical data. In my case, the Twitter API didn’t allow me to pull ALL historical data at one go – let’s not get into details because that’s not the focus of the blog-post. But rule of thumb is that more historical data gives better forecast. And, Here are the steps I followed:
1. Loaded Data into Excel 2010. (I am using Twitter as an example here. Other real world scenario’s would be Sales Forecast). Note that I have kept it simple for the purpose of the demo.
2. Now, let’s create a forecast model.
Go to Data Mining Tab > Data Modeling > Forecast:
3) Forecast Wizard:
a. Getting Started with Forecast Wizard: NEXT
b. Select Source Data. Then Press NEXT
c. Select input columns. In this case, I selected Date as Time Stamp and Total Follower Count & Total Tweet Count as Input columns.
- Notice the Parameters Button? That is used to set the configuration of how the (Time Series) algorithm runs. For the purpose of this demo – I am going to explore that.
4) It forecast-ed (Using the Time Series Data Mining Algorithm) the follower count for next week and if you can see – it says that on 12th Sep 2012, I would have 438 followers which is +3 when compared to today’s (5th Sep) follower count.
5) Few Notes
a. I had selected Total Tweet count just to show that It can forecast more than one variable at same time. Here the model used the Date Column as the time-stamp while forecasting.
b. Of course, this may not happen for REAL because your follower count can go up or down based on
- Tweet (Quality Tweets!) Frequency
- Number-of-bots-that-decide-to-follow-you (kidding!)
- Re-Tweeting interesting content and replying your followers. Basically being social!
- If tweet gets picked by someone famous, your count increases
- Other real life “surprises”..
Here’s the point though: This was just a Toy Example to show “forecasting” with Excel Data Mining – If I explore it further, I would document my experiences!
And oh, BTW here’s a nice video by @MarkTabNet and @SolidQ (SolidQ: I work at this amazing company!) on “Microsoft Data Mining Demo — Forecasting (SQL Server 2008 and Excel 2007″. And MarkTabNet is a great resource for Data Miners, Check it out!
- Where can we find datasets that we can play with for Business Intelligence, Data Mining, Data Analysis Projects? (parasdoshi.com)
- How to enable Data Mining in EXCEL powered by SQL Server Analysis Services? (parasdoshi.com)
- Data Mining: Classification VS Clustering (cluster analysis) (parasdoshi.com)
- What is the difference between Data Analysis and Data Mining? (parasdoshi.com)