How many websites in USA exceed the data collection limitations of Google Analytics?

Little bit of background:

- I was researching on the limitations of Google Analytics

- After reading the Limitations, I wanted to know – How many websites in USA exceed the limitations of Google Analytics?

So Here’s the Short Answer:

Only 108 sites exceed this limitation

(as of today)

And Here’s the long answer:

Limitations of Google Analytics. Here’s the URL: http://support.google.com/analytics/bin/answer.py?hl=en&answer=1070983

And I am quoting from the above URL:

Data Collection limit: You should not send more than 10 million hits per month. If you exceed this limit, there is no assurance that the excess hits will be processed.
Data Freshness limit: Sending more than 200,000 visits per day to Google Analytics will result in your reports being refreshed only once per day

And to take it further, I wanted to know how many website in USA get greater than 10 million hits per month, turns out only 108 websites in US get that much traffic.
Source: http://www.quantcast.com/top-sites/US?jump-to=108

so from data collection limit standpoint, only these 100 odd sites would exceed the limitations of Google Analytics.

To put things in Perspective: MySpace.com does not exceed Data Collection Google Analytics Limit:

my space can use google analytics

Conclusion

Just knowing about the Data Collection Limit was not interesting but I combined data from other data sources – it seemed very interesting to me! Anyhoo – In this post, I shared:

> Limitations of Google Analytics

> Answered How many websites in USA exceed the limitations of Google Analytics?

[UPDATE Feb 10th 2013] I made a mistake in correlating data from Quantcast and Google Analytics. Lesson learned: double-check for units when comparing data from two different sources

Florin Dumitrescu pointed out that while Quantcast uses People/Month and Google uses hits/month. They may NOT be always the same. Sorry about this.

Seven Interesting Google Projects that a Data Professional may not have heard about:

Here’s the list:

1. Google Refine

2. Google Prediction API

3. Google Trends

4. Google Chart Tools

5. Google Big Query

6. Google Correlate

7. Google Fusion Tables

Note: These projects may not be ready to be used in your production environment as some of them are in Beta/Experimental stages and their support/development may be deprecated in future.

Thanks: I thought of writing this blog post after a discussion I had with Parth Acharya about Google and it’s projects for Data Professionals. He pointed me to some of the most interesting samples that used Google Fusion Tables and here’s his one of the blog post on related topic: Google Fusion Table & Data Visualization

Mapping “Facebook Page Likes vs Country” using PowerView in Office 2013

Just a quick note that you can quickly create maps in PowerView in office 2013. I just created one in 2 minutes:

Facebook Page likes VS Country

This seems like a great way to visualize where your fans are from. In my case, most of them are from India and so one actionable insight would be to schedule posts based on Time Zone in India. And I can imagine that such reports could be very helpful to brands who have sizable fan following on Facebook.

Here’s the screenshot:

Maps Power View Excel 2013 social media analytics
Thanks to the following blog-posts for inspiration:

1. Google Fusion Table & Data Visualization (He used Google Fusion Tables, I used PowerView!)

2. Creating Maps in Excel 2013 using Power View

Five examples of Recommendation Systems on the web:

Recommendation systems is application of Data Mining Technologies. I have researched about how to implement a recommendation system and as a part of my research, I studied recommendation systems that are already out there on the Internet and here are five examples of Recommendation systems on the web:

1. Amazon

Customers Who Bought This Item Also Bought:

recommendation systems amazon customers who bought this also bought

Frequently Bought Together: (Example of Market Basket Analysis a.k.a Association Rules):

recommendation systems amazon frequently bought together

2. LinkedIn

You should read this: How does LinkedIn’s recommendation system work? – it would open up your brain to “recommendation” opportunities around you!

Jobs you may like + Groups you may like + Companies you may follow:

recommendation systems Linkedin Groups Jobs Companies

3. Netflix

Did you knew about Netflix Prize for improving their recommendation engine? If not you should read that!

Here’s their Movies you’ll love recommendation system:

netflix prize recommendation system

4. Twitter

People you may want to follow:

twitter who to follow recommendations data mining

5. Google

I do not have a screenshot but just wanted to point out the Google “personalize” (a.k.a recommends based on past behavior) search results based on your search history. And you can switch that off, if you want: Turn off search history personalization

Conclusion

In this blog-post, we saw examples of recommendation systems. The key take away is that there is more than one approach to building a recommendation system. The approaches can be based on 1. Past Behavior 2. Past Behavior of “friends” 3. Recommendation based on the Item that is being searched And you can definitely, Mix and Match!

And I hope this post helped you understand an application of data mining that’s all around us! And question: Where else do you see recommendation systems in action? Leave a comment!