Remove Duplicates in Excel Tables using Data Explorer Add-in:

In this blog post, we’ll see how you can remove duplicated and clean data in excel tables using Data Explorer Add-in.

Problem:

Our Excel Table has following Data:

Month Month Name
1 January
1 January
1 January
2 February
2 February
3 March

And we want to remove duplicates to make the data-set look like this:

Month Month Name
1 January
2 February
3 March

 

In real world data-sets, we wouldn’t have few rows but lot’s of rows and doing it manually wouldn’t be the wisest option. With that in mind, let’s look for a few-clicks solution that can help us remove duplicates.

Solution:

If you haven’t already, download the Data Explorer add-in preview available for Excel 2010 & 2013. It can do a lot more than removing duplicates – it’s a great add-in and it’ll save you lots of time especially if your job involves discovering, cleaning and combining data for analysis purposes. After you’re done installing the add-in, use the steps below to remove duplicates in an excel column:

1. Open Data in Excel. Switch to Data Explorer Tab

2. For the purpose of the demo, I am assuming that you already have the data in excel file. If not, you can connect to other sources via the add-in.

3. Data Explorer add-in > Excel Data> From Table

data explorer excel remove duplicates

4. After you’ve clicked on the From Table, a query editor will pop up:

excel data explorer query editor

5. Select both columns

(you can select both columns by: select first column > hold down the ctrl key and then click on second column)

6. Right click > Remove Duplicates

data explorer remove duplicates excel

7. click on done if you see that the duplicates have been removed correctly

data explorer excel remove duplicates 2

Conclusion:

In this blog post, we saw how to remove duplicates and clean data in Excel using the Data Explorer Preview add-in.

If you’ve not downloaded and installed the data explorer add-in for Excel 2010 & 2013, you can find Information about it here: http://office.microsoft.com/en-us/excel/download-data-explorer-for-excel-FX104018616.aspx 

Note:

1) URL to download the add-in may change in future

2) The steps that I described may also change because as of today the ad-in is in “preview” stage and things may change in future.

PASS Business Analytics Conference Keynote Day #1

In this post, I’ll summarize the PASS Business Analytics Conference’s Keynote Day #1:

The structure of the Keynote:

PASSt Business Analytics Conference

One of the NEW challenges that Data Pros face today is complexity involved in building a BI solution. Following slides nicely represent the challenge from the Tools standpoint:

pass business analytics conference keynote hadoop

Image Courtesy: https://twitter.com/SQLGal/status/322342662013321216

Microsoft’s Goal is to SIMPLIFY the above situation

NEW Tools:

> Data Explorer (Excel add-in)

> Power View in Excel 2013

> Geo Flow

Key Take away from the demo’s was:

Power View is a great tool that you could use to extract insights from data.

E.g. Insights about Music Charts from Germany:

Now combine the power of Power View w/ the new capabilities like Data Explorer that let’s you find, combine & refine data via Data Explorer.

In the Demo, they combined data in hadoop w/ data in relational sources. This is Powerful!

And Also

The Preview for GeoFLow in Excel was announced!

They had a great demo on a pretty big touch device:

GEO FLOW For EXcel

Sorry for the poor image – but imagine a touch device of that size w/ an interactive data visualization that has 3D geo maps!

Conclusion:

They had a nice message at the end of the keynote:

 

An auto-refreshing Word Document that fetches data numbers from Excel:

In this post, I’ll show you how you can create a word document that:

1) Gets numbers from Excel

2) auto refreshes when the data gets changed/updated in Excel

Note: I am using Excel 2010 & Word 2010 for the purpose of this blog.

Here are the steps:

1. Scenario:

Let’s say we have a line in word that says “The revenue for year _____ was $ ______”

and we want to fetch data numbers for Year and revenue from Excel.

2. Let’s go!

here’s the data in Excel:

excel data numbers

3. In excel; copy A2 that has value 2012

4. switch to WORD:

now this is IMPORTANT. do NOT just paste it.

move your cursor to the destination position > right-click > paste options > Link and Merge formatting 

link and merge formatting

There’s also an option of link and keep source formatting, this is how it looks:

link and keep source formatting

5. repeat the steps for revenue

6. Close Word.

7. Open Excel and change the value of year to 2011 and revenue to 4000

8 Now open Word 2010, say yes if a dialog box pops up

And here you go:

linked word file to excel data

Conclusion:

In this post, we saw how to create an auto-refreshing word document that fetches data numbers from Excel.

Adding a TrendLine to a Time Series Line Chart in Excel 2010:

I was playing w/ a time series data set in Excel 2010 and learned how to add a Trend-line and in this blog post, I’ll share how I added it:

First up, How is Trend-line useful? Here are few answers:
- It helps us see how data is changing over time, in other words, it helps us find “trends”
- It helps us forecast future.

With that, here is the chart without Trend-line:on time flight arrivals excel without trendline

Now let’s add the trend-line and you’ll be able to compare on your own how Trend-line makes it easier to spot “trends”. Here are the steps:

1. select the line > right-click > add trend line

add trendline time series

2. configure the trend-line options

trend line configuration options excel

3. I also changed the line style

4. And Here’s the chart w/ trend-line

american airlines on time flight arrivals excel with trendline

Conclusion:

In this post, we saw how to add trend-line in the time series chart in excel 2010

Grab Twitter search data using R and export to a tab delimited file

In this blog-post, we would see how you can grab Twitter search data using R and then export it to tab delimited file. Here are the steps:

1) First up, if we do not have R – you can install it by following the tutorial: Let’s install R Studio and R on windows machine

2) Instal Package: TwitteR if you haven’t

3) Look at the following code, modify the path in line #4 for write.table:

> require(twitteR)

> tweets <- searchTwitter(“#excel”,n=1500)

> tweetdataframe <- do.call(“rbind”,lapply(tweets,as.data.frame))

> write.table(tweetdataframe,”c:/users/paras/desktop/tweetsaboutexcel.txt”,sep=”\t”)

4) so now you have tab delimited file having about 1500 tweets!

1500 tweets R excel tab delimited RStudio code

You can also export the tweets to Excel spreadsheet, SPSS and SAS. Check this out: quick R Exporting Data

Conclusion:

In this blog-post, we saw how you can grab 1500 tweets using R and then export it to a tab delimited file.

A Social Media Analytics Sample Dashboard in Excel Powered by PowerPivot.

I found a great sample Dashboard on Social Media Analytics in Excel that is powered by PowerPivot. Here’s the screenshot of the Dashboard.

excel powerpivot twitter social media analytics dashboard 1

Here are the steps if you want to download and play with the Dashboard:

  1. Install Power Pivot add-in
  2. Download the “Analytics for Twitter” excel sample (powered by PowerPivot). Link: http://www.microsoft.com/en-us/download/details.aspx?id=26213
  3. It creates an “Analytics for Twitter” excel file on Desktop > Open it.
  4. The dashboard is powered by data it pulls in the Power Pivot:excel powerpivot twitter social media analytics dashboard
  5. You can change the search queries:
    a. Edit the default search terms:excel powerpivot twitter social media analytics dashboard
    b. Refresh Data:excel powerpivot refresh data
    c. Updated Dashboard!excel powerpivot twitter social media analytics dashboard

That’s about it. And here’s a Youtube Video showing some features in this sample:

Conclusion:
In this blog-post, I shared a great sample dashboard built on top of PowerPivot model.

 

Crunch more than 1 million rows in Excel 2010 with free addin called Power Pivot!

Lately, I have been talking to few business folks who do their own data analysis in excel (2010) and sometimes they run into the excel 2010 limit of 1 million rows. And so when I hear that, I talk about Power Pivot and I talk about what It can do and what it cannot and they are just amazed that there’s a FREE add-in that will help them crunch more than 1 million rows!

happy suprised business user excel power pivot

Image courtesy

You can explore more about this amazing add-in here: http://www.microsoft.com/en-us/bi/powerpivot.aspx

And Read more about pros/benefits of PowerPivot:

Top 5 Ways PowerPivot Helps Excel Pros

PowerPivot? But I use pivot tables in Excel

PowerPivot Model: How to TEST or EDIT existing connections?

Have you ever had someone sent you a PowerPivot model and asked you to do something with it? And if so – may be, you would have to see what data source(s) the model is using and if applicable, you test the existing connections. If you find yourselves in such a situation, this blog post is for you:

1) Open the excel file and switch to PowerPivot Window

2) Now, switch to Design Tab > Click on Existing Connections:

existing connection design tab of powerpivot model

3) Here you’ll find the list of connections under “PowerPivot Data Connections” > Select the connection you wish to TEST or EDIT > Click on EDIT button

4)  Now here you can edit the data source. And if you click on “Test Connection”, you’ll be able to test it too.

edit data source connection power pivot model

5) After you’re done, click on SAVE.

And you’ll now see the “Existing Connections” Box again:

powerpivot select a connection to a data source that contains the data you want to import

Click CLOSE and you’re done, you have successfully edited or tested the existing connection.

 

 

How to import data from Azure Datamarket to Excel

Short answer: Download Azure Datamarket Excel AddIn

And you want to know why i am writing a blog post for it? spare couple of minutes and you will realize that you were better off just knowing the short answer. yeah seriously. And still if you are adamant on reading it – please drop me an email on contact[at]parasdoshi[dot]com, I want to talk to you! seriously!

Have you ever wondered how to import data from Azure Data market to PowerPivot Excel? And you know what I did? – since I knew we could load data from datamarket into powerpivot, I did that! There is an inbuilt support btw:


Now, I copied this data (CTRL C) and tried pasting it in Excel sheet (CTRL V). And you know what – nothing happened! So tried again! And again nothing happened. Now, i again selected the data from powerpivot window via right clicked -> copy. Went to excel worksheet and right clicked -> paste special. And guess, my laptop froze for a while and in a weird way I was happy because I thought that the copy was successful! But again it did not work. If it had, well I would not have written this blog post.

Any-who, so it was time to read some whitepapers blog posts. some googling and binging. And you know what, while I was binging and googling stuff, I liked the bing wallpaper, so i had to change my wallpaper. So I did that! Look at it, don’t you like it too:

And after a little tweeting facebooking searching, i found EXCEL ADD IN!! yeah! you can download it here: https://datamarket.azure.com/addin

After installation, you will find under the DATA tab. you can sign in to datamarket directly from there. you can create a datamarket account if you do not have one. you allow access if you have not done so before. And then you can browse available data-sets! it’s that easy.

Then you could just select the data-set you want to import and click on “import data”:

And then click on “import data” that you see at the bottom of the below screenshot.

And that’s it – downloading started! optionally you could filter the data if you want.

That’s it. Moral of the story:

Download Excel addin to import data from azure datamarket to excel

 

BTW I am using Excel 2010!