How to map Open Data
One of the reasons we love maps so much is that they expose so much more information than can be communicated through tabular data. Yet, there are thousands of public Excel documents, Wikipedia HTML tables, and government CSVs with really interesting stories to tell. All that's required to unlock these insights is a human like you to highlight, copy, and paste into BatchGeo. You can keep the resulting map to yourself, or pay it forward by sharing with friends, co-workers, or the greater community.
It's easier than you might realize to get started with open data. This tutorial will walk you through an example from Wikipedia—one of thousands of lists the editors have compiled—showing the tallest buildings in the world. Below we'll show obtaining a dataset, cleansing and reducing the data, plotting the locations on a map, and sharing it with the world (or just the people you choose).
Copy the Wikipedia data
The tallest buildings data is listed in Wikipedia as a standard HTML table. Using your mouse, you should be able to highlight the entire contents of the table by clicking and dragging from the upper left to the lower right of the table. Sometimes the initial header may be hyperlinked, so you may need to start your highlight outside of the table.
With everything highlighted, you can copy with Ctrl+C (Cmd+C on Mac). Now this text can be pasted directly into your spreadsheet software, such as Excel or Numbers. Spreadsheet software is not necessarily required. You may be able to massage the data in a simple text file, or even paste it directly into BatchGeo.
For this example, there are a few artifacts from Wikipedia references remaining, as well as some highly formatted text, so we want to do some cleanup. Rather than use the shortcut to paste (Ctrl+V in Windows or Cmd+V on Mac), go to the edit menu and choose Paste Special. This will provide the option for Text, which will be the unformatted text.
There’s still a little cleanup left to do. In the next section we'll show how to do that using Excel.
Clean the Wikipedia data
With the tallest buildings data stored in an Excel document, we can make a few tweaks before submitting the data to BatchGeo. For example, the data begins on the second row, so we could remove the first row entirely if we want to be especially clean with our data. This will help us sort it if we want to explore the data before unlocking the geographic elements using BatchGeo.
The first row of your data should always contain headings, which BatchGeo will use to show all the meta-data on the map (unless you choose otherwise). Make sure these headings match what you want others to see and that they're free of Wikipedia referencing artifacts. For example, you should remove the [A][9] from the Building label.
To remove all the bracketed references, use Excel’s find and replace with a wildcard search. Click Edit, then select Replace. You want to find [*] and replace it with nothing. The * is a wildcard and will match anything within the brackets. Click Find Next, then click Replace for each match you want to remove. You can try Replace All, but this can be dangerous for searches that match more than you had in mind.
At this point, our data is looking pretty good. We could use another Find and Replace to remove the redundant “m” and “ft” units if we wanted, though BatchGeo will still interpret them as numbers for grouping. If you’d like, this is a good time to re-order columns to display differently. BatchGeo displays meta-data in the same order as the spreadsheet, with columns on the left being displayed first. When you’re happy with what you see, you’re ready to create a map the simple, easy, BatchGeo way.
Create your map
With your data all cleaned up, it's time for the fun part. Highlight all the cells of your data by clicking and dragging from the upper left to the lower right. Alternatively you can click the column letter above the header row and drag to the right, selecting all the columns, as well as the data below it. With the data you want selected, copy it with Ctrl+C (or Cmd+C on Mac).
Load the BatchGeo home page. If you have 250 or fewer lines of data, you don't even need to create an account to try our free batch geocoding service. Right from the home page, click into the data box and paste your data using Ctrl+V (or Cmd+V on Mac). Then click Validate and Set Options, which will let you choose some additional BatchGeo features.
Change the Region to be International, since these buildings are all over the world. City should be correctly pre-selected and BatchGeo chose the first column, Rank, as the default grouping. You can change the grouping and tweak advanced features if you’d like. Or, just click Make Map to start the geocode process, which usually takes just a couple seconds.
You’ll see a preview of your map, complete with the ability to click around and group by your meta-data. Look good? Click Save & Continue, where you’ll provide a name and enter your email address so you can make changes to the map in the future. You’ll also be able to select whether your map is public, which is required for some sharing options, such as embedding.
Share your map
BatchGeo has sharing capabilities built into our tool. For starters, you can simply copy the URL from the location bar in your web browser. Every map has its own unique address. Highlight and copy (Ctrl+C or Cmd+C) the address, which will start with batchgeo.com/map/. You can then paste (Ctrl+V or Cmd+V) the URL into an email, an instant message, a tweet, or anywhere else you want to share it.
We also have an embeddable map feature that can put your map within any web page, such as a blog post. You can embed a fully interactive Google Map, or a simple image badge, which provides a quick preview of your map and a link to the full map. BatchGeo pre-populates the code for each of these in the map editing screen and in the email you receive when you provide an email address as you save your map.
Find other open data
Now that you've seen how easy it is to take open data from Wikipedia and map it on a BatchGeo map, you likely want to find more of it. One fun place to look on Wikipedia is the List of Lists, which points to data-centric pages like the tallest buildings example. Not every list is place-related or ready for copy-pasting, but you'll find plenty of mapping opportunities within its many pages.
You can also look to open data repositories such as the Data.gov data catalog. Using the search functionality, you can filter to only find Excel documents or CSVs. You’ll have to read the descriptions to find out whether the data is location-specific. Some of what Data.gov provides may contain latitude and longitude points. BatchGeo can use these map coordinates as an alternative for an address, city name, or other location.
By mapping open data, you're creating a brand new way to view the story underlying the information. You can keep the insights for yourself, or spread them far and wide by creating a public map. Get started now for free.