Here’s a map of all the basketweavers in the world:
N
Basket Making Mecca
Forsaken
Basketless
Wasteland
Basket Making Mecca
Forsaken
Basketless
Wasteland
Too cold to weave
Basket Making Mecca
Forsaken
Basketless
Wasteland
Too cold to weave
Basket Making Mecca
Forsaken
Basketless
Wasteland
Too cold to weave
Basket Making Mecca
Forsaken
Basketless
Wasteland
Too cold to weave
Or, well, that’s all of the basket makers in the world… according to data from OpenStreetMap.
Which seems off, right? It’s reasonable to assume there’s at least one Aussie out there crafting baskets. Obviously this means OpenStreetMap (OSM) is an unreliable data source and no one should ever use it.
Yet thousands of applications rely on data from it! Apple maps supplements their maps with OSM, Foursquare bases it’s entire platform off it, and everything from government map based tools to prominent news orgs utilize it for a base.
Why are so many organizations using a source that’s clearly flawed?
I
Well, for many places in the world, it’s surprisingly complete where it counts. For example, when it comes to road networks it has some of the best publicly available data.
You can find other sources that have global coverage, are detailed, or are free to use, but only OSM offers it all. Which is what makes it so special.
Gathering geospatial data is an expensive endeavor. Most organizations dedicated to creating it were either narrowly focused (creating a dataset of a cities parks), compiling it to sell, or were just large enough to eat the sunk cost.
Started in 2004, OSM was created as an alternative to those approaches. Instead of throwing millions of $ at the problem, it instead throws millions of volunteers. With some exceptions, all OSM data has been added by volunteers mapping their world. Much like Wikipedia anyone can edit any and all of the information, adding features to the map or adding proper tags.
Above is the building data for Madison, Wisconsin. You might notice as you zoom out it tapers it out in the upper right hand side once you get away from the downtown.
Much like the neglected basketweavers of the world, no one’s quite gotten around to adding in those residential buildings quite yet. Mapping the world is an insurmountable task, so most mappers have understandably prioritized adding the most important buildings such as hospitals, schools, and other major buildings before endless swaths of residential blocks.
Particularly in major cities, the map is incredibly detailed.
As you may have picked up from the basketweaver map, Europe is the best mapped continent. This is in part because OSM was founded in Britain, but is maintained thanks to vibrant mapping communities scattered throughout Europe that get together in pub’s and map their areas.
II
For the reasons mentioned above, every map made with OSM data should have a caveat: There’s almost certainly missing bits here.
That said, because anyone can edit it, it’s constantly being improved.
For example, here’s an interactive that shows the nearest bar for any given point in DC.
I’m sure someone native to DC will notice their favorite bar missing, which could be because that bar is missing in the database, or tagged something other than bar
, such as a restaurant. If I had this interactive hooked up to the live database, anyone adding a bar feature to DC would instantly see it pop up on the map.
I made this map because it’s a fun novelty, but that principle of ‘one true source’ is another place OSM shines. Humanitarian organizations in particular make great use out of mapped areas. They can mark roads flooded, utilize building data to determine where to allocate resources, or access damage to towns using before/after satellite imagery.
But if anyone can update the data, how do we make sure it’s valid? What’s to stop some intrepid kid from drawing dicks everywhere?
III
In many ways, OSM has protection through obscurity.
It’s less visible than Wikipedia, and the notion of adding features to a map sounds significantly more difficult than just editing text. Most interactions we make online are editing text fields after all, so we’re well trained there.
In mid 2016, Nintantic released PokemonGO and brought with it a flood of folks adding made up features to OSM.
PokemonGO is a game that uses * the world * as it’s base, and what you encounter in the game is determined by what natural features you’re physically around. If you’re near a lake you’ll run into water Pokemon. If you’re in a field, you’ll find bird Pokemon, and so on. It encourages folks to leave their house and go to parks, rivers, and such to find all these critters.
At some point someone discovered PokemonGO was referencing OSM data when it determined which natural features it’s players were around. For some folks, this was great to know because their hometown hadn’t been well mapped out:
I just signed up on OSM and made some edits to my very small hometown which had nothing ever added to it before. I added the park, the baseball field, landmark names. Also made some edits near where I work (added an area as industrial which wasn’t labeled before).
Others took advantage of this and created ‘nests’ of fake features surrounding their home.
Others would misunderstand OSM and would add things that only existed in the game to the basemap, like checkpoints or item drop locations.
Luckily, OSM is widely used enough that there are people who run checks over the database to identify bad contributions. After a comb is run through the entire database, these checks flag potential bugs, and then someone either removes or reconciles the issue.
One basic example is a check that makes sure there aren’t polygons that self-intersect.
For PokemonGO, folks created checks that looked for:
- Is the word ‘Pokemon’ used in the contribution anywhere?
- Is this contribution full of questionably overlapping features, such as seven overlapping parks?
- Is this contribution made by someone who just joined, with less than 5 edits, and mapping major features like lakes?
& to detect other vandalism, the same sort of checks are used for the rest of the database broadly speaking. Even when checks don’t catch vandalism, OSM is widely used and people are pretty quick to notice when something like the following shows up in their hometown;
IV
So, it’s incomplete, but it’s awesome. And it’s in a state of constant improvement.
It’s not perfect, but no geospatial dataset can be- the only 1 to 1 translation of the actual world is the real thing. At least with OSM there exists the means of updating which is open to anyone with the gumption to contribute.
So, if you know any local basketweavers in your hometown, you should consider adding them to the database. Once you’ve done that, consider checking out if your favorite bar is there, or that the local parks aren’t absent.
Even small contributions like those will help countless others have ever so slightly more accurate maps.
All the maps here were made with Mapbox GL JS. The data is all from OSM and obtained through Overpass Turbo queries.
If you want to hear more on vandalism detection and how people are checking OSM, Sanjay Bhangar of Mapbox has a great talk you can view here.
The US road data was pulled out of geofabrick exports of North America using code written by Marc Farra The result you see here has been simplified via a combination of topojson & mbtiles to help reduce the geometry at lower zoom levels.
I chose basket weavers as a quirky way to show off OSM's weaknesses, strengths, and occasional odd specificity. If you have any comments, or more importantly corrections, I'd love to hear it. Feel free to drop me a line.