Here’s a map of all the basketweavers in the world:

N

Basket Making Mecca

Forsaken

Basketless

Wasteland

Basket Making Mecca

Forsaken

Basketless

Wasteland

Too cold to weave

Basket Making Mecca

Forsaken

Basketless

Wasteland

Too cold to weave

Basket Making Mecca

Forsaken

Basketless

Wasteland

Too cold to weave

Basket Making Mecca

Forsaken

Basketless

Wasteland

Too cold to weave

Or, well, that’s all of the basket makers in the world… according to data from OpenStreetMap.

Which seems off, right? It’s reasonable to assume there’s at least one Aussie out there crafting baskets. Obviously this means OpenStreetMap (OSM) is an unreliable data source and no one should ever use it.

Yet thousands of applications rely on data from it! Apple maps supplements their maps with OSM, Foursquare bases it’s entire platform off it, and everything from government map based tools to prominent news orgs utilize it for a base.

Why are so many organizations using a source that’s clearly flawed?

I

OpenStreetWat

Well, for many places in the world, it’s surprisingly complete where it counts. For example, when it comes to road networks it has some of the best publicly available data.

These are major roads in OSM cropped to the United States, projected in a special version of albers that has Hawaii and Alaska positioned in the bottom left corner. Had I not cropped that data, these roads would indeed extend into Canada & Mexico. I'm also excluding tertiary & service roads, as well as footpaths, trails etc.


You can find other sources that have global coverage, are detailed, or are free to use, but only OSM offers it all. Which is what makes it so special.

Gathering geospatial data is an expensive endeavor. Most organizations dedicated to creating it were either narrowly focused (creating a dataset of a cities parks), compiling it to sell, or were just large enough to eat the sunk cost.

Started in 2004, OSM was created as an alternative to those approaches. Instead of throwing millions of $ at the problem, it instead throws millions of volunteers. With some exceptions, all OSM data has been added by volunteers mapping their world. Much like Wikipedia anyone can edit any and all of the information, adding features to the map or adding proper tags.

Hover over the buildings to see how people have labeled each buildings type & name. You may notice a lot of Building: Yes. I'd love to say that's a Yoko-Ono inspired hurrah for masonry, but it's just the default type assigned to a building if one's not defined or unknown.


Above is the building data for Madison, Wisconsin. You might notice as you zoom out it tapers it out in the upper right hand side once you get away from the downtown.

Much like the neglected basketweavers of the world, no one’s quite gotten around to adding in those residential buildings quite yet. Mapping the world is an insurmountable task, so most mappers have understandably prioritized adding the most important buildings such as hospitals, schools, and other major buildings before endless swaths of residential blocks.

Particularly in major cities, the map is incredibly detailed.

Guess the city!


As you may have picked up from the basketweaver map, Europe is the best mapped continent. This is in part because OSM was founded in Britain, but is maintained thanks to vibrant mapping communities scattered throughout Europe that get together in pub’s and map their areas.

II

So how is this Reliable?

For the reasons mentioned above, every map made with OSM data should have a caveat: There’s almost certainly missing bits here.

That said, because anyone can edit it, it’s constantly being improved.

For example, here’s an interactive that shows the nearest bar for any given point in DC.

Hover over the map to view a voronoi highlight of which areas are closest to each bar. There's almost certainly missing bits here.


I’m sure someone native to DC will notice their favorite bar missing, which could be because that bar is missing in the database, or tagged something other than bar, such as a restaurant. If I had this interactive hooked up to the live database, anyone adding a bar feature to DC would instantly see it pop up on the map.

I made this map because it’s a fun novelty, but that principle of ‘one true source’ is another place OSM shines. Humanitarian organizations in particular make great use out of mapped areas. They can mark roads flooded, utilize building data to determine where to allocate resources, or access damage to towns using before/after satellite imagery.

But if anyone can update the data, how do we make sure it’s valid? What’s to stop some intrepid kid from drawing dicks everywhere?

III

Pokemon & Valid Geodata

In many ways, OSM has protection through obscurity.

It’s less visible than Wikipedia, and the notion of adding features to a map sounds significantly more difficult than just editing text. Most interactions we make online are editing text fields after all, so we’re well trained there.

In mid 2016, Nintantic released PokemonGO and brought with it a flood of folks adding made up features to OSM.

PokemonGO is a game that uses * the world * as it’s base, and what you encounter in the game is determined by what natural features you’re physically around. If you’re near a lake you’ll run into water Pokemon. If you’re in a field, you’ll find bird Pokemon, and so on. It encourages folks to leave their house and go to parks, rivers, and such to find all these critters.

At some point someone discovered PokemonGO was referencing OSM data when it determined which natural features it’s players were around. For some folks, this was great to know because their hometown hadn’t been well mapped out:

I just signed up on OSM and made some edits to my very small hometown which had nothing ever added to it before. I added the park, the baseball field, landmark names. Also made some edits near where I work (added an area as industrial which wasn’t labeled before).

- reddit person AlphaRocker

Others took advantage of this and created ‘nests’ of fake features surrounding their home.

Can't help but love that they at least acknowledged that these weren't actual parks.

Others would misunderstand OSM and would add things that only existed in the game to the basemap, like checkpoints or item drop locations.

Luckily, OSM is widely used enough that there are people who run checks over the database to identify bad contributions. After a comb is run through the entire database, these checks flag potential bugs, and then someone either removes or reconciles the issue.

One basic example is a check that makes sure there aren’t polygons that self-intersect.

There might be buildings that are built in a bowtie shape, but it's a flag if one wall goes straight through the building. In the event of a bowtie building, it should just be an outline of that shape.

For PokemonGO, folks created checks that looked for:

  • Is the word ‘Pokemon’ used in the contribution anywhere?
  • Is this contribution full of questionably overlapping features, such as seven overlapping parks?
  • Is this contribution made by someone who just joined, with less than 5 edits, and mapping major features like lakes?

& to detect other vandalism, the same sort of checks are used for the rest of the database broadly speaking. Even when checks don’t catch vandalism, OSM is widely used and people are pretty quick to notice when something like the following shows up in their hometown;

IV

Basketweavers Unite

So, it’s incomplete, but it’s awesome. And it’s in a state of constant improvement.

It’s not perfect, but no geospatial dataset can be- the only 1 to 1 translation of the actual world is the real thing. At least with OSM there exists the means of updating which is open to anyone with the gumption to contribute.

So, if you know any local basketweavers in your hometown, you should consider adding them to the database. Once you’ve done that, consider checking out if your favorite bar is there, or that the local parks aren’t absent.

Even small contributions like those will help countless others have ever so slightly more accurate maps.

All the maps here were made with Mapbox GL JS. The data is all from OSM and obtained through Overpass Turbo queries.

If you want to hear more on vandalism detection and how people are checking OSM, Sanjay Bhangar of Mapbox has a great talk you can view here.

The US road data was pulled out of geofabrick exports of North America using code written by Marc Farra The result you see here has been simplified via a combination of topojson & mbtiles to help reduce the geometry at lower zoom levels.

I chose basket weavers as a quirky way to show off OSM's weaknesses, strengths, and occasional odd specificity. If you have any comments, or more importantly corrections, I'd love to hear it. Feel free to drop me a line.