OpenStreetMap logo OpenStreetMap

While browsing Taginfo I got curious how many elements have at least one key described on the Wiki and how big share of all keys the described ones make up. Therefore, I decided to check it out.

The analysis consisted of the following parts:

  1. fetching OSM database dump from planet.osm.org;
  2. fetching key statistics from Taginfo with the API;
  3. extracting “is in Wiki” info into separate file;
  4. altering “is in Wiki” info for keys which were described on the Wiki after the database was dumped. The alteration was based on the recent changes registry;
  5. processing the dump with DuckDB:
    • extracting element type, its ID, and its tags to new table: CREATE TABLE elements AS SELECT kind, id, tags FROM ST_READOSM('planet-latest.osm.pbf');;
    • exploding keys to separate records: CREATE TABLE elements_keys AS SELECT kind, id, UNNEST(map_keys(tags)) FROM elements;;
  6. querying the database.

These are queries I provided to DuckDB:

Result Query
number of all elements SELECT COUNT(*) FROM elements;
number of tagged elements SELECT COUNT(*) FROM elements WHERE tags IS NOT NULL;
number of elements with key(s) described on the Wiki SELECT COUNT(*) FROM (SELECT DISTINCT kind, id FROM elements_keys WHERE "key" IN (SELECT "key" FROM 'keys_wiki.csv' WHERE in_wiki));
number of all keys SELECT COUNT(*) FROM (SELECT DISTINCT "key" FROM elements_keys);
number of keys described on the Wiki SELECT COUNT(*) from (SELECT DISTINCT "key" FROM elements_keys WHERE "key" IN (SELECT "key" FROM 'keys_wiki.csv' WHERE in_wiki));

I got the following results:

  • all elements: 11,759,061,283
    • of which are tagged: 1,459,875,801 (12.415% of all elements)
      • of which have a key described on the Wiki: 1,459,659,709 (99.985% of all tagged elements)
  • all keys: 109,269
    • of which are described on the Wiki: 5,865 (5.367% of all keys)

The analysis provided the following conclusions:

  • small amount of Wiki-described keys represent nearly all tagged elements, which follows Pareto principle;
  • “any tags you like” rule does pose no significant threat to tagging consistency, since there is at least one known way to get info about virtually every element;
  • there is always a room for improvement for keys in terms of being described on the Wiki, especially for those that are a more precise version of the described ones.

The results are valid as of 20th April 2026, 12:00 AM, when the OSM database was dumped.

Discussion

Leave a comment

Log in to leave a comment