Analysis of amount of elements with keys described on the Wiki
Posted by Kamil Kalata on 1 May 2026 in English.While browsing Taginfo I got curious how many elements have at least one key described on the Wiki and how big share of all keys the described ones make up. Therefore, I decided to check it out.
The analysis consisted of the following parts:
- fetching OSM database dump from planet.osm.org;
- fetching key statistics from Taginfo with the API;
- extracting “is in Wiki” info into separate file;
- altering “is in Wiki” info for keys which were described on the Wiki after the database was dumped. The alteration was based on the recent changes registry;
- processing the dump with DuckDB:
- extracting element type, its ID, and its tags to new table:
CREATE TABLE elements AS SELECT kind, id, tags FROM ST_READOSM('planet-latest.osm.pbf');; - exploding keys to separate records:
CREATE TABLE elements_keys AS SELECT kind, id, UNNEST(map_keys(tags)) FROM elements;;
- extracting element type, its ID, and its tags to new table:
- querying the database.
These are queries I provided to DuckDB:
| Result | Query |
|---|---|
| number of all elements | SELECT COUNT(*) FROM elements; |
| number of tagged elements | SELECT COUNT(*) FROM elements WHERE tags IS NOT NULL; |
| number of elements with key(s) described on the Wiki | SELECT COUNT(*) FROM (SELECT DISTINCT kind, id FROM elements_keys WHERE "key" IN (SELECT "key" FROM 'keys_wiki.csv' WHERE in_wiki)); |
| number of all keys | SELECT COUNT(*) FROM (SELECT DISTINCT "key" FROM elements_keys); |
| number of keys described on the Wiki | SELECT COUNT(*) from (SELECT DISTINCT "key" FROM elements_keys WHERE "key" IN (SELECT "key" FROM 'keys_wiki.csv' WHERE in_wiki)); |
I got the following results:
- all elements: 11,759,061,283
- of which are tagged: 1,459,875,801 (12.415% of all elements)
- of which have a key described on the Wiki: 1,459,659,709 (99.985% of all tagged elements)
- of which are tagged: 1,459,875,801 (12.415% of all elements)
- all keys: 109,269
- of which are described on the Wiki: 5,865 (5.367% of all keys)
The analysis provided the following conclusions:
- small amount of Wiki-described keys represent nearly all tagged elements, which follows Pareto principle;
- “any tags you like” rule does pose no significant threat to tagging consistency, since there is at least one known way to get info about virtually every element;
- there is always a room for improvement for keys in terms of being described on the Wiki, especially for those that are a more precise version of the described ones.
The results are valid as of 20th April 2026, 12:00 AM, when the OSM database was dumped.
Discussion