OpenStreetMap logo OpenStreetMap

Post When Comment
Is editing config files very scary for you? If not, you can help with iD tagging schema

To avoid misunderstandings: I am not here to tell anyone how to manage their projects, i am just trying to offer an outside perspective that might be helpful and try to better understand the inside perspective of your project.

On a few points you mention:

  • On the preempted suggestions of technical solutions to an evidently social problem: full agreement. This does not mean that interactive tools for editing various formats for structured information are not potentially helpful.
  • On the need for creating and maintaining content in a machine readable form - in my experience this is never a problem in the OSM community even for the technically most clueless contributors because OSM is inherently all about normal people without specialized technical skills creating content in machine readable form. This is in our DNA. But machine readable does not mean structured and in a format that is chosen primarily with the ease for the data user in mind. That the information should be created and maintained in a structure and form optimized for the needs of those creating it in its substance and it then gets technically converted into whatever structure and form is needed by the users of that information should be obvious. That is how OpenStreetMap works on its main database as well.
  • On the icons - yes, approaching people with the demonstrated skills directly is a good idea. But many people with artistic skills are largely not very visible in the OSM community and artistic work in OSM is practically universally pro bono while technical work is widely paid, including iD work. And someone on a paid job asking a volunteer to help them out with their skills is - well - universally awkward. ;-)
Is editing config files very scary for you? If not, you can help with iD tagging schema

One, which you seems to care about, is COLLECTING IDEAS about extra search terms, icons, better translations, extra tags etc. Those are certainly important!

Emphatically no!

I get the appeal of delineating the inner world of a project that is essentially self sufficient intellectually and where everything can be dealt with based on the competency set of the peer group and everything else is the outside world out of scope of the project itself and is only considered at the discretion of the insiders.

But this is not a sustainable approach for this kind of work.

Frankly - if you don’t realize from looking at my monastery example that there is a substantial editorial deficit in the project that is not going to be solved by more idea collection outside of it i am at a loss here.

The problem you have is that the vast majority of people who have the background to contribute on a non-technical level, who are knowlegeable in tagging practice, global geography etc. are not going to be motivated to contribute their skills by petitioning to technical gatekeepers. And to be fair: You are not alone with this problem, there are plenty of other technical projects that suffer from the same issue.

And the mirror side to that problem is that many non-technical community members experience a lack of self-efficacy in the community and quite a few of them act this out by trying to push their views on the wiki or becoming aggressive/derailing in communication. Which is counterproductive, of course, since it re-affirms the tech projects to self-isolate.

And it is bottleneck, because more issues get created each month then the pull requests that resolve them. Which means: good ideas are not being implemented, which is a loss.

That is exactly what did not make sense to me from the start. If - as you say - the only competency required to fill the bottleneck is technical in nature then the solution is to create the technical means to allow those having the good ideas to actually implement them without the need for a human technician to hand chisel them into JSON.

But if the work on which you have capacity issues is actually editorial work rather than technical work of dumbly chiseling the JSON and formally testing it then the attempt to solve this by adding purely technical work power is going to be completely counterproductive, because - if you are successful - you are mainly going to increase the accumulated editorial deficits.

[…] to the fact that we have at least 3 competing machine-readable “standards”: id-tagging-schema /JOSM presets /Vespucci presets, so improvementa in one is wasted effort on others).

I have had this discussion plenty of times. Diversity in tools and independent approaches to address a need is an asset for the community, not a waste of ressources. In practically all fields of work, technical or other, the OSM community does not have a manpower problem, it has a deficit in creating suitable and appealing conditions for people to contribute, learn and grow. We don’t need less tagging present projects, we need more - not more of the same but more diverse approaches in the way this matter is practically worked on.

In non-technical work different people working on the same thing from different perspectives is almost universally considered a good thing. Five different books being available on the same subject is something most people consider a good thing. Five different painters painting the same thing most consider a cultural enrichment. Considering such things wasteful usually stems from looking at matters as a zero sum game where one party’s gain is inherently another party’s loss. Don’t make the mistake of buying into that glass half empty world view.

If you want a concrete recommendation how to stand out from the other tagging schema projects it is as follows:

  • open up the project and invite people with non-technical skills as peers and equals to the technical people into the project and its decision making.
  • make the believable promise that (a) editorial control over the content of the tagging schema is going to be handed over to people with a non-technical background and (b) that technical work in the project, in particular when paid by community money, is going to aim primarily to facilitate the editorial work rather than focus on the interests of the maintainers of software using the schema.

But i am realistic and do not expect these ideas to change things here in substance - the economic interests in favour of the status quo are strong and i don’t have the capacity to fight them. The mains reason why i am commenting is to attempt to widen the perspective of people reading here - and my own.

Launching the project golfTiles.org

Thanks for correcting.

Note there is nothing wrong with starting a map specifically for golfers. OSM-Carto displays golf courses and their micro-mapping since these are widespread world wide and tagging is well established for the most part. But we don’t optimize the depiction specifically for the needs of golfers, our target group is the general public and we depict things primarily in a way we deem useful for them.

None the less i think it is advisable to look at what current maps already do in terms of depicting golf infrastructure (and OSM-Carto and its French variant are the forerunners here AFAIK) and to pick the tools you use based on the cartographic goals you have.

Is editing config files very scary for you? If not, you can help with iD tagging schema

that is done at OSM Wiki, tagging forums, tagging mailing list

Does not seem like that to me. Example:

https://github.com/openstreetmap/id-tagging-schema/blob/main/data/presets/amenity/monastery.json

osm.wiki/Tag:amenity%3Dmonastery

osm.wiki/Proposal:Monastery

https://lists.openstreetmap.org/pipermail/tagging/2022-June/thread.html#64968

https://community.openstreetmap.org/search?q=monastery

Very little in that JSON file can be traced to either the Wiki or various discussion channels. There are, however, plenty of specific secondary tags mentioned on the wiki that do not turn up there. Very few of the terms listed in the JSON file can be found anywhere on the wiki or past discussions while plenty of dedicated terms for specific types of monasteries exist (and some are mentioned on the wiki) that are not listed. The symbol referenced was - from what i know - never discussed on community channels.

And this example is not an outlier, many of the files i casually looked at in the repository seem to have a rather dubious provenance of the content they contain.

Needless to say that wiki and tagging discussion channels are not generally a very reliable source of information regarding the tagging consensus among mappers.

Bottom line for me: I feel very much confirmed in my impression that the main bottleneck of the iD tagging schema is non-technical editorial work.

Is editing config files very scary for you? If not, you can help with iD tagging schema

I think it is in large part true already, some changes can be easily made with little technical skills required. This post is attempt to make people aware of it.

Huh? I thought our discussion so far clearly established that technical skills are a fundamental prerequisite for active contribution at the moment. That everyone who is not able or willing to edit JSON or to test their changes within your framework is practically limited to petition others to do changes for them.

in the end tagging schema is machine-readable configuration for editing software, so software testing is impossible to be entirely avoided as far as I can see

That makes it much clearer - thanks.

So the primary goal of the project is to technically provide and maintain the tagging schema in a form optimized for ingestion by the software that uses it and decidedly not to cooperatively develop and maintain the schema on the semantic, non-technical level.

The problem i see with that approach is that it is not sustainable because there is no separate project for developing and maintaining the schema on the semantic level - you essentially have this as a footnote to a project with purely technical goals while the main level of added value human work needs to go into for sustainability is the non-technical level.

But this is, of course, just how i see it and you indicated you perceive the bottleneck to be somewhere else. I am still interested in learning where you exactly see the bottleneck.

Do you have a specific idea how tagging schema would be maintained in way that would reduce this problem?

I am hesitant to make specific suggestions. What i would and could recommend would be under the goals i see - which, as established above, are different from the actual ones. That is to rethink both the structure and the low level format in which the information is maintained and edited.

Under the premise that a text based format suitable for managing in a VCS is needed a natural choice of low level format optimized for intuitive editing by humans as drop-in replacement for JSON would be Markdown. But i lack the full understanding of the overall scope of structural information in the project to give a concrete recommendation on the structural level. It is likely that for a contributor centered approach substantially changing the structure is advisable.

Is editing config files very scary for you? If not, you can help with iD tagging schema

Because if someone is not capable of doing this, they are unlikely to be able to test changes either.

What you mean by testing here is testing within the programmer centered testing framework currently in use. But having that as the only available way of testing is - again - a deliberate choice setting certain priorities.

In theory it would be possible to write some tool allowing editing it in a different way but it would take a lot of effort and I am dubious is it lowering threshold much.

I don’t think much is neceesary from the editing side in terms of writing new tools. There are plenty of well established tools on that front - for humans writing and editing structured text. Where more work would be required is processing that information from the data user side because you would not have cheap human workers pre-chewing it into a convenient JSON structure. But that i already mentioned.

If someone is able to contribute by reporting issues - it is also welcome, but it is not a bottleneck at all right now.

Yes, as i said - i understand the prioritization that has led to how things are. I am just questioning that this is a wise prioritization.

But i am also curious where you think the bottleneck is. You seem to imply that you think it is technically skilled contributors in some form. But i am not sure if you think this is a scarcity in volunteering (i.e. working for free) or a scarcity in people having the necessary technical skills in general.

I find this interesting in particular because from my perspective the task of developing and maintaining a tagging schema for use in OSM editors would be a project that barely requires any technical skills at all, where the main qualifications you’d need are familiarity with mapping practice in OSM and with the world wide geography OpenStreetMap tries to document. Plus - since the concept of tagging schema here includes verbal and symbolic representation of the tags - verbal and symbol design skills in that context.

Is editing config files very scary for you? If not, you can help with iD tagging schema

But what if it is? (referring to the question in your diary title)

Or let me phrase it in a question back:

If you want people to contribute to the project and value their contributions why do you require them to provide their contributions the the form of hand editing JSON files?

I kind of already know the answer to that question: Because that happens to be a convenient form for the software developers to write a reader for. But is that a wise prioritization?

Personally, i am not scared by complex file formats. But every time i see a project maintaining information that is to be curated by humans in a format like JSON i primarily sense a statement of their priorities which implicitly states that - even if they would like my contribution - they don’t value it. Because if they did they would have made a different choice.

I know i am kind of barking up the wrong tree here since this was not your choice. But anyway…

Launching the project golfTiles.org

We more experienced mappers realize fast that OSM Carto is just meant to show off the capabilities of the database.

Huh? I have heard plenty of claims of what secret goals OSM Carto is allegedly pursuing. But that is a new one.

I don’t remember any suggestion to improve rendering of golf courses is OSM Carto having been rejected over the past few years. The last i remember was https://github.com/openstreetmap-carto/openstreetmap-carto/issues/3734 in 2019.

Contributions to improve OSM Carto’s depiction of golf courses are always welcome.

work on iD and iD tagging schema - report

My interpretation of that stated goal would consider not only current mappers. But also potential and future ones. Enabling people to contribute, where it was in some way blocked before is a good thing.

Ok, but my question still would be: Is that your personal stance or do you have the impression that this is a guiding principle for the project in total? Because from my perspective this makes a fairly huge difference, especially if you go beyond just enabling people to contribute in principle towards enabling people to accurately document their local geographic knowledge.

And affirmative action was meant simply to describe any proactive measure taken to counter a systemic bias or discrimination, it is not limited to specific criteria of discrimination. If that is good or not can be debated. I was just interested if iD development in general considers this part of their approach to making decisions about tagging presets or not.

work on iD and iD tagging schema - report

Right now I am struggling to find cases where these two metrics would conflict.

Well - if you go via numbers, especially under the stated premise of

Our primary aim is to serve the needs of iD mappers

then the utilitarian target would be to create presents that can be most widely used by the demographics de facto using iD at the moment in the geographic and cultural context where they map.

If, OTOH, the goal is to serve the OpenStreetMap project as a whole, to create a world wide collection of local geographic knowledge, the relevant demographics would be the potential world wide mappers, not the (currently very non-representative) subset of those currently mapping with iD.

Concrete example would be tower:type=minaret vs. tower:type=bell_tower. These are both in iD presets, but by most of the criteria documented by iD the arguments in support for tower:type=bell_tower are much stronger than for tower:type=minaret.

The point in your PR of how often it could be used touches that matter - but it leaves it unclear to what extent this is meant to endorse affirmative action and to what extent that is actually practically guiding decisions.

To be clear: My comment was not meant as a critique of iDs decision making practice. Doing that in a meaningful way would require a more in depth look than what can be done in a diary discussion. I was mainly interested in learning about your impression on the topic.

work on iD and iD tagging schema - report

Thanks for the report.

I am wondering if you could share your impressions of the de facto decison making principles in iD regarding support for certain tagging ideas and regarding how tags are framed to the user.

The documentation lists a number of criteria, but there are clearly cases where these are not the primary basis of decisions. And there are also criteria and goals that are notably missing from the documentation - like the support for geographic and cultural diversity as opposed to just summarily maximizing data quality in an utilitarian sense.

Given you have now made such decisions in quite a few cases and interacted with decisions of others in many cases as well you are in a unique position to look at how iD works in that regard and i would be very much interested in how you perceive that.

OpenStreetMap Carto version v6.0.0 released

If you just want to vent hate, then please take it somewhere else.

OpenStreetMap Carto version v6.0.0 released

@Mxdanger - I am going to make sure you get you money back.

OSM-Carto is a volunteer project. Work only gets done when someone invests their time into it based on their priorities. Telling all the OSM-Carto contributors collectively that their priorities are messed up is - to put it mildly - self defeating.

wetland=tidalflat controversy

Might be of interest: I wrote more extensively on the current state of coastal mapping in OSM and how it came to be the way it is in 2023.

Purchase Historical Eswatini Topographic Maps

Are these going to be (and did you get for Namibia) full map sheet scans?

This might not seem overly relevant for OSM use but for the value of such data as a record of cartographic history the information on the map sheet rims (like data sources used, specification of map sheet edition etc.) is of immense value.

Authoritative Data is Not More Right Just Because It’s Authoritative

You have not even mentioned the more fundamental issue with comparison of different classification systems - that they always re-cast one classification into another and this way create an inherent bias. And when comparing OSM data it is always the OSM classification that gets re-cast into the other one - with the inevitable losses in semantic accuracy.

W.r.t. ground truth sampling as the gold standard for quality assessment - that is only the case if you actually use true unbiased random sampling. And i have yet to find a single such study that does this. Most studies do not even document the method of selecting sampling locations and hence qualify as pseudo-science (for global analysis for example: picking a uniformly distributed set of random sampling locations is non-trivial - not to mention that ground checking a truly random set of sampling locations is very expensive). By manipulating the sampling, even in a subtle way, you can essentially freely modify the results of such a study.

The Challenge of Dynamic Watercourses and Static Admin Lines 🌊🏛

There are two persistent urban myths w.r.t. boundaries:

  • Boundaries are universally defined in some abstract coordinate space and stay there irrespective of changes in the physical world.
  • Boundary data sets published by public authorities invariably represent objective truth about boundaries.

Both of these are wrong.

Many (if not most) boundary data sets published by authorities are generalized approximations.

Most international boundaries world wide that follow a physical geography feature (either a river or a watershed divide) are legally (by agreement between the countries involved) tied to that element (Using formulations like: From A to B follows the course/centerline/left side/right side of the river X). Often a more explicit specification of the boundary is done (and demarcated) by a boundary commission, sometimes subject to regular revisions.

What counts in OSM is of course in any case only the line of de facto administration - which is usually well verifiable for international (admin_level 2) boundaries but increasingly non-verifiable for the higher admin levels.

In my experience most mappers treat higher level administrative boundaries as abstract artefacts and ignore them when editing other data and do not connect them to other geometries even if they are functionally tied to them.

Drawing Circles on Digital Maps

While it is nice to see practical instructions involving basic mathematics are given, this is, unfortunately, so wrong on the mathematics that i don’t want to let this stand uncommented.

Drawing an accurate circle on a map is hard - others have struggled that before. Turf.js does it right but the code shown here does not and does not provide meter-accuracy in the general case.

Here a python code verifying the accuracy of the circle by calculating the distances of its outline points to the center - with the inaccurate version in projected coordinates shown here and the more accurate version based on what turf.js does.

#!/usr/bin/env python3

from math import *

lon = 12
lat = 50

earthRadius = 6378137

radiusInMeters = 100000

steps = 64

# https://stackoverflow.com/questions/4913349/haversine-formula-in-python-bearing-and-distance-between-two-gps-points

def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance in kilometers between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 
    r = earthRadius
    return c * r

print ('== incorrect ==')

for i in range(steps):

    angle = (((i * 360.0) / steps) * pi) / 180

    dx = radiusInMeters * cos(angle);
    dy = radiusInMeters * sin(angle);

    newLon = lon + (dx / (earthRadius * cos((lat * pi) / 180))) * (180 / pi)
    newLat = lat + (dy / earthRadius) * (180 / pi);

    print (haversine(lon, lat, newLon, newLat))


print ('== correct ==')

for i in range(steps):

    angle = (((i * 360.0) / steps) * pi) / 180
    
    drad = radiusInMeters/earthRadius
    lat_rad = lat * pi / 180
    lon_rad = lon * pi / 180

    newLat = asin(sin(lat_rad) * cos(drad) + cos(lat_rad) * sin(drad) * cos(angle))
    newLon = lon_rad + atan2( sin(angle) * sin(drad) * cos(lat_rad), cos(drad) - sin(lat_rad) * sin(newLat) )

    newLat = newLat * (180 / pi)
    newLon = newLon * (180 / pi)

    print (haversine(lon, lat, newLon, newLat))
OpenStreetMap Carto version v5.9.0 released

Yes, that is one example where this change is important. Note, however, that vehicle=destination and access=destination + foot=yes are semantically very similar. The important point is that mappers should be free to choose the tagging variant that is either more convenient or more accurately describes the situation on the ground, OSM-Carto should not try to nudge them in one direction or the other. With this change we got a lot closer to accomplishing that.

Carto图层的已知渲染问题

You should note that all of the tags you mention (except for place=farm) have been discussed on the OSM-Carto issue tracker in some form in the past. You can find the issues where this happened/happens by searching for the tags there.