OpenStreetMap logo OpenStreetMap

Post When Comment
MapBox and MapQuest ... the bit the tech pundits are missing

It seems to me vector tiles technology is only an example for a somewhat more general problem. There are quite a few people and smaller companies working on and practically using vector tiles technology - like for example Andy Allen, but when you develop sophisticated stuff primarily for your own use releasing it as open source is a mixed bag, you spend a lot of time bringing things into a form that works out of the box for others and document it so people are actually able to use it but if you are not primarily a software developer there is little gain in doing so. I’d guess the situation is somewhat similar even for a fairly large company like Mapbox, their primary business is not developing software - despite employing several core developers of open source software projects.

If you read closely this can also more or less be found in https://www.mapbox.com/about/open/ - both for open/closed software and data by the way. This is somewhat at odds with marketing claims like ‘majority of their data is open’ or ‘dedication to being open’ IMO but from a business perspective this is a consistent view. It is unlikely that you will see a company like Mapbox preferencing open compared to closed (software or data) unless it seems to be favorable for them from a business perspective.

The real question is where these mechanisms leave us in terms of innovation. I have no doubts open source has such a solid position meanwhile that whenever something innovative, widely useful and popular is developed it will sooner or later also get available as open source. But if business structures prevent actual innovation from happening, for example because a few dominating companies in a field prevent innovative ideas from gaining foothold because they threaten their business model this becomes a real problem. Most people consider this to be a problem w.r.t. Google but it is perfectly possible that this also applies to Mapbox in some areas.

Globalizing the name translation debate

These are very useful insights into far east Language relations and particularities. You do not however address the problem of verifiability of handcrafted transliterations. This is rarely an issue with two neightboring countries with a long history of cultural relations like China and Vietnam but the main question is how to handle this in other cases. How do you decide if and how to tag name:vi for European/African/American places?

Top OSM Rank: The Big Imports

Nice summary of the big imports.

You however only looked at the formal cleanup of the nodes. There is also substantial cleanup in tagging required, in Tiger that is obvious but with Canvec this is also a real problem. To give an example: Canvec imports tag all waterways as stream. There are nearly 3 million waterway=stream ways with a source=NRCan* tag right now but only 774 (!) waterway=river with such source tag, 63 of which have been touched since the beginning of the year, 285 last year.

Making a very generous estimate and assuming the same number of rivers have been retagged removing the source tag (which is more difficult to determine, but it is unlikely there are this many - source tags usually remain untouched in later edits) and assuming only ten percent of the waterways would actually qualify for waterway=river (in reality it is more) it would still take ~500 years to fully evaluate waterway tagging in Canvec data even if we’d stop importing further streams right away.

By the way, CanvecImports only accounts for ~15% of the ways with a source=NRCan* tag, extrapolating from that for the nodes there would be more than 250 million Canvec nodes at the moment.

Humantarian Map - Correction Required

The whole situation including the different claims is fairly well mapped in that area, see:

osm.org/relation/4609727

osm.org/relation/4609726

osm.org/relation/4611980

osm.org/relation/4610574

osm.org/relation/4611981

Charging stations (carto v2.30.0) *rolling tumbleweed*

See:

https://github.com/gravitystorm/openstreetmap-carto/issues/989

Note however the practical usefulness is right now fairly hypothetical. If you have an electric vehicle right now the chances that when you go to one of the places tagged amenity=charging_station in OSM with it you can actually charge it there are probably slim - just have a look at the number of sockets shown on the wiki page - not to mention the contractual difficulties of being allowed to use it. Electric charging is still lightyears away from fossile fuel ‘charging’ where you can usually be sure that you can (a) technically fuel your vehicle and (b) pay with at least either cash or credit card at any fuel station.

map styles: Default OSM vs Google Maps

The biggest difference in road rendering between Google and OSM standard style to me seems to be that at z13/14 Google uses thin plain lines for all but the largest roads while the OSM style draws them with casing. Apart from that the road rendering in Google is generally more subtle. This is not necessarily a good thing - i am in general a proponent of rendering things clearly visible or not at all. But the dark gray at low zooms and the bright white at high zooms for the lower importance roads in the OSM style is of course quite extreme.

Also Google in general draws roads thinner than the OSM standard style (except for the highest zoom levels where OSM draws them thinner than reality). This helps them avoid things like

http://tools.geofabrik.de/mc/#14/4.6211/-74.1845&num=2&mt0=mapnik&mt1=google-map

although a better approach might be making the drawing width depend on local map scale rather than zoom level. This might be something you could try - AFAIK this would also be a first for any web map.

How good (bad) is water represented in OpenSteetMap?

Nice to see waterbody mapping and use of OSM waterbody data gathering attention outside OSM itself.

Looking at your presentation it seems however your methodology has quite a few issues, both in detail and en large - some of them will likely ‘bite you in the back’ when you try to use that approach in other areas.

The most basic one probably is that the techniques you compare do very different things. In OpenStreetMap we map bodies of water, either standing or flowing. When analyzing Landsat images you map surface water and when analyzing elevation data you map potential drainage lines. It should be obvious that these three methods - even if all of them work perfectly - will produce diverging results.

Your approach to Landsat/SRTM reminds me of something i did several years ago. You are lucky you do not have snow and glaciers or frozen lakes in Australia…

Germanwings Flight 9525 response

Rock glaciers are mapped with natural=glacier + glacier:type=rock - see osm.wiki/Proposed_features/Glaciers_tags

They are difficult to map though since they often look similar to scree and moraines in images.

Germanwings Flight 9525 response

You effort for providing useful information is meritorious but may i suggest you somewhat more carefully verify what you map. Aerial images are often difficult to interpret, especially if you don’t know the area from first hand experience and this is further aggrevated in mountain areas by irritating shadows and the distortions resulting from orthorectification.

For example

osm.org/way/334530242

osm.org/way/334561139

are clearly wrong (the glacier has mostly vanished except for a small rock glacier and the cliff would have a stream flowing uphill across it).

SRTM 1arc v3

If by dammed you mean that smooth looking slope, yes, of course

I mean there is a barrier in the valley’s profile, the river seems to flow uphill. This is a very common side effect of simple interpolation void fill.

I would like to know which data are you using.

The shown image is based on the 1 arc second and 3 arc seconds SRTM as well as Jonathan’s data and OSM waterbody data.

Flattening of lakes is to make sure the contours do not intersect the water areas.

SRTM 1arc v3

There are systematic differences between SRTM and Jonathan’s 1 arc second data so plainly replacing void pixels with that data is going to fail of course.

Your approach essentially has two major problems:

  • The USGS 1 arc second files contain bad quality void fill in some tiles that would need to be detected and removed for good quality results.
  • The void filled 3 arc seconds data from the USGS/NGA vary strongly in quality of the void fill, in Europe they mostly just interpolate (slightly better than your gdal_fillnodata attempt but essentially similar). You can see that for example in your final image where the valley at the bottom appears dammed near Mollières.

Here for comparison a rendering of my current basic processing in that area with OSM lakes flattened. Note this still has all artefacts of the USGS data in the non void areas included which are more difficult to address.

imagico.de relief rendering Maritime Alps

Sentinel-2 satellites imagery can be used for OSM

There are a few aspects that should probably be kept in mind before getting too excited about this:

  • As it is the cited license would not allow use of data covered by it for OSM - it requires all derivative works to include a note ‘contains Copernicus data’ which cannot be ensured in OSM.
  • So far the release under open license is only an intention - it is not clear when this will happen (the satellites are not even launched yet), what processing level the data will have and if the data is really freely distributable or if you have to sign an agreement before you get access to it. The ESA track record in this regard is - to put it nicely - not that great.
  • Experience with use of automated land cover classification as a data source for OSM as suggested in the cited blog post has been very bad in the past. The very concept of automated classification into a fixed set of classes is at odds with the basic principles of OSM.

None the less should this imagery become freely available it will be a welcome addition to existing sources for mapping.

The trouble with the ODbL - summarized

Just a few quick notes:

  • Of the points listed above at least 2, 4, 5 and 6 are not specific to the ODBL (meaning they would apply for many other licenses likewise). Before you say PD would not have these issues remember that true PD does not exist in most jurisdictions.
  • 1 and 3 are issues of EU database law and not specific to the ODBL either. Short of giving away all data without limitations there is no solution for this. It seems to me the authors of the paper you cite are not really familiar with the legal basis of data and database rights in Europe.
  • You probably should point the National Park Service to NASA which apparently has no such issues.
  • The case of Yale University seems to be based on the misconception that any license can force you to give away other data and force you to violate other obligations. The only thing the ODBL can do is forbid you to use OSM data under certain circumstances, it has no power to change your other legal or contractual obligations.
  • Generally citing people who do not use OSM data because of certain fears without analyzing if those fears are well-founded seems inappropriate and misleading. The key words in all your case examples are ‘could’, ‘might’, ‘concerned’, ‘worried’ etc. and the text makes no attempt to analyze the validity of these concerns.
  • Frankly i am somewhat appalled by the fact that you do not acknowledge the LWG efforts with the community guidelines to shed light on unclear and difficult to understand aspects of the ODBL. If you really want to help data users to use OSM data and resolve their concerns you should support these efforts by helping to communicate them to potential data users.
Afrika

In Mauretanien sind viele Ortschaften nicht detailliert erfasst - wo hochauflösende Bilder verfügbar sind, lässt sich da viel machen, z.B.

http://mc.bbbike.org/mc/?lon=-13.91534&lat=17.05309&zoom=15&num=2&mt0=bing-satellite&mt1=mapnik

http://mc.bbbike.org/mc/?lon=-9.61456&lat=16.66121&zoom=14&num=2&mt0=bing-satellite&mt1=mapnik

http://mc.bbbike.org/mc/?lon=-12.70895&lat=22.67912&zoom=15&num=2&mt0=bing-satellite&mt1=mapnik

Im Süden fehlen auch flächendeckend die meisten Gewässer, z.B. hier:

http://mc.bbbike.org/mc/?lon=-12.57525&lat=16.03655&zoom=10&num=2&mt0=bing-satellite&mt1=mapnik

da ist aber die Abdeckung mit aktuellen Bildern recht lückenhaft und die Landsat-Bilder in Bing sind im Grunde zu alt für sinnvolles Mapping.

Was Du auf keinen Fall machen solltest ist, die Wüstengebiete mit natural=desert-Polygonen zu pflastern.

Entscheidend ist beim Mapping auf die Ferne immer eine solide Kenntnis der Gegend - nur damit kann man nämlich Luft- und Satellitenbilder auch angemessen interpretieren, unbedingt lesen:

osm.wiki/DE:Armchair_mapping

intermittent streams in arid environment

A more fine grained characterization of intermittent waterways makes only limited sense since it can rarely be expected from the mapper to know how frequently a stream carries water. You could add ephemeral=yes in addition to intermittent=yes if you have this information. Also you could have intermittent=yes and seasonal=no to indicate there is no seasonal pattern.

Generally the characterization ‘ephemeral’ is often extremely vague - it can mean anything from ‘frequent but short presence of water for example after rain’ to ‘no water for decades and possibly permanently dry due to climate change’.

dlr data help

DLR is primarily funded by the taxpayer <sigh> but as usual in publicly financed research in Germany these days the management has a strong incentive to get additional money or non-monetary support from the private sector which in case of the DLR is often done by selling exclusive rights to the data to private companies.

There is no effective general obligation for public institutions in Germany to make their data publicly available.

dlr data help

The DLR is kind of the Antichrist of Open Data so it is unlikely you can get anything substantial from them.

OSM Node Density 2014

Very nice. Maybe you could publish the color scales for the different zoom levels, i.e. what color represents how many nodes per web mercator square kilometers.

Converting the data to real densities should be relatively easy by multiplying with the area scaling function of the projection. This would lighten up the polar regions quite a bit, Greenland for example is is fact mapped with similar node density in the north and south.

Redundanz bei OSM-Daten

@4rch - ich sehe nicht, dass das Argument ‘die Berechnung ist kompliziert’ etwas an der Situation ändert, nämlich dass die redundante Erfassung von Informationen zwangsläufig zu Widersprüchen in den Daten führt.

Dass es auch im maritimen Bereich Grenzen gibt, die nicht durch Distanz von einer Basislinie definiert sind, sondern durch bilaterale Abkommen oder anderweitig definierte unilaterale Ansprüche ändert eigentlich auch nichts - solche explitzit definierten Grenzen sollten natürlich immer auch explizit erfasst werden und hier besteht keine Gefahr der Redundanz oder Widersprüchlichkeit - außer natürlich bei Konflikten zwischen verschiedenen Ansprüchen, was aber ein anderes Thema ist.

Indem man die redundante Erfassung als alternativlos deklariert (was brogo explizit nicht getan hat) macht man sich die Sache zu einfach. Man sollte sich immer klar machen, dass man sich durch eine solche Entscheidung auch große und im gewählten Beispiel eben auch sehr deutlich in der Karte sichtbare Nachteile einhandelt.

Kurz - redundante Erfassung von Informationen ist niemals zwingend notwendig und es gibt wie erläutert durch durchaus gute prinzipielle Gründe, so etwas zu vermeiden. Dass man dennoch zu dem Schluss kommen kann, Dinge trotz der Nachteile redundant zu erfassen, will ich dabei garnicht bestreiten.

Redundanz bei OSM-Daten

Das Vermeiden von Redundanzen dient nicht nur der Effizienzsteigerung. Das größte Problem ist vielmehr, dass Redundanzen meist zwangsläufig zu Widersprüchen führen, denn die verschiedenen Versionen der selben Information müssten permanent von irgendjemandem synchronisiert werden, was nicht passiert.

Bestes und beim Betrachten der Karte offensichtlichstes Beispiel finde ich immer die maritimen Grenzen (i.a. also die 12-Meilen-Zone), welche explizit gemappt wird, obwohl sie eigentlich eine berechnete Linie darstellt (berechenbar auf Grundlage der Basislinie, welche meist nicht durchgehend erfasst ist). In der Praxis führt dies dazu, dass die gemappte Grenze oft im Widerspruch zum Rest der Karte steht (sie müsste sich definitionsgemäß immer mindestens 12 Meilen vor der Küstenlinie befinden).

Der Grund, trotzdem diese Grenze zu erfassen ist klar, denn es ist ein praktischer und einfacher Weg, die Grenze eines Landes zu schließen und es ist gleichzeitig die Linie, die man in der Karte darstellen möchte. Der wesentlich sauberere Weg wäre hier jedoch, für alle Anwendungen, die dies benötigen, diese Grenze aus den anderen Informationen in der Datenbank zu berechnen. Zu fordern, dass solche Informationen aus Komfortgründen für den Auswerter in der Haupt-OSM-Datenbank liegen müssen (was die automatische Aktualisierung ja wirkungsvoll verhindert) halte ich für problematisch.