OpenStreetMap logo OpenStreetMap

Evgeny Arbatov's Diary

Recent diary entries

I often need to create a single GPX file from multiple related but different GPX files from Garmin and Wikiloc. My use case is finding route recommendations for running based on GPX files. I try to get a single GPX file so I do not have to juggle multiple files during the run. Original GPX files are also noisy, so the added advantage is having a precise path to navigate instead of dealing with GPX noise.

I start by putting all downloaded GPX files into a single directory and plotting them on the map. This step allows me to see them in relation to each other and spot any outliers early. I then simplify each GPX file to reduce the number of points and match them to OSM ways. I filter the points to make them evenly spaced. Finally, I use the OSRM trip service to create a single combined GPX file, or fall back to the minimal number of split files if they cannot be merged into a single file. I plot the resulting GPX files and simplify them to reduce the point count. Now I can sync them to Garmin and use them as a kind of basemap on a device that does not support basemaps.

Of course, this only works when I know the area in advance, can find enough existing route recommendations, and there is a reasonably complete OSM map. The current pipeline is error-prone and I find myself tweaking it to make it work with any specific set of GPX files. You can try it yourself and view the complete code on GitHub: https://github.com/evgeniyarbatov/gpx-courses

Location: Hong Ha Ward, Hà Nội, 11025, Vietnam

House Hunting with OSM

Posted by Evgeny Arbatov on 28 April 2026 in English.

I was looking for a house and decided to use OSM to help me find the optimal location. I had several fixed conditions: I knew my office location and I was looking for a condo close to a primary school. I used OSM to extract all schools and residential buildings. As there were many residential buildings, I used DBSCAN to find clusters and pick one representative building per cluster. To reduce the number of routes to compute, I used the OSRM table service to filter schools and houses within 5 km of each other. Then I ran the OSRM route service on triplets of (home, school, office). I saved the route polylines and locations into a CSV file, then created a KML file based on it and imported it into Google My Maps for visualizing the results. The resulting map is useful for getting a first impression of possible places to live. I am getting quite a few schools and houses to choose from, but I still find it valuable to have a view of the entire city that does not depend on what I happened to find first.

Location: Tay Ho Ward, Hà Nội, 11214, Vietnam

I was interested in finding walkable areas in a city I had never visited before. After using OpenClaw bot to summarize my JSON files, I thought I could do the same for OSM-based metrics.

I started by generating an OSM extract with a 5km radius from my hotel. I then extracted the geometry and tags for every way, park, building, and tourist attraction in this area. I normalized the raw data into a handful of generic classes like “Food & Café” and “Nature / Quiet.”

I then assigned each way and point of interest to an H3 hexagon. I calculated aggregate metrics for each hexagon, like the length of roads and the area taken up by parks and water. Then I simply fed the metrics for the hexagon to Ollama with Mistral Nemo, asking it to generate a short one-sentence vibe of a place based on the collected metrics and label it as positive, negative, or mixed.

To visualize the results, I created a KML file and imported it into Google My Maps. I had to iterate on the LLM prompts, as there are a lot of fields in the generated JSON and the LLM struggles to interpret what the numbers mean.

I also discovered a number of bugs in how I calculate features per H3 hexagon, but I eventually arrived at a reasonable overlay showing how walkable each area is. It’s not perfect — partly because OSM data is incomplete for the area I picked, and partly because I need to make my prompts more specific.

The recommendations are generic, but they add an extra dimension to the map. I think this is really exciting because you can create any perspective you like on OSM data with your own overlays for driving, finding a house, or finding a place to eat.

Location: Hoằng Tiến Commune, Thanh Hóa Province, Vietnam

I have a large set of photographs I made while running. They are geotagged, as I took them with my phone camera. The compass direction is completely unreliable, but lat/lon is more trustworthy. I thought it would be an interesting experiment to extract greenery like grass and trees from these photographs. It can be a useful addition for creating routes that are more pleasant to walk, since the eye-level point of view is not available in OSM. As this is based on my personal photographs, it has the additional benefit of recommending routes that I tend to use. The first challenge I encountered is that out of a few thousand photographs, only a handful were taken during the daytime. After deduplicating and dropping all photos that contain no greenery, this becomes a relatively small set of waypoints. I decided not to extrapolate additional points along OSM ways to keep the dataset small and avoid adding misleading info. The greenery detection works well enough with the SegFormer model, although it is somewhat slow locally. My plan is to select waypoints from this dataset before calling OSRM. This way I get routes that are more enjoyable to walk and run, but are generally longer than the default shortest route. You can find my dataset on Kaggle.

Location: Ba Dinh Ward, Hà Nội, 11120, Vietnam

For a while, I was interested in understanding what makes one pedestrian OSM way better than another. I wanted to know if there is some generic way to identify good walking routes from OSM data. I looked at Garmin and Strava heatmaps at first. Then I checked Strava segments and their proximity to points of interest such as rivers, ponds, and parks. Then I thought to look at my running pace along OSM ways to separate good and not-so-good walking routes. My idea was simple — a good walking route means a smooth running pace. There are fewer stops, less waiting at intersections, etc. Of course, my pace depends on many factors, such as how far I have to run to get to a certain place. So it cannot be a simple cutoff, but rather the distribution of paces along a given segment. This turned out to be a reasonably good approximation of how good or bad I perceive each route to be. I created this Kaggle dataset as an illustration. This relies on my personal GPX data, so it does not scale, but it captures the kind of local knowledge that I find hard to share in any other way.

Location: Tay Ho Ward, Hà Nội, 11214, Vietnam

Width of OSM Ways from GPX Data

Posted by Evgeny Arbatov on 8 January 2026 in English.

I find that the width of OSM ways is a useful property for determining how good a pedestrian route is. However, it is often missing from OSM. As an experiment, I decided to use my running activities from Strava to estimate the width of a single OSM way that I use often. The specific way ID I used is in a relatively open area, meaning GNSS error is minimized. I also have collected over 100 traces of me running that single way ID over ~1.5 years. Given all this, how accurate can the estimate of the width be? I got the median width to be in the range of 11 meters. The actual width as measured with Google Maps satellite imagery is 13 meters. It’s close. I am happy with the result. I don’t have nearly as many traces for any other segment on the OSM map, so it’s a limited experiment, but the potential is promising. See the code on Github.

Location: Vinh Tuy Ward, Hà Nội, 11622, Vietnam

Local Knowledge in Maps

Posted by Evgeny Arbatov on 30 December 2025 in English.

I was visiting Sa Pa, Vietnam, and navigating with Organic Maps. I was looking for a street that would bring me back to the city center. I could not see any on OSM or Google Maps. I walked for a while and was able to see a street that led in the right direction. It turns out that it connected to another street that brought me where I wanted to go. This made me realize how much of the useful information in maps depends on people walking, running, or commuting through those streets. You cannot see these kinds of streets from satellite images. You can only know them, but knowing them, you may not use GPS tracking to record them. I think this leaves only runners and anyone who likes walking to discover most of the streets that are not currently on the map.

Location: Sa Pa, Lào Cai Province, 31786, Vietnam