[OSM-talk] OSM Wikidata SPARQL service updated

Sarah Hoffmann lonvia at denofr.de
Wed Aug 23 20:39:43 UTC 2017


On Mon, Aug 21, 2017 at 05:38:06PM -0400, Yuri Astrakhan wrote:
> Sarah, thanks, I created an issue at
> https://github.com/osmcode/pyosmium/issues/47
> 
> Does this mean I cannot even use the existing node cache file when
> processing ways from the minute diff files from pyosmium?

You can do the updates manually as well. Open the node cache file like this:

    mapfile = osmium.osm.index.create_map("dense_file_array," + filename)

and hand it into your handler. In the node() callback make sure you
update the locations in the cahce file like this:

   node(self, n):
		 if n.deleted:
       mapfile.set(n.id, osmium.osm.Location())
	   else:
       mapfile.set(n.id, n.location)

and then change the way callback to simply use the coordinates of a
point in the middle instead of computing a representitive point
from the full way geometry:

	  way(self, w):
			ndid = w.nodes[len(w.nodes)/2]
			point = wkbfab.create_point(mapfile.get(ndid))

That's out of my head, I haven't tested it and you should be adding
a couple of sanity checks for empty ways and points.

Sarah

> 
> On Mon, Aug 21, 2017 at 4:44 PM, Sarah Hoffmann <lonvia at denofr.de> wrote:
> 
> > On Sun, Aug 20, 2017 at 11:08:03PM -0400, Yuri Astrakhan wrote:
> > > Sarah, how would I set the node cache file to the repserv.apply_diffs()?
> > > The idx param is passed to the apply_file() - for the initial PBF dump
> > > parsing, but I don't see any place to pass it for the subsequent diff
> > > processing.  I assume there must be a way to run .apply_diff() that will
> > > download the minute diff file, update node cache file with the changed
> > > nodes, and afterwards call my way handler with the updated way
> > geometries.
> >
> > I don't think that is possible yet. For my own projects I have always
> > used an explicit instance of the node cache file and read and written
> > that manually (using the osmium.index.LocationTable() class). But that
> > is not particularly practical. I'll look into adding an idx parameter
> > to the replication mechanism when I find a minute. Feel free to open
> > a feature request on github to remind me.
> >
> > Kind regards
> >
> > Sarah
> >
> > >
> > > Also, I assume you meant dense_file_array, not dense_file_cache. So in my
> > > case I would use one of these idx values when calculating way centroid,
> > and
> > > None otherwise:
> > > sparse_mem_array
> > > dense_mmap_array
> > > sparse_file_array,my_cache_file
> > > dense_file_array,my_cache_file
> > >
> > > Thanks!
> > >
> > >
> > > On Mon, Aug 14, 2017 at 4:31 PM, Sarah Hoffmann <lonvia at denofr.de>
> > wrote:
> > >
> > > > On Mon, Aug 14, 2017 at 11:10:39AM -0400, Yuri Astrakhan wrote:
> > > > > mmd, the centroids are calculated with this code, let me know if
> > there
> > > > is a
> > > > > better way, I wasn't aware of any issues with the minute data
> > updates.
> > > > >       wkb = wkbfab.create_linestring(obj)
> > > > >       point = loads(wkb, hex=True).representative_point()
> > > > > https://github.com/nyurik/osm2rdf/blob/master/osm2rdf.py#L250
> > > >
> > > > It doesn't look like you have any location cache included when
> > > > processing updates, so that's unlikely to work.
> > > >
> > > > Minutely updates don't have the full node location information.
> > > > If a way gets updated, you only get the new list of node ids.
> > > > If the nodes have not changed themselves, they are not available
> > > > with the update.
> > > >
> > > > If you need location information, you need to keep a persistent
> > > > node cache in a file (idx=dense_file_cache,file.nodecache)
> > > > and use that in your updates as well. It needs to be updated
> > > > with the fresh node locations from the minutely change files
> > > > and it is used to fill the coordinates for the ways.
> > > >
> > > > Once you have the node cache, you can get the geometries for
> > > > updates ways. This is still only half the truth. If a node in
> > > > a way is moved around, then this will naturally change the
> > > > geometry of the way, but the minutely change file will have
> > > > no indication that the way changed. Normally, these changes are
> > > > relatively small and for some applications it is good enough
> > > > to ignore them (Nominatim, the search engine, does so, for example).
> > > > If you need to catch that case, then you also need to keep a
> > > > persistent reverse index of which node is part of which way
> > > > and for each changed node, update the ways it belongs to.
> > > > There is currently no support for this in libosmium/pyosmium.
> > > > So you would need to implement this yourself somehow.
> > > >
> > > > Kind regards
> > > >
> > > > Sarah
> > > >
> > > > >
> > > > > Your query is correct, and you are right that (in theory) there
> > shouldn't
> > > > > be any ways without the center point. But there has been a number of
> > ways
> > > > > with only 1 point, causing a parsing error "need at least two points
> > for
> > > > > linestring". I will need to add some special handling for that
> > > > > (suggestions?).
> > > > >
> > > > > You can see the error by adding this line:
> > > > >    OPTIONAL { ?osmId osmm:loc:error ?err . }
> > > > > The whole query --  http://tinyurl.com/ydf4qd62  (you can create
> > short
> > > > urls
> > > > > with a button on the left side)
> > > > >
> > > > > On Mon, Aug 14, 2017 at 5:18 AM, mmd <mmd.osm at gmail.com> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Am 13.08.2017 um 19:49 schrieb Yuri Astrakhan:
> > > > > >
> > > > > > > * all ways now store "osmm:loc" with centroid coordinates,
> > making it
> > > > > > > possible to crudely filter ways by location
> > > > > >
> > > > > > out of curiosity, can you say a few words on how your overall
> > approach
> > > > > > to calculate centroids for ways? As we all know it's an endless
> > pain to
> > > > > > get that information out of minutely diffs :)
> > > > > >
> > > > > > I have to say that I'm pretty much unfamiliar with SPARQL and just
> > > > tried
> > > > > > the following query. My expectation was that I won't get any
> > results,
> > > > > > making me wonder if my query has some issue?
> > > > > >
> > > > > > SELECT * WHERE {
> > > > > >   ?osmId osmm:type 'w' .
> > > > > >   FILTER NOT EXISTS { ?osmId osmm:loc ?osmLoc }.
> > > > > > } LIMIT 100
> > > > > >
> > > > > >
> > > > > > BTW: A quick search on Github yielded the following:
> > > > > > https://github.com/nyurik/osm2rdf. Would that be the right place
> > to
> > > > look
> > > > > > for more details?
> > > > > >
> > > > > > Best,
> > > > > > mmd
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > talk mailing list
> > > > > > talk at openstreetmap.org
> > > > > > https://lists.openstreetmap.org/listinfo/talk
> > > > > >
> > > >
> > > > > _______________________________________________
> > > > > talk mailing list
> > > > > talk at openstreetmap.org
> > > > > https://lists.openstreetmap.org/listinfo/talk
> > > >
> > > >
> >



More information about the talk mailing list