[OSM-dev] Cartagen - client-side vector based map renderer, dynamic maps

Fri May 8 21:09:17 BST 2009

Great, this is a good discussion. I've put up a wiki page with some of the
things we've covered, with pros/cons. I hope we can continue to talk about
our approaches and as we optimize for different problems post some of it
back up here:
http://code.google.com/p/cartagen/wiki/FeatureTradeoff

I put in what I could gather about Temap, but feel free to update and add
more pros and cons... this is just my thought process so far. We might also
add a "status" column so we can annotate what we learn from each approach.

Best,
Jeff

On Fri, May 8, 2009 at 3:00 PM, Tels <nospam-abuse at bloodgate.com> wrote:

> Moin,
>
> On Friday 08 May 2009 20:04:48 you wrote:
> > > * The proxy receives XML from the api or xapi server. Currently it
> > > requests the full dataset.
> > > * Then it removes unnec. tags (like note, fixme, attribution and a
> > > whole bunch of others that are not needed for rendering). Some of
> > > them are very minor, but 10000 nodes with "attribution=veryvery
> > > long string here" can make up like 40% of all the data, and just
> > > clog the line and browser :)
> >
> > Yes, I'm thinking of trying to cache locally but still request
> > changesets if the ?live=true tag is set... caching locally is great
> > for more static data but for the live viewer, I'm trying to not use
> > caching, but increase efficiency in the requests.
>
> I fear loading data live from the API server is just not feasible,
> unless you:
>
> * only load diffs (minute-diffs?) and update your already cached at the
> proxy data with that. OTOH I read that importing a one-hour diff into a
> postgres database can take 40..70 minutes, e.g. depending on load you
> might not even manage to update your DB with the diffs fast enough...
> * invent an API server that is about 1000 times faster :)
> * do never zoom out from level 18, anything below will request so much
> data that you can't get it live :)
>
> Currently I consider "live-view" not an achiveable goal, I am happy if I
> can render data that is about 1 day or so old.
>
> > * The data is then pruned into (currently 3) levels and stored in a
> >
> > > cache:
> > >  * level 0 - full
> > >  * level 1 - no POI, no paths, streams, tracks etc. used for zoom
> > > 11 * level 2 - no tertiary roads etc. used for zoom 10 and below *
> > > The client is served the level it currently requested as JSON.gz.
> >
> > Great, this is what I'm working on too. I'm thinking a ruleset about
> > what features are relevant for what zoom levels could be something to
> > work together on? I was also thinking of correlating tags with a
> > certain zoom level. But maybe each tag should be associated with a
> > range of zoom levels, like "way: { zoom_outer: 3, zoom_inner: 1 }".
> > Thoughts?
>
> My rules do have a minimum zoom level, smaller than that and they are
> not rendered. The levels are inspired by the osmarenderer and mapnik
> outputs, but I moved a few of them down so you can render really high
> resolution maps.
>
> However, the pruning at the proxy is something else and not connected to
> that. For instance, somebody might not want to see tertiary roads on
> level 13, but others want. So I make sure that I only prune out data
> that is never be able too seen on that level. E.g. a conservative
> pruning.
>
> Also, about 90% of the data-pruning is about removing unwanted data
> (like "note=blah" :) and not about the smaller zoom levels because
> currently it is simple not feasible to render below 10 and even for
> zoom 10 you need a really really beefy machine and a long wait time....
>
> > > * There are three servers in the list (api.openstreetmap,
> > > xapi.informationfreeway and tagwatch) and a lot of them do not
> > > complete the request (internal error, not implemented etc. etc.).
> > > It can take a lot of retries to finally get the data.
> > > * Even when you get the data, it takes seconds (10..40 seconds
> > > is "normal") to minutes - upwards to 360 seconds just to serve one
> > > request.
> > >
> > > So currently all received data is stored in the cache for 7 days to
> > > avoid the very very long loading times.
> > >
> > > Ideas of fetching the full dataset and pre-computing the cache
> > > simple don't work because I don't have a big enough machine and no
> > > big enough online account to store the resulting JSON :(
> > >
> > >
> > >
> > > Also, somehow processing 150 Gbyte XML into JSON will prove to be a
> > > challange :)
> >
> > So I'm having the same problems with the APIs. The standard 0.6 api
> > has been pretty good but of course it serves XML, not JSON. The xapi
> > is not very responsive to me, it seems.
>
> Neither for me, but the API server is very slow, too. It seems it can't
> manage to send me more than 17Kbyte/s (but maybe it is bandwidth
> limited?).
>
> > I thought parsing XML in JS
> > would be molasses,
>
> When I tried it, it used ungodly amounts of memory (because the data
> structure is not usefull for rendering and it contains so much cruft),
> and I also never managed to extract the actual node data for rendering
> from it...
>
> > so if you're interested, we should put up our own
> > XAPI or custom api off the planet.osm file, and send JSON?
>
> Yeah, that was my plan for the near future :) For now I am happy with my
> proxy as there is quite enough to do on the client side wether the data
> is current/real-time or 1 day old :)
>
> > I have an quad-core Intel Mac Pro with 1.5 TB and a bunch of RAM we
> > can dedicate to this effort, with plenty of bandwidth. And perhaps
> > when Stefan's work is published, we could run it as well, since it
> > seems to be a great solution to requesting fewer nodes for large
> > ways... but for now do you think you could use an XAPI? i think all
> > my requests fit into that api.
>
> Currently I am just requesting the full data and then prune it myself
> because I am not actually sure it would help if we do either:
>
> * request partial data (all streets, all landuse), simple because at one
> zoom level high enough you need the full data, anyway
> * request ways with less nodes, because that is only good for low zooms
> and I am currently sort of ignoring them :) It is basically a
> side-problem.
>
> > Alternatively, Stefan points out that the dbslayer patch for the
> > Cherokee server allows direct JSON requests to a database. So some
> > very thin db wrapper might serve us for now? This isn't my area of
> > expertise, so if you have better ideas on how to generate JSON direct
> > from the db, like GeoServer or something, and still have tag-based
> > requests, i'm all ears.
>
> Well, I am not sure that this would be faster or better. If the db-json
> would serve the full API data, we would also get all the "junk" data
> like "note" and so on, and this will overhelm the browser. So it might
> need a filter, too.
>
> Also, my renderer expects the format currently spewed by my proxy. If we
> use stevens format, it wouldn't work (multipolygons are one reason) and
> it would be a lot of work to switch the code.
>
> OTOH; I would not complain if somebody invents a server that spits out
> JSON in the right format and in real-time :)
>
> > Yes, but reducing the polygons is also a lot of work :) I haven't
> > > started on this yet, because on zoom 12 or higher you need to
> > > render almost anything, anyways. Plus, you would then need to cache
> > > thepartial data somehow (computing it is expensive in JS..)
> >
> > Seems like Stefan's work may address this, no? Or if we did cache it,
> > seems like we'd calculate it on the server side.
>
> I was kinda hoping that I build a client-side aplication, not something
> that runs on the server :) If the server has to reduce the polygons, it
> might never be able to process the whole planet.
>
> But I see the point. :)
>
> (I was f.i. pondering if the JSON from the server should already contain
> BBOX data for each way. Decided against it as it: uses bandwitdh and
> server CPU, and isvery fast to compute on the client, anyway. But
> definitely a few things can be precomputed at the server and stored in
> the cache. One example are the multipolygon relationships. The
> presentation in XML isn't actually very useable, so I just rewrite it
> that the client can access it super-fast).
>
> > > > d) oh, and localStorage. I've partially implemented that but
> > > > haven't had much testing... other work... ugh. So caching on a
> > > > few levels, basically.
> > >
> > > I fail to see what localstorage actually gains, as the delivered
> > > JSON is put into the browser cache, anyway and the rest is cached
> > > in memory. Could you maybe explain what your idea was?
> >
> > Yes, localStorage persists across sessions so you could build up a
> > permanent local cache and have more control (in JS) over requesting
> > it and timestamping when you cached it, not to mention applying only
> > changesets and not complete cache flushes. This has some advantages
> > over the browser cache, although that does of course persist across
> > sessions too.
>
> But it won't help if you move to a different machine. Also, it goes
> against the "live", we would need to query the server for new data,
> anyway. Currently, if you reload a temap-session, most of the time is
> spent in the rerender, and almost none in loading the data over the
> net.
>
> I guess if I write a 100x faster renderer, that might change, but I'd
> like to work on one problem at a time :)
>
> So for now I'd like to keep localstore out as it creates more problems
> than it solves :)
>
> > > * There is a talk I proposed for State of the Map and I don't want
> > > to spoil everything before :)
> >
> > yes, me too! so if you want to discuss off-list that's fine.
>
> Heh, you have a talk scheduled, too? :) That sounds like fun :)
>
> > Of course, semi-dynamic rules like "color them according to feature X
> > by
> > > formula Y" are still useful and fun, and avoid the problems above.
> > > (Like: "use maxspeed as the color index ranging from red over green
> > > to yellow" :).
> >
> > Yes, this is an exciting area to me, for example the color by
> > authorship stylesheet i posted before:
> >
> > http://map.cartagen.org/find?id=paris&gss=http://unterbahn.com/cartag
> >en/authors.gss
> >
> > or this one i threw together yesterday, based on the tags of measured
> > width instead of on a width rule:
> >
> > http://map.cartagen.org?gss=http://unterbahn.com/cartagen/width.gss
> >
> > A more fully-rendered screenshot is here:
> >
> > http://www.flickr.com/photos/jeffreywarren/3510685883/
>
> Yeah, that is what I have in mind, too. But so many things to do, so
> little time :)
>
> > Anyways, thanks for sharing; one thought I had was that besides
> > sharing ideas and solutions online, we should try *different*
> > approaches, so that we try all the possibilities. I think multiple
> > projects working on the same problem can sometimes be redundant, but
> > more often it's beneficial for all parties since there's a diversity
> > of approaches to a problem. Let's take advantage of that by
> > specifically attempting different solutions to the problems we face,
> > and discussing the results... if you're willing. If one of us tries a
> > technique and it doesn't work, we can all learn from the attempt.
>
> Sure, I am working on my ideas, anyway :) A few things you might find
> interesing:
>
> * no dashed lines on canvas, need to roll your own
> * rendering 60000 lines/areas takes a long time (>1minute), which means
> you need a sort of "slippy tiles" setup like I have currently. That
> allows the user to pan the map in real-time and the renderer can only
> render tiles off-screen.
>
> All the best,
>
> Tels
>
> --
>  Signed on Fri May  8 20:47:12 2009 with key 0x93B84C15.
>  Get one of my photo posters: http://bloodgate.com/posters
>  PGP key on http://bloodgate.com/tels.asc or per email.
>
>  "If Duke Nukem Forever is not out in 2001, something's very wrong."
>
>  -- George Broussard, 2001 (http://tinyurl.com/6m8nh)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20090508/7dede8cf/attachment.html>