[OSM-dev] amazon AMIs for a full blown openstreetmap.org-like server

Fri Jan 8 14:39:49 GMT 2010

On Fri, Jan 8, 2010 at 3:23 PM, Stefan de Konink <stefan at konink.de> wrote:
> Op 08-01-10 15:09, jamesmikedupont at googlemail.com schreef:
>> I dont know all the details of mapnik, but from what I have seen,
>> using the postgres database is not needed in all cases.
>> I am thinking about
>
> It is not required to use the database, but the overhead is in the
> rendering. I do agree that PostgreSQL will not always give you the
> fastest results ;)
>
>> Well you mean a 4k page in memory. But you dont need to have all that
>> data memory mapped.
>> It could be just 4k of data on disk in an array. It could also just be
>> a tiny osm file that is parsed when needed.
>
> Tiny would have to be an osmtile in binary format, agreed?

Yes, well if you sort the data properly then mapnik would just render
the data in a sax callback,
right?
It would be a single pass over the data to render it, maybe with a
second pass to do some touch ups.
But given a street you could just draw the street. It has a vector of nodes.

imagine the following sequence of events in a sax xml callback :

begin tile  :
process nodes for next street (marking intersections)
Process all the intersecting streets (there we need to preprocess the data)
process street (process intersections as they go)
next street.

So you need to look for street crossings, and need to mark the points
where they cross as such. I can imagine that might need some work.
I am just working this out right now, but I can imagine that if you
sorted the streets and knew all the intersections and marked each
intersection, storing a ref to it like in <nd ref > and also had those
segments available, then you could just draw the street.

>
>> Well again, I have not really gotten into mapnik. But I can.
>
> Please do so :)

Ok, well I will have to.!

>
>> But lets try and define the problem as rendering an osm file to a
>> tile. That osm file is updated , and rerendered.
>> All the data needed to render is just stored in osmxml in a nice
>> sorted way. You never need all the data at once because you only
>> render a tile at a time.
>
> You don't want to store it in XML, it will get huge... and requires
> again the parsing overhead. And as I pointed out before, we prefer to
> render 64 tiles at a time.

So, you would have 64 tiles of data in one page.... How big is that?
the size of a city.

Well I have gotten the parsing down to very small using sax2, it is
just a bit slower than reading the file.
If you sort the xml data topologically and pack it all in, it could be
very easy to process.

>
>
>> but you dont need a full postgres database functionality, you just
>> have very basic update of pages of data.
>
> In your idea, how would you *update* the storage?

My vision of a distributed GIT repository would have the user update
the storage when they save it...
Now if you have a web user that commits the data or uploads it, then
the page of data would be updated by that process
the rendering would be triggered and the whole thing checked in.
Each tile would be checked in and older versions discarded as needed.

Now, if you just have a nice xml file that is compact  or an ASN.1
file for higher speed binary or something else, you need to update
that. but how often are we talking about update here? Most people are
using daily diff files from the main database. those would be split up
into the tiles, and the xml would be replaced by a new page, sorting
would have to take place. I need to work this all out, but it is
possible.

I will have to work through the entire pipeline of processing to see
if it fits my model, I risk having to put my foot in my mouth here....

mike