[OSM-dev] NetSquared ominationN

Fri Apr 20 22:35:12 BST 2007

I'm still working out the exact approach and learning python as I do it.
The data is not a 100% match as, for example, tiger specifies zip code
to left, and zip code to right as opposed to a zip code for a line. Also
while segments have only two points (a begin lon/lat, and an end) where
the rest is filled in by points along the segment, there appear to be
multiple segments for the same road(I'm unsure why this is?). 

Also, someone said that long ways would not be a good idea, why is this?

Finally as to the requirements, the tiger files I believer are around
4gig zipped, and since they're text, they zip very well. For a 3.3M zip
file I get about 28M unzipped. When loaded into the initial(NON OSM
SCHEMA) database, however, I get about 2-3M (Converting strings to
numbers, trimming out irrelevant stuff). So spacewise I would probably
say something like 75G free if we are going to unzip everything all at
once and then load, alternatively we could have the program unzip, load,
and then remove the extracted files.
This would get the data into a holding database, which will make it much
easier to deal with.
I honestly don't see this as a big bang process, and believe that the
data will have a few surprises in it (I've already found roads with no
names that I haven't figured out why they're there yet).
To me the most important first step is to get all road data (And only
road data (with street numbers and zipcodes)) into holding tables mysql.
This should be okay, however tiger has a provision that the primary
description is of the line's most prominent feature, so there's a chance
that this might miss something, but I don't think so. I believe that
doing this will allow us to tinker with the data until we're sure it's
ok, and maybe do something like load all interstates to production
first.

On Thu, 2007-04-19 at 07:52 -0700, Andrew Turner wrote:
> Agreed that the project didn't seem 'noble' enough since it wasn't
> mapping Africa or otherwise. Interesting that "Maps 2.0' got picked.
> Maybe they have money now to offer OSM ;)
> 
> I do agree the project for US kickstart is still needed. Timeframe
> wise would be to put together the plan and the beginning of the import
> code and then go around Where/WhereCamp invigorating the US geo
> community to make it happen.
> 
> With regards to boxxen, Anselm may have space - I can ping him. What
> are the expected requirements, based on existing h/w used? Otherwise,
> an option may be to do a US donation drive to build up the small?
> amount - good hosting is only like $30-50/month.
> 
> I'm still in CA at Loc Int & web2.0 and can gather more thoughts after
> I get back.
> Andrew
> 
> On 4/19/07, Don Smith <dcsmith at gmail.com> wrote:
> > So,
> > The last place this was left was that we were going to load the
> > tiger/line data into mysql, then run some stored procedures to get the
> > data into osm format.
> > There was going to be a dedicated machine for the project, but I have no
> > idea where that ended up. I switched my desktop with the idea if the
> > machine wasn't ready I'd start development, but haven't gotten around to
> > it yet. I guess my task will be to layout the tables, and write a loader
> > in python. Once that's done, the rest should be the writing of a few
> > stored procedures to convert it to osm data (Which I haven't looked at
> > closely yet).
> >
> > Don Smith
> >
> > On Thu, 2007-04-19 at 09:25 +0100, 80n wrote:
> > > Well, we weren't one of the winning 20.  But hopefully the project
> > > will have got some exposure from this process anyway.  Anyway, I'd
> > > like to thank everybody for all the time and work they put into the
> > > proposal.
> > >
> > > I think the main problem with our proposal was that it was focussed on
> > > helping the project succeed in the US.  If we'd have made a pitch for
> > > creating a free map of some small village in Africa it would have come
> > > over better.  Anyone else have any thoughts or observations, in
> > > retrospect, about what we did wrong and what we could do better next
> > > time?
> > >
> > > The orginal goal of kick-starting OSM in the US still exists.  Does
> > > anyone have any other ideas or strategies that we can try?  The
> > > TIGER/Line data still needs to be dealt with and will continue to be
> > > one of the obstacles until we address it.  Don, I think you were
> > > planning to work on this, what would help you to get it done?
> > >
> > > 80n
> > >
> > >
> > >
> >
> >
> 
>