[OSM-dev] NetSquared ominationN

Mon Apr 23 17:30:07 BST 2007

Getting Tiger into OSM seems like an excellent hacking activity for WhereCamp in June

http://wherecamp.pbwiki.com/WhereCampSF

----- Original Message ----
From: Andy Robinson <Andy_J_Robinson at blueyonder.co.uk>
To: Don Smith <dcsmith at gmail.com>; Andrew Turner <ajturner at highearthorbit.com>
Cc: 80n <80n80n at gmail.com>; SteveC <steve at asklater.com>; Mikel Maron <mikel_maron at yahoo.com>; Dev mail list <dev at openstreetmap.org>
Sent: Friday, April 20, 2007 11:31:33 PM
Subject: RE: NetSquared ominationN

Don,

Neat, all sounds good.

Has anyone had any thoughts yet on what needs to be done to shift the data
to fit the OSM schema, ie the reuse of common nodes (segment end points).

Cheers

Andy

Andy Robinson
Andy_J_Robinson at blueyonder.co.uk 

>-----Original Message-----
>From: Don Smith [mailto:dcsmith at gmail.com]
>Sent: 20 April 2007 10:35 PM
>To: Andrew Turner
>Cc: 80n; Andy Robinson; SteveC; Mikel Maron; Dev mail list
>Subject: Re: NetSquared ominationN
>
>I'm still working out the exact approach and learning python as I do it.
>The data is not a 100% match as, for example, tiger specifies zip code
>to left, and zip code to right as opposed to a zip code for a line. Also
>while segments have only two points (a begin lon/lat, and an end) where
>the rest is filled in by points along the segment, there appear to be
>multiple segments for the same road(I'm unsure why this is?).
>
>Also, someone said that long ways would not be a good idea, why is this?
>
>Finally as to the requirements, the tiger files I believer are around
>4gig zipped, and since they're text, they zip very well. For a 3.3M zip
>file I get about 28M unzipped. When loaded into the initial(NON OSM
>SCHEMA) database, however, I get about 2-3M (Converting strings to
>numbers, trimming out irrelevant stuff). So spacewise I would probably
>say something like 75G free if we are going to unzip everything all at
>once and then load, alternatively we could have the program unzip, load,
>and then remove the extracted files.
>This would get the data into a holding database, which will make it much
>easier to deal with.
>I honestly don't see this as a big bang process, and believe that the
>data will have a few surprises in it (I've already found roads with no
>names that I haven't figured out why they're there yet).
>To me the most important first step is to get all road data (And only
>road data (with street numbers and zipcodes)) into holding tables mysql.
>This should be okay, however tiger has a provision that the primary
>description is of the line's most prominent feature, so there's a chance
>that this might miss something, but I don't think so. I believe that
>doing this will allow us to tinker with the data until we're sure it's
>ok, and maybe do something like load all interstates to production
>first.
>
>
>
>
>
>
>On Thu, 2007-04-19 at 07:52 -0700, Andrew Turner wrote:
>> Agreed that the project didn't seem 'noble' enough since it wasn't
>> mapping Africa or otherwise. Interesting that "Maps 2.0' got picked.
>> Maybe they have money now to offer OSM ;)
>>
>> I do agree the project for US kickstart is still needed. Timeframe
>> wise would be to put together the plan and the beginning of the import
>> code and then go around Where/WhereCamp invigorating the US geo
>> community to make it happen.
>>
>> With regards to boxxen, Anselm may have space - I can ping him. What
>> are the expected requirements, based on existing h/w used? Otherwise,
>> an option may be to do a US donation drive to build up the small?
>> amount - good hosting is only like $30-50/month.
>>
>> I'm still in CA at Loc Int & web2.0 and can gather more thoughts after
>> I get back.
>> Andrew
>>
>> On 4/19/07, Don Smith <dcsmith at gmail.com> wrote:
>> > So,
>> > The last place this was left was that we were going to load the
>> > tiger/line data into mysql, then run some stored procedures to get the
>> > data into osm format.
>> > There was going to be a dedicated machine for the project, but I have
>no
>> > idea where that ended up. I switched my desktop with the idea if the
>> > machine wasn't ready I'd start development, but haven't gotten around
>to
>> > it yet. I guess my task will be to layout the tables, and write a
>loader
>> > in python. Once that's done, the rest should be the writing of a few
>> > stored procedures to convert it to osm data (Which I haven't looked at
>> > closely yet).
>> >
>> > Don Smith
>> >
>> > On Thu, 2007-04-19 at 09:25 +0100, 80n wrote:
>> > > Well, we weren't one of the winning 20.  But hopefully the project
>> > > will have got some exposure from this process anyway.  Anyway, I'd
>> > > like to thank everybody for all the time and work they put into the
>> > > proposal.
>> > >
>> > > I think the main problem with our proposal was that it was focussed
>on
>> > > helping the project succeed in the US.  If we'd have made a pitch for
>> > > creating a free map of some small village in Africa it would have
>come
>> > > over better.  Anyone else have any thoughts or observations, in
>> > > retrospect, about what we did wrong and what we could do better next
>> > > time?
>> > >
>> > > The orginal goal of kick-starting OSM in the US still exists.  Does
>> > > anyone have any other ideas or strategies that we can try?  The
>> > > TIGER/Line data still needs to be dealt with and will continue to be
>> > > one of the obstacles until we address it.  Don, I think you were
>> > > planning to work on this, what would help you to get it done?
>> > >
>> > > 80n
>> > >
>> > >
>> > >
>> >
>> >
>>
>>