Thanks Jason and Serge,

To address the 5 occurrences of the word "will" on the wiki page, "will"
means "Expressing the future tense", ie consensus happens prior to
commencement of work, which is the primary motivation for this discussion.
 The stated target completion date is the end of 2013, but I'd frankly like
the work to be done (well) before the October editathon.  Once work starts
I'd be glad to update the wiki page to use present tense ;-).

The LWG has not seen this, as I was not aware of that requirement, and
assumed the Import US group provided comprehensive guidance.

On address points, they are "available" via online scanned parcel maps of
the city; I am not committing to hand transcribing some 29,000 points.  I'd
suggest that exercise as a future editathon/MapRoulette project via the
Tasking Manger or similar.  If consensus is that building outlines are not
useful without addresses, please explicitly state that now to save time and
effort.  Again, I am not proposing individually adding address points; this
proposal is for building outlines (and perhaps land use updates) only.

WRT to PA land use, those polys are already in the OSM DB, and many appear
rather rough.  The city has a reasonable set of shapes, for example, there
are about 56 parks, and it seems useful to improve them where possible.  If
the plan is to remove them from OSM across the board, I will drop the idea.

I will investigate outline simplification, and will handle the dups and
intersections as I process the buildings.

Good point on the rooftops, I saw no level attribute, but will revisit.  If
there is no obvious way to remove; I'll again handle by removing as updates
are applied via JOSM.

On the subject of including ids and future updates, the idea that ids are
not very useful appears to be in conflict with the notion of continuing
maintenance.  The IDs are apparently persistent, and therefore are useful
for future update efforts.  Removing the source ids seems to add complexity
to future update efforts.  While I understand no method is perfect, not
including them when present seems suboptimal.  How about paloalto_ca:id for
the id tag ?

I will switch to changeset level source tagging, and change the tag text to
"City of Palo Alto CA 0713", unless something else is preferred.

On processing, I loaded a csv file of the data into PostGIS, validated
geometry, then extracted a shapefile containing only the useful columns and
filtered for outlines that are proper polygons via ogr2ogr.  I then
generated the osm file from the resulting shapefile via ogr2osm.

Again, thanks for the quick feedback; it is appreciated.

On Mon, Jul 22, 2013 at 8:17 AM, Serge Wroclawski <emacsen at gmail.com> wrote:

> On Mon, Jul 22, 2013 at 10:25 AM, the Old Topo Depot
> <oldtopos at novacell.com> wrote:
> > The city of Palo Alto, CA has released a set of building outlines, and
> I've
> > created an .osm file from that data at
> > https://github.com/oldtopos/PaloAltoCA
> Comments on the osm file at the bottom.
> > A wiki page discussing this proposal can be reviewed at
> >
> http://wiki.openstreetmap.org/wiki/Palo_Alto,_California/Buildings_Import#Data_Transformation_Results
> John,
> The purpose of this list, and this group, is to work collaboratively
> and also to build consensus, so I'm glad you're bringing this import
> up.
> One question that is looming large in my mind is the question of
> timeframe, and feedback, since there is a lot of "will" language in
> the wiki.
> How much time will you give us for participatory feedback?
> Now onto some questions.
> The license for this data is "interesting"- in that it has a lot of
> terms in it which I don't understand, because I suspect they have a
> different meaning in legalease than they do in plain English.
> Has the LWG seen this license?
> > This import does NOT include comprehensive address points, which is not
> > currently available as an easily imported dataset.
> This question of the value of building outlines without their
> corresponding addresses comes up very often.
> Building traces are very high in terms of data density, but without
> something like an address, they can be very noisy with little benefit.
> Is there a plan to get addresses, or at least address interpolation
> data, to go alone with the high density data?
> > The city also as has a land use polygon dataset (among others) that I'll
> use
> > to update existing shapes once this import has been approved.
> We've discussed landuse, especially in California, and found it to be
> of very low quality. There seems to be a growing consensus in this
> group that landuse isn't that valuable in general. I'm still somewhat
> neutral on this topic, but I'm increasingly persuaded not only by the
> arguments that have been presented, but by the compelling evidence
> that landuse imports have been especially bad in the US.
> On the OSM file, I echo Jason's sentiments.
> The overlapping buildings is really confusing to me. I don't
> understand what that is- can you explain it?
> As Paul Norman and others have pointed out in the past, including a
> source_id on an object does not really help- so it's not generally
> encouraged (and I'd argue for its removal). Source tags on objects,
> too, is generally not needed, as we can get that same information from
> the changeset.
> My other question is regarding your conflation process and the future.
> Your github contains the output file, but is it possible for you to
> share how you derived at that data?
> Also, I'm thinking about in a few years, if the city has a new
> dataset, how it would be possible to conflate the datasets between
> what is in OSM and what's in the new dataset, both adding new
> building, and removing buildings which have been deleted from the new
> dataset, or even buildings which were removed from OSM but still
> appear in the city data?
> Some of this will need to be handled by manual review, but the
> question is- based on working with this data, do you have any thoughts
> to share about it?
> - Serge

