[Imports] Spanish Cadastre ELEMTEX

Paul Norman penorman at mac.com
Thu Jun 7 03:32:34 UTC 2012


> From: Javier Sánchez [mailto:javiersanp at gmail.com]
> Subject: [Imports] Spanish Cadastre ELEMTEX
> 
> The ELEMTEX layer contains text labels about unpopulated places, many
> small populated places and points of interest (like police stations,
> post offices, hospitals, schools, etc), but they are not categorized.
> These data are extracted with an option of Cat2Osm. The result consist
> of nodes with the tags name=*, source=cadastre and source:date=*. The
> job basically consists in manually check nodes, assign tags most
> suitable to describe the elements based on the name and local knowledge,
> drop which of them that can not be classified, correct other errors like
> spelling and conflate with existing OSM data.
> 
> As example two files are attached. The first is the output generated by
> the program Cat2Osm for a municipality [3]. The second contains the data
> reviewed manually [4].

I have a few comments, some on the raw data, some on the edited data. This
is much improved and dealing with a smaller subset of data makes it much
easier to review.

- Some nodes have only name, source and source:date tags while others have
those and place=locality

- Some have absurd names (e.g. name=-I+I and name=P-1, P-2, etc). These
could be dealt with manually but it would be worth seeing how common they
are and dealing with them in the conversion

- Some of the names could be translated to tagging, e.g. name=GASOLINERA
could be turned into amenity=fuel. Again, this would depend how common they
are and how consistent they are.

- All I noticed with the processed file is that were some ways with only
source, source:date and name tags.

CanVec has convinced me that going this method with imports requires some
sort of QA plan or one importer will not check any of their work and cause
duplicates, disconnected data, bad data, etc.

The problem is not that there aren't tools for checking your own work, the
problem is that some people will not use them and cause significant damage.
I can't suggest a solution for this, and I think it is not as significant
for this particular data set, but it is definitely an issue when you start
getting into others like roads and landuse.

If you find a good solution for this problem, please document it and let
everyone know as it is a big problem with imports done by multiple people
who may have varying standards.  




More information about the Imports mailing list