[OSM-talk-be] Conflation OpenData Brussels
Glenn Plas
glenn at byte-consult.be
Wed Apr 4 09:04:20 UTC 2018
On 03-04-18 21:22, eMerzh wrote:
> Hi all :)
>
> I'm just discovering the data sets that are in
> http://opendatastore.brussels/ (brussels region)
> and some of them might be really interesting for osm.
> Like :
> - school list
> - streets surfaces
> - 3d buildings
> - parkings
> - transports stop poles
> and much more
>
> , but I was wondering if there was an "easy" way to do the
> conflation... automatically or semi-manually ...
> for now the only way I see is to transform data, put them in a
> postgis then doing all the work manually... but...
> ** there must be a better way **
> no?
hi,
Concerning the 3D buildings and Street surfaces, I know for sure that
this is data coming from Urbis which is a high quality dataset but even
that doesn't justify importing it into OSM without human review.
I've been considering to include the Urbis dataset in my tool but I have
to keep the focus on GRB, once that is launched I will point my
attention to Urbis data. The tool basically does what you state, it
imports it into postgis and makes it easy to export into JOSM directly
and translated to OSM model as good as I/IT can. The urbis source data
model will work quite well with my transform procedure and tools
Not all the data there is GIS oriented either. But a lot is scatttered
around, for example, on the parking subject we are dealing with 12
different datasets, there are sets who represent the access to the
parkings , which in OSM would be a simple access tag on another set.
Here you will have to combine the parking sets and do process the data ,
do the Q/A , handle the exceptions etc. You cannot consider this job
easy and straightforward, it's going to be painful instead.
Also, the more I look at the data, the more errors and problems I see,
I've been working with GRB data for a long time and only last week I
contacted Marc to verify if he also noticed that there seems to be an
"addressing" problem in the transformed data where housenumbers seem to
be messed up in certain cases, and he confirmed this. And that is a
"new" problem, aka: something which slipped my attention for a good year
now... I traced this down to the .dbf data files containing the address
data, so it's a source problem we need to mitigate. (it's not
impossible, it's just more work)
That's just to say, preparing this data is so much work, I've been
working on that part alone -daily- for the last 3 months. It all
starts with quality. I've been using GRB 3D data set as an aid to
determine building types, that is about as far as I want to go for
several reasons. the GRB set (non 3D) is a lot better maintained than
3D grb, which kinda looks like an experiment (that went quite well)
That is also the big issue I have with all the imports in the wild
without data analysis, scrutiny and Q/A work. We do notice people are
just importing the shape files straight into OSM without understanding
the problems related to this. Antwerp for example is huge mistake
happening, Antwerp used to be quite empty on buildings but now it looks
like it's just a flat import of GRB shapefile data (1)
Also Yves has done his homework on VILLO and STIB data, I've seen what
he had to deal with those datasets, I totally agree with his conclusion
that this isn't just a simple: "hey it's gis data, that's good enough"
and that bulk imports like the one that fucked up Antwerp (but makes it
look good on the map) should be avoided.
I work at CIBG/CIRB so I can determine the source of certain datasets if
required and talk to people here to get to know where they are coming
from (not all datasets are from government sources, a lot are from
customers of CIBG/CIRB )
In short, I would be careful and not import in bulk, the schools you
could probably merge/verify with OSM in a few days manually. I would
choose the latter. There is more time spent on automating than you(and
definitely me too) would think at first glance.
Glenn
[1] Diff tool :
http://tiles.grbosm.site/slide/app/index.html#15/51.2091/4.4226
More information about the Talk-be
mailing list