[OSM-talk] AND preview for India/China

D Tucny d at tucny.com
Wed Dec 12 04:08:28 GMT 2007


On 04/12/2007, Martijn van Oosterhout <kleptog at gmail.com> wrote:
>
> On Dec 4, 2007 5:07 AM, D Tucny <d at tucny.com> wrote:
> > > - Many many placenames, especially in India
> >
> > All the placenames in China are in their romanised form, does the data
> have
> > the Chinese names?
>
> I've check the files by visual inspection but no, it appears only
> romanised names are available.


I wonder if it would be worth trying to merge this with GNS data, I'll try
have a look at coming up with something to do that this evening... At least
comparing GNS vs this data would be useful I think... The GNS data has the
chinese name too...

> > - Lots of the major regional roads connecting the towns
> >
> >  For China at least, as mentioned previously, the roads are, at best,
> very
> > very out of date... most expressways (motorways) are missing and the
> > accuracy of the other roads is very low... Does the data have any more
> > details on roads, such as names or references? That would make it easier
> to
> > try and link them to real data... luckily though, the data is so sparse
> that
> > merging shouldn't be an issue as long as there's a way to work out what
> > should be linked where...
>
> Are they wildly off, or do you mean that there's not many points per
> kilometre? I don't see any reference, though how do they refer to
> their roads, by number, by code, by name???


As mentioned in IRC, yes, the resolution is low, but, most of the roads I've
been looking at have been replaced...
My guestimate for the age of the data is early to mid 90s, since then,
around 50000km of expressway have been built and the road network overall
grew by approximately 1 million km... The majority of these new roads are
intercity links, which is all this data contains...

Roads are referred to by name in cities, name and reference on intercity
routes...
The references refer to the road type typically, 'A' being expressways
(Motorways), 'G' being national highways, 'S' being provincial highways, 'X'
being county roads...
However... this isn't entirely standardised... for example the 'A'
references seem mostly used around Shanghai, but not so much if at all in
other parts of China...
All expressways however do have names, typically referring to their start
and end points in a shortened form, however, these names seem to often
change along the route, e.g. for an expressway running between two cities in
different provinces, the name order may switch at the tolls at the province
border... e.g. Hangning Expressway (杭宁高速(公路))is the naming used in Zhejiang
province for the expressway to Nanjing, but, cross the border into Jiangsu
and the same expressway is named Ninghang Expressway (宁杭高速), in this example
Hang (杭) refers to Hangzhou, Ning (宁) refers to Nanjing as the common
abbreviation... Huhang Expressway (沪杭高速), also referred to as A8, runs
Shanghai (hu/沪) to Hangzhou, however, it's also referred to in parts as
Huhangyong Expressway (沪杭甬高速) where yong refers to Ningbo where the
expressway extends to after Hangzhou...

e.g.

A8 is the Huhang(yong) expressway, though lots of expressways exist without
a numeric reference... The names, and references where they exist, should be
nationally unique...
G320 is National Highway 320, which might be signed as 320国道, 320 or G320,
they should be nationally unique...
S02 is a province level highway, numbered 02, each province has it's own
numbering...

I'm still searching for any of the roads in that data and I'll let you know
if I find any... As you've mentioned, there are no references or names of
roads in this data...

> > There are also lakes and some rivers.
> >
> > The lakes, rivers and included coastline info looks way off, some of it
> > might have been the way representing a long long time ago, the PGS data
> for
> > the areas I've looked gives a much more accurate representation of those
> > features... Also, at least one lake I've looked at looks very broken,
> only
> > seems to have the south west corner of it's coast with the open ends
> > seemingly just joined together to make a small slice of the lake..
>
> The coastlines are not from the AND data. It's using standard mapnik
> coastlines for zoom 9, but going all the way down because the normal
> coastline data file break horribly at high zoom. I thought I'd fixed
> it but apparently not. And I can't use the coastline out of the main
> DB 'cause it's not there yet.
>
> I'll see what I can find out about the rivers and such.


The lake mentioned above has been fixed in the latest .osm file, the small
slice seen before was a result of the lake being split along the province
border which runs through the lake.

While the coast might not be there, the province borders do follow the coast
and islands, for provinces with a coastline, this will probably need
something doing to it...

From looking through the data, there is a considerable overlap of data
closer to the coast with the PGS data, up to about 300km inland... The PGS
data is a higher resolution and looks quite a bit more recent, so some form
of manual import of the water along these areas would probably be beneficial
as the water data does have a few benefits, 1) data further inland, 2) some
lakes and rivers have names... This will need tidying up somewhat, as it
stands... Rivers are currently tagged as waterway=river while what is
typically tagged is an area... islands still look a bit wrong as mentioned
in my last post...

> There are only really accent on the romanised forms used in Xinjiang,
> > pinyin, the romanisation of mandarin, can have accents to represent
> tones,
> > but these don't seem to be present... I can confirm though that the
> accents
> > on Ürümqi, the capital of Xinjiang look correct, pinyin would be
> Wulumuqi,
> > or with tones, Wūlǔmùqí, chinese would be 乌鲁木齐.
>
> Ok, that's what I wanted to know. I basically assumed the data was in
> Latin1, so I wanted a check the it was in the right ballpark. No kanji
> anywhere to be found.
>
> > I hope this feedback on the China data is useful... Would be good to see
> > some examples of what data is actually in there... As it stands, the
> place
> > info in the GNS data looks more detailed, if there's no more info in
> these
> > files, it would probably be better for me to go back to working on
> getting
> > that data together, I mostly stopped when we heard about the AND data...
> > Water features seem to be better covered in the places I've looked by
> PGS.
> > The road data itself seems to be just VMAP0 data...
>
> Well, what I havn't imported yet is:
> - There is some coastline data for the pacific ocean, but we already have
> that
> - Administrative boundaries: polygons for provinces I beleive
> (examples: Tianjin, Zhejiang, Heilongjiang)
> - A file with a list of GPS coordinates with postcodes and names
> (romanised)


Province boundaries, very good, not that many that we can't reasonably
easily fixup the tagging of them, e.g. adding chinese and/or filling in the
central point with standard abbreviation, postcode prefix etc...

As discussed in IRC, the postcodes file seems to be a postcode prefixes of
asia file, without any China data...

For the boundaries I might try loading them into mapnik direct rather
> than via OSM files, see if I can make it work better...
>
> > So, unless there is more data in the files that I'm not seeing on the
> map,
> > we can get the same or better data from other sources with global
> coverage
> > and that the only benefit to this data is that we already have tools to
> > convert and import it... Would that be an accurate summary?
>
> Perhaps, at least Mumbai is in the but that's not in China. All the
> places names include a population (though I can't guess the quality of
> that either). I'd like to provide people access via JOSM so they can
> look for themselves, but there's no simple read-only OSM server that I
> can just drop in and use. I could write one ofcourse...


Population data seem pretty out too...

As you mentioned above, you've already made the .osm files, so that problem
is solved :)

So, in summary, the data we have is (with my opinions on it below)...

Places, cities plus some towns and villages where they are in the road
network.
 - Possibly worthwhile importing, but probably best to merge in the GNS
data, or, only use the GNS data... The fact that the places are nodes in the
road network is probably an issue too...

Place areas, the borders around an the urban area.
 - Looks nice, but, probably never very accurate and now, it is very out of
date... Realistically, probably not worth importing...

Airports
 - Probably not worth importing, the locations of the ones I've checked are
not overly accurate, some are way out, (10+km) and we already have airport
data. There are 'service roads' connecting the nearest road to the airport
in a straight line, these are definitely not worth importing I would say...

Water
 - Worthwhile importing, but, needs some more fixing up, and will need some
careful merging with existing, better PGS data near the coast at least...

Coastline
 - Not worth importing, the PGS data seems more recent, higher resolution
and a lot has already been imported and has had a lot of time spent on
tidying up...

Administrative Boundaries
 - Definitely worthwhile importing, will need some tidy up along the coasts,
especially where there are islands, and adding more tagging data, like the
name in Chinese would be useful... This would probably be a bit of data that
would benefit from being lockable...

Roads
 - I'm still not convinced by these... The fact that they seem obviously
very out of date, have a low resolution, lack names or references and the
fact that they in many cases just seem to join together in a city point
makes me think that they don't really add much... Obviously they would make
it look like we had data, but, I don't think, they really give us anything
approaching usable data... As it stands, I'd say they are not worth
importing... Potentially splitting the road data out and breaking it down
based on provinces and leaving those files somewhere accessible would be
useful, maybe at some point the the future, someone in a remote location may
find that the data is useful for their area, but, even if an area hasn't had
any road improvements in the past 10-20 years, pretty unlikely for major
roads even in remote locations, I'm not convinced the data would even give
much value there... As I mentioned in IRC, our data would probably be a more
accurate reflection of what's there if we imported all the GNS places, and
joined all large cities together with straight lines tagged as motorway,
drop a bit of terrain information into the process to make some curves and
it would probably be a lot more accurate, could then even join up smaller
cities with non-motorways... I'm not suggesting we do this, I'm just trying
to explain my feel on the data that if all we want is less white space,
making data up using some form of guestimation as to where roads would
probably be built would probably give us something more useful/accurate than
this road data...

d
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20071212/98a3e0b9/attachment.html>


More information about the talk mailing list