On 04/12/2007, <b class="gmail_sendername">Martijn van Oosterhout</b> <<a href="mailto:kleptog@gmail.com">kleptog@gmail.com</a>> wrote:<div><span class="gmail_quote"></span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Dec 4, 2007 5:07 AM, D Tucny <<a href="mailto:d@tucny.com">d@tucny.com</a>> wrote:<br>> > - Many many placenames, especially in India<br>><br>> All the placenames in China are in their romanised form, does the data have
<br>> the Chinese names?<br><br>I've check the files by visual inspection but no, it appears only<br>romanised names are available.</blockquote><div><br>I wonder if it would be worth trying to merge this with GNS data, I'll try have a look at coming up with something to do that this evening... At least comparing GNS vs this data would be useful I think... The GNS data has the chinese name too...
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> > - Lots of the major regional roads connecting the towns<br>><br>> For China at least, as mentioned previously, the roads are, at best, very
<br>> very out of date... most expressways (motorways) are missing and the<br>> accuracy of the other roads is very low... Does the data have any more<br>> details on roads, such as names or references? That would make it easier to
<br>> try and link them to real data... luckily though, the data is so sparse that<br>> merging shouldn't be an issue as long as there's a way to work out what<br>> should be linked where...<br><br>Are they wildly off, or do you mean that there's not many points per
<br>kilometre? I don't see any reference, though how do they refer to<br>their roads, by number, by code, by name???</blockquote><div><br>As mentioned in IRC, yes, the resolution is low, but, most of the roads I've been looking at have been replaced...
<br>My guestimate for the age of the data is early to mid 90s, since then, around 50000km of expressway have been built and the road network overall grew by approximately 1 million km... The majority of these new roads are intercity links, which is all this data contains...
<br><br>Roads are referred to by name in cities, name and reference on intercity routes... <br>The references refer to the road type typically, 'A' being expressways (Motorways), 'G' being national highways, 'S' being provincial highways, 'X' being county roads...
<br>However... this isn't entirely standardised... for example the 'A' references seem mostly used around Shanghai, but not so much if at all in other parts of China... <br>All expressways however do have names, typically referring to their start and end points in a shortened form, however, these names seem to often change along the route,
e.g. for an expressway running between two cities in different provinces, the name order may switch at the tolls at the province border... e.g. Hangning Expressway (杭宁高速(公路))is the naming used in Zhejiang province for the expressway to Nanjing, but, cross the border into Jiangsu and the same expressway is named Ninghang Expressway (宁杭高速), in this example Hang (杭) refers to Hangzhou, Ning (宁) refers to Nanjing as the common abbreviation... Huhang Expressway (沪杭高速), also referred to as A8, runs Shanghai (hu/沪) to Hangzhou, however, it's also referred to in parts as Huhangyong Expressway (沪杭甬高速) where yong refers to Ningbo where the expressway extends to after Hangzhou...
<br><br>e.g.<br><br>A8 is the Huhang(yong) expressway, though lots of expressways exist without a numeric reference... The names, and references where they exist, should be nationally unique...<br>G320 is National Highway 320, which might be signed as 320国道, 320 or G320, they should be nationally unique...
<br>S02 is a province level highway, numbered 02, each province has it's own numbering... <br><br>I'm still searching for any of the roads in that data and I'll let you know if I find any... As you've mentioned, there are no references or names of roads in this data...
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> > There are also lakes and some rivers.<br>><br>> The lakes, rivers and included coastline info looks way off, some of it
<br>> might have been the way representing a long long time ago, the PGS data for<br>> the areas I've looked gives a much more accurate representation of those<br>> features... Also, at least one lake I've looked at looks very broken, only
<br>> seems to have the south west corner of it's coast with the open ends<br>> seemingly just joined together to make a small slice of the lake..<br><br>The coastlines are not from the AND data. It's using standard mapnik
<br>coastlines for zoom 9, but going all the way down because the normal<br>coastline data file break horribly at high zoom. I thought I'd fixed<br>it but apparently not. And I can't use the coastline out of the main
<br>DB 'cause it's not there yet.<br><br>I'll see what I can find out about the rivers and such.</blockquote><div><br>The lake mentioned above has been fixed in the latest .osm file, the small slice seen before was a result of the lake being split along the province border which runs through the lake.
<br><br>While the coast might not be there, the province borders do follow the coast and islands, for provinces with a coastline, this will probably need something doing to it... <br><br>From looking through the data, there is a considerable overlap of data closer to the coast with the PGS data, up to about 300km inland... The PGS data is a higher resolution and looks quite a bit more recent, so some form of manual import of the water along these areas would probably be beneficial as the water data does have a few benefits, 1) data further inland, 2) some lakes and rivers have names... This will need tidying up somewhat, as it stands... Rivers are currently tagged as waterway=river while what is typically tagged is an area... islands still look a bit wrong as mentioned in my last post...
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> There are only really accent on the romanised forms used in Xinjiang,<br>
> pinyin, the romanisation of mandarin, can have accents to represent tones,<br>> but these don't seem to be present... I can confirm though that the accents<br>> on Ürümqi, the capital of Xinjiang look correct, pinyin would be Wulumuqi,
<br>> or with tones, Wūlǔmùqí, chinese would be 乌鲁木齐.<br><br>Ok, that's what I wanted to know. I basically assumed the data was in<br>Latin1, so I wanted a check the it was in the right ballpark. No kanji<br>anywhere to be found.
<br><br>> I hope this feedback on the China data is useful... Would be good to see<br>> some examples of what data is actually in there... As it stands, the place<br>> info in the GNS data looks more detailed, if there's no more info in these
<br>> files, it would probably be better for me to go back to working on getting<br>> that data together, I mostly stopped when we heard about the AND data...<br>> Water features seem to be better covered in the places I've looked by PGS.
<br>> The road data itself seems to be just VMAP0 data...<br><br>Well, what I havn't imported yet is:<br>- There is some coastline data for the pacific ocean, but we already have that<br>- Administrative boundaries: polygons for provinces I beleive
<br>(examples: Tianjin, Zhejiang, Heilongjiang)<br>- A file with a list of GPS coordinates with postcodes and names (romanised)</blockquote><div><br>Province boundaries, very good, not that many that we can't reasonably easily fixup the tagging of them,
e.g. adding chinese and/or filling in the central point with standard abbreviation, postcode prefix etc... <br></div><br>As discussed in IRC, the postcodes file seems to be a postcode prefixes of asia file, without any China data...
<br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">For the boundaries I might try loading them into mapnik direct rather<br>than via OSM files, see if I can make it work better...
<br><br>> So, unless there is more data in the files that I'm not seeing on the map,<br>> we can get the same or better data from other sources with global coverage<br>> and that the only benefit to this data is that we already have tools to
<br>> convert and import it... Would that be an accurate summary?<br><br>Perhaps, at least Mumbai is in the but that's not in China. All the<br>places names include a population (though I can't guess the quality of
<br>that either). I'd like to provide people access via JOSM so they can<br>look for themselves, but there's no simple read-only OSM server that I<br>can just drop in and use. I could write one ofcourse...</blockquote>
<div><br>Population data seem pretty out too... <br><br>As you mentioned above, you've already made the .osm files, so that problem is solved :)<br> <br>So, in summary, the data we have is (with my opinions on it below)...
<br><br>Places, cities plus some towns and villages where they are in the road network.<br> - Possibly worthwhile importing, but probably best to merge in the GNS data, or, only use the GNS data... The fact that the places are nodes in the road network is probably an issue too...
<br><br>Place areas, the borders around an the urban area.<br> - Looks nice, but, probably never very accurate and now, it is very out of date... Realistically, probably not worth importing...<br><br>Airports<br> - Probably not worth importing, the locations of the ones I've checked are not overly accurate, some are way out, (10+km) and we already have airport data. There are 'service roads' connecting the nearest road to the airport in a straight line, these are definitely not worth importing I would say...
<br><br>Water<br> - Worthwhile importing, but, needs some more fixing up, and will need some careful merging with existing, better PGS data near the coast at least...<br><br>Coastline<br> - Not worth importing, the PGS data seems more recent, higher resolution and a lot has already been imported and has had a lot of time spent on tidying up...
<br><br>Administrative Boundaries<br> - Definitely worthwhile importing, will need some tidy up along the coasts, especially where there are islands, and adding more tagging data, like the name in Chinese would be useful... This would probably be a bit of data that would benefit from being lockable...
<br><br>Roads<br> - I'm still not convinced by these... The fact that they seem obviously very out of date, have a low resolution, lack names or references and the fact that they in many cases just seem to join together in a city point makes me think that they don't really add much... Obviously they would make it look like we had data, but, I don't think, they really give us anything approaching usable data... As it stands, I'd say they are not worth importing... Potentially splitting the road data out and breaking it down based on provinces and leaving those files somewhere accessible would be useful, maybe at some point the the future, someone in a remote location may find that the data is useful for their area, but, even if an area hasn't had any road improvements in the past 10-20 years, pretty unlikely for major roads even in remote locations, I'm not convinced the data would even give much value there... As I mentioned in IRC, our data would probably be a more accurate reflection of what's there if we imported all the GNS places, and joined all large cities together with straight lines tagged as motorway, drop a bit of terrain information into the process to make some curves and it would probably be a lot more accurate, could then even join up smaller cities with non-motorways... I'm not suggesting we do this, I'm just trying to explain my feel on the data that if all we want is less white space, making data up using some form of guestimation as to where roads would probably be built would probably give us something more useful/accurate than this road data...
<br><br>d<br></div></div>