[OSM-ja] Japan post code polygons

Tom Lee tlee @ mapbox.com
2015年 1月 30日 (金) 16:05:35 UTC


I have investigated the Japan Post CSV. It uses a JISX0402 code for one of
its columns, which allows it to be joined to shapefile data from MLIT/KSJ2.
Each JISX0402 code corresponds to an administrative district, and may be
composed of many polygons (islands, for instance). Here is the MLIT
shapefile data at maximum resolution:

https://api.tiles.mapbox.com/v4/sbma44.46ed6987/page.html?access_token=pk.eyJ1Ijoic2JtYTQ0IiwiYSI6Inh1cm5teEEifQ.LFnEfmyK7mtxU5O64ID4ZA#6/38.316/139.032

There are 1900 JISX0402 codes. But (if my e-Stat polygon combination is
correct), there are 106450 postcodes. This means that the Japan Post CSV
can only be used to assign postcodes in a way so that an average of 56
postcodes share a single polygon. This might be acceptable for some uses
but in my testing it is insufficient for high-quality geocoding. (I have
completed this work, however, and if such a product is useful to others I
will be happy to share it, thanks to MLIT's open license and the CC0 status
of the CSV files).

One possibility: the Japan Post post office/ATM locator
<http://map.japanpost.jp/pc/map.php?el=141.12.49.655&nl=38.25.23.603&scl=70000>
provides many points will full postcodes and lat/lon coordinates.  My
initial experiments show that there may as many as 25,000 such
point/postcode pairs in this tool. Because I do not speak Japanese and am
unfamiliar with Japanese law, I am unsure of whether we can collect and
reuse these points. Satoshi, if you are able, perhaps you could investigate
this question? If it appears that we cannot use them, perhaps you could
also email the Japan Post contact page? I have had success in the past when
I asked them about the license status of the postal code CSV files.

Are there other businesses in Japan that have many locations, ATMs or
vending machines? For instance: what are the most popular banks? Knowing
this could help identify additional data sources, if Japanese law allows it
(under US law I believe such work is permissible, because the data will be
substantially transformed).

As always, thank you for your help and advice.

Tom



On Thu, Jan 29, 2015 at 9:03 PM, Satoshi IIDA <nyampire at gmail.com> wrote:

>
> Although I could not checked carefully,
> this dataset (address of post office) might be helpful.
>
> http://www.post.japanpost.jp/zipcode/dl/jigyosyo/zip/jigyosyo.zip
> http://www.post.japanpost.jp/zipcode/dl/jigyosyo/index-zip.html
>
> Owner "Japan Post Co." does not claim copyright to this dataset as same as
> their post code dataset.
> (I guess almost same as CC0)
>
> http://www.post.japanpost.jp/zipcode/dl/jigyosyo/readme.html
> >
> 大口事業所個別番号データに限っては日本郵便株式会社は著作権を主張しません。自由に配布していただいて結構です。日本郵便株式会社への許諾も必要ありません。
>
> But this dataset do not contain coordinate.
> So it need join using neighbourhood name column.
>
> 4th column "都道府県名" is province's name.
> 5th column "市区町村名" is city/town/village's name.
> 6th column "町域名" is neighbourhood name.
> 8th column "大口事業所個別番号" is postcode digit (7 figured).
>
> I'll seek the other dataset. :)
>
>
>
> 2015-01-30 9:06 GMT+09:00 Tom Lee <tlee at mapbox.com>:
>
>> Unfortunately the school address column does not include the postal code
>> in either of these files. Here's sample output from ogrinfo. At this point
>> I'm afraid I don't have many good ideas about how to finish this work,
>> unless someone else from the Japanese mapping community has a suggestion.
>>
>> OGRFeature(P29-13):7438
>>   P29_001 (String) = 35201
>>   P29_002 (String) = 16
>>   P29_003 (String) = 16002
>>   P29_004 (String) = 16002
>>   P29_005 (String) = 玄洋中学校
>>   P29_006 (String) = 彦島本村町2-8-1
>>   P29_007 (Real) = 3.000000
>>   X (Real) = 130.906788
>>   Y (Real) = 33.942762
>>   No (String) = 58
>>   FLG (String) = 1
>>   Memo (String) = H18KSJ
>>   Memo2 (String) = (null)
>>   F14 (String) = (null)
>>   F15 (String) = (null)
>>   F16 (String) = (null)
>>   F17 (String) = (null)
>>   memo3 (String) = (null)
>>   F18 (String) = (null)
>>
>> post offices:
>> OGRFeature(P30-13):10320
>>   P30_001 (String) = 26344
>>   P30_002 (String) = 18
>>   P30_003 (String) = 18002
>>   P30_004 (String) = 18006
>>   P30_005 (String) = 郷ノ口郵便局
>>   P30_006 (String) = 郷之口本町16-2
>>   P30_007 (Integer) = 0
>>   X (Real) = 135.85202000000
>>   Y (Real) = 34.85229400000
>>   No (Integer) = 446
>>   FLG (Integer) = 1
>>   Memo (String) = H18KSJ
>>   Memo2 (String) = (null)
>>   memo3 (String) = (null)
>>   行政コード (String) = (null)
>>
>> On Thu, Jan 29, 2015 at 5:52 AM, Satoshi IIDA <nyampire at gmail.com> wrote:
>>
>>>
>>> > e-Stat has confirmed
>>> <https://gist.github.com/sbma44/2805ee5c0e8dc2825631> via email that
>>> their data, when transformed, may be used and redistributed with
>>> attribution.
>>> Great!
>>> In fact, "丁目 (neighbourhood)" polygon data is very rare (under Open
>>> License),
>>> and I'm super happy that we could use "when transformed" terms.
>>>
>>> > shapefile
>>> What is the digits in Attribute Table?
>>> It seems it is not post code.
>>>
>>> > address data with geometory
>>> If I understand your motivation correctly,
>>> most famous data is Kokudo_Suuchi_Joho (KSJ2) that could use even into
>>> OSM.
>>> http://wiki.openstreetmap.org/wiki/Import/Catalogue/Japan_KSJ2_Import
>>>
>>> * list - http://nlftp.mlit.go.jp/ksj/
>>>
>>> e.g.
>>> * post office -
>>> http://nlftp.mlit.go.jp/ksj/gml/datalist/KsjTmplt-P30.html
>>>   code "P30_006" is address column.
>>> * school - http://nlftp.mlit.go.jp/ksj/gml/datalist/KsjTmplt-P29.html
>>>   code "P29_006" is address column.
>>>
>>> But the notation for address is very messy.
>>>
>>> Anyone knows the geometory dataset with post-code?
>>> どなたか、郵便番号と住所 (と、緯度経度) が対になったOpenなデータセットをご存知ありませんか?
>>>
>>>
>>>
>>>
>>> 2015-01-29 8:51 GMT+09:00 Tom Lee <tlee at mapbox.com>:
>>>
>>>> I've made a bit more progress. First, e-Stat has confirmed
>>>> <https://gist.github.com/sbma44/2805ee5c0e8dc2825631> via email that
>>>> their data, when transformed, may be used and redistributed with
>>>> attribution.
>>>>
>>>> Second, I have successfully joined the data together using the methods
>>>> I described above. Below is a link to the compressed shapefile (~240MB).
>>>>
>>>> https://www.dropbox.com/s/j9sfaogofg5tkkr/japan_estat_joined.zip?dl=0
>>>>
>>>> I would be grateful for any feedback you can offer on the correctness
>>>> of this geometry, suggestions for means of evaluating it, or how a postal
>>>> code might be assigned to each polygon.
>>>>
>>>> At the moment I believe I need a source of point geometry that can be
>>>> used to assign postcodes to these polygons. I have working code written
>>>> using some restaurant locations pulled from the web, but this only covers
>>>> about 1% of the polygons. If anyone has appropriate data available under an
>>>> acceptably open license, please let me know if you'd be willing to share
>>>> it! I have not done much research, but I can imagine that voting locations,
>>>> school locations or other public data might be appropriate to this use.
>>>>
>>>> Thanks very much for any advice or thoughts you might have.
>>>>
>>>> Tom Lee
>>>>
>>>>
>>>>
>>>> On Tue, Jan 27, 2015 at 10:25 AM, Tom Lee <tlee at mapbox.com> wrote:
>>>>
>>>>> Thank you! This is quite encouraging. I am unable to read Japanese,
>>>>> but Google Translate makes your interpretation -- that distributing
>>>>> modified data is okay -- seem reasonable to me. I will email e-Stat for
>>>>> clarification, and would welcome any thoughts that others on this list
>>>>> might have about this.
>>>>>
>>>>> Thank you also for the jamfunk.jp links. This is detail about the
>>>>> Japan Post CSV that I did not know, and which will certainly be useful. I
>>>>> do not believe that it contains a mapping that would allow postcodes to be
>>>>> connected to the geometry derived from e-Stat. However, I do have a
>>>>> database of zip code centroids for Japan which could be used. I will have
>>>>> to check the licensing and see if it can be used to create a
>>>>> redistributable product.
>>>>>
>>>>>
>>>>> On Mon, Jan 26, 2015 at 11:14 PM, Satoshi IIDA <nyampire at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Amazing work!
>>>>>>
>>>>>> 1. source of geo-data
>>>>>> At first glance, e-Stats data is not "Open" as we use.
>>>>>> Data re-distribution is forbidden by Terms of Use.
>>>>>>
>>>>>> http://e-stat.go.jp/SG2/eStatFlex/help/content/_73.html#B007
>>>>>> > B-7. 第三者に提供することを目的として、ダウンロードしたデータを利用することはできますか?
>>>>>> > 本システムからダウンロードしたデータを複製(ファイル形式を変換しての複製を含む)してそのまま第三者に譲渡することは禁じています。
>>>>>> > 詳細については、ダウンロードデータについての『使用上の注意』をご参照ください。
>>>>>>
>>>>>> http://e-stat.go.jp/SG2/eStatFlex/help/content/_72.html
>>>>>> > 2.利用の制限
>>>>>> >
>>>>>> 利用者は、本システムでダウンロードしたデータ及び画像データをそのまま複製(ファイル形式を変換しての複製を含む。)して第三者に譲渡することを禁じます。
>>>>>>
>>>>>> Maybe "そのまま複製 (just copied one)" in this sentence means
>>>>>> "データを付け加えるなど、加工すれば配布OK (modified data is permitted to distribute)".
>>>>>> Is my understanding same as yours? :)
>>>>>>
>>>>>> 2. Combination of "丁目" polygon and ZIP-code digit
>>>>>> Perfect correspondence would be difficult, but it is worth to tackle!
>>>>>>
>>>>>> Famous errors in ZIP csv are summarized in this site.
>>>>>> http://jamfunk.jp/wp/?page_id=356
>>>>>> http://jamfunk.jp/wp/?p=390
>>>>>> http://jamfunk.jp/wp/?p=417
>>>>>>
>>>>>> I guess most annoying is "○○の一部 (part of XXX chome)" descriptions.
>>>>>> Famous around Iwate Prefecture.
>>>>>> http://www.city.morioka.iwate.jp/sumai/jukyohyoji/tsushida/008020.html
>>>>>>
>>>>>> In other word, I guess we could make 99% of the data (except those
>>>>>> errors).
>>>>>>
>>>>>> Best!
>>>>>>
>>>>>>
>>>>>> 2015-01-27 7:10 GMT+09:00 Tom Lee <tlee at mapbox.com>:
>>>>>>
>>>>>>> Update: I have spent some time experimenting with the Census
>>>>>>> shapefiles, and it seems as though one of their ID fields might be usable
>>>>>>> for joining census polygons into postal code polygons. Specifically:
>>>>>>>
>>>>>>> shp2pgsql -W SJIS h22ka13115.shp tokyo1 | psql japan
>>>>>>>
>>>>>>> echo "create table tokyozip as select left(KEY_CODE, 10) as
>>>>>>> KEY_CODE, st_setsrid(st_union(st_buffer(geom,0)),4326) as geom from tokyo1
>>>>>>> group by left(KEY_CODE, 10);" | psql japan
>>>>>>>
>>>>>>> Was used to generate the following shapefile:
>>>>>>>
>>>>>>> http://cl.ly/3p2V1p400h3b/possible_tokyo_postcode.zip
>>>>>>>
>>>>>>> Assigning the correct post code is still a problem to be solved. I
>>>>>>> also don't have as much data (or familiarity with Japanese post codes) as I
>>>>>>> would like to test this hypothesis. Any advice will be much appreciated.
>>>>>>>
>>>>>>> http://i.imgur.com/JMYR09w.jpg
>>>>>>>
>>>>>>>
>>>>>>> Tom
>>>>>>>
>>>>>>> On Mon, Jan 26, 2015 at 3:37 PM, Tom Lee <tlee at mapbox.com> wrote:
>>>>>>>
>>>>>>>> I have been trying to find geometry that corresponds to Japanese
>>>>>>>> postal codes (sometimes also called zip codes). I initially joined Japan
>>>>>>>> Post's CSV download to MLIT's administrative boundary shapefile, but this
>>>>>>>> has proven to be too low-resolution.
>>>>>>>>
>>>>>>>> I have found the PAREA Zip product
>>>>>>>> <http://www.parea.jp/datebase/area_map/index.html>, but of course
>>>>>>>> an open source of data would be preferable.
>>>>>>>>
>>>>>>>> I am particularly curious to know whether E-Stat/Census data can be
>>>>>>>> used to create postal code polygons. If you visit this URL:
>>>>>>>>
>>>>>>>> http://e-stat.go.jp/SG2/eStatGIS/page/download.html
>>>>>>>>
>>>>>>>> and select "平成22年国勢調査(小地域) 2010/10/01"
>>>>>>>>
>>>>>>>> You can then choose a smaller area and download a high-resolution
>>>>>>>> mesh as a shapefile. That file's field definitions can be found here:
>>>>>>>>
>>>>>>>>
>>>>>>>> http://e-stat.go.jp/SG2/eStatFlex/help/content/downloaddata/A002005212010.pdf
>>>>>>>>
>>>>>>>> Here is one such shapefile in QGIS, overlaid on Bing aerial
>>>>>>>> imagery: http://i.imgur.com/7z1dhn4.jpg
>>>>>>>>
>>>>>>>> Although the polygons are well-indexed, they do not seem to
>>>>>>>> correspond to postal codes.
>>>>>>>>
>>>>>>>> Is anyone aware of a means of mapping the data included in this
>>>>>>>> shapefile to postal codes? I would be very glad to share the results of my
>>>>>>>> efforts under an open license, should I prove able to solve this problem
>>>>>>>> (E-Stat's license seems to make this possible).
>>>>>>>>
>>>>>>>> Thanks very much!
>>>>>>>>
>>>>>>>> Tom Lee
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Talk-ja mailing list
>>>>>>> Talk-ja at openstreetmap.org
>>>>>>> https://lists.openstreetmap.org/listinfo/talk-ja
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Satoshi IIDA
>>>>>> mail: nyampire at gmail.com
>>>>>> twitter: @nyampire
>>>>>>
>>>>>> _______________________________________________
>>>>>> Talk-ja mailing list
>>>>>> Talk-ja at openstreetmap.org
>>>>>> https://lists.openstreetmap.org/listinfo/talk-ja
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Talk-ja mailing list
>>>> Talk-ja at openstreetmap.org
>>>> https://lists.openstreetmap.org/listinfo/talk-ja
>>>>
>>>>
>>>
>>>
>>> --
>>> Satoshi IIDA
>>> mail: nyampire at gmail.com
>>> twitter: @nyampire
>>>
>>> _______________________________________________
>>> Talk-ja mailing list
>>> Talk-ja at openstreetmap.org
>>> https://lists.openstreetmap.org/listinfo/talk-ja
>>>
>>>
>>
>> _______________________________________________
>> Talk-ja mailing list
>> Talk-ja at openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk-ja
>>
>>
>
>
> --
> Satoshi IIDA
> mail: nyampire at gmail.com
> twitter: @nyampire
>
> _______________________________________________
> Talk-ja mailing list
> Talk-ja at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk-ja
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-ja/attachments/20150130/700ca46d/attachment-0001.html>


Talk-ja メーリングリストの案内