[Talk-GB] Import UK postcode data?

Thu Oct 3 08:26:46 UTC 2019

On 03/10/2019 01:40, ndrw6 at redhazel.co.uk wrote:
> 
> - Code-Point Open is a legal and open source of postcode data. In fact 
> it is the _only_ legal source of such data in bulk. All other sources 
> are either derived from CPO or are based on local knowledge.

That's not true. The ONS Postcode Database (ONSPD) products are also 
OGL, at least as far as mainland GB postcodes are concerned (NI 
postcodes are somewhat different). And ONSPD is more useful than 
Code-Point Open, partly because it's more amenable to an automated 
update (you can script a regular download of the latest file, unlike OS 
products which need to be manually ordered each time), and partly 
because it includes more meta-data that can also be valuable (for 
example, it includes lookups to GSS codes for a wide range of 
administrative authorities).

> - The key (and deliberate) limitation Code-Point Open is that it doesn't 
> distinguish between residential postcodes and postcodes assigned to 
> "large users". This is not ideal but still useful - we know the postcode 
> exists at a given location, we just can't be sure if it is the only 
> postcode there.

ONSPD solves this problem, because it includes the "large user" flag.

(Slight tangent here: residential postcodes can be "large user" too; for 
example a university hall of residence with a single address point. 
Postcodes themselves don't distinguish between residential and 
commercial use, and that information isn't reliably held anywhere, even 
in the full PAF, as that information is generally irrelevant to Royal 
Mail's purposes. But it is true that most large user postcodes are 
commercial.)

> - Quality of building in OSM database. Large buildings, especially in 
> town centres, are often not partitioned correctly. Different parts may 
> have different street names and postcodes. Code-Point Open may in fact 
> be helpful in finding and correcting such issues.
> 
> - Some postcodes are for PO boxes (usually collocated with post offices) 
> are are best left out.

You can generally identify Post Office based PO Box postcodes simply by 
looking for postcodes that share identical coordinates. But, of course, 
to do that you need to have all of them; you can't do it reliably on a 
postcode-by-postcode basis.

> My recommendation: import missing postcodes "as is" (as points) with 
> extra tags denoting the import, import date and an accuracy metric from 
> CPO. Keep it searchable and easy to remove or update, if necessary. 
> Code-Point Open is updated quarterly and sometimes centroids move to 
> another building. Filter out PO boxes and postcodes which are already in 
> OSM (I usually check if there is an OSM object with a matching 
> addr:postcode within a 10m radius of the code point). Do not attempt to 
> merge them with buildings as it is not guaranteed to work in all cases. 
> This is best done manually and in some cases it may require a survey.

I agree with all of that, with the exception that I'd suggest using 
ONSPD as the source (for the reasons given above). An advantage of using 
ONSPD is that the presence of the large user flag means that for 
postcodes identified as being large user (if not also PO Box postcodes), 
they do accurately and correctly identify a specific building. So they 
can be merged with the building data where possible.

Mark