[Tile-serving] [osm2pgsql-dev/osm2pgsql] Proposal for new data types (Issue #2274)

Jochen Topf notifications at github.com
Mon Dec 9 16:48:29 UTC 2024


Several issues have come up related to data types that can be used in an `define_table()` command and how data is converted from Lua to osm2pgsql. In all these cases it is possible to use the generic `sql_type` mechanism of osm2pgsql to reach the intended goal, but it is a bit awkward, always needs explaining and exposes the user to possible errors which are hard to understand (cryptic error messages from `COPY`). It would be nice, if we can say: "Yes, osm2pgsql supports these common constructs out of the box".

## Geography

Osm2pgsql has always supported the *geometry* datatype with its subtypes like *Point*, *Linestring*, etc. and the setting of the SRID. But it doesn't natively support *geography* data types with their variants. It is easy enough to work around this using the `sql_type` setting, but still a bit awkward. This has been [discussed here](https://github.com/osm2pgsql-dev/osm2pgsql/discussions/2152).

Proposal: Create new datatypes `geography`, `geography-point`, `geography-linestring`, and so on. Default projection would be 4326. Generate an error if the projection is not valid for a geography data type.

Variant A: Also add `geometry-point` as alias for `point` geometry and so on, giving us a consistent naming scheme.

Implementation: Is trivial, just the new types need to be recognized and the different default for the projection. There is no need to write special WKB or so.

## Identity

Unique IDs on tables are often useful. Sometimes they need to be generated. How to do this is [documented in the manual](https://osm2pgsql.org/doc/manual.html#using-an-additional-id-column). But this is a bit awkward and we always get questions on this and the use of the `serial` type.

Proposal: Create new types `id2`, `id4`, and `id8` (with aliases `smallid`, `id`, and `bigid`) that create integer [identity columns](https://www.postgresql.org/docs/current/ddl-identity-columns.html) of the specified size with `GENERATED ALWAYS AS IDENTITY`. It will also set `create_only` to `true` and `not_null`, i.e. osm2pgsql will not try to fill this column.

Variant A: Use `identity` instead of `id` in the type names. Not so easy to confuse with `int`.

Variant B: Also automatically generate a unique index for all ID columns. Might be too "magic", and not every use case needs one.

Variant C: Add a `sequence` option, which allows setting the sequence name instead of creating a default one. For special use cases (say using the same ID space for several tables), users can do a [`CREATE SEQUENCE`](https://www.postgresql.org/docs/current/sql-createsequence.html) before running osm2pgsql and then refer to that.

Implementation: Only the new types need to be recognized and a few SQL templates extended. A bit more to do if a sequence can be set, but still easily done.

## Arrays

PostgreSQL can store arrays of any type. We use this in osm2pgsql to store the list of member node IDs of a way, for instance. Currently you need to define them in Lua as `sql_type = 'int8[]'` etc. and then build the context as text yourself: `nodes = '{' .. table.concat(object.nodes, ',') .. '}'`. [See also the question in discussions forum on this](https://github.com/osm2pgsql-dev/osm2pgsql/discussions/2216).

Proposal: Add new data types `int-array` and its variants for the different integer types and add an automatic conversion from Lua tables with integers. The user doesn't have to write the conversion any more and we can generate better error message if the data is invalid for the type.

Variant A: Allow scalar values in conversion, will result in a single-element array.

Variant B: Also add this for some other scalar types. The only ones that make sense are probably `real` and `text`.

Variant C: Use the PostgreSQL syntax `int[]` and its variants instead.

Variant D: Make this a general feature that works for all data types, i.e. adding `[]` makes any type into an array.

Implementation: Add the new data types and new conversion functions. Depends on the variant how much work this will be.


-- 
Reply to this email directly or view it on GitHub:
https://github.com/osm2pgsql-dev/osm2pgsql/issues/2274
You are receiving this because you are subscribed to this thread.

Message ID: <osm2pgsql-dev/osm2pgsql/issues/2274 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tile-serving/attachments/20241209/bf1ede59/attachment-0001.htm>


More information about the Tile-serving mailing list