[OSM-dev] Database Schema (was Re: [OSM-talk] Oh dear - planet has duplicate id's)
SteveC
steve at asklater.com
Sat Jun 23 13:57:00 BST 2007
On 23 Jun 2007, at 12:02, Tom Hughes wrote:
> In message <233e28ed4e.tom at loxley.compton.nu>
> Tom Hughes <tom at compton.nu> wrote:
>
>> I now have a migration script which produces a schema that more or
>> less matches the create_database.sql script in the repository.
>>
>> The only place where it doesn't match is two columns which have been
>> declared with an unsigned qualifier in the SQL script, and I propose
>> that we drop the qualifier from those - in at last one case other
>> columns that are logically the same type don't have the qualifier
>> anyway.
>>
>> I would therefore like to propose a plan of action:
>>
>> 1. Somebody dumps the actual schema from the live database so I can
>> make sure it matches create_database.sql and correct anything in
>> my migration script that doesn't quite match the live schema.
>>
>> 2. Commit my fixed migration script and run an SQL script provided
>> by me to drop those unsigned qualifiers from the live database
>> so that the two are fully in sync.
>
> These first two steps are now complete, so the migration script in
> the repository now produces a databases which matches the live one.
>
>> 3. I produce a new migration script to migrate the database to a
>> new schema version that fixes the missing unique indexes and
>> any other oddidities in the current schema.
>
> I've had a look through the schema and have a series of proposals to
> clean things up, which (on a table by table basis) look like:
>
> current_nodes
>
> - Make the index on id a primary key, thus guaranteeing
> uniqueness.
>
> - Split the current index on lat+lon into two separate indexes
> for each axis - as it stands the lon component of the index can
> only be used when exactly one latitude has been selected (more
> or less impossible as it is floating point) so is never used.
> With
> separate indexes the database will be able to decide which is
> most
> efficient for each query and the keys will be smaller so there
> will be less I/O for whichever index it uses.
>
> current_segments
>
> - Make the index on id a primary key, thus guaranteeing
> uniqueness.
>
> - Drop the visible component from the id index as it is pointless
> if the id component is unique anyway. If an index to help find
> only visible components is needed then visible should come
> first.
>
> current_way_segments
>
> - Make segment_id bigint(64) to match other tables.
>
> - Make the index on id a primary key by adding sequence_id.
>
> current_way_tags
>
> - No changes proposed.
>
> current_ways
>
> - No changes proposed.
>
> friends
>
> - Add index on user_id so that a user's friends can be found
> efficiently.
>
> gps_points
>
> - Make gpx_id bigint(64) to match other tables.
>
> - Drop user_id column and index as this information can be
> found by joining to gpx_files and recording the user_id for
> every point seems pointless.
>
> - Split lat/lon index into two separate indexes, dropping the
> user_id component - reasons for this are the same as for nodes.
>
> gpx_gile_tags
>
> - Drop the id field and primary key as they don't seem to serve
> any purpose.
>
> gpx_files
>
> - What is the visible field for here? There are a few records
> with it set to zero but the code ignores it - if it isn't
> used then let's remove it.
isnt that whats used to delete them?
>
> gpx_pending_files
>
> - This seems to be unused so let's remove it.
>
> messages
>
> - Drop user_id column as it serves no purpose (from_user_id
> and to_user_id record who the message is from and to) and
> it is breaking the message functionality at the moment.
>
> - Add index on to_user_id so we can find a user's messages
> efficiently.
>
> - Drop from_display_name column and index as although it is
> filled in it is never used, and the information is available
> by joining to the user table anyway.
>
> meta_areas
>
> - Seems to be unused (but has a few records) and it presumbly
> part of the old area code so should be dropped.
>
> nodes
>
> - Split lat/lon index in to as for current_nodes.
>
> segments
>
> - No changes proposed.
>
> users
>
> - Make email and display name indexes unique.
>
> way_segments
>
> - Make segment_id bigint(64) to match other tables.
>
> way_tags
>
> - No changes proposed.
>
> Please let me know what you think folks...
>
> Tom
>
> --
> Tom Hughes (tom at compton.nu)
> http://www.compton.nu/
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>
have fun,
SteveC | steve at asklater.com | http://www.asklater.com/steve/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20070623/f4b00d7c/attachment.html>
More information about the dev
mailing list