[Tile-serving] [openstreetmap/osm2pgsql] Memory usage during phase 2 flex processing (#1535)
mboeringa
notifications at github.com
Sat Jul 10 12:23:38 UTC 2021
> You can't compare the amount of memory used by some Lua structure with how much this will take in the database. Memory usage in Lua will be much larger.
Yes, I realize that the type of data structure used, and the way the data is stored, can make a huge difference. I recently had to handle a slightly similar issue, where I needed to store unique IDs and the vertex count of polygons for a multi-threaded Python application. My first, naive, approach was to store this information as a nested Python lists-in-list structure, with a separate sub-list for each polygon record. With a few hundred million records, memory soared to over 130 GB... Reading up more about Python objects and memory consumption, I finally settled on re-implementing this as one big 'numpy array', which probably reduced memory consumption by a factor 20x.
> To figure out where the problem is, I suggest running the exact same config, but with the one line removed where you are actually storing anything in the global variable.
Yes, thanks for the suggestion. I will attempt that. It will take some time before I can report the results though, as I have another process running that I would like to finish first.
The code involved though, is this by the way:
```
phase2_admin_ways = {}
...
function osm2pgsql.process_way(object)
if osm2pgsql.stage == 1 then
if clean_tags(object.tags) then
return
end
local area_tags = isarea(object.tags)
if object.is_closed and area_tags then
add_polygon(object.tags)
if z_order(object.tags) ~= nil then
add_transport_polygon(object.tags)
end
else
add_line(object.tags)
if z_order(object.tags) ~= nil then
add_transport_line(object.tags)
end
if roads(object.tags) then
add_roads(object.tags)
end
end
elseif osm2pgsql.stage == 2 then
-- Stage two processing is called on ways that are part of admin boundary relations
local props = phase2_admin_ways[object.id]
if props ~= nil then
tables.admin:add_row({admin_level = props.level, multiple_relations = (props.parents > 1), geom = { create = 'line' }})
end
end
end
function osm2pgsql.process_relation(object)
-- grab the type tag before filtering tags
local type = object.tags.type
object.tags.type = nil
if clean_tags(object.tags) then
return
end
if type == "boundary" or (type == "multipolygon" and object.tags["boundary"]) then
add_line(object.tags)
if roads(object.tags) then
add_roads(object.tags)
end
add_polygon(object.tags)
elseif type == "multipolygon" then
add_polygon(object.tags)
if z_order(object.tags) ~= nil then
add_transport_polygon(object.tags)
end
elseif type == "route" then
add_line(object.tags)
add_route(object)
-- TODO: Remove this, roads tags don't belong on route relations
if roads(object.tags) then
add_roads(object.tags)
end
end
end
function osm2pgsql.select_relation_members(relation)
if relation.tags.type == 'boundary'
and relation.tags.boundary == 'administrative' then
local admin = tonumber(admin_level(relation.tags.admin_level))
if admin ~= nil then
for _, ref in ipairs(osm2pgsql.way_member_ids(relation)) do
-- Store the lowest admin_level, and how many relations it used in
if phase2_admin_ways[ref] == nil then
phase2_admin_ways[ref] = {level = admin, parents = 1}
else
if phase2_admin_ways[ref].level == admin then
phase2_admin_ways[ref].parents = phase2_admin_ways[ref].parents + 1
elseif admin < phase2_admin_ways[ref].level then
phase2_admin_ways[ref] = {level = admin, parents = 1}
end
end
end
return { ways = osm2pgsql.way_member_ids(relation) }
end
end
end
```
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/openstreetmap/osm2pgsql/issues/1535#issuecomment-877629626
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tile-serving/attachments/20210710/3659a002/attachment.htm>
More information about the Tile-serving
mailing list