[Tile-serving] [openstreetmap/osm2pgsql] Gazetteer: move main tag filtering into processing function (#1018)
Jochen Topf
notifications at github.com
Fri Dec 6 10:14:00 UTC 2019
joto requested changes on this pull request.
This is a more superficial review as I don't understand the intricacies of what you are doing here.
>
- for (auto const &item : tags) {
- char const *k = item.key();
- if (prefix.compare(0, plen, k) == 0 &&
- (k[plen] == '\0' || k[plen] == ':')) {
- ret.push_back(&item);
+ char const *operator ()(osmium::Tag const &t) const noexcept
+ {
+ if (strncmp(t.key(), m_domain, m_len) == 0
We should get into the habit of properly using `std::strncmp` etc. Strictly speaking this gives you the C++ standard version of that function whil `strncmp` is the old C function. This might not matter for `strncmp`, but it does matter in some cases, for instance `std::abs()` vs. `abs()`, because `std::abs()` has overloads for different types, but `abs()` hasn't.
Same thing above with `strlen`, other cases of `strncmp` etc.
> - if (!postcode) {
- postcode = v;
- }
- } else if (strcmp(addr_key, "country") == 0) {
- if (!country && strlen(v) == 2) {
- country = v;
- }
- } else {
- bool first = std::none_of(
- m_address.begin(), m_address.end(), [&](ptag_t const &t) {
- return strcmp(t.first, addr_key) == 0;
- });
- if (first) {
- m_address.emplace_back(addr_key, v);
- }
+ bool first = std::none_of(
`bool const`?
> }
+
+ return nullptr;
The `DomainMatcher` class is sufficiently complex that it should have some tests.
> {
- std::vector<osmium::Tag const *> domain_name;
- if (std::get<2>(tag) & SF_MAIN_NAMED_KEY) {
- domain_name = domain_names(std::get<0>(tag), o.tags());
- if (domain_name.empty()) {
- return false;
- }
- }
+ // first throw away unnamed mains
+ auto mend = std::remove_if(m_main.begin(), m_main.end(), [&](pmaintag_t const &t) {
+ auto flags = std::get<2>(t);
`auto const`? More like these below, I will not mention it again.
> return false;
- }
+ });
+
+ // any non-fallback mains left?
+ bool has_primary =
+ std::any_of(m_main.begin(), mend, [](pmaintag_t const &t) {
+ return !(std::get<2>(t) & SF_MAIN_FALLBACK);
+ });
+
+ if (has_primary) {
+ // remove all fallbacks
+ mend = std::remove_if(m_main.begin(), mend, [&](pmaintag_t const &t) {
Should probably not reuse `mend` here.
> +{
+ for (auto const &tag : m_main) {
+ buffer.new_line();
+ // osm_id
+ buffer.add_column(o.id());
+ // osm_type
+ char const osm_type[2] = {
+ (char)toupper(osmium::item_type_to_char(o.type())), '\0'};
+ buffer.add_column(osm_type);
+ // class
+ buffer.add_column(std::get<0>(tag));
+ // type
+ buffer.add_column(std::get<1>(tag));
+ // names
+ if (std::get<2>(tag) & SF_MAIN_NAMED_KEY) {
+ DomainMatcher m(std::get<0>(tag));
`m{...}`
>
- if (first) {
+ if (first) {
+ buffer.add_null_column();
+ } else {
+ buffer.finish_hash();
Rather complex code to get the `new_hash/finish_hash` pair right. Can we either
1. just use an empty hash instead of a null (which would greatly simplify this code) or
2. wrap this somehow in a class?
> - // admin_level
- buffer.add_column(m_admin_level);
- // address
- if (m_address.empty()) {
- buffer.add_null_column();
- } else {
- buffer.new_hash();
- for (auto const &a : m_address) {
- if (strcmp(a.first, "tiger:county") == 0) {
- std::string term;
- auto *end = strchr(a.second, ',');
- if (end) {
- auto len = (std::string::size_type)(end - a.second);
- term = std::string(a.second, len);
+ buffer.new_hash();
+ for (auto const &a : m_address) {
One-letter variable names are so 80s :-)
> - buffer.add_column(m_admin_level);
- // address
- if (m_address.empty()) {
- buffer.add_null_column();
- } else {
- buffer.new_hash();
- for (auto const &a : m_address) {
- if (strcmp(a.first, "tiger:county") == 0) {
- std::string term;
- auto *end = strchr(a.second, ',');
- if (end) {
- auto len = (std::string::size_type)(end - a.second);
- term = std::string(a.second, len);
+ buffer.new_hash();
+ for (auto const &a : m_address) {
+ if (strcmp(a.first, "tiger:county") == 0) {
This code is complex enough for an extra function with tests.
> @@ -53,6 +53,22 @@ class db_deleter_place_t
std::vector<item_t> m_deletables;
};
+class gazetteer_copy_mgr_t : public db_copy_mgr_t<db_deleter_place_t>
+{
+public:
+ gazetteer_copy_mgr_t(std::shared_ptr<db_copy_thread_t> const &processor)
+ : db_copy_mgr_t<db_deleter_place_t>(processor),
+ m_table(std::make_shared<db_target_descr_t>("place", "place_id"))
+ {}
+
+ using db_copy_mgr_t<db_deleter_place_t>::new_line;
+
+ void new_line() { new_line(m_table); }
Maybe this is a bit too "clever" with the `using` and the overload?
> + CHECK(1 == node_count(conn, 201, "building"));
+ CHECK(0 == node_count(conn, 202, "building"));
+}
+
+TEST_CASE("Main tag deleted")
+{
+ import("n1 Tamenity=restaurant x12.3 y3\n"
+ "n2 Thighway=bus_stop,railway=stop,name=X x56.4 y-4\n");
+
+ auto conn = db.connect();
+
+ CHECK(1 == node_count(conn, 1, "amenity"));
+ CHECK(1 == node_count(conn, 2, "highway"));
+ CHECK(1 == node_count(conn, 2, "railway"));
+
+ update("n1 Tatiy=restaurant x12.3 y3\n"
`atiy` ? Maybe something like `not_a=restaurant` or so to make clearer what's happening here?
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/openstreetmap/osm2pgsql/pull/1018#pullrequestreview-328079461
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tile-serving/attachments/20191206/a2f9676b/attachment-0001.html>
More information about the Tile-serving
mailing list