[josm-dev] [PATCH 1/2] QuadBuckets (using quad tiling) for node storage in JOSM

Dave Hansen dave at sr71.net
Sat Sep 12 17:09:07 BST 2009


I've been hacking on the JOSM validator plugin for a while.  One of the
repeating "hard problems" that comes up are doing the UnconnectedWays tests.
You need to do searches for every segment in a way to see if there are any
nearby nodes.  This generally means that you do a number of searches on the
same order as the number of nodes that you have.

If you have 100k way nodes, you end up doing ~90k searches (one for each
segment) each looking at a sizable number of those 100k nodes.  Any way you
slice it, you end up doing billions of searches since it's an O(n^2) algorithm.
A single validation run with 50k nodes takes me 183 seconds.

I indexed 500,000 nodes from Oregon in the US.  I then took another 3,000,000
nodes and searched to find which of those 500,000 were near them.  Each
iteration over 500,000 nodes for me takes ~0.28 seconds.  If I were to do
3,000,000 searches, it would take 1.63 days!  So, I attempted to optimize this.

The messages is lengthy one, but feel free to stop and look at the code now.
If you want to know how I got that 183-second run down to 2 seconds, read
down. :)

--

My first attempt to optimize was to do binary searches.  I indexed all of the
nodes in two arrays, one by x coordinate and one by y.  I did a binary search
for each axis, iterated out from the place that the binary search found
(representing the radius of the search).  I then combined the two searches.
Algorithmically, it's still an O(n^2) algorithm since there's an iteration, but
it was significantly faster since the number of nodes that were touched was
smaller.

But, I still ran into problems with large data sets because I could still get
thousands of nodes returned from each axis search.  Optimizing, I instead kept
just a single index x-axis.  I then sorted the result of the x-axis binary
search by the y-coordinate and binary searched on that.  That's as fast as I
ever got that algorithm to go.  It ended up way better than the linear search
(~600x!) but still takes ~200s for the 3,000,000 node search.

Can we make the binary search better, or is there was still a fundamental
limitation in that algorithm?  I think I was up against a pretty fundamental
limitation if I chose to sort by a single coordinate at a time: during the
search, it touched a *LOT* of virtually random data.  Since I indexed all of
the coordinates by longitude, a point in Greenwich could be in the binary
x-axis index next to a point in Algeria and next to another in Antarctica (all
at 0 longitude).  There's a lot of jumping around to pseudo-random points all
over the data set.  That thrashes the CPU caches and TLB.  

I was intrigued by the quadtiling algorithm that we currently use on the
database servers.

	http://wiki.openstreetmap.org/wiki/QuadTiles

If we were to make a data structure that represents the tiles, we could chop up
the world into much smaller pieces.  Searches would also potentially be faster
since we could operate on subsets of the data much more easily.  We would not
be dealing with the whole world of random data.

So, I went and implemented it.  I call it QuadBuckets, and it's basically an
unbalanced 4-way radix tree structure.  We radix on the bits in the quad index.
It stores individual data in quad bucket levels (QBLevel)s that can either
contain pointers to 4 more child QBLevels or may act as a leaf node and store
arbitrarily long lists of OSM Nodes.

When a leaf node gets too large (my testing shows that 16 or 32 elements is the
best maximum size for performance.. see todo list), the leaf is split into its
4 constituent child tiles and its contents are redistributed among them.
Several splits (and new tree levels) may be required if the stored coordinates
were very close together.

The quad tiling in the database uses 16 bits of precision.  This means that the
least-significant quadtile bit represents an area about 610m (40,000km / 2^16)
across (and half that tall) at the equator.  On real data sets, this means the
buckets are just too large.  So, I added another 8 bits of precision so that we
have buckets that are now ~2.4m across.  Making sure buckets don't get too big
is one of the keys to this algorithm, and these buckets are small enough.

I also implemented a simple "search cache".  It records the last QBLevel node
from which the last search found results.  When the next search request is
made, we check to see if this tree node can satisfy the search, and avoid
walking too far down the tree each time.  If the tree node can not satisfy the
search, we walk back up the tree until the search can be satisfied.  This
optimization speeds things up by ~25%.

For testing, I first used random data spanning the globe, basically creating
nodes with random latitudes and longitudes.  This is a near worst-case for
QuadBuckets.  Despite this, I got the QuadBuckets approach to be about 30%
faster than the binary searches.  But, using real OSM data, it really shines.

I threw the 3,000,000 search workload at it.  Remember, using the binary search
method, it took on average ~200s.  But, using QuadBuckets, the same search was
9.84s on average.  So, we've gone from 140,000 seconds to 200 to 9.84.  :)

Although it is hard to profile, QuadBuckets seem to outperform the binary
search even in quite small data sets.  Their worst-case performance happens if
you try to store an extremely large number of nodes that are within the
bottom-level tiles (those 2.4m boxes) since it degrades to array storage at
that point.  It will also degrade if you are working on an area that spans
several high-level tiles.I don't think this will happen in the real world, much.
It will probably still be faster than anything else out there.

QuadBuckets do have a higher insertion cost than the other methods. Computing
the quad tile index means a bunch of Math.floor() function calls, and
those seem to be at the top of the profiles.  But, we could probably mitigate
this insertion cost by deferring computing the quad tile coordinates until the
first search is performed.

TODO:
* make content buckets larger and do binary searches inside them
* Do something smarter for Nodes with null getCoor() other than
  sticking in tile 0000000000000000
* Clean up the code, remove debugging statements

---

 core-dave/src/org/openstreetmap/josm/data/coor/QuadTiling.java |  111 +
 core-dave/src/org/openstreetmap/josm/data/osm/QuadBuckets.java |  881 ++++++++++
 2 files changed, 992 insertions(+)

diff -puN /dev/null src/org/openstreetmap/josm/data/osm/QuadBuckets.java
--- /dev/null	2009-07-21 09:53:48.000000000 -0700
+++ core-dave/src/org/openstreetmap/josm/data/osm/QuadBuckets.java	2009-09-12 09:05:39.000000000 -0700
@@ -0,0 +1,881 @@
+package org.openstreetmap.josm.data.osm;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.Set;
+import java.util.SortedMap;
+import java.util.TreeMap;
+
+import org.openstreetmap.josm.data.coor.EastNorth;
+import org.openstreetmap.josm.data.osm.Node;
+import org.openstreetmap.josm.tools.Pair;
+
+import java.nio.MappedByteBuffer;
+import java.nio.DoubleBuffer;
+import java.nio.channels.FileChannel;
+import java.io.FileOutputStream;
+import java.io.RandomAccessFile;
+
+import java.io.IOException;
+import java.io.BufferedReader;
+import java.io.InputStreamReader;
+
+import java.util.Map.Entry;
+import java.awt.geom.Point2D;
+
+//import java.util.List;
+import java.util.*;
+import java.util.Collection;
+
+import org.openstreetmap.josm.data.coor.QuadTiling;
+import org.openstreetmap.josm.data.coor.CachedLatLon;
+import org.openstreetmap.josm.data.coor.EastNorth;
+import org.openstreetmap.josm.data.coor.LatLon;
+import org.openstreetmap.josm.data.osm.visitor.Visitor;
+
+
+public class QuadBuckets implements Collection<Node>
+{
+    public static boolean debug = false;
+    static boolean consistency_testing = false;
+
+    static void abort(String s)
+    {
+        out(s);
+        Object o = null;
+        o.hashCode();
+    }
+    static void out(String s)
+    {
+        System.out.println(s);
+    }
+    // periodic output
+    long last_out = -1;
+    void pout(String s)
+    {
+        long now = System.currentTimeMillis();
+        if (now - last_out < 300)
+            return;
+        last_out = now;
+        System.out.print(s + "\r");
+    }
+    void pout(String s, int i, int total)
+    {
+        long now = System.currentTimeMillis();
+        if ((now - last_out < 300) &&
+            ((i+1) < total))
+            return;
+        last_out = now;
+        // cast to float to keep the output size down
+        System.out.print(s + " " + (float)((i+1)*100.0/total) + "% done    \r");
+    }
+    // 24 levels of buckets gives us bottom-level
+    // buckets that are about 2 meters on a side.
+    // That should do. :)
+    public static int NR_LEVELS = 24;
+    public static double WORLD_PARTS = (1 << NR_LEVELS);
+
+    public static int MAX_OBJECTS_PER_LEVEL = 16;
+    // has to be a power of 2
+    public static int TILES_PER_LEVEL_SHIFT = 2;
+    public static int TILES_PER_LEVEL = 1<<TILES_PER_LEVEL_SHIFT;
+    class BBox
+    {
+        private double xmin;
+        private double xmax;
+        private double ymin;
+        private double ymax;
+        void sanity()
+        {
+            if (xmin < -180.0)
+                xmin = -180.0;
+            if (xmax >  180.0)
+                xmax =  180.0;
+            if (ymin <  -90.0)
+                ymin =  -90.0;
+            if (ymax >   90.0)
+                ymax =   90.0;
+            if ((xmin < -180.0) ||
+                (xmax >  180.0) ||
+                (ymin <  -90.0) ||
+                (ymax >   90.0)) {
+                out("bad BBox: " + this);
+                Object o = null;
+                o.hashCode();
+            }
+        }
+        public String toString()
+        {
+            return "[ " + xmin + " -> " + xmax + ", " +
+                         ymin + " -> " + ymax + " ]";
+        }
+        double min(double a, double b)
+        {
+            if (a < b)
+                return a;
+            return b;
+        }
+        double max(double a, double b)
+        {
+            if (a > b)
+                return a;
+            return b;
+        }
+        public BBox(LatLon a, LatLon b)
+        {
+            xmin = min(a.lon(), b.lon());
+            xmax = max(a.lon(), b.lon());
+            ymin = min(a.lat(), b.lat());
+            ymax = max(a.lat(), b.lat());
+            sanity();
+        }
+        public BBox(double a_x, double a_y, double b_x, double b_y)
+        {
+            xmin = min(a_x, b_x);
+            xmax = max(a_x, b_x);
+            ymin = min(a_y, b_y);
+            ymax = max(a_y, b_y);
+            sanity();
+        }
+        public double height()
+        {
+            return ymax-ymin;
+        }
+        public double width()
+        {
+            return xmax-xmin;
+        }
+        boolean bounds(BBox b)
+        {
+            if (!(xmin <= b.xmin) ||
+                !(xmax >= b.xmax) ||
+                !(ymin <= b.ymin) ||
+                !(ymax >= b.ymax))
+                return false;
+            return true;
+        }
+        boolean bounds(LatLon c)
+        {
+            if ((xmin <= c.lon()) &&
+                (xmax >= c.lon()) &&
+                (ymin <= c.lat()) &&
+                (ymax >= c.lat()))
+                return true;
+            return false;
+        }
+        boolean inside(BBox b)
+        {
+            if (xmin >= b.xmax)
+                return false;
+            if (xmax <= b.xmin)
+                return false;
+            if (ymin >= b.ymax)
+                return false;
+            if (ymax <= b.ymin)
+                return false;
+            return true;
+        }
+        boolean intersects(BBox b)
+        {
+            return this.inside(b) || b.inside(this);
+        }
+    }
+    class QBLevel
+    {
+        int level;
+        long quad;
+        QBLevel parent;
+
+        public List<Node> content;
+        public QBLevel children[];
+        public QBLevel(QBLevel parent)
+        {
+            init(parent);
+        }
+        String quads(Node n)
+        {
+            return Long.toHexString(QuadTiling.quadTile(n.getCoor()));
+        }
+        void split()
+        {
+            if (debug)
+                out("splitting "+this.bbox()+" level "+level+" with "
+                        + content.size() + " entries (my dimensions: "
+                        + this.bbox.width()+", "+this.bbox.height()+")");
+            if (children != null) {
+                abort("overwrote children");
+            }
+            children = new QBLevel[QuadTiling.TILES_PER_LEVEL];
+            // deferring allocation of children until use
+            // seems a bit faster
+            //for (int i = 0; i < TILES_PER_LEVEL; i++)
+            //    children[i] = new QBLevel(this, i);
+            List<Node> tmpcontent = content;
+            content = null;
+            for (Node n : tmpcontent) {
+                int new_index = QuadTiling.index(n, level);
+                if (children[new_index] == null)
+                    children[new_index] = new QBLevel(this, new_index);
+                QBLevel child = children[new_index];
+                if (debug)
+                    out("putting "+n+"(q:"+quads(n)+") into ["+new_index+"] " + child.bbox());
+                child.add(n);
+            }
+        }
+        void add_leaf(Node n)
+        {
+            LatLon c = n.getCoor();
+            QBLevel ret = this;
+            if (content == null) {
+                if (debug)
+                    out("   new content array");
+                // I chose a LinkedList because we do not do
+                // any real random access to this.  We generally
+                // just run through the whole thing all at once
+                content = new LinkedList<Node>();
+            }
+            content.add(n);
+            if (content.size() > MAX_OBJECTS_PER_LEVEL) {
+                if (debug)
+                    out("["+level+"] deciding to split");
+                if (level >= NR_LEVELS-1) {
+                    out("can not split, but too deep: " + level + " size: " + content.size());
+                    return;
+                }
+                int before_size = -1;
+                if (consistency_testing)
+                    before_size = this.size();
+                this.split();
+                if (consistency_testing) {
+                    int after_size = this.size();
+                    if (before_size != after_size) {
+                        abort("["+level+"] split done before: " + before_size + " after: " + after_size);
+                    }
+                }
+                return;
+            }
+            if (debug) {
+                out("   plain content put now have " + content.size());
+                if (content.contains(c))
+                    out("   and I already have that one");
+            }
+        }
+
+        List<Node> search(BBox bbox, LatLon point, double radius)
+        {
+            if (isLeaf())
+                return search_leaf(bbox, point, radius);
+            return search_branch(bbox, point, radius);
+        }
+        private List<Node> search_leaf(BBox bbox, LatLon point, double radius)
+        {
+            if (debug)
+                out("searching contents in tile " + this.bbox() + " for " + bbox);
+            /*
+             * It is possible that this was created in a split
+             * but never got any content populated.
+             */
+            if (content == null)
+                return null;
+            // We could delay allocating this.  But, the way it is currently
+            // used, we almost always have a node in the area that we're
+            // searching since we're searching around other nodes.
+            //
+            // the iterator's calls to ArrayList.size() were showing up in
+            // some profiles, so I'm trying a LinkedList instead
+            List<Node> ret = new LinkedList<Node>();
+            for (Node n : content) {
+                // This supports two modes: one where the bbox is just
+                // an outline of the point/radius combo, and one where
+                // it is *just* the bbox.  If it is *just* the bbox,
+                // don't consider the points below
+                if (point == null) {
+                    ret.add(n);
+                    continue;
+                }
+                LatLon c = n.getCoor();
+                // is greatCircleDistance really necessary?
+                double d = c.greatCircleDistance(point);
+                if (debug)
+                    out("[" + level + "] Checking coor: " + c + " dist: " + d + " vs. " + radius + " " + quads(n));
+                if (d > radius)
+                    continue;
+                if (debug)
+                    out("hit in quad: "+Long.toHexString(this.quad)+"\n");
+                //if (ret == null)
+                //    ret = new LinkedList<Node>();
+                ret.add(n);
+            }
+            if (debug)
+                out("done searching quad " + Long.toHexString(this.quad));
+            return ret;
+        }
+        /*
+         * This is stupid.  I tried to have a QBLeaf and QBBranch
+         * class decending from a QBLevel.  It's more than twice
+         * as slow.  So, this throws OO out the window, but it
+         * is fast.  Runtime type determination must be slow.
+         */
+        boolean isLeaf()
+        {
+            if (children == null)
+                return true;
+            return false;
+        }
+        QBLevel next_sibling()
+        {
+            boolean found_me = false;
+            if (parent == null)
+                return null;
+            int __nr = 0;
+            for (QBLevel sibling : parent.children) {
+                __nr++;
+                int nr = __nr-1;
+                if (sibling == null) {
+                    if (debug)
+                        out("[" + this.level + "] null child nr: " + nr);
+                    continue;
+                }
+                // We're looking for the *next* child
+                // after us.
+                if (sibling == this) {
+                    if (debug)
+                        out("[" + this.level + "] I was child nr: " + nr);
+                    found_me = true;
+                    continue;
+                }
+                if (found_me) {
+                    if (debug)
+                        out("[" + this.level + "] next sibling was child nr: " + nr);
+                    return sibling;
+                }
+                if (debug)
+                    out("[" + this.level + "] nr: " + nr + " is before me, ignoring...");
+            }
+            return null;
+        }
+        QBLevel nextLeaf()
+        {
+            QBLevel next = this;
+            if (this.isLeaf()) {
+                QBLevel sibling = next.next_sibling();
+                // Walk back up the tree to find the
+                // next sibling node.  It may be either
+                // a leaf or branch.
+                while (sibling == null) {
+                    if (debug)
+                        out("no siblings at level["+next.level+"], moving to parent");
+                    next = next.parent;
+                    if (next == null)
+                        break;
+                    sibling = next.next_sibling();
+                }
+                next = sibling;
+            }
+            if (next == null)
+                return null;
+            // all branches are guaranteed to have
+            // at least one leaf.  It may not have
+            // any contents, but there *will* be a
+            // leaf.  So, walk back down the tree
+            // and find the first leaf
+            while (!next.isLeaf()) {
+                if (debug)
+                    out("["+next.level+"] next node is a branch, moving down...");
+                for (QBLevel child : next.children) {
+                    if (child == null)
+                        continue;
+                    next = child;
+                    break;
+                }
+            }
+            return next;
+        }
+        int size()
+        {
+            if (isLeaf())
+                return size_leaf();
+            return size_branch();
+        }
+        private int size_leaf()
+        {
+            if (content == null) {
+                if (debug)
+                    out("["+level+"] leaf size: null content, children? " + children);
+                return 0;
+            }
+            if (debug)
+                out("["+level+"] leaf size: " + content.size());
+            return content.size();
+        }
+        private int size_branch()
+        {
+            int ret = 0;
+            for (QBLevel l : children) {
+                if (l == null)
+                    continue;
+                ret += l.size();
+            }
+            if (debug)
+                out("["+level+"] branch size: " + ret);
+            return ret;
+        }
+        boolean contains(Node n)
+        {
+            QBLevel res = find_exact(n);
+            if (res == null)
+                return false;
+            return true;
+        }
+        QBLevel find_exact(Node n)
+        {
+            if (isLeaf())
+                return find_exact_leaf(n);
+            return find_exact_branch(n);
+        }
+        QBLevel find_exact_leaf(Node n)
+        {
+            QBLevel ret = null;
+            if (content != null && content.contains(n))
+                ret = this;
+            return ret;
+        }
+        QBLevel find_exact_branch(Node n)
+        {
+            QBLevel ret = null;
+            for (QBLevel l : children) {
+                if (l == null)
+                    continue;
+                ret = l.find_exact(n);
+                if (ret != null)
+                    break;
+            }
+            return ret;
+        }
+        boolean add(Node n)
+        {
+            if (isLeaf())
+                add_leaf(n);
+            else
+                add_branch(n);
+            return true;
+        }
+        QBLevel add_branch(Node n)
+        {
+            LatLon c = n.getCoor();
+            int index = QuadTiling.index(n, level);
+            if (debug)
+                out("[" + level + "]: " + n +
+                    " index " + index + " levelquad:" + this.quads() + " level bbox:" + this.bbox());
+            if (debug)
+                out("   put in child["+index+"]");
+            if (children[index] == null)
+                children[index] = new QBLevel(this, index);
+            children[index].add(n);
+            /* this is broken at the moment because we need to handle the null n.getCoor()
+            if (consistency_testing && !children[index].bbox().bounds(n.getCoor())) {
+                out("target child["+index+"] does not bound " + children[index].bbox() + " " + c);
+                for (int i = 0; i < children.length; i++) {
+                    QBLevel ch = children[i];
+                    if (ch == null)
+                        continue;
+                    out("child["+i+"] quad: " + ch.quads() + " bounds: " + ch.bbox.bounds(n.getCoor()));
+                }
+                out(" coor quad: " + quads(n));
+                abort("");
+            }*/
+
+            if (consistency_testing) {
+                for (int i = 0; i < children.length; i++) {
+                    QBLevel ch = children[i];
+                    if (ch == null)
+                        continue;
+                    if (ch.bbox().bounds(c) && i != index) {
+                        out("multible bounding?? expected: " + index + " but saw at " + i + " " + ch.quads());
+                        out("child["+i+"] bbox: " + ch.bbox());
+                    }
+                }
+            }
+            return this;
+        }
+        private List<Node> search_branch(BBox bbox, LatLon point, double radius)
+        {
+            List<Node> ret = null;
+            if (debug)
+                System.out.print("[" + level + "] qb bbox: " + this.bbox() + " ");
+            if (!this.bbox().intersects(bbox)) {
+                if (debug) {
+                    out("miss " + Long.toHexString(this.quad));
+                    //QuadTiling.tile2xy(this.quad);
+                }
+                return ret;
+            }
+            if (debug)
+                out("hit " + this.quads());
+
+            if (debug)
+                out("[" + level + "] not leaf, moving down");
+            int child_nr = 0;
+            for (QBLevel q : children) {
+                if (q == null)
+                    continue;
+                child_nr++;
+                if (debug)
+                    System.out.print(child_nr+": ");
+                List<Node> coors = q.search(bbox, point, radius);
+                if (coors == null)
+                    continue;
+                if (ret == null)
+                    ret = coors;
+                else
+                    ret.addAll(coors);
+                if (q.bbox().bounds(bbox)) {
+                    search_cache = q;
+                    // optimization: do not continue searching
+                    // other tiles if this one wholly contained
+                    // what we were looking for.
+                    if (coors.size() > 0 ) {
+                        if (debug)
+                            out("break early");
+                        break;
+                    }
+                }
+            }
+            return ret;
+        }
+        public String quads()
+        {
+            return Long.toHexString(quad);
+        }
+        public void init(QBLevel parent)
+        {
+                this.parent = parent;
+                if (parent != null)
+                    this.level = parent.level + 1;
+                this.quad = 0;
+        }
+        int index_of(QBLevel find_this)
+        {
+            if (this.isLeaf())
+                return -1;
+
+            for (int i = 0; i < QuadTiling.TILES_PER_LEVEL; i++) {
+                if (children[i] == find_this)
+                    return i;
+            }
+            return -1;
+        }
+        public QBLevel(QBLevel parent, int parent_index)
+        {
+            this.init(parent);
+            this.level = parent.level+1;
+            this.parent = parent;
+            int shift = (QuadBuckets.NR_LEVELS - level) * 2;
+            long mult = 1;
+            // Java blows the big one.  It seems to wrap when
+            // you shift by > 31
+            if (shift >= 30) {
+                shift -= 30;
+                mult = 1<<30;
+            }
+            long this_quadpart = mult * (parent_index << shift);
+            this.quad = parent.quad | this_quadpart;
+            if (debug)
+                out("new level["+this.level+"] bbox["+parent_index+"]: " + this.bbox()
+                        + " coor: " + this.coor()
+                        + " quadpart: " + Long.toHexString(this_quadpart)
+                        + " quad: " + Long.toHexString(this.quad));
+        }
+        /*
+         * Surely we can calculate these efficiently instead of storing
+         */
+        double width = Double.NEGATIVE_INFINITY;
+        double width()
+        {
+            if (width != Double.NEGATIVE_INFINITY)
+                return this.width;
+            if (level == 0) {
+                width = this.bbox().width();
+            } else {
+                width = parent.width()/2;
+            }
+            return width;
+        }
+        double height()
+        {
+            return width()/2;
+        }
+        BBox bbox = null;
+        public BBox bbox()
+        {
+            if (bbox != null) {
+                bbox.sanity();
+                return bbox;
+            }
+            if (level == 0) {
+                bbox = new BBox(-180, 90, 180, -90);
+            } else {
+                LatLon bottom_left = this.coor();
+                double lat = bottom_left.lat() + this.height();
+                double lon = bottom_left.lon() + this.width();
+                LatLon top_right = new LatLon(lat, lon);
+                bbox = new BBox(bottom_left, top_right);
+            }
+            bbox.sanity();
+            return bbox;
+        }
+        /*
+         * This gives the coordinate of the bottom-left
+         * corner of the box
+         */
+        LatLon coor()
+        {
+            return QuadTiling.tile2LatLon(this.quad);
+        }
+    }
+
+    private QBLevel root;
+    private QBLevel search_cache;
+    public QuadBuckets()
+    {
+        clear();
+    }
+    public void clear()
+    {
+        root = new QBLevel(null);
+        search_cache = null;
+        if (debug)
+            out("QuadBuckets() cleared: " + this);
+    }
+    public boolean add(Node n)
+    {
+        if (debug)
+            out(this + " QuadBuckets() add: " + n + " size now: " + this.size());
+        int before_size = -1;
+        if (consistency_testing)
+            before_size = root.size();
+        boolean ret = root.add(n);
+        if (consistency_testing) {
+            int end_size = root.size();
+            if (ret)
+                end_size--;
+            if (before_size != end_size)
+                abort("size inconsistency before add: " + before_size + " after: " + end_size);
+        }
+        return ret;
+    }
+    public void unsupported()
+    {
+        out("unsupported operation");
+        Object o = null;
+        o.hashCode();
+        throw new UnsupportedOperationException();
+    }
+    public boolean retainAll(Collection nodes)
+    {
+        unsupported();
+        return false;
+    }
+    public boolean removeAll(Collection nodes)
+    {
+        unsupported();
+        return false;
+    }
+    public boolean addAll(Collection nodes)
+    {
+        unsupported();
+        return false;
+    }
+    public boolean containsAll(Collection nodes)
+    {
+        unsupported();
+        return false;
+    }
+    private void check_type(Object o)
+    {
+        if (o instanceof Node)
+            return;
+        unsupported();
+    }
+    public boolean remove(Object o)
+    {
+        check_type(o);
+        return this.remove((Node)o);
+    }
+    public boolean remove(Node n)
+    {
+        QBLevel bucket = root.find_exact(n);
+        if (!bucket.isLeaf())
+            abort("found branch where leaf expected");
+        return bucket.content.remove(n);
+    }
+    public boolean contains(Object o)
+    {
+        check_type(o);
+        QBLevel bucket = root.find_exact((Node)o);
+        if (bucket == null)
+            return false;
+        return true;
+    }
+    public ArrayList<Node> toArrayList()
+    {
+        ArrayList<Node> a = new ArrayList<Node>();
+        for (Node n : this)
+            a.add(n);
+        out("returning array list with size: " + a.size());
+        return a;
+    }
+    public Object[] toArray()
+    {
+        return this.toArrayList().toArray();
+    }
+    public <T> T[] toArray(T[] template)
+    {
+        return this.toArrayList().toArray(template);
+    }
+    class QuadBucketIterator implements Iterator<Node>
+    {
+        QBLevel current_leaf;
+        int index_in_leaf;
+        int iterated_over;
+        QBLevel next_leaf(QBLevel q)
+        {
+            if (q == null)
+                return null;
+            QBLevel orig = q;
+            QBLevel next = q.nextLeaf();
+            if (consistency_testing && (orig == next))
+                abort("got same leaf back leaf: " + q.isLeaf());
+            return next;
+        }
+        public QuadBucketIterator(QuadBuckets qb)
+        {
+            if (debug)
+                out(this + " is a new iterator qb.root: " + qb.root + " size: " + qb.size());
+            if (qb.root.isLeaf())
+                current_leaf = qb.root;
+            else
+                current_leaf = next_leaf(qb.root);
+            if (debug)
+                out("\titerator first leaf: " + current_leaf);
+            iterated_over = 0;
+        }
+        public boolean hasNext()
+        {
+            if (this.peek() == null) {
+                if (debug)
+                   out(this + " no hasNext(), but iterated over so far: " + iterated_over);
+                return false;
+            }
+            return true;
+        }
+        Node peek()
+        {
+            if (current_leaf == null) {
+                if (debug)
+                    out("null current leaf, nowhere to go");
+                return null;
+            }
+            while((current_leaf.content == null) ||
+                  (index_in_leaf >= current_leaf.content.size())) {
+                if (debug)
+                    out("moving to next leaf");
+                index_in_leaf = 0;
+                current_leaf = next_leaf(current_leaf);
+                if (current_leaf == null)
+                    break;
+            }
+            if (current_leaf == null || current_leaf.content == null) {
+                if (debug)
+                    out("late nowhere to go " + current_leaf);
+                return null;
+            }
+            return current_leaf.content.get(index_in_leaf);
+        }
+        public Node next()
+        {
+            Node ret = peek();
+            index_in_leaf++;
+            if (debug)
+                out("iteration["+iterated_over+"] " + index_in_leaf + " leaf: " + current_leaf);
+            iterated_over++;
+            if (ret == null) {
+                if (debug)
+                    out(this + " no next node, but iterated over so far: " + iterated_over);
+            }
+            return ret;
+        }
+        public void remove()
+        {
+            Node object = peek();
+            current_leaf.content.remove(object);
+        }
+    }
+    public Iterator<Node> iterator()
+    {
+        return new QuadBucketIterator(this);
+    }
+    public int size()
+    {
+        // This can certainly by optimized
+        int ret = root.size();
+        if (debug)
+            out(this + " QuadBuckets size: " + ret);
+        return ret;
+    }
+    public boolean isEmpty()
+    {
+        if (this.size() == 0)
+            return true;
+        return false;
+    }
+    public BBox search_to_bbox(LatLon point, double radius)
+    {
+        BBox bbox = new BBox(point.lon() - radius, point.lat() - radius,
+                             point.lon() + radius, point.lat() + radius);
+        if (debug)
+            out("search bbox before sanity: " +  bbox);
+        bbox.sanity();
+        if (debug)
+            out("search bbox after sanity: " +  bbox);
+        return bbox;
+    }
+    List<Node> search(LatLon point, double radius)
+    {
+        return this.search(search_to_bbox(point, radius), point, radius);
+    }
+    List<Node> search(BBox bbox)
+    {
+        return this.search(bbox, null, -1);
+    }
+    public List<Node> search(LatLon b1, LatLon b2)
+    {
+        BBox bbox = new BBox(b1.lon(), b1.lat(), b2.lon(), b2.lat());
+        bbox.sanity();
+        return this.search(bbox);
+    }
+    List<Node> search(BBox bbox, LatLon point, double radius)
+    {
+        if (debug) {
+            out("qb root search at " + point + " around: " + radius);
+            out("root bbox: " + root.bbox());
+        }
+        List<Node> ret = null;
+        // Doing this cuts down search cost on a real-life data
+        // set by about 25%
+        boolean cache_searches = true;
+        if (cache_searches) {
+            if (search_cache == null)
+                search_cache = root;
+            // Walk back up the tree when the last
+            // search spot can not cover the current
+            // search
+            while (!search_cache.bbox().bounds(bbox)) {
+                search_cache = search_cache.parent;
+            }
+        } else {
+            search_cache = root;
+        }
+        return search_cache.search(bbox, point, radius);
+    }
+}
diff -puN /dev/null src/org/openstreetmap/josm/data/coor/QuadTiling.java
--- /dev/null	2009-07-21 09:53:48.000000000 -0700
+++ core-dave/src/org/openstreetmap/josm/data/coor/QuadTiling.java	2009-09-12 09:05:50.000000000 -0700
@@ -0,0 +1,111 @@
+// License: GPL. Copyright 2009 by Dave Hansen, others
+package org.openstreetmap.josm.data.coor;
+
+import org.openstreetmap.josm.data.osm.Node;
+
+import org.openstreetmap.josm.Main;
+import org.openstreetmap.josm.data.projection.Projection;
+
+public class QuadTiling
+{
+    public static int NR_LEVELS = 24;
+    public static double WORLD_PARTS = (1 << NR_LEVELS);
+
+    public static int MAX_OBJECTS_PER_LEVEL = 16;
+    // has to be a power of 2
+    public static int TILES_PER_LEVEL_SHIFT = 2;
+    public static int TILES_PER_LEVEL = 1<<TILES_PER_LEVEL_SHIFT;
+    static public int X_PARTS = 360;
+    static public int X_BIAS = -180;
+
+    static public int Y_PARTS = 180;
+    static public int Y_BIAS = -90;
+
+    public static LatLon tile2LatLon(long quad)
+    {
+        // The world is divided up into X_PARTS,Y_PARTS.
+        // The question is how far we move for each bit
+        // being set.  In the case of the top level, we
+        // move half of the world.
+        double x_unit = X_PARTS/2;
+        double y_unit = Y_PARTS/2;
+        long shift = (NR_LEVELS*2)-2;
+
+        //if (debug)
+        //    out("tile2xy(0x"+Long.toHexString(quad)+")");
+        double x = 0;
+        double y = 0;
+        for (int i = 0; i < NR_LEVELS; i++) {
+            long bits = (quad >> shift) & 0x3;
+            //if (debug)
+            //    out("shift: " + shift + " bits: " + bits);
+            // remember x is the MSB
+            if ((bits & 0x2) != 0)
+                x += x_unit;
+            if ((bits & 0x1) != 0)
+                y += y_unit;
+            x_unit /= 2;
+            y_unit /= 2;
+            shift -= 2;
+        }
+        x += X_BIAS;
+        y += Y_BIAS;
+        return new LatLon(y, x);
+    }
+    static long xy2tile(long x, long y)
+    {
+       long tile = 0;
+       int i;
+       for (i = NR_LEVELS-1; i >= 0; i--)
+       {
+            long xbit = ((x >> i) & 1);
+            long ybit = ((y >> i) & 1);
+            tile <<= 2;
+            // Note that x is the MSB
+            tile |= (xbit<<1) | ybit;
+       }
+       return tile;
+    }
+    static long coorToTile(LatLon coor)
+    {
+        return quadTile(coor);
+    }
+    static long lon2x(double lon)
+    {
+       //return Math.round((lon + 180.0) * QuadBuckets.WORLD_PARTS / 360.0)-1;
+       long ret = (long)Math.floor((lon + 180.0) * WORLD_PARTS / 360.0);
+       if (ret == WORLD_PARTS)
+           ret--;
+       return ret;
+    }
+    static long lat2y(double lat)
+    {
+       //return Math.round((lat + 90.0) * QuadBuckets.WORLD_PARTS / 180.0)-1;
+       long ret = (long)Math.floor((lat + 90.0) * WORLD_PARTS / 180.0);
+       if (ret == WORLD_PARTS)
+           ret--;
+       return ret;
+    }
+    static public long quadTile(LatLon coor)
+    {
+        return xy2tile(lon2x(coor.lon()),
+                       lat2y(coor.lat()));
+    }
+    static public int index(int level, long quad)
+    {
+        long mask = 0x00000003;
+        int total_shift = TILES_PER_LEVEL_SHIFT*(NR_LEVELS-level-1);
+        return (int)(mask & (quad >> total_shift));
+    }
+    static public int index(Node n, int level)
+    {
+        LatLon coor = n.getCoor();
+        // The nodes that don't return coordinates will all get
+        // stuck in a single tile.  Hopefully there are not too
+        // many of them
+        if (coor == null)
+            return 0;
+        long quad = coorToTile(n.getCoor());
+        return index(level, quad);
+    }
+}
_




More information about the josm-dev mailing list