[josm-dev] [PATCH 1/2] QuadBuckets (using quad tiling) for node storage in JOSM

Dave Hansen dave at sr71.net
Tue Sep 15 21:45:41 BST 2009


On Sat, 2009-09-12 at 23:52 +0200, Petr Nejedlý wrote:
> Dave Hansen napsal(a):
> > On Sat, 2009-09-12 at 21:04 +0200, Petr Nejedlý wrote:
> >> Dave Hansen napsal(a):
> >>  > Ooh, I forgot about josm-ng.  That one looks very usable.  If mine
> >>  > doesn't pan out, I'll certainly look at that one.
> >>
> >> Well, the QTree in josm-ng is similar to yours in the way it distributes 
> >> the content, though I did pay quite some attention to memory usage 
> >> (hint: e.g. LinkedList is not the collection you'd like to use unless 
> >> you have specific reason to)
> > 
> > For the leaf nodes you mean?  I actually got better performance out of
> > it than I did ArrayList.  Honestly, I've been looking at performance a
> Really?

Yep, and I made a number of runs.  It was on the order of 5% or so.
But, I'm sure it depends heavily on how you populate the tree and how
close all of the data structures are in memory.  It also depends on how
sparse your data is in the tree on average and the size of the leaf
buckets.

> > lot more than pure memory usage, so I bet you're right.  But, when you
> > have an absolute ton of ArrayLists around that you're iterating over a
> > lot ArrayList.size() actually shows up in the profiles pretty high.
> Well, guess what:
> ArrayList.size() {
>      return size;
> }
> 
> LinkedList.size() {
>      return size;
> }
> The size() call is probably the only thing where LinkedList matches 
> ArrayList speed wise. It has faster add/remove in the middle, but 
> everything else is slower. Even iteration (which has to follow two 
> pointers per entry in LinkedList) is slower than in ArrayList (which 
> follows just one and accesses an array linearly, being nice to caches).

Yes, it should be nicer to caches.  But, java is weird.  It gets laid
out in all kinds of wacky-ass ways and it's backwards from what I expect
sometimes.

Notice that I didn't say that calling .size() was slower on ArrayList
than LinkedList.  I said it was a part of the *iterator*.  LinkedList
iterators don't have to consult .size, they just know when they're at
the end because they see either the list head or null.  

> So if ArrayList.size() shows up pretty high in the profiler, blame the 
> profiler (or the selected profiling method, or the interpretation of the 
> data, which is where most of the confusion usually came from ;-))

Granted, what I did may just have been to switch to a list that doesn't
*call* size during iteration, but the overall performance did improve
for me.  It could be for a large number of factors, though.  I'm sure
that there's relatively poor locality of adjacent LinkedList elements,
but I just haven't seen it show up.

> >> and support also 2d entities (node is 
> >> zero-d). The problem with importing it directly into josm is that it 
> >> uses (in -ng) the projected (and integer) coordinates. While it doesn't 
> >> care whether the coordinates are projected or not, it hugely benefit 
> >> from their signed integer nature.
> > 
> > You just mean that they're cheaper to deal with than doubles/floats?
> Partially, and also because I don't need to do all the math converting
> coordinates to given bit pattern. Your quadTile (being long) has some 
> similarity to this, though you seem to be wasting precious cycles by 
> recomputing the whole quadTile at every level just to get 2 bits out of 
> it. (I would generally diverge from the quadTile by skipping the 
> expensive bit interleaving anyway).

Amen to that.  I'm wasting a ton of cycles.  It certainly has a *lot* of
room for extra optimization.  There's a ton of coordinate comparing that
could probably just be done with the quad indexes themselves, and
there's plenty of calculation to be saved.

But, my approach so far has been to do what I found easiest to code.  If
things start to make it slow, I optimize it.  No need to optimize (and
complicate) things that aren't causing an appreciable impact.

I consider simple and fast to be way better than really really fast and
complicated. ;)

-- Dave





More information about the josm-dev mailing list