[OSM-dev] Re: osmeditor2 to Java, and a common Java OSM client library

Fri May 26 10:04:24 BST 2006

Hi Nick,

(This is a refreshing take on the "my favourite language" holy war.
Thanks for the thoughtful discussion!)

On Thu, 25 May 06 @08:38pm, Nick Hill wrote:
> Ben Gimpert wrote:
> >You're absolutely correct -- the *relative* performance of Ruby will
> >never improve.  But the same could be said about writing assembly and C.
> >C will always be relatively slower than assembly, because of the design
> >of the two languages. The important point is that *we don't care*!
> >Assembly is only rarely used because it is so difficult to read and
> >write, compared to higher level languages like C (2GL's).  And with
> >contemporary computers, the performance difference is not enough to
> >justify "assembly's" development hassle.
> 
> I understand that optimising C compilers produce faster binary than can 
> generally be written directly in assembly. The optimising C compiler 
> re-arranges order of evaluation to minimise pipeline stalls. I guess the 
> same algorithm could be applied directly to assembly at RTL level.

Okay, so maybe C versus assembly was not the best example, since C is
only barely a higher abstraction than assembly.  I think if people
writing assemblers cared to tweak at the RTL level, "optimized assembly"
would be even faster than optimized C.

> Assuming a higher level language is slower, often, but not always, the 
> abstraction and language difference can lead the programmer to a 
> better/faster algorithm to offset the slower execution.

Yeah.  Higher level languages *do* encourage more clever, more
interesting algorithms -- and thus faster, or at least "not impossible,"
solutions.

> This is certainly the case in particular problem domains when
> comparing declarative languages such as lisp with imperative such as
> C, Java, Ruby. I haven't considered whether our problem domain falls
> into this category.

Hold on there --  Lisp is not declarative.  (I don't care what Wikipedia
says.)  It's as imperative as any other language, relying more on
recursion and *offering* enough to do functional (no side-effect)
coding.  I wouldn't confuse what a language *supports* with how people
actually use it.  Dude I sit next to at Day Job writes plenty of Common
Lisp, and his is very side effect and iterative loop -ridden stuff.

I have yet to experience the "declarative epiphany" when writing XSLT,
for example.  Though it's very speculative (heh), I believe declarative
coding is so utterly foreign to human linguistic structures that
non-algorithmic langauges are doomed.  Even if we end up just
simulating them, people *think* with side-effects, and people think
with state.  (Oh and as much as I love Lisp, prefix notation is
probably dooming also.)

Oh, and lest we leave the domain of OSM for too long -- most declarative
systems have such crap library support that we don't want to consider
them seriously.

> >If there's one almost-absolute I try to stand by, it's to develop
> >with the platform providing:  a) the highest level of abstraction,
> >and b) the largest set of libraries.  (In that order.)  I'm just too
> >busy to give a shit about absolute performance.  (Note, I did not say
> >"algorithm performance.")
> 
> I don't understand the distinction between algorithm performance and
> absolute performance. If what you are performing is an algorithm,
> which is surely the case for any computer program, that is the same?

No they're not the same at all.  Here I grudgingly put on the ivory
tower CS hat:  There's an entire maths notation (big O) for discussing
algorithmic performance abstracted from the particulars of hardware
(clock cycles, CPU's, blah blah).  This is because dealing in terms of
specific hardware fails to isolate the core issue -- some algorithms are
n^2, some are log^n, some are just n.  Think of Quick Sort versus your
basic (stupid) linear array sort.  An O(n^2) algorithm is slower than an
O(log n) algorithm, regardless of language or hardware.

> I can think of an example where the selection of Ruby wouldn't be
> suitable. Say there is a need to serve data based on mathenmatical
> transformations. Say that transformation is similar to a mandelbrot
> function. Let's say we need to serve 250 such services per second.
> 
> Let's say the average data served took 11.2M CPU cycles to process
> when written in C and 9Bn CPU cycles when written in Ruby (tests bear
> this relationship out as being realistic for a mandelbrot type
> function).
> 
> We could service all requests with one 3Ghz CPU if written in C. How
> many CPUs would we need if written in Ruby?

I don't care.  And more importantly, I don't think *we* should care.

There's an important distinction to be made here between real-time
systems and what might be called "normal" software.  The real-time
software flying a passenger jet has firm requirements on performance,
and thus an entire software development culture supporting the writing
of such software.  (Eight zillion testers per developer, intense
idiomatic code review, everything in C, etc.)

On the other hand, we at OSM are writing software that can pragmatically
expect to have plenty of hardware to cope with the "hit" of any high
level language -- as long as the algorithms are good (see above).
Something might take the equivalent of 3 million quid's worth of CPU to
calculate today, but it'll take 10 pence tomorrow.  Hooray for Moore's
Law.

> >>Would your suggestion to use Ruby require a complete re-write of the
> >>client side?
> >
> >Not really.  I'm suggesting Nick use Ruby only because *he's* talking
> >about doing a greenfield (clean slate) rewrite.  I do not suggest Imi
> >(for example) rewrite the lovely JOSM as ROSM.  Though I'd be happy
> >if he did... ;)
> 
> Do you think there is a good argument to build a common code base and
> data representation for client side programs in Java?

Absolutely -- I think there's a good argument to build a common code
base and data representation in any one langauge.  Now we get back to
one of my earlier thoughts:  The server is *already* written in Ruby!
And Ruby is a proper Free language.  So use it by default, unless you've
got a damn good reason.

> >Free Java's performance is a joke.  I genuinely wish this were not
> >the case -- because I love what the GNU CLASSPATH people are doing --
> >but the native output of gcj doesn't hold a candle to Sun's JIT-ed
> >JVM bytecode.  (If you don't believe me, try a natively compiled
> >Eclipse...  It's almost funny.)
> 
> I guess that is an issue with the implementation you used. If you
> install ubuntu dapper, enable universe, then install Eclipse, it will
> install bytecode eclipse with eclipse's AWT compiled against GTK, run
> on the GIJ virtual machine. GIJ performs just-in-time compilation of
> the same code as GCC.

You're right, I did not do my Eclipse test with the latest and greatest
GNU tools (as of now, May 2006).  I'm glad the performance has improved.

> You will find that on a P3-500 with 500M ram, performance is good. Not
> blistering, but certainly no slouch. Certainly usable. Everything else
> being equal, if the code was pre-compiled, it *should* be faster.

That's not always true, since having the high level source around during
execution provides more optimization opportunities than you have at the
machine code level.  (This is one of Sun's dirty little secrets:  The
JVM is fast because JVM byte code *is* basically Java.  I mean really,
what CPU in the world has a "lookup and execute static class method"
*instruction*?!)

A good Common Lisp interpreter can be extremely performant precisely
because it has the high level source around to make crazy execution path
optimizations, smarter garbage collection, etc.  You lose some of this
capability when translating to a (real) native binary.  Ruby's
interpreter is gradually adding more of this as well, especially since
Ruby works again the AST and doesn't bother "compiling" in the
traditional sense.

> >Oh, and why do we want statically linked binaries anyway?  Ruby's
> >interpreter works on the AST so it doesn't really get you more than
> >code obfuscation, which is stupid for Free software anyway.  Or maybe
> >we want the convenience of not requiring a hundred supporting
> >downloads -- library JAR's or interpreters / VM's. 
> 
> If we made a statically linked binary, the end user would not need a
> VM.  They would not need JARs. They simply have a piece of code to
> run. Just click on the EXE. No external dependencies. The code is
> genuine native code.

But we can achieve the same thing in a higher level langauge, as long as
we are legally allowed to "bundle everything" (Free).  Though it was
written in Java, it's worth noting how few typical users even notice
that a program like LimeWire or Azureus launches an interpreter, etc.
when running.

		Ben