[self-interest] Performance?

Thu Jul 1 16:51:10 UTC 2004

On Wednesday 30 June 2004 23:05, Michael Latta wrote:
> The Squeak list has postings for bytecodes/sec and sends/sec.  Are
> similar figures avaialble for self?  Is there a benchmark that can be
> run to get similar figures?

I did a quick and dirty translation of the two Squeak benchmarks to Self 
and have attached it. One way to load it is to execute

   'tbench.self' _RunScript

and then get the result of

   0 tinyBenchmarks

> I would be interested in knowing the
> relative performance of the two systems.

Self for Linux 2.2 (which only includes the very poor non inlining 
compiler) gets

   4,761,904 bytecodes/sec; 428,523 sends/sec

(I added the commas because otherwise I find it very hard to read these 
numbers) and running it a few more times doesn't change things (no 
recompilation). This is a 600MHz Pentium III, for which Squeak 3.2 
shows these results

   51,405,622 bytecodes/sec; 1,561,049 sends/sec

For a 277MHz UltraSparc II, Self 4.2.1 yields

   95,238,095 bytecodes/sec; 16,202,747 sends/sec

while Squeak 3.6 says

   14,121,800 bytecodes/sec; 678,849 sends/sec

So unless there is something wrong in my translation, or the timing 
functions in either the Linux Self or the Sparc one, the lesson is 
clear: don't leave home without the optimizing compiler ;-)

> Squeak seems to be running
> > 1000 cycles / bytecode.  This seems quite high compared to the
> techniques used in the Self compiler, even with GC and compiler
> overhead.

People have been getting results of 200Mbytecodes/sec or more with 
Squeak on 3GHz machines, which would be 15 clocks per bytecode. The 
Self on Ultra II numbers show this could be improved around 5 times.

-- Jecel
-------------- next part --------------
traits integer _AddSlots: ( |
    tinyBenchmarks = (
        "Report the results of running the two tiny Squeak
         benchmarks in Self.
         ar 9/10/1999: Adjusted to run at least 1 sec to get
         more stable results.
         jaj jun-1-2004: ported to Self"
         "0 tinyBenchmarks"
         | t1. t2. r. n1 <- 1. n2 <- 28 |
         [t1: [n1 benchmark] time.
          t1 < 1000 ] whileTrue: [n1: n1 * 2].
         "Note: benchmark's runtime is about O(n)"
         [t2: [r: n2 benchFib] time.
          t2 < 1000 ] whileTrue: [n2: n2 + 1].
         "Note: benchFib's runtime is about O(k**n)
            where k is the golden number (1 + 5 sqrt) / 2 = 1.618...."

         ((n1 * 500000 * 1000) / t1) printString, ' bytecodes/sec; ',
         ((r * 1000) / t2) printString, ' sends/sec'
    ).
    benchFib = (
        "Handy send-heavy benchmark"
        "(result / seconds to run) = approx calls per second"
        < 2 ifTrue: [1]
            False: [(- 1) benchFib + (- 2) benchFib + 1]
    ).
    benchmark = (
        "Handy bytecode-heavy benchmark"
        "(500000 / time to run) = approx bytecodes per second"
        | size <- 8190. flags. prime. k. count |
        1 to: self Do:
            [|:iter |
            count: 0.
            flags: vector copySize: size+1 FillingWith: true.
                        "0 based array vs 1 based for Squeak"
            1 to: size Do:
                [|:i | (flags at: i) ifTrue:
                    [prime: i + 1.
                     k: i + prime.
                     [k <= size] whileTrue:
                        [flags at: k Put: false.
                         k: k + prime].
                     count: count + 1]]].
        count
    ).
| )