Re: [math-fun] 32-bit reversal speed tests

9 Apr 2014

      * Warren D Smith <warren.wds@gmail.com> [Apr 09. 2014 18:51]:
...
On 4/9/14, Warren D Smith <warren.wds@gmail.com> wrote:
...
I tested several 32-bit reversors  for speed.
--and, the winning 32-bit reversor is actually slower than Arndt's
64-bit reversor that
uses assembler "bswap" inside.
Total futility.
If, in their infinite wisdom, the CPU makers would have
added a "reverse bits in every byte", we'd be free from this
business for all practical purposes.  Exactly _zero_
register pressure.  The additional wiring/gates cannot
possibly be nontrivial.

Now intel is looking at 512 bit regs and still their
instruction set's anti-orthogonality is a poo fest that
makes even the bad old x86 instruction set look structured.
Humanity weeps.

Let's write down (yes, yes, futility; all below ignoring
bswap) for some specific n. Butterfly paradigm from now on.

We can do 63 bits in 4 steps, as we have seen.

126 goes by [ 63, 63 ] in just one more step (swap those two
halves in one more step, total now 5 steps).

127 goes by [ 63, 1, 63 ] in just one more step (swap those
two halves in one more step, total now 5 steps).

128 goes by [ 1, 63, 63, 1 ] in just two steps (swap those
two 63's in one more step, and those two 1's in one more,
total now 6 steps).  That is likely the first power of 2
that one can do in less then log_2(n) steps.  But then,
these butterfly steps cost more than "cut out by masks and
shift-or together" as used in the straight forward routines.

129 (thanks for asking) goes by [ 1, 63, 1, 63, 1 ],
again 6 steps.

Yours futile,    jj
...
_______________________________________________
math-fun mailing list
math-fun@mailman.xmission.com
http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

Re: [math-fun] 32-bit reversal speed tests

Joerg Arndt