[math-fun] reversing the order of bits in a word

older
[math-fun] errors by WDS corrected

Warren D Smith

28 Mar 2014 28 Mar '14

10:55 a.m.

Let w be a 64-bit word ("unsigned long long," in C language). The following C instruction sequence w ^= w>>32; w ^= w<<32; w ^= w>>32; w ^= w>>16; w ^= w<<16; w ^= w>>16; w ^= w>>8; w ^= w<<8; w ^= w>>8; w ^= w>>4; w ^= w<<4; w ^= w>>4; w ^= w>>2; w ^= w<<2; w ^= w>>2; w ^= w>>1; w ^= w<<1; w ^= w>>1; will overwrite w with its bit-order-reversal. In general for a (2^k)-bit long word this will use 3*k shifts and 3*k XOR operations and with zero(?) or one(?) extra registers consumed for temp storage and no funny masking constants needed. -- Warren D. Smith http://RangeVoting.org

Show replies by date

Tom Rokicki

28 Mar 28 Mar

11:01 a.m.

Pretty long dependency chain there. On Mar 28, 2014 9:56 AM, "Warren D Smith" <warren.wds@gmail.com> wrote:

...

Let w be a 64-bit word ("unsigned long long," in C language). The following C instruction sequence

w ^= w>>32; w ^= w<<32; w ^= w>>32; w ^= w>>16; w ^= w<<16; w ^= w>>16; w ^= w>>8; w ^= w<<8; w ^= w>>8; w ^= w>>4; w ^= w<<4; w ^= w>>4; w ^= w>>2; w ^= w<<2; w ^= w>>2; w ^= w>>1; w ^= w<<1; w ^= w>>1;

will overwrite w with its bit-order-reversal. In general for a (2^k)-bit long word this will use 3*k shifts and 3*k XOR operations and with zero(?) or one(?) extra registers consumed for temp storage and no funny masking constants needed.

-- Warren D. Smith http://RangeVoting.org

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

Marc LeBrun

11:16 a.m.

IIRC hackersdelight.org has some discussion of this. --MLB

Tom Karzes

11:35 a.m.

New subject: [math-fun] reversing the order of bits in a word

This doesn't work. Try it will all 1's for instance. The final result ends up with lots of 0's in it. The first stage works fine, but after that some masking is needed to prevent crossover between the groups. In fact, you can easily see this by running all 1's through the final stage. With just 4 bits, you have: 1111 (initial pattern) 1000 (should be 1010) 1000 (should be 1010) 1100 (should be 1111) Tom Warren D Smith writes:

...

Let w be a 64-bit word ("unsigned long long," in C language). The following C instruction sequence

w ^= w>>32; w ^= w<<32; w ^= w>>32; w ^= w>>16; w ^= w<<16; w ^= w>>16; w ^= w>>8; w ^= w<<8; w ^= w>>8; w ^= w>>4; w ^= w<<4; w ^= w>>4; w ^= w>>2; w ^= w<<2; w ^= w>>2; w ^= w>>1; w ^= w<<1; w ^= w>>1;

will overwrite w with its bit-order-reversal. In general for a (2^k)-bit long word this will use 3*k shifts and 3*k XOR operations and with zero(?) or one(?) extra registers consumed for temp storage and no funny masking constants needed.

-- Warren D. Smith http://RangeVoting.org

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

Tom Duff

11:49 a.m.

Nice. I've never seen it before. (In particular, it's not in Hacker's Delight.) On Fri, 28 Mar 2014, Warren D Smith wrote:

...

Let w be a 64-bit word ("unsigned long long," in C language). The following C instruction sequence

w ^= w>>32; w ^= w<<32; w ^= w>>32; w ^= w>>16; w ^= w<<16; w ^= w>>16; w ^= w>>8; w ^= w<<8; w ^= w>>8; w ^= w>>4; w ^= w<<4; w ^= w>>4; w ^= w>>2; w ^= w<<2; w ^= w>>2; w ^= w>>1; w ^= w<<1; w ^= w>>1;

will overwrite w with its bit-order-reversal. In general for a (2^k)-bit long word this will use 3*k shifts and 3*k XOR operations and with zero(?) or one(?) extra registers consumed for temp storage and no funny masking constants needed.

-- Tom Duff. We can give no comment on this as it is too scientific.

Joerg Arndt

11:50 p.m.

* Warren D Smith <warren.wds@gmail.com> [Mar 29. 2014 06:08]:

...

Let w be a 64-bit word ("unsigned long long," in C language).

unsigned long, unless you are using a platform whose makers are known imbeciles.

...

The following C instruction sequence

w ^= w>>32; w ^= w<<32; w ^= w>>32; w ^= w>>16; w ^= w<<16; w ^= w>>16; w ^= w>>8; w ^= w<<8; w ^= w>>8; w ^= w>>4; w ^= w<<4; w ^= w>>4; w ^= w>>2; w ^= w<<2; w ^= w>>2; w ^= w>>1; w ^= w<<1; w ^= w>>1;

will overwrite w with its bit-order-reversal.

Nope.

...

In general for a (2^k)-bit long word this will use 3*k shifts and 3*k XOR operations and with zero(?) or one(?) extra registers consumed for temp storage and no funny masking constants needed.

You can generate the masks needed dynamically. static inline ulong revbin(ulong x) { ulong s = 64 >> 1; ulong m = ~0UL >> s; while ( s ) { x = ( (x & m) << s ) ^ ( (x & (~m)) >> s ); s >>= 1; m ^= (m<<s); } return x; } The masks (m) are nothing mysterious. In binary (16 bits, dots for zeros): ........11111111 ....1111....1111 ..11..11..11..11 .1.1.1.1.1.1.1.1 Looks like binary counting to me.

...

-- Warren D. Smith [...]

As for machine instructions, assume we have a byte-reversing instruction (as in host<-->network ordering). On AMD64 (and intel clones) that would be (GCC inline asm) static inline ulong asm_bswap(ulong x) { asm ("bswap %0" : "=r" (x) : "0" (x)); return x; } Then revbin can be done quite cheaply as static inline ulong revbin(ulong x) { x = bit_swap_1(x); x = bit_swap_2(x); x = bit_swap_4(x); x = bswap(x); return x; } where static inline ulong bit_swap_1(ulong x) // Return x with neighbor bits swapped. { ulong m = 0x5555555555555555UL; return ((x & m) << 1) | ((x & (~m)) >> 1); } static inline ulong bit_swap_2(ulong x) // Return x with groups of 2 bits swapped. { ulong m = 0x3333333333333333UL; return ((x & m) << 2) | ((x & (~m)) >> 2); } static inline ulong bit_swap_4(ulong x) // Return x with groups of 4 bits swapped. { ulong m = 0x0f0f0f0f0f0f0f0fUL; return ((x & m) << 4) | ((x & (~m)) >> 4); } This is very fast in practice. Beware of table-lookup (byte-wise), it excels in benchmarks but will be a dog as soon as the table fails to be in first level cache. Best regards, jj (everything above is from the fxtbook) P.S.: something for the weekend: RLL word (lex) Fib word (Gray) 0: .1.1.1.1. ......... 1: .1.1.1.11 ........1 2: .1.1.11.. ......1.1 3: .1.1.11.1 ......1.. 4: .1.11..1. ....1.1.. 5: .1.11..11 ....1.1.1 6: .1.11.1.. ....1...1 7: .1.11.1.1 ....1.... 8: .1.11.11. ....1..1. 9: .11..1..1 ..1.1..1. 10: .11..1.1. ..1.1.... 11: .11..1.11 ..1.1...1 12: .11..11.. ..1.1.1.1 13: .11..11.1 ..1.1.1.. 14: .11.1..1. ..1...1.. 15: .11.1..11 ..1...1.1 16: .11.1.1.. ..1.....1 17: .11.1.1.1 ..1...... 18: .11.1.11. ..1....1. 19: .11.11..1 ..1..1.1. 20: .11.11.1. ..1..1... 21: .11.11.11 ..1..1..1 22: 1..1..1.. 1.1..1..1 23: 1..1..1.1 1.1..1... 24: 1..1..11. 1.1..1.1. 25: 1..1.1..1 1.1....1. 26: 1..1.1.1. 1.1...... 27: 1..1.1.11 1.1.....1 28: 1..1.11.. 1.1...1.1 29: 1..1.11.1 1.1...1.. 30: 1..11..1. 1.1.1.1.. 31: 1..11..11 1.1.1.1.1 http://jjj.de/fxt/demo/bits/#bit-rll2

Tom Karzes

29 Mar 29 Mar

12:28 a.m.

No, in gcc a "long" is 32-bits. That's why they introduced "long long", to support a 64-bit integer type without introducing compatibility problems or new keywords. The C standard requires only that "long" be at least 32 bits. At this point I would say gcc is the de facto reference implementation of C, and I would hardly call its creators imbeciles. Tom Joerg Arndt writes:

...

* Warren D Smith <warren.wds@gmail.com> [Mar 29. 2014 06:08]:

...
Let w be a 64-bit word ("unsigned long long," in C language).

unsigned long, unless you are using a platform whose makers are known imbeciles.

Joerg Arndt

1:52 a.m.

* Tom Karzes <karzes@sonic.net> [Mar 29. 2014 08:31]:

...

No, in gcc a "long" is 32-bits.

Not on a 64 bit system.

...

That's why they introduced "long long", to support a 64-bit integer type without introducing compatibility problems or new keywords.

... on 32 bit systems. (Sadly "long long" was _not_ promoted to 128 bits on any 64 bit system, bad!).

...

The C standard requires only that "long" be at least 32 bits. At this point I would say gcc is the de facto reference implementation of C, and I would hardly call its creators imbeciles.

I was talking about a certain Redmond based platform (requiring "unsigned long long" to get a full CPU register). The C standard mandates that char <= short <= int <= long <= long long and minimum ranges (yes, long >= 32 bit by that), see http://stackoverflow.com/questions/589575/size-of-int-long-etc

...

Tom

[...]

Best regards, jj

Tom Karzes

2:28 a.m.

On 64-bit Linux systems I think it can depend on the installation. I have used gcc on a 64-bit Linux PC where "long" was still just 32 bits (which I hadn't expected). I agree that it makes more sense for it to be 64 bits on a 64-bit platform, but I don't think you can depend on that. I think it's generally preferred to include <stdint.h>, then use uint64_t for unsigned 64-bit integers. There's also uint_least64_t (small, >= 64 bits) and uint_fast64_t (fast, >= 64 bits). Tom Joerg Arndt writes:

...

* Tom Karzes <karzes@sonic.net> [Mar 29. 2014 08:31]:

...
No, in gcc a "long" is 32-bits.

Not on a 64 bit system.

...
That's why they introduced "long long", to support a 64-bit integer type without introducing compatibility problems or new keywords.

... on 32 bit systems.

(Sadly "long long" was _not_ promoted to 128 bits on any 64 bit system, bad!).

...
The C standard requires only that "long" be at least 32 bits. At this point I would say gcc is the de facto reference implementation of C, and I would hardly call its creators imbeciles.

I was talking about a certain Redmond based platform (requiring "unsigned long long" to get a full CPU register).

The C standard mandates that char <= short <= int <= long <= long long and minimum ranges (yes, long >= 32 bit by that), see http://stackoverflow.com/questions/589575/size-of-int-long-etc

...
Tom

[...]

Best regards, jj

Joerg Arndt

5:20 a.m.

* Tom Karzes <karzes@sonic.net> [Mar 29. 2014 12:05]:

...

On 64-bit Linux systems I think it can depend on the installation. I have used gcc on a 64-bit Linux PC where "long" was still just 32 bits (which I hadn't expected).

You grabbed a 32 bit version (the CPU then runs in 32 bit mode). There should really be a warning like "You are about to install a 32 bit version of Linux. A 64 bit version is available as well. (...)" But apparently no distro does this.

...

I agree that it makes more sense for it to be 64 bits on a 64-bit platform, but I don't think you can depend on that.

For general purpose CPUs I am only aware of Windoze doing the silly thing (apparently in order to be able to compile their software without cleaning the source code).

...

I think it's generally preferred to include <stdint.h>, then use uint64_t for unsigned 64-bit integers. There's also uint_least64_t (small, >= 64 bits) and uint_fast64_t (fast, >= 64 bits).

Yes.

...

Tom

Best regards, jj

...

[...]

Tom Karzes

6:12 a.m.

It was a work computer, but I know it ran native 64-bit apps. It may have been a 32-bit gcc install though. Tom Joerg Arndt writes:

...

* Tom Karzes <karzes@sonic.net> [Mar 29. 2014 12:05]:

...
On 64-bit Linux systems I think it can depend on the installation. I have used gcc on a 64-bit Linux PC where "long" was still just 32 bits (which I hadn't expected).

You grabbed a 32 bit version (the CPU then runs in 32 bit mode).

There should really be a warning like "You are about to install a 32 bit version of Linux. A 64 bit version is available as well. (...)" But apparently no distro does this.

...
I agree that it makes more sense for it to be 64 bits on a 64-bit platform, but I don't think you can depend on that.

For general purpose CPUs I am only aware of Windoze doing the silly thing (apparently in order to be able to compile their software without cleaning the source code).

...
I think it's generally preferred to include <stdint.h>, then use uint64_t for unsigned 64-bit integers. There's also uint_least64_t (small, >= 64 bits) and uint_fast64_t (fast, >= 64 bits).

Yes.

...
Tom

Best regards, jj

...
[...]

John Aspinall

10:18 a.m.

New subject: [math-fun] how long is long [was: reversing the order of bits in a word]

Speaking as someone who does this as his "day job", Joerg is right about longs, on 64 bit platforms, being 64 bits, except on Windows. The fairly standard compiler flag for this is "LP64", or some nearby variant (it's "_LP64" on gcc). LP64 is just shorthand for "longs and pointers are 64 bits". It also makes a good search string for more than you ever wanted to know on this subject. - John Aspinall

Tom Karzes

1:41 p.m.

New subject: [math-fun] how long is long [was: reversing the order of bits in a word]

It sounds like the admins had done a 32-bit gcc install on the 64-bit server I was using, which would explain the 32-bit long size I was seeing. I guess they wanted it to run 64-bit apps, but wanted anything built on it to run on 32-bit platforms. Or perhaps they just messed up the gcc install. Tom John Aspinall writes:

...

Speaking as someone who does this as his "day job", Joerg is right about longs, on 64 bit platforms, being 64 bits, except on Windows. The fairly standard compiler flag for this is "LP64", or some nearby variant (it's "_LP64" on gcc). LP64 is just shorthand for "longs and pointers are 64 bits". It also makes a good search string for more than you ever wanted to know on this subject. - John Aspinall

4255

Age (days ago)

4256

Last active (days ago)

List overview

Download

12 comments

7 participants

participants (7)

Joerg Arndt
John Aspinall
Marc LeBrun
Tom Duff
Tom Karzes
Tom Rokicki
Warren D Smith