* rcs@xmission.com <rcs@xmission.com> [May 04. 2009 11:50]:
[...]
Xor Multiply
Intel is introducing the PCLMULQDQ instruction, which does a 64x64->128 bit Xor-Multiplication. (Their naming conventions remind me of the Tom Lehrer song about the Red Line.) This instruction is helpful in several crypto applications, and also in working with error correcting codes.
Puzzle: a) Use the new instruction in a routine to reverse 64 bits. This swaps the sign bit with the low order bit, etc.
Uhm, dunno. And I'd like to know. I use a method similar to the code given for (b), it needs less than 10 cycles if a byte-swap instruction is available.
b) The new instruction can obviously be used to Xor-Square a 64-bit quantity. Use it to do "square root": collect the odd numbered bits 63-1 in the left half of the word, and the even numbered bits 62-0 in the right half.
Using 8 bit words here: squaring: abcdABCD --> a0b0c0d0,A0B0C0D0 compute aAbBcCdD == a0b0c0d0 XOR ( A0B0C0D0 >> 1 ) [I call this operation bit-zip] repeat to get acACbdBD (which is what we want) I doubt this wil be much better than the well known methods, unless immediate operands are expensive (or unavailable, as in yucky SSE). Here is essentially what I use: static inline ulong bit_unzip(ulong x) // Return word with even indexed bits in lower half // and odd indexed bits in higher half. { ulong y = (x >> 1) & 0x5555555555555555UL; x &= 0x5555555555555555UL; x = (x | (x>>1)) & 0x3333333333333333UL; y = (y | (y>>1)) & 0x3333333333333333UL; x = (x | (x>>2)) & 0x0f0f0f0f0f0f0f0fUL; y = (y | (y>>2)) & 0x0f0f0f0f0f0f0f0fUL; x = (x | (x>>4)) & 0x00ff00ff00ff00ffUL; y = (y | (y>>4)) & 0x00ff00ff00ff00ffUL; x = (x | (x>>8)) & 0x0000ffff0000ffffUL; y = (y | (y>>8)) & 0x0000ffff0000ffffUL; x = (x | (x>>16)) & 0x00000000ffffffffUL; y = (y | (y>>16)) & 0x00000000ffffffffUL; x |= (y<<32); return x; }
c) Extra credit: describe the relationship with card shuffling.
one step as above (my 'bit-zip') is an outer shuffle(?)
d) Is PCLMULQDQ useful in implementing an arbitrary pre-specified bit permutation?
That would be nice 8-) A few important permutations (like zip/unzip or rev-bin, which have clean decompositions as Kronecker products) can be obtained with methods similar to the above.
Rich
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun