I sketch a method (perhaps what FxtBook had in mind?) which should be able to perform an arbitrary fixed permutation of the W bits in a W-bit-long word in O(logW) steps. No future or weird machine instructions are needed -- the present non-weird ones will suffice. We break the problem down into SUBPROBLEMS of the following form: Divide the W-long word into chunks of lengths P,P,P,...,P where P is a power of 2. The even- and odd-numbered chunks are mentally red and blue. Then within 2 adjacent such chunks rrrrrrrrbbbbbbbb (here shown with P=8) the kth "r" is swapped (or not) with the kth "b" for all chunk-pairs and all k simultaneously, with independent swap-or-not decisions made for each bit-pair according to specification by certain W-bit long constants. It is known from the theory of telephone switching networks (e.g. the "Benes network" see http://en.wikipedia.org/wiki/Clos_network) that 2*lgW subproblem solves with P=W/2, W/4, ..., 4,2,1, 1,2,4, ..., W/4, W/2 successively will suffice to route any permutation, and it is fairly easy to figure out how using binary encodings. So: how can we solve that subproblem? Recall the well known (to those who know it) way to swap two words A, B without any temporary storage: A = XOR(A,B); B = XOR(A,B); A = XOR(A,B). Recall also the even-better-known fact that XOR(A,0)=A. So if we do A = XOR( A, B AND m); B = XOR(A AND m, B); A = XOR(A, B AND m) where m is a bitmask, then the mask=1 bits in A and B are swapped, and the rest left unswapped. Now in our problem "A" and "B" actually are different (red & blue) parts of the *same* machine word, but that is why God invented the arbitrary-hops bit-shift instruction and other bit masks specifying the coloring. QED[algorithm construction]. Now I also remark that this solution is OPTIMAL (up to a constant factor) in runtime because lg(W!) bits are needed to specify the permutation, i.e. at least lg(W) different W-bit long words are needed to specify it, and any machine which can only deal with a constant number of machines words at a time, necessarily will consume order log(W) timesteps just to see the specification. QED[optimality]. -- Warren D. Smith http://RangeVoting.org