The 3 shears procedure is more exactly: 1. Shear the bitmap horizontally 45 degrees to the right; the X-axis stays the same, the Y-axis tilts 45 degrees to the right. 2. Shear the bitmap vertically 45 degrees down; the Y-axis drops down 45 degrees to horizontal, the X-axis drops down 45 degrees below horizontal. 3. Shear the bitmap horizontally 45 degrees to the left; the Y-axis remains horizontal, the X-axis shifts 45 degrees to become vertical. At 01:51 PM 7/2/2017, Henry Baker wrote:
If you have high speed raster graphics hardware, you may already have the capability for 2D transpose.
There's also the famous (at least in computer graphics circles) "three shears" algorithm for image rotation by 90 degrees without any loss of information:
1. Shear the image 45 degrees to the right. 2. Shear the image 90 degrees down. 3. Shear the image 45 degrees to the left.
The problem is that step #2 is usually just as hard in most memory systems as transposition, unless the hardware is set up to pick up word slices -- e.g., give me the 13th bit each of 32 words arranged as a 32-bit word.
At 01:10 PM 7/2/2017, rcs@xmission.com wrote:
Transposition can be done in log2(N) passes over the matrix. Imagine cutting your matrix into quarters, and exchanging the upper right quarter with the lower left quarter. Next, do the same thing within each quarter. Etc. If N isn't a power of 2, you may have to round it up; unsure.
Rich
----- Quoting Henry Baker <hbaker1@pipeline.com>:
Thanks, Adam.
I'm already doing some of what you suggest.
Is there any cleverness for *transposing* such bit matrices ?
(As an aside, do *least squares* have any relevance over GF(2)?)
Note that Gaussian elimination on extremely long vectors (much longer than the rank of the system) is faster if you invert the small square (rank x rank) matrix by itself, and then propagate that info to the rest of each row.
Is there any cleverness that might be achieved if one had access to a *polynomial* multiply (and/or divide) operation over GF(2)[x] ? I seem to recall that this sort of HW is finally starting to show up as a "carryless multiply" operation.
At 11:06 AM 7/2/2017, Adam P. Goucher wrote:
Yes, if the matrix has <= 64 rows, then you can store the columns as 64-bit integers and perform column operations using bitwise XOR. Also, you can determine the uppermost and lowermost non-zero elements in a column using the __builtin_clzll() instruction and variants thereof.
This generalises, of course; with N rows, you can store each column as a vector of ceil(N/64) 64-bit integers and perform column operations with as many instructions. With SIMD, you can actually get away with as few as ceil(N/256) instructions per column operation.
The fact that there's only one non-zero element of GF(2) is also helpful since there's no need to multiply the columns -- add/subtract (i.e. XOR) is the only operation you ever need to perform!
I'm using exactly this procedure in my PhD research on topological data analysis. In particular, computing the persistent homology barcode from the boundary map is (a slightly modified version of) Gaussian elimination over GF(2).
Best wishes,
Adam P. Goucher
Sent: Sunday, July 02, 2017 at 6:29 PM From: "Henry Baker" <hbaker1@pipeline.com> To: math-fun@mailman.xmission.com Subject: [math-fun] Any cool hacks for Gaussian elimination over GF(2) ?
Other than Strassen?
I.e., are there any bit-twiddling hacks that are particular to GF(2) for Gaussian elimination?