[math-fun] NAND flash memory

5 Jan 2015

      ...
From what I understand, NAND flash memory consists of
rather large "eraseblocks"/"gcblocks" of perhaps 64KBytes,
which can be read in 512byte blocks (?) and written in
512byte blocks (?), but the "write" operation here is
really an AND operation, since an erase operation on
such a block resets the block to all 1's.
Obviously, the usual "FTL" and file system layers cover
all of this up -- for good reasons, including bad block
handling and "wear leveling", to avoid erasing a block
too many times.

The question here is whether the ability to perform a
simple logic operation -- AND in this case -- can be
put to good use.  (Let's ignore for the moment that
these devices are slower than RAM memories.)

The most obvious use which comes to mind is that of
a bitvector for keeping track of which "eraseblocks"
are free; the most obvious convention would be that
a "1" bit indicates that the block is free!  Thus,
a single 64KB eraseblock could keep track of the
free/allocated status of 32GB of flash memory.

Note that simply _reading_ or _AND'ing to memory_
don't cause any "wear"; the wear is caused by
_erasing_ a complete "eraseblock".

So in order to make good use of these blocks, we have
to arrange data structures to put stuff together in
the same "eraseblock" that all gets erased at the same
time, and keep stuff in separate eraseblocks that
needs to be independently erased.

Next, we notice that algorithms which live in a lattice
and which compute LUB's or GLB's can be mapped to such
a memory -- e.g., lattice elements the are long bitvectors.
So if each bitvector occupied a separate eraseblock, then
an optimizing algorithm that computed an LUB, for example,
could be mapped to such a flash memory, and only cause
"wear" at the end of each computation when the bitvectors
are reset.

We could also do Connection Machine-style bit-serial
processing on 2^19 bit-processors, but for wear-levelling,
we would assume a WORM (write-once, read-many) memory,
and then recycle the eraseblocks only after all of them
had been used at least once.

(Note that in CM-style processing, each bitvector i
indicates the state of the i'th memory bit in
all 2^19 processors in an SIMD array of processors.
Also, any ALU ops are "free" because they can be
performed by the processor accessing the NAND flash
memory.)

Can we do faster bit-matrix multiplication?

Vaughan Pratt wrote a paper about long bitvector machines,
but I'm not sure how much of this would carry (!) over to
this model.

Can any other lattices other than Boolean lattices be
emulated with these long AND operations?

Any other ideas?

Henry Baker

Whitfield Diffie

tags

participants (2)