Here's Intel's latest & greatest: http://www.theregister.co.uk/2016/03/31/intel_broadwell_ep_xeon_e5_2600_v4/ Intel's Broadwell Xeon E5-2600 v4 chips: So what's in it for you, smartie-pants coders New instructions, transactions, virtualization features and more --- Knuth also proposed some cool bit vector/matrix instructions, but I can't find the link just now. Vaughan Pratt showed that very long bit vector ("Boolean vector") instructions can be surprisingly powerful. --- Intel, AMD, nVidia may very well have undocumented and/or firmware-additional instructions specially designed by NSA, but unavailable to the unwashed. If NOBUS (Google it) is going to mean anything for brute force attacks, then NOBUS brute force attacks might as well have a 10x advantage over everyone else's. nVidia's new ARM chips have a 'soft' architecture, in which machine instructions are merely 'suggestions' about which operations to really perform. Although this 'advance' is touted as a way to increase performance -- e.g., in automotive video applications for driverless cars -- it is likely the world's least secure architecture. God only knows what's really going on inside these chips. At 10:57 AM 4/1/2016, Tom Knight wrote:
This "binary matrix multiply" is one of the key instructions requested by NSA in the supercomputers designed by Cray etc. It's somewhat remarkable that it has not been implemented in more mainstream architectures. I'd be surprised it is missing in the NVIDIA instruction set, for example, since they are successors of the supercomputer crowd.
On Apr 1, 2016, at 1:46 PM, Warren D Smith <warren.wds@gmail.com> wrote:
On 4/1/16, Warren D Smith <warren.wds@gmail.com> wrote:
Be nice if one could create an NxN bit-matrix (for example N=machine's wordsize) and then multiply it (mod 2) by an N-element bit vector (e.g machine word) in 1 instruction.
This would help with cryptography, error correcting codes, and probably a lot of other stuff.
--such as permuting bits in a word, would be another application.
Also, if the XORs that were used to add up the sums mod 2, were replaced by plain ORs, then that other kind of matrix-vector "multiply" would be excellent for graph connectivity calculations.