GPUs have suffered from a small amount of memory per processor. A few KB was typical in 2008. [Has this been fixed? I haven't looked at the latest offerings.] Data hungry programs starve from inadequate bandwidth to the main computer memory. On the other hand, if your problem involves a lot of computing on a small amount of data (or a smallish internal state), GPUs win. Examples include a lot of our counting puzzles (anyone for counting 6x6 magic squares?), and searches, such as for various combinatorics problems, Ramsey numbers, or perhaps game trees or Life patterns. Crypto keys seem an obvious target, but the 2^128 key space of AES is still way out of reach. DES, with only 2^56 keys, is a possibility, but 70 quadrillion is still a pretty big number, even spread over 1000 processors. ECM, using elliptic curves to search for modest factors of biggish numbers, looks promising. Several bignum state variables need to fit in that O(KB)/processor, so hugenums are out, but targets of a few hundred digits should work. ECM has found factors up to 10^70, although mostly it finds 40-50 digit factors. NFS factoring (the Number Field Sieve) needs huge arrays, GB of randomly accessed memory. There are ways to reduce the random-access to sequential -- batching the sieve updates and sorting them -- but the memory bandwidth problem remains. It's possible that the Mersenne prime search could work with GPUs: The FFTs for the squaring could be subdivided, with most of the work being local to a single core. For the Pi work, the situation looks murky: Most of the work in a big multiplication can be subdivided -- imagine Karatsuba farming out subproblems to GPU processors, or splitting up an FFT into pieces. But there's a fair amount of cross communication required, which might clobber the GPU advantage. Rich ________________________________________ From: math-fun-bounces@mailman.xmission.com [math-fun-bounces@mailman.xmission.com] on behalf of Simon Plouffe [simon.plouffe@gmail.com] Sent: Tuesday, October 11, 2011 9:16 PM To: math-fun@mailman.xmission.com Subject: Re: [math-fun] fastest computer ? (single operation). Yes, thank you for those answers. In other words, the fastest computers (single cpu), is still the ones that an ordinary citizen can afford and there are apparently no machines that are currently developed with NASA or such places, or a kind of an experimental DNA computer ? I don't see those nitrogen cooled machines as being something that one person could maintain in normal operations for let's say 1 month for 1 big calculation like Kondo or Bellard did on Pi. It is somewhat reassuring that for a couple of thousand dollars one can achieve record calculations with something you can buy at the local computer store. Unless someone finds a way to gpu-parallelize big multiplications or high precision arithmetic, those monsters can go 100 times faster than those crazy nitrogen cooled gizmos. I am thinking : like having the factoring problem being done on 1 big GPU ? Simon Plouffe Le 2011-10-12 04:07, Robert Munafo a écrit :
All the best records are set by liquid-cooled systems. Nitrogen is most common but some of them use liquid helium. The CPU doesn't need to run at such a low temperature, it just gives them a steeper thermal gradient, which allows them to get more wattage (heat) out of the CPU so they can run at a higher wattage (electrical) at a given temperature.
If clock speed were the criterion, then indeed that AMD record from last month is the winner. However most of the recent clock speed benchmarks have been set by Intel Netburst (Pentium 4 and Celeron) processors in the LGA-775 socket, which by now is pretty much obsolete. These processors were designed back in the early 2000's when high clock speed was the primary design goal. [1]
Overclockers are typically more interested in overall instructions per second, and therefore instructions-per-clock and memory bandwidth, etc. are a factor. A popular program is SuperPi, which calculates the value of Pi to very high precision. The record-holders there are all of the latest (overclockable) Intel models (like the Core i7 Extreme 980X, 2600K, etc.) running at speeds around 6 to 7 GHz. [2]
Based on the original question (referencing Mersenne prime testing as the operative benchmark) you should probably look at the GIMPS Prime95 benchmark results. Here again the records are held by the latest Intel processors like the Core i7-2600K, overclocked into the 4-5 GHz range. [3] Note however that the serious liquid-cooling folks do not use the GIMPS benchmark, so we don't really know how fast one of their systems would go.
With the right motherboard a Core i7 2600K can be run at 4.5 GHz with just normal air cooling. See [4], conclusion states: "[...] Practically, though, you should be able to reach anywhere between ~4.5 and roughly 5 GHz on air cooling with all Core K-series processors based on the 32 nm Sandy Bridge architecture."
- Robert
[1] See http://www.hwbot.org/benchmark/cpu_frequency/
[2] See http://www.hwbot.org/benchmark/superpi_32m/
[3] See http://www.mersenne.org/report_benchmarks/
[4] See http://www.tomshardware.com/reviews/sandy-bridge-overclocking-efficiency,285...
On Tue, Oct 11, 2011 at 18:45, Tom Rokicki<rokicki@gmail.com> wrote:
The answer is probably one of the AMD overclocked processors; they have a bunch of people that go crazy with cooling solutions and crank the clock up:
http://hothardware.com/News/AMD-Breaks-Frequency-Record-with-Upcoming-FX-Pro...
Not that this is ready for your home office or anything like that.
On Tue, Oct 11, 2011 at 3:40 PM, Simon Plouffe<simon.plouffe@gmail.com> wrote:
Hello,
As you may know, the fastest computer is currently a japanese machine called K, which can do 10 petaflops (?) or something like that. These machines are made with ordinary components at the base like Intel processors or Opteron + many GPU like Nvidia. This is all very impressive but I am wondering, what is the fastest single operation computer on this planet ? A souped-up Intel machine running overclocked at 4 ghz ? Is there somebody that knows the answer to that question ? In other words, if someone wants to run let's say the Mersenne test for a prime which is not yet implemented in parallel then on which machine would that algorithm run the fastest ? Best regards and have a nice evening.
Simon Plouffe
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun