[math-fun] ENIAC and maple : a simple test
Hello, I was reading an article on the ENIAC, that monster could add 100,000 10 digits numbers in 1 second. At the time it was impressive in 1946 of course. I made this simple test, a naive test. I took maple 11, with a 3.0 Ghz PC pentium. ##################################### Digits:=10: som:=0: for i from 1 to 1e99 do: som:=som+1.0/i: if i mod 100000 = 0 then lprint(i,som) fi: od: ##################################### It displays the partial sum of the harmonic series every 100000 terms. this is not optimized, a lazy program. Well, this is the bad news. That loop made on a modern computer with one of the top programs goes at the SAME speed. It takes 1 second to display each step. Are we missing something ? of course, 1) 2 operations are made at each step. 2) the program is NOT optimized. 3) programmed in C, this would be a zillion times better. but still we are 60 years later and we hardly do better !!?? Does maple became a bloat-ware ? ps : the speed of the ENIAC was something like 0.001 Mhz... simon plouffe
My equally naive test in PLT DrScheme took less than 15 sec to do 10 million terms of the harmonic series. So 0.15 sec versus your 1 sec. On OS X, intel 1.66 GHz core duo. Also for comparison: to sum 1s instead of floating point numbers, took 9 sec (to give some idea of how much of the time in that calculation might be loop overhead ...) And to calculate triangular numbers (adding n, instead of 1 or 1/n as in the previous trials) took 12 sec (so that gives some idea of the timing of big integer arithmetic vs counting by 1s). So yeah, Maple taking 60 sec is awfully slow. --Joshua Zucker
On 7/17/07, Simon Plouffe <simon.plouffe@gmail.com> wrote:
Hello,
I was reading an article on the ENIAC, that monster could add 100,000 10 digits numbers in 1 second. At the time it was impressive in 1946 of course.
ps : the speed of the ENIAC was something like 0.001 Mhz...
Don't you mean .000001 Ghz? Or .000000001 Thz? The whole point of having the prefixes is to avoid the need for huge numbers of zeros before or after the decimal point. Second, the statements "The ENIAC was something like 1Khz" and The Eniac could add 100,000 10-digit numbers in one second" seem unlikely to me to both be true. Did the Eniac really have a sufficiently parallel architecture that it could add 100 10-digit numbers in a single clock cycle? Andy.Latto@pobox.com
On 7/17/07, Andy Latto <andy.latto@gmail.com> wrote:
Second, the statements "The ENIAC was something like 1Khz" and The Eniac could add 100,000 10-digit numbers in one second" seem unlikely to me to both be true. Did the Eniac really have a sufficiently parallel architecture that it could add 100 10-digit numbers in a single clock cycle?
According to http://en.wikipedia.org/wiki/ENIAC it seems to have clocked at 5 kHz (five times faster than Simon told us) and it also seems to have had 20 parallel accumulators, so for adding in parallel it could potentially have done up to 100,000 10-digits (base 10!) additions in a second. --Joshua Zucker
Simon Plouffe wrote:
Hello,
I was reading an article on the ENIAC, that monster could add 100,000 10 digits numbers in 1 second. At the time it was impressive in 1946 of course.
I made this simple test, a naive test.
I took maple 11, with a 3.0 Ghz PC pentium. ##################################### Digits:=10:
som:=0: for i from 1 to 1e99 do: som:=som+1.0/i:
1) 2 operations are made at each step. 2) the program is NOT optimized. 3) programmed in C, this would be a zillion times better.
Ok, it don't know anything about Maple but I am quite sure that Digits:=10, does not mean it should use integer unit with 10 decimal digits, right? So then (i) you compare floats with integers, (ii) you have divisions which are normally more expensive than additions. How fast is adding the first n (according to Gauss you can go up to n approx 2^16 before an overflow occurs) integers (32 bit) in a loop with maple? regards Christoph
Hello, my test was meant to be a rough comparison of speed. If you try maple 11 on mac OSX it is even worse, I installed maple 11 + mac osX on a pentium double core at 2.13 Ghz. Maple was running about 3 times slower than the same maple 11 on a windows xp. Eniac was a dinosaure and even an interpreter today should run a million times faster and it is not : the speed is comparable. I used to run scripts on a mac with hypercard 10 years ago. It could compute at 16 digits. I bet that my old mac with hypercard could run at the same speed, this is bad, really bad. My machine now runs 129 times faster and the speed of the maple interpreter is about the same as a <color mac for the home and the family> of 10 years ago that runs a hypercard stack, a toy application.... houston we have a problem! We are comparing orders of magnitude, there should be a considerable difference and there is not. Simon Plouffe
hello, about that test again (maple VS ENIAC). I compiled that simple loop in C++ and ran the program on the same machine. Result : 7 million additions per second. the machine being a Pentium 3.0 Ghz (HT) with 1 gig of ram. The loop is hardly 5-6 lines and has additional code which is understandable in the context. a) There is the counter and the test for the modulo 100000 b) I had to add 100000 different numbers, this is why I used the inverse of n. funny note : the same program when ran with VIRTUAL PC does 10 million additions per second. c) the announced speed of 7 and 10 million iterations per second are rough estimates, from eye sight. Simon Plouffe (and my colleague at work, david grenier, that helped with this test).
* Simon Plouffe <simon.plouffe@gmail.com> [Jul 20. 2007 17:47]:
hello, about that test again (maple VS ENIAC).
I compiled that simple loop in C++ and ran the program on the same machine.
Result : 7 million additions per second.
There should be about one addition per cycle: /* --------------------------- */ /* gcc -W -Wall -O2 add.c -o add time ./add ./add 2.55s user 0.00s system 99% cpu 2.552 total 1000000000/2.55 == 392,156,862 iterations/sec one iteration is 2 float adds, one int add (and the branch) ==> about 2 cycles per float add */ int main() { double i, s; unsigned long k; i = 0.0; s = 0.0; for (k=0; k<1000000000; ++k) { i+=1.0; s+=i; } if ( s < 239 ) return 1; // avoid loop optimized away return 0; } /* --------------------------- */ Only the branch slows things down. Somewhat optimized (unrolled) codes should give rather 2 adds per cycle. With inversion or mod the timing is dominated by those. If one neeeds breakpoints, then use code like /* want N adds, printout at every K-th step */ i=0; loop N/K times: { loop K times: { s += i; i += 1; } print s } This avoids the modulo computation and the additional branch.
yes, but your program does not add full precision number isn't ? I modified your loop the following way. int main() { double i, s; unsigned long k; i = 0.0; s = 0.0; for (k=0; k<50000000; ++k) { i+=1.0; s+=1.0/i; } printf("%f %f\n", i, s); } with the same compile : i.e : gcc -W -Wall -O2 boucle.c -o boucle boucle.c being the program name When ran it gives this result : 50000000.000000 18.304749 one second later simon plouffe
* Simon Plouffe <simon.plouffe@gmail.com> [Jul 21. 2007 09:17]:
yes, but your program does not add full precision number isn't ?
For which definition of full precision? I use 64-bit floats, anything further needs a arbitrary precsion library.
I modified your loop the following way.
int main() { double i, s; unsigned long k; i = 0.0; s = 0.0; for (k=0; k<50000000; ++k) { i+=1.0; s+=1.0/i; } printf("%f %f\n", i, s); }
with the same compile : i.e : gcc -W -Wall -O2 boucle.c -o boucle
boucle.c being the program name
When ran it gives this result : 50000000.000000 18.304749
one second later
This timing measures the speed of the FPU division instruction.
simon plouffe
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
yes, you are right, the computation of the inverse takes most of the time, then ok if I do : /*--------------------------------------*/ int main() { double i, s; unsigned long k; i = .0000817181878711; s = 0.0; for (k=0; k<999999999; ++k) { i+=1.0; s+=i; } printf("%f %f\n", i, s); if ( s < 100 ) return 1; return 0; } /*--------------------------------------*/ with the same compilation options, in a loop takes approx. 1.5 seconds for each iteration on a 2.13 Ghz pentium core2duo with 2 gigs of ram. which makes 666,666,666 additions per second, which makes that CPU 6667 times faster than the ENIAC, sounds fast enough for me! that's a decent speed. simon plouffe
participants (5)
-
Andy Latto -
Christoph Pacher -
Joerg Arndt -
Joshua Zucker -
Simon Plouffe