* Joerg Arndt <arndt@jjj.de> [Apr 12. 2006 13:06]:
[...]
Another one I though about: For all small primes P compute the square roots of minus one mod P, mod P^2, mod P^3, etc. Do sieving as follows: For all small primes P. add log(P) at positions X where (X^2+1) divides P; again add log(P) at positions X where (X^2+1) divides P^2; etc.
Let L be the log-sum at position X.
Check whether log(X^2+1) equals L within precision according to prime limit. Use a fast hand-woven approx-log routine aplog() for the computation of the log(X^2+1).
I just haven't found a good alog(), else this would be my next attempt. But C's float log() might just do together with: after X_j was valid (i.e. smooth) the next valid entry L_i must be strictly greater than L_j. Hmmm... this might eliminate almost all log() computation.
I'll try that.
Done. And its fast! For X<=10^10 the old code needed 37h, the new one needs 23min30sec. Updating the log as decribed reduces the number of log-computations to (exactly) the number of smooth X found. Using three priority queues, according to size of the prime power sizes. This gives fine cache locality. Now computing up to 10^12, should take 40h. Thanks a lot for the very useful ideas!