[math-fun] Prime-encoding constants/functions
Apropos https://www.youtube.com/watch?v=_gCKX6VMvmU There are many ways to produce the sequence of primes by iterating a function, e.g. starting with f_1 = ContinuedFraction(2, 3, 5, 7, 11, 13, ,,,) = 2.3130367364335829..., (http://oeis.org/A064442) and f_n+1 = 1/(f_n - Floor(f_n)) the sequence of Floor(f_n) will be the sequence of primes. Or, as in the clip, starting with the average of http://oeis.org/A053669 = 2.920050977316134712..., and f_n+1 = Floor(f_n) * (f_n - Floor(f_n) + 1) will do the same. Or the Blazys continued fraction, see http://oeis.org/A233588 The question is, which constant/function pair encodes primes in the most compact way. More precisely, given a denominator for a rational approximation, which function will correctly produce the most primes starting from the approximation of the corresponding constant? For example, with the IEEE double (64 bit floating point) approximations of the constants, the continued fraction starts diverging at 29, the Blazys continued fraction - at 47, the 2.92... constant - at 59. I was able to come up with the function f_n+1 = Floor(f) + 1/(f_n-Floor(f_n)); which is the delta-coded continued fraction of the sequence of primes, gives rise to f_1 = 2.7101020234300968374... and given its IEEE double approximation, starts diverging at 67. Improvements are welcome. Thanks, Leo
The question, of course, depends on the programming language. Fix one and then it'll make sense to play code golf. Otherwise, I can just define a language with a built-in operation that takes in an arbitrary pair and outputs the primes, so (0, op) would be the smallest. On Wed, Dec 2, 2020 at 4:30 PM Leo Broukhis <leob@mailcom.com> wrote:
Apropos https://www.youtube.com/watch?v=_gCKX6VMvmU
There are many ways to produce the sequence of primes by iterating a function, e.g. starting with f_1 = ContinuedFraction(2, 3, 5, 7, 11, 13, ,,,) = 2.3130367364335829..., (http://oeis.org/A064442) and
f_n+1 = 1/(f_n - Floor(f_n))
the sequence of Floor(f_n) will be the sequence of primes.
Or, as in the clip, starting with the average of http://oeis.org/A053669 = 2.920050977316134712..., and f_n+1 = Floor(f_n) * (f_n - Floor(f_n) + 1) will do the same.
Or the Blazys continued fraction, see http://oeis.org/A233588
The question is, which constant/function pair encodes primes in the most compact way. More precisely, given a denominator for a rational approximation, which function will correctly produce the most primes starting from the approximation of the corresponding constant?
For example, with the IEEE double (64 bit floating point) approximations of the constants, the continued fraction starts diverging at 29, the Blazys continued fraction - at 47, the 2.92... constant - at 59.
I was able to come up with the function f_n+1 = Floor(f) + 1/(f_n-Floor(f_n)); which is the delta-coded continued fraction of the sequence of primes, gives rise to f_1 = 2.7101020234300968374... and given its IEEE double approximation, starts diverging at 67.
Improvements are welcome.
Thanks, Leo _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Mike Stay - metaweta@gmail.com http://math.ucr.edu/~mike https://reperiendi.wordpress.com
Mike Stay wrote:
The question, of course, depends on the programming language.
Fix one and then it'll make sense to play code golf. Otherwise, I can just define a language with a built-in operation that
takes in an
arbitrary pair and outputs the primes, so (0, op) would be the smallest. I disagree. First, that's not a question of programming language, but a question of mathematical operations. Second, the idea is to have a "universal" function, which is not specific to the encoded sequence (here: the primes): All that is specific to that sequence should be encoded in the initial value. The very same function should work as well (or at least reasonably well) for other strictly increasing positive integer sequences, of course with a different initial value that must be computable (in principle, to any desired precision) for that function and a given sequence. Of course it can be expected that the success of one function vs the other might vary depending on growth and maybe other properties of the encoded integer sequence. But a function that always produces the primes independently of the initial value would disqualify here. Maybe that main idea is already enough to eliminate the relevance of complicated and artificially hand-crafted functions. But just in case, we can also easily fix clearly defined classes of allowed functions - through the allowed operations (in all examples, the iterated function is constructed using only +, -, *, / and floor(.)), - and by fixing a maximum overall complexity of the function (obtained by adding up the complexity value assigned to each allowed function (or operation), e.g., complexity 1 for each floor(.) ; complexity 2 for each +, -, * or / ; and maybe larger values for other possibly allowed functions (exp, log, Gamma, ...) It would be interesting to know whether more complex functions allow a more efficient encoding of (arbitrary or given type of) integer sequences. If so, we can consider "records" depending on the maximum complexity of the function. Finally, it's also possible that various functions have their best performance in different ranges: e.g. one might be "ahead" (requiring less precision on the initial value) to produce the first 100 terms, another one would be "better" for producing 200 to 500 terms, and yet another one beyond that. But even in that case, the question is interesting. I think we can measure the "performance" of a function by looking at ( - log of) the difference between the largest and smallest initial value that would produce (a(1), ..., a(n)), even though that might not always be a monotonous function. That might be a curve of roughly the same shape, independently of the functions and/or encoded sequences (maybe, as long as these are roughly similar). We could check this already for the four examples at hand. If so, then we can compare the performance by simply comparing the respective slopes. (That should be consistent with, if not equivalent to the values from Leo, i.e., number of correct terms for float64, i.e., epsilon = 2^-53.) - Maximilian On Wed, Dec 2, 2020 at 4:30 PM Leo Broukhis wrote:
Apropos https://www.youtube.com/watch?v=_gCKX6VMvmU
There are many ways to produce the sequence of primes by iterating a function, e.g. starting with f_1 = ContinuedFraction(2, 3, 5, 7, 11, 13, ,,,) = 2.3130367364335829..., (http://oeis.org/A064442) and
f_n+1 = 1/(f_n - Floor(f_n))
the sequence of Floor(f_n) will be the sequence of primes.
Or, as in the clip, starting with the average of http://oeis.org/A053669 = 2.920050977316134712..., and f_n+1 = Floor(f_n) * (f_n - Floor(f_n) + 1) will do the same.
Or the Blazys continued fraction, see http://oeis.org/A233588
The question is, which constant/function pair encodes primes in the most compact way. More precisely, given a denominator for a rational approximation, which function will correctly produce the most primes starting from the approximation of the corresponding constant?
For example, with the IEEE double (64 bit floating point) approximations of the constants, the continued fraction starts diverging at 29, the Blazys continued fraction - at 47, the 2.92... constant - at 59.
I was able to come up with the function f_n+1 = Floor(f) + 1/(f_n-Floor(f_n)); which is the delta-coded continued fraction of the sequence of primes, gives rise to f_1 = 2.7101020234300968374... and given its IEEE double approximation, starts diverging at 67.
Improvements are welcome.
On Wed, Dec 2, 2020 at 8:54 PM M F Hasler <mhasler@dsi972.fr> wrote:
Mike Stay wrote:
The question, of course, depends on the programming language. Fix one and then it'll make sense to play code golf.
Otherwise, I can just define a language with a built-in operation that takes in an arbitrary pair and outputs the primes, so (0, op) would be the smallest.
I disagree. First, that's not a question of programming language, but a question of mathematical operations.
What's the difference? You're defining an algorithm that takes a real number as input and produces a stream of natural numbers. The choice of mathematical operations affects the size of the algorithm. Are you allowed to use more than standard arithmetic operations, like floor? Does floor count as five letters, or two characters ⌊⌋ or one operation? Are trig functions allowed? Erf? LLL? OEIS lookup? Ah---I see you make a choice below. It also matters in which format the real number is encoded. Is it a decimal expansion? A continued fraction? An Egyptian fraction? A math-mode LaTeX expression?
Second, the idea is to have a "universal" function, which is not specific to the encoded sequence (here: the primes):
Suppose we say that the input is a stream of base-256 digits ("bytes"), thought of as a real number between 0 and 1, and the output is a stream of decimal digits and commas. In a language like JavaScript that has a built-in interpreter ('eval'), we can simply evaluate the number as a program in UTF-8. We can play code golf on the size of the number. It will even be rational, since the primes are a computable sequence.
All that is specific to that sequence should be encoded in the initial value. The very same function should work as well (or at least reasonably well) for other strictly increasing positive integer sequences, of course with a different initial value that must be computable (in principle, to any desired precision) for that function and a given sequence.
Of course it can be expected that the success of one function vs the other might vary depending on growth and maybe other properties of the encoded integer sequence. But a function that always produces the primes independently of the initial value would disqualify here.
Maybe that main idea is already enough to eliminate the relevance of complicated and artificially hand-crafted functions.
Unfortunately not. Various authors have tried to come up with a class of "natural" universal Turing machines, but there's no agreement on what such a machine is. I even vaguely remember a paper that supposed that "natural" machines would be more easily simulated by others and therefore would be "more probable" in some sense, but it turned out that that's not the case. I can dig around for the paper if you're interested.
But just in case, we can also easily fix clearly defined classes of allowed functions - through the allowed operations (in all examples, the iterated function is constructed using only +, -, *, / and floor(.)), - and by fixing a maximum overall complexity of the function (obtained by adding up the complexity value assigned to each allowed function (or operation), e.g., complexity 1 for each floor(.) ; complexity 2 for each +, -, * or / ; and maybe larger values for other possibly allowed functions (exp, log, Gamma, ...)
And a variable capable of holding a real number! What you've described is a reasonable choice, but since you've got a loop, memory, and an arithmetic update function, I'm pretty sure it's Turing complete. See, e.g. https://en.wikipedia.org/wiki/One-instruction_set_computer.
It would be interesting to know whether more complex functions allow a more efficient encoding of (arbitrary or given type of) integer sequences.
Certainly! Given exp(), one can produce the digits of e from 1.
If so, we can consider "records" depending on the maximum complexity of the function.
Finally, it's also possible that various functions have their best performance in different ranges: e.g. one might be "ahead" (requiring less precision on the initial value) to produce the first 100 terms, another one would be "better" for producing 200 to 500 terms, and yet another one beyond that. But even in that case, the question is interesting.
I think we can measure the "performance" of a function by looking at ( - log of) the difference between the largest and smallest initial value that would produce (a(1), ..., a(n)), even though that might not always be a monotonous function. That might be a curve of roughly the same shape, independently of the functions and/or encoded sequences (maybe, as long as these are roughly similar). We could check this already for the four examples at hand. If so, then we can compare the performance by simply comparing the respective slopes. (That should be consistent with, if not equivalent to the values from Leo, i.e., number of correct terms for float64, i.e., epsilon = 2^-53.)
- Maximilian
In the end, this all boils down to Kolmogorov complexity, which depends on the choice of language. -- Mike Stay - metaweta@gmail.com http://math.ucr.edu/~mike https://reperiendi.wordpress.com
Le jeu. 3 déc. 2020 à 01:37, Mike Stay <metaweta@gmail.com> a écrit :
In the end, this all boils down to Kolmogorov complexity, which depends on the choice of language.
Yes and no, Kolmogorov complexity is universal in the sense that it depends on the language or machine only up to a constant. But I feel you didn't reply to what I wrote, specifically what I think is the the core idea: the function f : R -> R which yields a map A(f) from reals to integer sequences, A(f) : R -> Z^N ; c -> a = (floor(f^n(c))) = (floor(c), floor(f(c)), floor(f(f(c))), ...) which should be continuous (i.e., constants c, c' which are sufficiently close should produce integer sequences with as many common terms as possible) but still "efficiently" code these sequences, which might translate to a differential function with a large derivative. But yes, thinking about it in these clarified mathematical terms shows that we have a kind of dilemma, if we get many primes with not so many decimals of c = c(primes), then probably it will / must(?) do less well for a different sequence.... So IMHO, rather than a sequence, we must fix the test cases on which the function is to be evaluated. Where "evaluation" is as specified in my earlier mail, the (asymptotic?) ratio of number of correct terms of the sequence, per number of correct digits in c. Maybe you can "reverse engineer" that scoring system by knowing the test suite if it is finite, (i.e., your function f would guess which sequence is "wanted" for a given constant c and artificially produce the required terms) but for a sufficiently well conceived infinite test suite, I doubt that would be possible. - Maximilian
On Mon, Dec 7, 2020 at 6:03 PM M F Hasler <mhasler@dsi972.fr> wrote:
Le jeu. 3 déc. 2020 à 01:37, Mike Stay <metaweta@gmail.com> a écrit :
In the end, this all boils down to Kolmogorov complexity, which depends on the choice of language.
Yes and no, Kolmogorov complexity is universal in the sense that it depends on the language or machine only up to a constant.
But I feel you didn't reply to what I wrote, specifically what I think is the the core idea: the function f : R -> R which yields a map A(f) from reals to integer sequences, A(f) : R -> Z^N ; c -> a = (floor(f^n(c))) = (floor(c), floor(f(c)), floor(f(f(c))), ...) which should be continuous (i.e., constants c, c' which are sufficiently close should produce integer sequences with as many common terms as possible) but still "efficiently" code these sequences, which might translate to a differential function with a large derivative.
If c and c' straddle an integer, all bets are off. But yes, thinking about it in these clarified mathematical terms shows that
we have a kind of dilemma, if we get many primes with not so many decimals of c = c(primes), then probably it will / must(?) do less well for a different sequence....
Naturally. Given that the function is pure, i.e. it does not have any internal state, to encode primes efficiently it would have to be "hardwired" specifically for the primes. Plotting the functions, it is possible to see, by the range and the slope of the graph, how the value of the next prime is "predicted". The continued fraction is clearly wasteful because it does not use the monotonicity. The function from the clip is bounded, as NextPrime(p) < 2p, but it is linear, that is, it assumes equal probability over the whole possible range, which is unlikely to match reality. My function and Blazys' function look similar but mine is not as steep over the "sweet" range near the current prime, thus performs better than the Blazys'. I guess, the trick would be to find out what is the asymptotic shape of the probability distribution of the next prime over [p+2, 2p-1] (is that distribution known?) and to construct the corresponding iterative function.
So IMHO, rather than a sequence, we must fix the test cases on which the function is to be evaluated. Where "evaluation" is as specified in my earlier mail, the (asymptotic?) ratio of number of correct terms of the sequence, per number of correct digits in c. Maybe you can "reverse engineer" that scoring system by knowing the test suite if it is finite, (i.e., your function f would guess which sequence is "wanted" for a given constant c and artificially produce the required terms) but for a sufficiently well conceived infinite test suite, I doubt that would be possible.
That I didn't get, sorry. Thanks, Leo
Using Firoozbakht's conjecture (adapted to make it start working from the first prime) works much better than Bertran's postulate. Starting with 2.63585974145479126553562569 and using f(x) = (x-[x])*[log([x]+6)*(log([x]+6)-1]+x produces primes up to 97. Leo On Wed, Dec 9, 2020 at 6:59 PM Leo Broukhis <leob@mailcom.com> wrote:
On Mon, Dec 7, 2020 at 6:03 PM M F Hasler <mhasler@dsi972.fr> wrote:
Le jeu. 3 déc. 2020 à 01:37, Mike Stay <metaweta@gmail.com> a écrit :
In the end, this all boils down to Kolmogorov complexity, which depends on the choice of language.
Yes and no, Kolmogorov complexity is universal in the sense that it depends on the language or machine only up to a constant.
But I feel you didn't reply to what I wrote, specifically what I think is the the core idea: the function f : R -> R which yields a map A(f) from reals to integer sequences, A(f) : R -> Z^N ; c -> a = (floor(f^n(c))) = (floor(c), floor(f(c)), floor(f(f(c))), ...) which should be continuous (i.e., constants c, c' which are sufficiently close should produce integer sequences with as many common terms as possible) but still "efficiently" code these sequences, which might translate to a differential function with a large derivative.
If c and c' straddle an integer, all bets are off.
But yes, thinking about it in these clarified mathematical terms shows that
we have a kind of dilemma, if we get many primes with not so many decimals of c = c(primes), then probably it will / must(?) do less well for a different sequence....
Naturally. Given that the function is pure, i.e. it does not have any internal state, to encode primes efficiently it would have to be "hardwired" specifically for the primes.
Plotting the functions, it is possible to see, by the range and the slope of the graph, how the value of the next prime is "predicted". The continued fraction is clearly wasteful because it does not use the monotonicity. The function from the clip is bounded, as NextPrime(p) < 2p, but it is linear, that is, it assumes equal probability over the whole possible range, which is unlikely to match reality.
My function and Blazys' function look similar but mine is not as steep over the "sweet" range near the current prime, thus performs better than the Blazys'.
I guess, the trick would be to find out what is the asymptotic shape of the probability distribution of the next prime over [p+2, 2p-1] (is that distribution known?) and to construct the corresponding iterative function.
So IMHO, rather than a sequence, we must fix the test cases on which the function is to be evaluated. Where "evaluation" is as specified in my earlier mail, the (asymptotic?) ratio of number of correct terms of the sequence, per number of correct digits in c. Maybe you can "reverse engineer" that scoring system by knowing the test suite if it is finite, (i.e., your function f would guess which sequence is "wanted" for a given constant c and artificially produce the required terms) but for a sufficiently well conceived infinite test suite, I doubt that would be possible.
That I didn't get, sorry.
Thanks, Leo
This is not the question of programming language, and I don't think that my question does not boil down to Kolmogorov complexity. Steve Whitman is right, it is a question akin to entropy-encoding the sequence of prime numbers, and my idea of delta-coding them worked. Obviously, and facetiously, the function f = NextPrime(Floor(f)) will work up to infinity regardless of the fractional part of the starting value, and may even be expressed quite efficiently (for integers, FRACTRAN did it quite well), but it is easy to notice that its computational complexity would not be O(1). If one limits themselves to function expressible as a combination of arithmetic operations and elementary functions (following the examples), regardless of its length, as it will still be computable in idealized O(1) time, is there a way to improve my result? The answer is yes; we can write any finite prefix to any of the examples in my original post in the form f(x) = if x < 3 then 3+frac(x) elsif x < 5 then 5 + frac(x) else (any of the original formulas). but as soon as the evaluation gets to the general case, the asymptotic efficiency will be different formulas. If we disallow conditionals and limit ourselves to the four arithmetic operations and Floor(), it will be easier to compare the efficiency of encoding, that's all. Thanks, Leo On Wed, Dec 2, 2020 at 4:07 PM Mike Stay <metaweta@gmail.com> wrote:
The question, of course, depends on the programming language. Fix one and then it'll make sense to play code golf. Otherwise, I can just define a language with a built-in operation that takes in an arbitrary pair and outputs the primes, so (0, op) would be the smallest.
On Wed, Dec 2, 2020 at 4:30 PM Leo Broukhis <leob@mailcom.com> wrote:
Apropos https://www.youtube.com/watch?v=_gCKX6VMvmU
There are many ways to produce the sequence of primes by iterating a function, e.g. starting with f_1 = ContinuedFraction(2, 3, 5, 7, 11, 13, ,,,) = 2.3130367364335829..., (http://oeis.org/A064442) and
f_n+1 = 1/(f_n - Floor(f_n))
the sequence of Floor(f_n) will be the sequence of primes.
Or, as in the clip, starting with the average of http://oeis.org/A053669
=
2.920050977316134712..., and f_n+1 = Floor(f_n) * (f_n - Floor(f_n) + 1) will do the same.
Or the Blazys continued fraction, see http://oeis.org/A233588
The question is, which constant/function pair encodes primes in the most compact way. More precisely, given a denominator for a rational approximation, which function will correctly produce the most primes starting from the approximation of the corresponding constant?
For example, with the IEEE double (64 bit floating point) approximations of the constants, the continued fraction starts diverging at 29, the Blazys continued fraction - at 47, the 2.92... constant - at 59.
I was able to come up with the function f_n+1 = Floor(f) + 1/(f_n-Floor(f_n)); which is the delta-coded continued fraction of the sequence of primes, gives rise to f_1 = 2.7101020234300968374... and given its IEEE double approximation, starts diverging at 67.
Improvements are welcome.
Thanks, Leo _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Mike Stay - metaweta@gmail.com http://math.ucr.edu/~mike https://reperiendi.wordpress.com
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
participants (3)
-
Leo Broukhis -
M F Hasler -
Mike Stay