On Wed, Dec 2, 2020 at 8:54 PM M F Hasler <mhasler@dsi972.fr> wrote:
Mike Stay wrote:
The question, of course, depends on the programming language. Fix one and then it'll make sense to play code golf.
Otherwise, I can just define a language with a built-in operation that takes in an arbitrary pair and outputs the primes, so (0, op) would be the smallest.
I disagree. First, that's not a question of programming language, but a question of mathematical operations.
What's the difference? You're defining an algorithm that takes a real number as input and produces a stream of natural numbers. The choice of mathematical operations affects the size of the algorithm. Are you allowed to use more than standard arithmetic operations, like floor? Does floor count as five letters, or two characters ⌊⌋ or one operation? Are trig functions allowed? Erf? LLL? OEIS lookup? Ah---I see you make a choice below. It also matters in which format the real number is encoded. Is it a decimal expansion? A continued fraction? An Egyptian fraction? A math-mode LaTeX expression?
Second, the idea is to have a "universal" function, which is not specific to the encoded sequence (here: the primes):
Suppose we say that the input is a stream of base-256 digits ("bytes"), thought of as a real number between 0 and 1, and the output is a stream of decimal digits and commas. In a language like JavaScript that has a built-in interpreter ('eval'), we can simply evaluate the number as a program in UTF-8. We can play code golf on the size of the number. It will even be rational, since the primes are a computable sequence.
All that is specific to that sequence should be encoded in the initial value. The very same function should work as well (or at least reasonably well) for other strictly increasing positive integer sequences, of course with a different initial value that must be computable (in principle, to any desired precision) for that function and a given sequence.
Of course it can be expected that the success of one function vs the other might vary depending on growth and maybe other properties of the encoded integer sequence. But a function that always produces the primes independently of the initial value would disqualify here.
Maybe that main idea is already enough to eliminate the relevance of complicated and artificially hand-crafted functions.
Unfortunately not. Various authors have tried to come up with a class of "natural" universal Turing machines, but there's no agreement on what such a machine is. I even vaguely remember a paper that supposed that "natural" machines would be more easily simulated by others and therefore would be "more probable" in some sense, but it turned out that that's not the case. I can dig around for the paper if you're interested.
But just in case, we can also easily fix clearly defined classes of allowed functions - through the allowed operations (in all examples, the iterated function is constructed using only +, -, *, / and floor(.)), - and by fixing a maximum overall complexity of the function (obtained by adding up the complexity value assigned to each allowed function (or operation), e.g., complexity 1 for each floor(.) ; complexity 2 for each +, -, * or / ; and maybe larger values for other possibly allowed functions (exp, log, Gamma, ...)
And a variable capable of holding a real number! What you've described is a reasonable choice, but since you've got a loop, memory, and an arithmetic update function, I'm pretty sure it's Turing complete. See, e.g. https://en.wikipedia.org/wiki/One-instruction_set_computer.
It would be interesting to know whether more complex functions allow a more efficient encoding of (arbitrary or given type of) integer sequences.
Certainly! Given exp(), one can produce the digits of e from 1.
If so, we can consider "records" depending on the maximum complexity of the function.
Finally, it's also possible that various functions have their best performance in different ranges: e.g. one might be "ahead" (requiring less precision on the initial value) to produce the first 100 terms, another one would be "better" for producing 200 to 500 terms, and yet another one beyond that. But even in that case, the question is interesting.
I think we can measure the "performance" of a function by looking at ( - log of) the difference between the largest and smallest initial value that would produce (a(1), ..., a(n)), even though that might not always be a monotonous function. That might be a curve of roughly the same shape, independently of the functions and/or encoded sequences (maybe, as long as these are roughly similar). We could check this already for the four examples at hand. If so, then we can compare the performance by simply comparing the respective slopes. (That should be consistent with, if not equivalent to the values from Leo, i.e., number of correct terms for float64, i.e., epsilon = 2^-53.)
- Maximilian
In the end, this all boils down to Kolmogorov complexity, which depends on the choice of language. -- Mike Stay - metaweta@gmail.com http://math.ucr.edu/~mike https://reperiendi.wordpress.com