Re: [math-fun] Prime-encoding constants/functions

2 Dec 2020


      On Wed, Dec 2, 2020 at 8:54 PM M F Hasler <mhasler@dsi972.fr> wrote:
...
Mike Stay wrote:
...
The question, of course, depends on the programming language.
Fix one and then it'll make sense to play code golf.
...
Otherwise, I can just define a language with a built-in operation that
takes in an arbitrary pair and outputs the primes, so (0, op) would
be the smallest.
I disagree. First, that's not a question of programming language, but a
question of mathematical operations.
What's the difference?  You're defining an algorithm that takes a real
number as input and produces a stream of natural numbers.  The choice
of mathematical operations affects the size of the algorithm.  Are you
allowed to use more than standard arithmetic operations, like floor?
Does floor count as five letters, or two characters ⌊⌋ or one
operation?  Are trig functions allowed? Erf? LLL? OEIS lookup?  Ah---I
see you make a choice below.

It also matters in which format the real number is encoded.  Is it a
decimal expansion?  A continued fraction? An Egyptian fraction?  A
math-mode LaTeX expression?
...
Second, the idea is to have a "universal" function, which is not specific
to the encoded sequence (here: the primes):
Suppose we say that the input is a stream of base-256 digits
("bytes"), thought of as a real number between 0 and 1, and the output
is a stream of decimal digits and commas.  In a language like
JavaScript that has a built-in interpreter ('eval'), we can simply
evaluate the number as a program in UTF-8.  We can play code golf on
the size of the number. It will even be rational, since the primes are
a computable sequence.
...
All that is specific to that sequence should be encoded in the initial
value.
The very same function should work as well (or at least reasonably well)
for other strictly increasing positive integer sequences,
of course with a different initial value that must be computable (in
principle, to any desired precision) for that function and a given sequence.
Of course it can be expected that the success of one function vs the other
might vary depending on growth and maybe other properties of the encoded
integer sequence.
But a function that always produces the primes independently of the initial
value would disqualify here.
Maybe that main idea is already enough to eliminate the relevance of
complicated and artificially hand-crafted functions.
Unfortunately not.  Various authors have tried to come up with a class
of "natural" universal Turing machines, but there's no agreement on
what such a machine is.  I even vaguely remember a paper that supposed
that "natural" machines would be more easily simulated by others and
therefore would be "more probable" in some sense, but it turned out
that that's not the case.  I can dig around for the paper if you're
interested.
...
But just in case, we can also easily fix clearly defined classes of allowed
functions
- through the allowed operations (in all examples, the iterated function is
constructed using only +, -, *, /  and floor(.)),
- and by fixing a maximum overall complexity of the function (obtained by
adding up the complexity value assigned to each allowed function (or
operation),
   e.g., complexity 1 for each floor(.) ; complexity 2 for each  +, -, * or
/ ; and maybe larger values for other possibly allowed functions (exp, log,
Gamma, ...)
And a variable capable of holding a real number!  What you've
described is a reasonable choice, but since you've got a loop, memory,
and an arithmetic update function, I'm pretty sure it's Turing
complete.  See, e.g.
https://en.wikipedia.org/wiki/One-instruction_set_computer.
...
It would be interesting to know whether more complex functions allow a more
efficient encoding of (arbitrary or given type of) integer sequences.
Certainly!  Given exp(), one can produce the digits of e from 1.
...
If so, we can consider "records" depending on the maximum complexity of the
function.
Finally, it's also possible that various functions have their best
performance in different ranges:
e.g. one might be "ahead" (requiring less precision on the initial value)
to produce the first 100 terms,
another one would be "better" for producing 200 to 500 terms, and yet
another one beyond that.
But even in that case, the question is interesting.
I think we can measure the "performance" of a function by looking at ( -
log of) the difference between the largest and smallest initial value
that would produce (a(1), ..., a(n)), even though that might not always be
a monotonous function.
That might be a curve of roughly the same shape, independently of the
functions and/or encoded sequences (maybe, as long as these are roughly
similar).
We could check this already for the four examples at hand.
If so, then we can compare the performance by simply comparing the
respective slopes.
(That should be consistent with, if not equivalent to the values from Leo,
i.e., number of correct terms for float64, i.e., epsilon = 2^-53.)
- Maximilian
In the end, this all boils down to Kolmogorov complexity, which
depends on the choice of language.
-- 
Mike Stay - metaweta@gmail.com
http://math.ucr.edu/~mike
https://reperiendi.wordpress.com