Re: [math-fun] How miraculous was the origin of life? Experiment suggested to answer this.

7 May 2015

      If M is the minimal length of the genome of a "lifeform capable of
independent existence", then it strikes me as unlikely that there would be
many that would 'work'. (At least if you count properly -- two different
ways of coding the same AA shouldn't count as different in this context.
Otherwise you introduce a fairly predictable exponential factor, something
like (64/20)^(n/3) on average. In any case the remaining entropy is quite
high.) I'd expect that almost any change would render it useless -- maybe
some transpositions would be possible, but probably not many. (At that
level of compression, probably many sections are being reused in strange
ways.) Of course even if you could break it up into 30 sections (average
length just 100bp!) and transpose them you'd only gain 107 bits toward the
4322, leaving you with much too much entropy.

For genomes of length M + 100, say, you can get lots of viable lifeforms by
adding noncoding sections, but you don't have that option with length M.

In any case I don't think that a single 3000bp strand of DNA could
reasonably form by chance, let alone N/4^3000 of them. I suspect
abiogenesis was much more subtle.

Charles Greathouse
Analyst/Programmer
Case Western Reserve University

On Thu, May 7, 2015 at 2:46 PM, Warren D Smith <warren.wds@gmail.com> wrote:
...
If the simplest lifeform capable of independent existence had, say, 3000
base pair long DNA --actually I think the least the "minimal genome
project" has been able to come up with is more like 100 times that --
then you might say "the chance of that is 4^(-3000) which is about
10^(-1806), which is so small that even if every atom in the
observable universe were trying a new 3000-long DNA sequence every
femtosecond, life almost certainly would still never have come into
existence anywhere ever...  therefore, life is a miracle and Earth is
likely the only place in the universe that has any."
However, that calculation was wrong because more than one of those 4^3000
sequences probably works to produce a viable lifeform.  In fact the
number that work is probably also enormous. If the number were N then
the viability probability is more like P=N/4^3000, and it is that
chance P that really needs to be used to assess the miraculousness.
OK, that brings me to my point.  We can do an experiment to
approximately measure P.
Start with some near-minimal bacterial genome which say has G base
pairs in its genome.  Randomly mutate K of its base pairs.
There are binomial(G,K)*3^K  possible mutated genomes obtainable in this
way.
We are taking a uniform random sample among them.   When you do this,
count how many of the resulting bacteria remain viable, versus how many are
rendered unviable.
The result of such an experiment is a function  F(K)  estimating the
chance that mutating K of the base pairs, still yields a viable
lifeform.
We will know, to good accuracy, the values of F(1), F(2), F(3), etc
for some set of K's.
We then want to EXTRAPOLATE this function to determine the value of F(G-1),
which is, essentially, equal to P, the life-viability chance we were
seeking.
To perform this extrapolation we need to obtain an empirical formula
that fits the data F(1), F(2), F(3) etc that we have.  I doubt this
extrapolation will be very difficult.
In fact I might a priori suspect that  F(K) = K^Q * C^K
for suitable fitting constants Q and C, will work decently.
Thinking some more, it might be possible to do an even better job than
that.
We can do a different kind of experiment to attempt to estimate
F(K+J)/F(K),
by starting with a viable K-mutant, and generating J-mutants of it.
We might be able to reach very large K values this way and thus build
a long chain of
such ratio-estimates.
_______________________________________________
math-fun mailing list
math-fun@mailman.xmission.com
https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun