Re: virus: Sexuality

zaimoni@ksu.edu
Wed, 18 Sep 1996 00:09:09 -0500 (CDT)


On Tue, 17 Sep 1996, ken sartor wrote:

> At 09:14 AM 9/17/96 -0500, zaimoni@ksu.edu wrote:

> >Let me add one of my own:
> > IF random-mutation-based biological evolution is to be mathematically
> >coherent, [as far as I know], one must have either the many-worlds
> >interpretation of quantum mechanics, or a steady-state universe. [Roger
> >Penrose's ideas ARE a steady-state universe.] The currently-inferrable
> >space-time is woefully inadequate, and cannot even account for the
> >Von-Neumann minimal life-form. [1500 bits, one-shot 10^(-450) against]
> >[For reference: the smallest biological virus is larger than a
> >Von-Neumann minimal life-form. The smallest computer virus I have data on
> >DOES go under--by 4 BITS! This isn't really enough to make an impact on
> >the calculations. There is not much difference between 10^(-250) against
> >and 10^(-248) against. Besides, whether a computer virus is
> >self-contained is an interesting question--biological life doesn't have
> >the 'copy codon'!]
> > All of which can be escaped by simply not requiring the mutations to
> >be random. Saying that the mutations are random imposes a mathematical
> >structure which allows numerical calculation. [Note that natural
> >selection doesn't really alter these results. First, because whether
> >pruning occurs immediately or later has no effect here. Second, because
> >I'm talking about getting to the point where one has a sufficiently alive
> >life-form to evolve!]
>
> I don't know how one can reasonably compute the probabilities you
> have stated here. However, seeing that life is here changes
> everything. The probability of life arising in this universe is
> 1. Interesting data would be on seeing how prevalent it
> is in the universe (and how diverse).

Of course the probability of life arising in this universe is 1. It then
requires more faith [Virian definition] than I have to insist that
originating methods with amazingly low a priori chances of results must be
preferred to originating methods whose a priori chances are NOT
measurable. [The a priori chance IS an experiment!]

The Von Neumann limit is 1950's material--mathematics, so it's not going
to change fast. In particular, it doesn't really care about HOW the
DNA/RNA/etc. is implemented. 1500 bits means just what it sounds like in
CIS; it selects an event of 1 in 2^1500=(2^10)^150. The latter is bounded
below by (10^3)^150 i.e. 10^450. [Thus the 10^(-450) for one-shot.]

Note that redundancy reduces the information content. The technical
definition is to take a ground set of states S_i, compute their
probabilities P_i of occuring, and then take -log_2(P_i) to be the
information content of state S_i. [The i must match, and there should
only be a finite number of ground states.] Thus [using ACGT as our ground
set], it will take at least 750 bases i.e. 250 codons to specify a Von
Neumann minimal virus, but it can take more. [Does anyone have recent
enough data on biological virus sizes to contradict this? Exclude
prions--no DNA or RNA, and it's controversial whether they exist. The
smallest instance I recall finding a reference to is ~260 codons.]

Information, as defined above, is very close to an inverse of entropy, in
quantum-mechanical terms.

Besides the domain question [to be dealt with below], I
am presuming that:
1) We are interested in life based on conventional matter.
2) We are restricted to the apparently-observable universe
[many-worlds violates this trivially, as does infinite-time i.e.
steady-state. That's why I excluded them.]
3) Such life must contain at least one electron per organism.
Thus, I cannot possibly have more organisms around than electrons [the
latter number IS fuzzy--I used 10^130, above.] [The point is to
deliberately favor success.]
4) For each candidate, I [Nature] am allowed to replace it at most
only once per second. Note that until a functional candidate occurs
[self-reproducing in a suitable environment], that natural selection
will reject all others equally--we need something that replicates before
any normal evolution kicks in. For conventional-matter life, this seems
extremely fast. [Again, the point is to deliberately favor success.]
5) We need an estimate for observable space-time, of course. The
big-bang limit is a first guess. Pick your favorite estimate, and convert
to seconds. [The most liberal estimate I've seen is around 20 trillion
[AMERICAN; read as 20*10^12] years, back in the 1950's. This gives
10^70 seconds, to 1 decimal place. Shorter estimates give lesser numbers
of seconds, although not that much shorter; anything below 10^60 seconds
would be blatantly inconsistent with conventional cosmology.
The sample size then is (# of electrons in estimate)*(# of seconds in
estimate). For the numbers I used, this is 10^200. You may recompute
with your own numbers.
We now must decide on a method of generating our sample. [Note that
partial match detection would speed this up a LOT--but the natural
selection is assumed to not be able to distinguish between various total
failures, so it won't notice partial matches. The Von Neumann limit
doesn't presume 100% accurate copying.]
We could, of course, use an exponential model--sampling with
duplication. This gives worse numbers than the one I quoted.
[If our sample size was 10^450, we still would have only 1/e chance
of making it [e=2.718281828....] This is basic AFTER one has understood a
700-level analysis course [i.e., Calc 700]. Since we are short by a
factor of 10^(-250), the number computed is e^(-(10^250)), which is MUCH
smaller than the 10^(-250) I named. It's bounded above by 2^(-(10^250))
i.e. (2^(-10))^(10^249) i.e. (10^-3)^(10^249)--and now I'm looking at
some leading decimal place beyond 10^249 instead of decimal place
250<10^3....]
Sampling without duplication is easier to compute: simply divide
one's sample size by the total sample space. This gives the 10^(-250) I
cited. It is clearly a gross overestimate.
In other words, the numbers I mentioned were highly biased in favor
of the hypothesis I was computing the a priori chances of.

> Note as a side issue that _random_ mutation may be considered a
> reserved phase. I.e., it is restricted by the laws of chemistry
> and physics and hence not totally random...

Certainly, one must always state the environment from which one picks a
'random' variable. I actually want more restricting than just 'the laws
of chemistry and physics'--I need something that will let the candidates
survive long enough to do something interesting. [The surface of the sun
is restricted that way, and it's not going to work.] I am assuming that
we are not allowed to change the interpreter to match whatever happened to
come up [For a full-blown cellular organism, we must bootstrap an
interpreter within the coding. For a virus-type organism, the interpreter
is given to us.]

I'm not assuming a particular representation--I don't really care about
the exact nucleic acid setup, or whether I'm looking AT carbon-based
life. I am presuming that each attempt has some sort of defined
ground-set for the DNA-analog. I am also presuming that, on a macroscopic
scale, that there are no special choices of sequence that show up
unusually often [or unusually rarely] according to various statistical
tests, compared to the self-replicating one we are trying to construct.
[Clearly, once we have our life-form, this assumption is violated.]

//////////////////////////////////////////////////////////////////////////
/ Kenneth Boyd
//////////////////////////////////////////////////////////////////////////