Bits & Bio Log: Week 1

6 min readApr 23, 2021

*Chemical evolution in the primordial Earth (credit: Quanta Magazine)*

Hello! Welcome to the first log, which I am admittedly posting quite late and retroactively. This covers the resources I used, questions I had, and tentative answers I derived/found during this week (March 29 — Apr 2).

This week I mainly got started learning basic Biochemistry and understanding the main components of typical BioEngineering majors. This led me to some interesting reading on:

Origins of life
Biology seen from a computer scientist’s eye
Minimal organisms/genomes

Trailhead (links/resources)

Building up the basics:

Biochemistry by Voet and Voet (4th edition virtual, ordered 2nd edition in the meantime from the awesome resource AbeBooks: 2nd edition, 4th edition, and the 2nd edition was less than 12 dollars with shipping when I ordered!)

MIT BioE Major Concentation Plans (was trying to learn about topics in BioE):

Synthetic Biology, Tissue Engineering, Microbiome and Infectious Disease, Immunoengineering, Computational Systems Biology, Biomolecular and Biomaterial Design

Interesting Articles and Videos:

Reverse Engineering the source code of the BioNTech/Pfizer SARS-CoV-2 Vaccine
Bert Hubert’s description of DNA and other interesting cellular mechanisms from the perspective of a computer scientist
A similar resource, but this time written out and with a bibliography
That “junk” DNA … is full of information
Marbled Lungfish! which supposedly has the largest genome (at least at the time of Hubert’s talk, linked above)
Cool, more accurate picture of tree of life, compared to a lot I’ve seen in textbooks

Building minimal genomes (JCVI-synX.0*):

Nat Geo Overview
Wikipedia page on the JVCI results
Recent related work regarding genes for cell division, Cell Article
Wikipedia page to learn more about the transposon mutagenesis mentioned in the Cell paper. McClintock’s paper might glance through it at some point since I still don’t get how this thing works.
Science Article
Two related papers I still need to read:
https://www.nature.com/articles/s41467-020-14545-0
https://www.nature.com/articles/s41467-020-14545-0
Gen-chem Review on Khan Academy

Origins of life:

Article on origins of life probably not being due to any pure type of macromolecule
Older article on chemical evolution
Another one on a themodynamical explanation for why life makes sense

Interesting quotes from these:

“‘A system that uses information the way organisms use genetic information — to synthesize their own components — must contain reflexive information,’ [Peter] Wills said. He defined reflexive information as information that, ‘when decoded by the system, makes the components that perform exactly that particular decoding.’ The RNA of the RNA world hypothesis, he added, is just chemistry because it has no means of controlling its chemistry. ‘The RNA world doesn’t tell you anything about genetics,’ he said.”

“Since then, he, Wills and others have collaborated on a theory that circles back to that research. Their main goal was to figure out the very simple genetic code that preceded today’s more specific and complicated one. And so they turned not just to computation but also to genetics.

At the center of their theory are 20 ‘loading’ molecules called aminoacyl-tRNA synthetases. These catalytic enzymes allow RNA to bond with specific amino acids in keeping with the rules of the genetic code. “In a sense, the genetic code is ‘written’ in the specificity of the active sites” of those enzymes, said Jannie Hofmeyr, a biochemist at Stellenbosch University in South Africa, who was not involved in the study.”

“Their argument centers on the relative simplicity of the ribosomal core, more formally known as the peptidyl transferase center (PTC). The PTC’s job is to bring together amino acids, the building blocks of proteins.”

Mike Synder’s lab

Scenic Points (questions/thoughts)

How much DNA do you need to read from a gene to uniquely identify it in the human genome?

What would be the ideal DNA format/organization to make it easier to read?

What would it take to direct dna changes to single cells?

Can we build a disassembler for dna?

Well, at the very least, the ribosome is an ALU, machine code is the mRNA, and the protein is the value in $eax (I feel like that

analogy broke down fast…)

Better question: What state does the information encoded in DNA deal with? What are the levels of abstraction of the operating cell?

How do the transport mechanisms of a cell work (inside organelles, across the cell, across the membrane)?

What is the right order of complexity to imagine a prokaryotic cell as? Is it right to think of it as a factory, city, nation in the complexity of its infrastructure?

Like an IC needs manufacturing precision at the resolution of a house-granular map of the US, what’s the resolution needed to entirely describe a cell, ie accurately propagate its state forward (and is there a significant quantum mechanical barrier here)? What is the precision needed to predict if a chemical reaction will occur one way or another? What can then be averaged out, and what level of abstraction does each averaging out correspond to?

What is the optimum state of a cell? And what deviations express decline in health? How does the health of a cell compare to a tissue, organ, or the whole human body?

Could you evolve a smaller genome by limiting the time to reproduce somehow, while restricting the number of “parallel workers” doing the copying (and what are these workers called)?

What are the control mechanisms that coordinate the myriad biochemical reactions that take place in cells and in organisms?

What does a ribosome look like? How does it work? (Answers here, need to read this tho)

How does a cell know to only take one each chromosome type during division, and how does it “know” X and Y chromosomes are corresponding chromosomes?

Can crossover occur in the middle of a gene? Yes, see how the rII gene was analyzed

How is a “gene” defined in terms of nucleic acid sequences, in the sense that one-gene:one-polypeptide, like how can you distibguish it without reading out the proteins defined by the sequence and seeing if it’s found in the cell?

Can ribosomes make more ribosomes? Apparently not, but why not? Is it because they don’t assemble nucleic acids?

Are their theoretical proto-ribosomes that could replicate themselves?

What, in terms of the chemical reaction allows us to keep existing, esp in the early evolution of light? Must’ve needed something to imbalance the environment so that the reactants of early earth didn’t just all go to equilibrium: lightning?

How is DNA actually sequenced?

What is the current state of knowledge regarding junk DNA (and what’s non-coding RNA)?

What’s a selfish genetic element? Wiki Article

From secondary readings of what was done for T4 rII, I’m unsure what exactly was Benzer confident about and what we have come to confirm with experimentation since then. These genetics procedures seem so indirect to me, I’m not sure how these individuals used their experiments to determine properties of how DNA interacts with itself and other molecules through that. Corresponding wikipedia page

Making Camp (tentative answers)

What does hydrogen look like:

Bond length of hydrogen gas is 74pm, which is the shortest of all: https://courses.lumenlearning.com/introchem/chapter/bond-lengths/#:~:text=Bonds%20involving%20hydrogen%20can%20be,distance%20between%20two%20identical%20atoms.
Van der Waals radius of hydrogen gas is 120 pm: half the closest distance of two equal, non-covalently bound, atoms
For atomic hydrogen Van der Waals radius is 53pm
https://chemistry.stackexchange.com/questions/115363/is-a-hydrogen-molecule-smaller-than-hydrogen-atom

What’s the simplest model organism we can work with to understand the building blocks of life:

JCVI-syn3.0 one of the simplest organisms we might hope to understand, based on number of genes:
1 micrometer across, roughly; atomic-hydrogen-sized or hydrogen-bond-sized edge voxels:
So order of 10¹²… which is a… lot, and this is a TINY cell

If we were to try to create a visual model for this, how complex would it need to be:

Our eyes can resolve 1 mm at up to 11 ft away
JCVI-syn3.0 is a sphere about 10⁶ pm across and atomic hydrogen is about 53 pm across, so if we wanted to model all the molecules at even the fuzziest of resolutions we’d need a simulated sphere with: 10⁶/(53/2) ~4*10⁴ pixels across its diameter elements. If these are each 1 mm wide, we have a sphere 4m across to simulate the entire state of the simplest living cell
JCVI-syn3.0 has 531,560 base pair genome or 531,5602 bits = 531,5602/8 bytes = 132,890 bytes ~ 130 KiB of code
Then how much is the memory needed for a cellular VM?

Can crossover occur in the middle of a gene? Yes, see how the T4 rII gene was analyzed, it’s pretty interesting.

Bits & Bio Log: Week 1

Trailhead (links/resources)

Scenic Points (questions/thoughts)

Making Camp (tentative answers)

Written by Ishan Gaur