Proposed triplet nature of genetic code:
George Gamow (physicist, 1954): argued mathematically that 3-nucleotide codons give 4^3 = 64 combinations, sufficient for 20 amino acids.
Crick (1961): experimentally proved code is triplet using T4 phage frameshift mutations.
Nirenberg+Matthaei (1961): cracked first codon (UUU = Phe).
Khorana: synthesised polynucleotides to decipher full code.
Answer: George Gamow
George Gamow (physicist, 1954): first proposed genetic code is a triplet code. Mathematical argument: doublet (4^2 = 16) insufficient for 20 amino acids. Triplet (4^3 = 64) gives 64 combinations, more than enough for 20 amino acids. Francis Crick, Sydney Brenner et al. (1961): proved code is triplet using frameshift mutations in T4 bacteriophage rII gene. Adding or deleting 1 or 2 nucleotides = mutant. Adding or deleting 3 nucleotides = reading frame restored = near-normal function. Marshall Nirenberg and Johann Matthaei (1961): broke first codon. Cell-free protein synthesis system. Poly-U RNA produced poly-phenylalanine = UUU codes for Phe. Nobel Prize 1968: Nirenberg, Khorana, Holley. Khorana: synthesised poly-AC, poly-AG etc. to decipher code. Holley: determined tRNA structure.
Universal: nearly same in all organisms. Exceptions: some mitochondria (UGA = Trp not stop), Mycoplasma (UGA = Trp). Triplet: 3 nucleotides per codon. Degenerate: multiple codons for same amino acid. Serine has 6 codons (UCU, UCC, UCA, UCG, AGU, AGC). Leucine has 6. Arginine has 6. Methionine has 1 (AUG). Tryptophan has 1 (UGG). Degeneracy mostly at 3rd position (Wobble). Unambiguous: each codon specifies only one amino acid. Non-overlapping: each nucleotide belongs to only one codon. Commaless (non-punctuated): no spacers between codons - read continuously. Start codon: AUG (Met in eukaryotes, fMet in prokaryotes). Stop codons: UAA, UAG, UGA. Sense strand: same sequence as mRNA. Template strand: antisense, used as template by RNA polymerase.
Nirenberg-Matthaei (1961): cell-free protein synthesis. poly-U = poly-Phe (UUU = Phe). poly-A = poly-Lys (AAA = Lys). poly-C = poly-Pro (CCC = Pro). Khorana (1960s): synthesised defined polynucleotides. Alternating AC (poly-ACACAC): produced alternating Thr-His-Thr-His. Therefore ACA = Thr and CAC = His (or ACA = His and CAC = Thr). Combined with other data = both assignments confirmed. Trinucleotide binding technique (Nirenberg and Leder, 1964): specific trinucleotides cause specific aminoacyl-tRNA to bind ribosomes. Tested all 64 triplets. By 1966: all 64 codons assigned. Wobble hypothesis (Crick, 1966): explains degeneracy at 3rd codon position.
Francis Crick (1958): proposed Central Dogma. Information flow: DNA to RNA to Protein. DNA replication: DNA to DNA. Transcription: DNA to RNA. Translation: RNA to Protein. Reverse transcription: RNA to DNA (retroviruses, discovered by Temin and Baltimore, 1970 - Nobel 1975). Reverse translation does NOT occur (protein cannot be back-translated to RNA). Central Dogma exceptions: RNA replication in RNA viruses (RNA to RNA by RNA-dependent RNA polymerase). Prions: protein-based inheritance (no nucleic acid template). The central dogma is the fundamental principle of molecular biology. "The sequence information cannot be transferred back from protein to either nucleic acid." - Crick.
Watson-Crick double helix (1953). Complementary base pairs: A=T (2 H-bonds), G=C (3 H-bonds). Antiparallel strands. Right-handed B-form (most common). A-form (RNA:DNA hybrid, dsRNA). Z-form (left-handed, GC-rich sequences). DNA replication: semi-conservative (Meselson-Stahl experiment, 1958 - definitive proof using 15N/14N isotopes). Enzymes: Helicase (unwinds), Primase (makes RNA primer), DNA polymerase III (prokaryotes, extends 5 to 3), DNA polymerase I (removes primer, fills gap), DNA ligase (seals nicks). Leading strand: continuous synthesis. Lagging strand: discontinuous (Okazaki fragments). Telomerase: adds TTAGGG repeats to chromosome ends. Absent in most somatic cells (cellular aging). Active in cancer cells (immortalisation), germ cells, stem cells.
Transcription: DNA to RNA by RNA polymerase. Template strand used. Transcription bubble: RNA pol unwinds ~17 bp. mRNA synthesised 5 to 3. Prokaryotic RNA polymerase: core enzyme (alpha2, beta, beta prime, omega) + sigma factor (recognises promoter). Promoters: -10 box (Pribnow box, TATAAT) and -35 box (TTGACA). Sigma factor: recognises and binds promoter, dissociates after initiation. Rho factor: involved in some prokaryotic termination. Eukaryotic transcription: 3 RNA polymerases: Pol I (rRNA, 28S, 18S, 5.8S, in nucleolus), Pol II (mRNA, snRNA), Pol III (tRNA, 5S rRNA). Many transcription factors. TATA box (Hogness box) at -25. CAAT box, GC box upstream elements. General transcription factors (GTFs) assemble preinitiation complex. Enhancers: distant regulatory elements (can be thousands of bp away).
Eukaryotic mRNA processing: 5 capping: 7-methylguanosine cap added to 5 end. Functions: protects from exonucleases, promotes ribosome recognition (cap-dependent translation). 3 polyadenylation: poly-A tail (100-250 A residues) added to 3 end after cleavage. Functions: mRNA stability, nuclear export, translation initiation. RNA splicing: introns removed, exons joined. Carried out by spliceosome (complex of snRNPs: U1, U2, U4, U5, U6 snRNPs). GT-AG rule: introns start with GU and end with AG. Alternative splicing: same pre-mRNA can produce different mRNAs by using different exons. Allows one gene to encode multiple proteins. Human genome: ~20,000 genes but ~100,000+ different proteins (mostly via alternative splicing). RNA editing: ADAR enzymes change A to I (inosine read as G) in some mRNAs.
Operon concept (Jacob and Monod, 1961): groups of genes controlled together. Lac operon (E. coli): genes for lactose metabolism (lacZ, lacY, lacA) regulated together. Negative regulation: lac repressor (encoded by lacI) binds operator when no lactose present - blocks transcription. Inducer (allolactose) binds repressor - repressor released - transcription occurs. Catabolite repression: CAP (CRP) protein + cAMP activates transcription. Low glucose = high cAMP = CAP active = lac operon on. High glucose = low cAMP = CAP inactive = lac operon off (even if lactose present). Trp operon: biosynthetic operon. Trp repressor + tryptophan (corepressor) = active repressor blocks transcription. Attenuation: ribosome stalls when Trp rare, RNA forms anti-terminator structure, transcription continues. Eukaryotic gene regulation: chromatin remodelling, histone modification (acetylation, methylation), DNA methylation, transcription factors, enhancers, silencers, microRNAs, long non-coding RNAs.