Table of Contents
- Introduction & The Central Dogma
- Functions of Nucleotides
- Structure of Nucleotides
- 3.1 Nitrogenous Bases
- 3.2 The Pentose Sugar
- 3.3 Nucleosides and Nucleotides
- 3.4 Nomenclature
- Phosphodiester Bonds and Polynucleotides
- Primary Structure of Nucleic Acids
- DNA Secondary Structure: The Double Helix
- Structural Variations of DNA
- Unusual DNA/RNA Structures
- DNA Denaturation and Renaturation
- Nonenzymatic Transformations of Nucleotides
- DNA Methylation
- RNA: Structure and Types
- Nucleases: Degradation of Nucleic Acids
- Other Functions of Nucleotides
- Genome, Genes, and Chromosomes
- Chromatin and Nucleosomes
- DNA Supercoiling and Topoisomerases
- SMC Proteins: Cohesins and Condensins
- Mitochondrial DNA (mtDNA)
- Polymerase Chain Reaction (PCR)
- Conclusion
1. Introduction & The Central Dogma
The study of nucleic acids is foundational to understanding life at the molecular level. This module covers the structure and function of DNA and RNA, from their basic chemical building blocks (nucleotides) up to the complex three-dimensional organization of chromosomes.
The Central Dogma of Molecular Biology
The central dogma, first articulated by Francis Crick, describes the flow of genetic information within a biological system:
DNA → RNA → Protein
Transcription Translation
- DNA replication: DNA is copied to produce identical DNA molecules.
- Transcription: The information in a DNA sequence is copied into messenger RNA (mRNA).
- Translation: The mRNA sequence is decoded by ribosomes to synthesize a protein.
An important addition to the dogma:
- Reverse transcription: In some viruses (retroviruses, e.g., HIV), RNA is reverse-transcribed back into DNA using the enzyme reverse transcriptase. This is an exception to the classical unidirectional flow.
Context: In a living cell, all three processes often occur simultaneously. In prokaryotes, transcription and translation can occur at the same time (coupled), while in eukaryotes they are spatially separated (transcription in the nucleus, translation in the cytoplasm).
2. Functions of Nucleotides
Nucleotides are not only the building blocks of nucleic acids. They serve multiple essential roles in the cell:
-
Energy currency: ATP (adenosine triphosphate) is the primary energy carrier in metabolic reactions. The hydrolysis of its phosphoanhydride bonds releases free energy (~30.5 kJ/mol per anhydride bond).
-
Signal transduction: Cyclic nucleotides such as cAMP (cyclic adenosine monophosphate) and cGMP act as second messengers, relaying signals from extracellular hormones and neurotransmitters (acting via G protein-coupled receptors, GPCRs) into intracellular responses.
-
Enzyme cofactors and metabolic intermediates: Many coenzymes contain nucleotide components:
- NAD⁺/NADH (nicotinamide adenine dinucleotide)
- FAD/FADH₂ (flavin adenine dinucleotide)
- Coenzyme A (contains a 3’-phosphoadenosine diphosphate moiety)
-
Building blocks of nucleic acids: DNA and RNA are both linear polymers of nucleotides.
3. Structure of Nucleotides
Every nucleotide is composed of three characteristic components:
| Component | Description |
|---|---|
| Nitrogenous base | A purine or pyrimidine ring system |
| Pentose sugar | A five-carbon sugar (ribose in RNA; deoxyribose in DNA) |
| Phosphate group | One or more phosphate groups linked to the 5’ carbon of the sugar |
Terminology clarification: A nucleotide without the phosphate group is called a nucleoside (base + sugar only).
3.1 Nitrogenous Bases
Nitrogenous bases are heterocyclic aromatic compounds. They are:
- Planar (aromatic ring system)
- Hydrophobic (contributing to base-stacking interactions in double-stranded nucleic acids)
- Basic (contain nitrogen atoms that can accept protons)
There are two structural families:
Purines (bicyclic: pyrimidine ring fused to an imidazole ring)
- Adenine (A)
- Guanine (G)
Pyrimidines (monocyclic: single six-membered ring)
- Cytosine (C) — found in both DNA and RNA
- Thymine (T) — found only in DNA
- Uracil (U) — found only in RNA (replaces thymine)
Why does DNA use thymine instead of uracil? Cytosine spontaneously deaminates to uracil at a measurable rate. If DNA contained uracil, this deamination would be mutagenic and difficult to correct (since uracil would be indistinguishable from a “normal” base). Instead, DNA uses thymine (5-methyluracil), so any uracil found in DNA is immediately recognized as a deaminated cytosine and repaired by specialized enzymes.
Minor (modified) bases
In addition to the five major bases, nucleic acids—especially tRNA and rRNA—contain modified bases such as:
- 5-Methylcytidine and N⁶-Methyladenosine (important in epigenetics and RNA modification)
- Inosine, Pseudouridine (Ψ), 7-Methylguanosine, 4-Thiouridine
- 5-Hydroxymethylcytidine (found in certain bacteriophage DNA)
3.2 The Pentose Sugar
The sugar in nucleic acids is always present in its β-D-furanose (closed five-membered ring) form.
- RNA contains β-D-ribose (has a hydroxyl group –OH at the 2’ carbon)
- DNA contains β-D-2’-deoxyribose (has only –H at the 2’ carbon; lacks the 2’-OH)
Functional importance of the 2’-OH: The 2’-OH in RNA makes it susceptible to alkaline hydrolysis (because it can attack the adjacent phosphodiester bond to form a 2’,3’-cyclic monophosphate intermediate). DNA, lacking this group, is more chemically stable — an advantage for a molecule storing genetic information long-term.
Sugar Puckering (Conformation)
The furanose ring is not planar; four out of five atoms are approximately coplanar, and one carbon is displaced above or below this plane. Two major conformations:
- C-2’ endo: C-2’ is on the same side as C-5’ (common in B-form DNA)
- C-3’ endo: C-3’ is on the same side as C-5’ (common in A-form DNA and RNA)
3.3 Nucleosides and Nucleotides
-
Nucleoside: Base + Sugar, connected by an N-β-glycosidic bond between the anomeric carbon (C-1’) of the sugar and nitrogen N-9 of purines or N-1 of pyrimidines.
- This bond does not form spontaneously due to thermodynamic constraints; it requires enzymatic catalysis in vivo.
-
Nucleotide: Nucleoside + Phosphate group, connected by a phosphoester bond to the 5’-OH of the sugar.
- A nucleotide with one phosphate = nucleoside monophosphate (e.g., AMP)
- Two phosphates = nucleoside diphosphate (e.g., ADP)
- Three phosphates = nucleoside triphosphate (e.g., ATP)
Positions of Phosphate Groups
The phosphate can be attached at the 2’, 3’, or 5’ carbon of the sugar:
- Adenosine 5’-monophosphate (AMP): phosphate at C-5’
- Adenosine 2’-monophosphate: phosphate at C-2’
- Adenosine 3’-monophosphate: phosphate at C-3’
- Adenosine 2’,3’-cyclic monophosphate: phosphate bridging both 2’ and 3’ oxygens
3.4 Nomenclature
Ribonucleotides (RNA building blocks)
| Base | Nucleoside | Nucleotide | Symbol |
|---|---|---|---|
| Adenine | Adenosine | Adenylate (AMP) | A |
| Guanine | Guanosine | Guanylate (GMP) | G |
| Uracil | Uridine | Uridylate (UMP) | U |
| Cytosine | Cytidine | Cytidylate (CMP) | C |
Deoxyribonucleotides (DNA building blocks)
| Base | Nucleoside | Nucleotide | Symbol |
|---|---|---|---|
| Adenine | Deoxyadenosine | Deoxyadenylate (dAMP) | dA |
| Guanine | Deoxyguanosine | Deoxyguanylate (dGMP) | dG |
| Thymine | Deoxythymidine | Deoxythymidylate (dTMP) | dT |
| Cytosine | Deoxycytidine | Deoxycytidylate (dCMP) | dC |
4. Phosphodiester Bonds and Polynucleotides
Successive nucleotides in a nucleic acid chain are joined by 3’,5’-phosphodiester bonds: a phosphate group bridges the 3’-OH of one nucleotide to the 5’ carbon of the next.
5' end
|
Phosphate
|
Sugar (C3'–OH)
|
Phosphodiester bond
|
Phosphate
|
Sugar (C3'–OH)
|
...
|
3' end (free –OH)
Key properties:
- The backbone of each strand (alternating sugar–phosphate units) runs in a specific direction, giving the strand polarity: 5’ → 3’.
- Chains are always written from 5’ end (free phosphate) to 3’ end (free hydroxyl).
- The phosphate groups are negatively charged at physiological pH (pKa ~1), making DNA and RNA polyanions. This is why they bind positively charged proteins (like histones) so readily.
RNA hydrolysis in alkaline conditions: Because RNA has a 2’-OH, in basic solution the 2’-oxygen attacks the adjacent phosphorus, forming a 2’,3’-cyclic monophosphate intermediate, which then opens to give a mixture of 2’- and 3’-monophosphates. This is why RNA is labile in alkali while DNA is stable — DNA lacks the 2’-OH needed to initiate this reaction.
5. Primary Structure of Nucleic Acids
The primary structure of a nucleic acid is defined as:
- The covalent backbone (sugar–phosphate chain)
- The sequence of nitrogenous bases along this backbone
Higher levels of structure:
- Secondary structure: Regular, stable structures formed by base-pairing (e.g., the DNA double helix, RNA hairpins)
- Tertiary structure: Complex three-dimensional folding of large molecules (e.g., chromosomal looping, tRNA L-shaped structure)
6. DNA Secondary Structure: The Double Helix
6.1 Historical Background
| Year | Scientist(s) | Contribution |
|---|---|---|
| 1869 | Friedrich Miescher | Isolated DNA (“nucleolin”) from white blood cells |
| 1940 | Avery, MacLeod, McCarty | Demonstrated DNA is the genetic material (transformation experiment with S. pneumoniae) |
| 1940 | Erwin Chargaff | Established Chargaff’s rules of base composition |
| 1952 | Hershey & Chase | Confirmed DNA (not protein) carries genetic information using radiolabeled bacteriophages |
| 1950–53 | Rosalind Franklin & Maurice Wilkins | Produced X-ray diffraction patterns of DNA fibers, revealing its helical structure |
| 1953 | James Watson & Francis Crick | Proposed the double-helix model of DNA structure |
Franklin’s X-ray data were crucial: the cross-shaped pattern of spots indicated a helical structure, and the heavy bands at the periphery were due to the regularly spaced, stacked bases. The pattern revealed two periodicities: 3.4 Å (distance between adjacent base pairs) and 34 Å (distance for one complete helical turn).
6.2 Chargaff’s Rules
Analysis of DNA base composition from multiple organisms led to four rules:
- A = T and G = C (molar equivalence of complementary bases)
- Therefore, Purines (A+G) = Pyrimidines (T+C)
- The base composition varies between species (species-specific)
- The base composition is constant within a species, regardless of tissue type, age, nutritional state, or environment
These rules implied that A pairs specifically with T and G pairs specifically with C — the basis for the antiparallel complementary strands of the double helix.
6.3 The Watson & Crick Model
The double helix model satisfies:
- Thermodynamic requirements: Hydrophilic sugar-phosphate backbone faces outward (toward water); hydrophobic bases are inside, protected from water and stabilized by base-stacking interactions (van der Waals forces between aromatic rings) and hydrogen bonds between complementary bases.
- Chargaff’s rules: A pairs with T (2 hydrogen bonds); G pairs with C (3 hydrogen bonds)
- X-ray data: Correct dimensions and periodicity
Key structural features of B-form DNA (the physiologically relevant form):
- Right-handed double helix
- Two antiparallel strands: one runs 5’→3’, the other 3’→5’
- Base pairs are nearly perpendicular to the helix axis (tilted ~6°)
- 3.4 Å rise per base pair; 10.5 base pairs per helical turn (36 Å or 3.6 nm per turn) in solution
- Helix diameter: ~20 Å (2 nm)
- Features a major groove (wider, where most DNA-binding proteins interact) and a minor groove (narrower)
- The two grooves arise from the asymmetric positioning of the base pairs relative to the backbone
6.4 Base Complementarity & Geometry
Watson-Crick base pairing:
- A–T: 2 hydrogen bonds
- G–C: 3 hydrogen bonds (therefore, G–C pairs are stronger)
A crucial geometric feature: A=T and G≡C base pairs have the same overall geometry (same C-1’–C-1’ distance of ~10.85 Å). This means any sequence of base pairs can be accommodated in the helix without distorting the backbone — the double helix can encode unlimited amounts of information.
Anti vs. Syn conformation: The base can rotate about the N-glycosidic bond relative to the sugar. In the anticonformation (most common in B-DNA), the base projects away from the sugar. In the syn conformation, the base projects over the sugar. Z-DNA has alternating syn/anti conformations in its purine/pyrimidine residues.
7. Structural Variations of DNA
Under different conditions, DNA can adopt three distinct helical forms:
| Feature | A-form | B-form | Z-form |
|---|---|---|---|
| Helical sense | Right-handed | Right-handed | Left-handed |
| Diameter | ~26 Å | ~20 Å | ~18 Å |
| Base pairs/turn | 11 | 10.5 | 12 |
| Rise/base pair | 2.6 Å | 3.4 Å | 3.7 Å |
| Base tilt | 20° | 6° | 7° |
| Sugar pucker | C-3’ endo | C-2’ endo | C-2’ endo (pyr); C-3’ endo (pur) |
| Glycosyl bond | Anti | Anti | Anti (pyr); Syn (pur) |
| Major groove | Narrow, deep | Wide, accessible | Barely apparent |
| Minor groove | Wide, shallow | Narrow | Narrow and deep |
A-form DNA
- Favored in dehydrated (low water) conditions and in RNA–DNA hybrids
- Also the most common form observed in DNA crystals
- Base pairs are tilted ~20° from perpendicular to the helix axis
- Whether A-form occurs significantly in living cells is uncertain
B-form DNA
- The physiologically dominant form under normal cellular conditions
- The reference structure for all DNA studies
- Bases are nearly perpendicular to the helix axis
- Most DNA-binding proteins recognize and interact with B-form DNA
Z-form DNA
- Left-handed double helix — the backbone follows a zigzag path (hence “Z”)
- Favored by sequences with alternating purines and pyrimidines (e.g., 5’-CGCGCG-3’) and by 5-methylcytosine
- The major groove is nearly absent; minor groove is narrow and deep
- Short stretches of Z-DNA have been detected in both bacteria and eukaryotes
- May play a role in regulation of gene expression and genetic recombination (precise role still under investigation)
8. Unusual DNA/RNA Structures
Beyond the standard double helix, nucleic acids can form a variety of unusual structures:
Palindromic Sequences and Inverted Repeats
A palindrome in molecular biology refers to a sequence that reads the same on both strands (in the 5’→3’ direction). Example:
5'–GAATTC–3'
3'–CTTAAG–5'
These are sites recognized by restriction enzymes.
Mirror repeats are sequences that read the same on a single strand in both directions.
Hairpins and Cruciforms
- A hairpin forms when a single-stranded DNA or RNA folds back on itself, with complementary sequences pairing to form a stem, and unpaired bases forming a loop.
- In double-stranded DNA, palindromic sequences can form cruciform structures (cross-shaped), where each strand forms its own hairpin. These structures are thermodynamically unstable but can form transiently during replication or transcription.
Hoogsteen Base Pairs and Triple Helices
- In standard Watson-Crick pairing, hydrogen bonds involve N-1 and the amino group of purines.
- In Hoogsteen pairing, the purine is in the syn conformation, using N-7 and the amino group, allowing a third strand to bind in the major groove.
- This enables the formation of triple-stranded DNA (triplex DNA): one purine-rich strand base-pairs with the pyrimidine strand via Watson-Crick bonds AND with the third strand via Hoogsteen bonds.
- Triplex DNA has roles in gene regulation and may be exploited for therapeutic purposes.
G-Quadruplexes
- Guanosine-rich sequences (especially in telomeres) can form quadruplex structures: four guanines associate via Hoogsteen hydrogen bonds to form a G-quartet plane; multiple G-quartets stack to form a G-quadruplex.
- These are stabilized by central metal cations (K⁺ or Na⁺).
- Adjacent strands can be parallel or antiparallel relative to each other.
- G-quadruplexes are thought to play roles in telomere maintenance, transcriptional regulation, and genome stability.
9. DNA Denaturation and Renaturation
Denaturation
DNA denaturation (also called “melting”) is the reversible disruption of:
- Hydrogen bonds between complementary base pairs
- Base-stacking interactions
…causing the double helix to unwind into two separate single strands. No covalent bonds are broken in this process.
Promoting factors:
- High temperature
- Extreme pH (very high or very low)
- Denaturing chemicals (e.g., urea, formamide)
Hyperchromic Effect and UV Absorption
- In the intact double helix, base-stacking interactions decrease UV absorption at 260 nm — this is the hypochromic effect.
- Upon denaturation, base stacking is lost, and UV absorption increases (~40% increase) — this is the hyperchromic effect.
- Denaturation can be monitored conveniently by measuring A₂₆₀ as a function of temperature.
Melting Temperature (Tₘ)
The melting temperature (Tₘ) is defined as the temperature at which 50% of the DNA is in single-stranded form.
- Each DNA species has a characteristic Tₘ.
- Tₘ increases with increasing G+C content because:
- G–C pairs have 3 hydrogen bonds (vs. 2 for A–T), requiring more energy to break
- G–C pairs also contribute more to base-stacking interactions
- Tₘ also depends on salt concentration (higher [Na⁺] stabilizes DNA by shielding the negatively charged backbone)
Practical application: Knowing the Tₘ is critical for designing PCR experiments — the annealing temperature of primers is calculated from their G+C content.
Renaturation (Annealing)
When denatured DNA is slowly cooled below its Tₘ, complementary strands can re-anneal (renature) through base-pair formation. This specificity is the basis for:
- PCR (Polymerase Chain Reaction)
- Southern and Northern blotting
- Nucleic acid hybridization techniques
10. Nonenzymatic Transformations of Nucleotides
DNA damage is unavoidable in living cells. While repair mechanisms exist, some damage escapes correction, leading to mutations. The main types of spontaneous and chemical DNA damage are:
1. Deamination
The spontaneous loss of an exocyclic amino group from a base:
- Cytosine → Uracil (~100 events/cell/day in mammals)
- If not repaired, U pairs with A instead of G, causing a C→T transition mutation
- 5-Methylcytosine → Thymine (particularly problematic because the product thymine is a normal DNA base and harder to recognize as a mutation)
- Adenine → Hypoxanthine (pairs with C instead of T)
- Guanine → Xanthine
Evolutionary implication: The high rate of 5-methylcytosine deamination to thymine explains why CpG dinucleotides are underrepresented in mammalian genomes; over evolutionary time, methylated CpGs have been converted to TpG dinucleotides.
2. Depurination
Spontaneous hydrolysis of the N-β-glycosidic bond between a purine base and the deoxyribose:
- Leaves an abasic (AP = apurinic/apyrimidinic) site in the DNA
- ~10,000 purines lost per mammalian cell per day
- AP sites block replication and transcription unless repaired
3. Thymine Dimers (UV Damage)
UV light induces covalent bonds between adjacent pyrimidines on the same strand:
- Cyclobutane thymine dimers: A four-membered ring forms between C-5 and C-6 of adjacent thymines (most common)
- 6-4 Photoproducts: A bond forms between C-6 of one thymine and C-4 of the next
Both lesions:
- Create kinks or bends in the DNA double helix
- Block DNA replication and transcription
- Are repaired by nucleotide excision repair (NER)
UV and ionizing radiation account for ~10% of all DNA damage from environmental agents.
4. Chemical Mutagens
| Agent | Effect |
|---|---|
| Nitrous acid (HNO₂) | Promotes deamination of C→U, A→hypoxanthine |
| Bisulfite | Also promotes deamination; used as a food preservative |
| Dimethylsulfate (alkylating agent) | Methylates G at O6 position → O6-methylguanine cannot pair with C |
| Nitrosamines | Precursors of nitrous acid; present in processed meats |
5. Oxidative Damage
Reactive Oxygen Species (ROS) — including superoxide anion (O₂•⁻), hydrogen peroxide (H₂O₂), and hydroxyl radical (OH•) — are generated by:
- Normal aerobic metabolism (especially mitochondria)
- Ionizing radiation
- Environmental toxins
ROS cause:
- Oxidation of bases (e.g., 8-oxoguanine, which mispairs with adenine)
- Sugar oxidation leading to strand breaks
Cellular defenses:
- Catalase: converts H₂O₂ → H₂O + O₂
- Superoxide dismutase (SOD): converts O₂•⁻ → H₂O₂
- Glutathione system: eliminates peroxides
Despite these defenses, a fraction of ROS escapes and causes cumulative DNA damage — contributing to aging and cancer.
11. DNA Methylation
Certain bases in DNA are enzymatically methylated after replication. In eukaryotes:
- Cytosine and Adenine are most commonly methylated
- ~5% of all cytosine residues are methylated to 5-methylcytosine in eukaryotic cells
- Methylation is concentrated at CpG dinucleotides (CpG islands), particularly in gene promoter regions
Key facts:
- All known DNA methyltransferases use S-adenosylmethionine (SAM) as the methyl group donor
- Different cell types show different methylation patterns (tissue-specific methylation profiles = methylome)
- Methylation of promoter regions generally represses transcription (gene silencing)
- Methylation patterns can be heritable through cell division without altering the DNA sequence — this is epigenetic regulation
Clinical relevance: Aberrant DNA methylation is a hallmark of cancer. Hypermethylation of tumor suppressor gene promoters silences them; global hypomethylation leads to genomic instability. Analysis of DNA methylation (methylome analysis) is a powerful tool in cancer diagnostics and epigenetics research.
12. RNA: Structure and Types
Types of RNA
| Type | Full Name | Primary Function |
|---|---|---|
| mRNA | Messenger RNA | Template for protein synthesis (carries the code from DNA to ribosome) |
| tRNA | Transfer RNA | Adapter molecules that bring amino acids to the ribosome; decode mRNA codons |
| rRNA | Ribosomal RNA | Structural and catalytic component of ribosomes |
| Ribozymes | Catalytic RNA | RNA molecules with enzymatic activity (e.g., self-splicing introns, RNase P) |
RNA vs. DNA: Key Structural Differences
| Feature | DNA | RNA |
|---|---|---|
| Sugar | 2’-Deoxyribose | Ribose |
| Bases | A, T, G, C | A, U, G, C |
| Strands | Usually double-stranded | Usually single-stranded |
| Stability | More stable (no 2’-OH) | Less stable (susceptible to alkaline hydrolysis) |
| Helix form | B-form (usually) | A-form (in double-stranded regions) |
Secondary Structure of RNA
Although RNA is single-stranded, it extensively folds back on itself to form complex secondary structures through intramolecular base-pairing. This folding is driven by the tendency to maximize hydrogen bonding and hydrophobic base-stacking interactions.
Common RNA secondary structure elements:
- Hairpin loops: A stem (base-paired region) and a loop (unpaired region)
- Internal loops: Bulges within a double-stranded region where bases are unpaired on one or both sides
- Pseudoknots and other complex tertiary structures
Double-stranded RNA regions adopt an A-form right-handed helix (never B-form; Z-form RNA has been produced in the laboratory under extreme conditions but is not physiologically relevant).
Unconventional Base Pairs in RNA
RNA can form base pairs beyond the standard Watson-Crick pairs:
- G–U wobble pairs: Particularly important in tRNA anticodon–codon interactions
- Hoogsteen pairs and reverse Hoogsteen pairs
- Interactions involving three bases simultaneously (base triples)
- Interactions involving modified bases (e.g., 7-methylguanosine, inosine)
These non-canonical interactions greatly expand RNA’s ability to fold into complex three-dimensional structures and perform catalytic functions.
tRNA Structure
tRNA is a particularly well-characterized RNA with an elaborate secondary structure:
- Approximately 73–93 nucleotides in length
- Cloverleaf secondary structure with four stem-loop (arm) regions:
- Acceptor arm (AA stem): 5’ and 3’ ends of the tRNA; amino acid is attached to the 3’ end (sequence …CCA-3’)
- D arm (D loop): Contains the unusual base dihydrouridine (D); involved in ribosome interaction
- Anticodon arm: Contains the anticodon triplet (recognizes the mRNA codon); the wobble position is at the 5’ end of the anticodon
- TΨC arm: Contains the sequence thymine-pseudouridine-cytosine (ribothymidine + pseudouridine Ψ); involved in ribosome interaction
- Variable arm: Present between TΨC and anticodon arms; variable in size (absent in some tRNAs)
The three-dimensional structure of tRNA is an L-shaped fold (tertiary structure), where the acceptor end and anticodon are at opposite ends of the L.
Pseudouridine (Ψ): This is an unusual isomer of uridine in which uracil is attached to ribose through C-5 (instead of the normal N-1). This modified nucleoside is important for tRNA stability and function.
13. Nucleases: Degradation of Nucleic Acids
Nucleases are enzymes that catalyze the hydrolysis of phosphodiester bonds in nucleic acids.
Classification
By substrate:
- Deoxyribonucleases (DNases): degrade DNA
- Ribonucleases (RNases): degrade RNA
- Pancreatic RNase A specifically cleaves RNA at phosphodiester bonds 3’ to pyrimidine residues, producing 2’,3’-cyclic monophosphate intermediates
By position of cleavage:
- Endonucleases: cleave internal phosphodiester bonds, producing fragments
- Exonucleases: cleave from one end of the molecule (either 5’→3’ or 3’→5’ direction)
Restriction Endonucleases (Restriction Enzymes)
These are bacterial enzymes (part of the restriction-modification system) that:
- Recognize specific palindromic sequences (typically 4–8 bp recognition sites)
- Cut both strands of the double-stranded DNA at or near the recognition site
Two types of cuts:
- Sticky ends (cohesive ends): Staggered cuts leave short single-stranded overhangs (e.g., EcoRI cuts 5’-G↓AATTC-3’, leaving 5’-AATT overhangs)
- Blunt ends: Even cuts leave no overhangs (e.g., SmaI cuts 5’-CCC↓GGG-3’)
Molecular biology applications: Restriction enzymes are indispensable tools for molecular cloning. Sticky ends from compatible restriction enzymes can be ligated together by DNA ligase to create recombinant DNA molecules — the basis of genetic engineering.
14. Other Functions of Nucleotides
Energy Storage and Transfer
The nucleoside triphosphates, especially ATP, carry chemical energy:
- Hydrolysis of a phosphoester bond (from nucleoside monophosphate): releases ~14 kJ/mol
- Hydrolysis of a phosphoanhydride bond (the bonds between phosphate groups in ADP/ATP): releases ~30.5 kJ/mol per bond
The two bonds between the three phosphate groups in ATP are phosphoanhydride bonds (high-energy bonds). This energy is used to drive thermodynamically unfavorable reactions.
Coenzymes Containing Nucleotide Components
Many metabolic coenzymes contain an adenosine unit:
- NAD⁺/NADH (contains adenosine + nicotinamide; involved in redox reactions in catabolism)
- FAD/FADH₂ (contains riboflavin + adenosine; used in the citric acid cycle and electron transport chain)
- Coenzyme A (CoA-SH) (contains adenosine + pantothenic acid + mercaptoethylamine; carries acyl groups)
Second Messengers
- cAMP (cyclic adenosine 3’,5’-monophosphate): Synthesized from ATP by adenylyl cyclase (activated by GPCRs). Acts as a second messenger in countless hormonal signaling pathways. Activates protein kinase A (PKA).
- cGMP (cyclic guanosine 3’,5’-monophosphate): Similar role; activated by nitric oxide signaling and natriuretic peptides.
Neurotransmitters and Platelet Signaling
Nucleotides also act extracellularly:
- ATP binds to P2X receptors (ligand-gated ion channels) in postsynaptic membranes → involved in taste sensation, inflammation, and smooth muscle contraction
- ADP binds to P2Y receptors on platelets → promotes platelet aggregation and blood clotting
- Clinically relevant: Clopidogrel (Plavix) is an antiplatelet drug that irreversibly blocks P2Y₁₂ ADP receptors
Alarmone: ppGpp
Guanosine 5’-diphosphate, 3’-diphosphate (ppGpp), also called guanosine tetraphosphate, is an alarmone produced in bacteria under conditions of nutritional stress (e.g., amino acid starvation). It triggers the stringent response, globally reprogramming bacterial gene expression to shut down growth-related processes.
15. Genome, Genes, and Chromosomes
The Human Genome
- The haploid human genome contains approximately 3 billion (3 × 10⁹) base pairs of DNA
- Distributed across 23 chromosomes (3,054,815,472 bp with X; 2,963,015,935 bp with Y)
- Individual chromosome size: ~50 million to 300 million bp
- Humans are diploid (46 chromosomes in somatic cells): 22 pairs of autosomes + 1 pair of sex chromosomes (XX or XY)
- Total length of DNA in a human cell: ~2 meters
- Total DNA in the human body (~10¹⁴ cells): ~2 × 10¹¹ km (compare: Earth-Sun distance = 1.5 × 10⁸ km)
Visualization: Riccardo Sabatini printed Craig Venter’s genome in 175 volumes totaling 262,000 pages.
The Term “Genome”
The genome refers to the complete nucleotide sequence of an organism, including both coding and non-codingsequences.
Prokaryotic vs. Eukaryotic Genomes
| Feature | Prokaryotic Genome | Eukaryotic Genome |
|---|---|---|
| Size | Small | Large |
| Organization | Compact, circular | Linear chromosomes |
| Nucleus | Absent (cytoplasmic) | Present (membrane-bound) |
| Extra elements | Plasmids | Telomeres, centromeres |
| Gene structure | Mostly uninterrupted | Many interrupted (introns/exons) |
| Repetitive sequences | Few | Many |
| Transcription & Translation | Coupled (simultaneous) | Separated in space and time |
Composition of the Human Genome
| Component | Approximate % |
|---|---|
| Protein-coding genes (exons) | ~1.5% |
| Introns | ~26% |
| LINEs (Long Interspersed Nuclear Elements) | ~20% |
| SINEs (Short Interspersed Nuclear Elements) | ~13% |
| LTR retrotransposons | ~8% |
| Miscellaneous heterochromatin | ~8% |
| Segmental duplications | ~5% |
| Simple sequence repeats | ~3% |
| DNA transposons | ~3% |
| Miscellaneous unique sequences | ~12% |
Non-coding regions include:
- Introns (non-coding segments within genes)
- Regulatory elements (promoters, enhancers, silencers)
- Non-coding RNAs (rRNA, tRNA, miRNA, lncRNA, etc.)
Gene Definition
A gene is defined as a portion of DNA that encodes the primary sequence of a final gene product, which may be:
- A polypeptide (protein-coding gene)
- An RNA with structural or catalytic function (rRNA, tRNA, ribozyme genes)
Introns and Exons (Eukaryotic Gene Structure)
Many eukaryotic genes (but few prokaryotic genes) are interrupted by non-coding sequences called introns (intervening sequences):
- Exons: The coding segments, which are retained in the mature mRNA
- Introns: Non-coding sequences that are removed from the primary transcript (pre-mRNA) during RNA splicing
Example: The hemoglobin β-subunit gene spans ~851 bp in genomic DNA but the coding sequence (exons) totals only ~126 bp.
RNA Splicing is the process of precisely removing introns and ligating exons to generate a contiguous mRNA. This is carried out by a large ribonucleoprotein complex called the spliceosome.
Chromosomal Features
A eukaryotic chromosome contains:
- Unique sequences (genes) and dispersed repetitive sequences
- Multiple replication origins (unlike prokaryotes with a single origin)
- Centromere: Site of kinetochore assembly; where spindle fibers attach during cell division
- Telomeres: Repetitive sequences (e.g., TTAGGG in humans) at chromosome ends; protect against degradation and maintain chromosomal integrity
Karyotyping
Karyotyping is the visualization and analysis of an organism’s complete set of chromosomes. Clinical applications include detection of:
- Down syndrome (Trisomy 21): extra chromosome 21
- Klinefelter syndrome (XXY): extra X chromosome in males
- Turner syndrome (XO): only one X chromosome in females
16. Chromatin and Nucleosomes
Chromatin
In the eukaryotic nucleus, DNA is not naked; it is tightly associated with proteins to form chromatin:
- ~90% of chromatin proteins are histones
- Also contains significant amounts of non-histone proteins and RNA
- During interphase: chromatin is partially decondensed to allow transcription and replication
- Post-replication: chromatin condenses into visible chromosomes
Two functional states:
- Euchromatin: Lightly packed, transcriptionally active regions
- Heterochromatin: Tightly packed, transcriptionally inactive regions (includes constitutive heterochromatin at centromeres and telomeres, and facultative heterochromatin in silenced gene regions)
Electron microscopy reveals chromatin as “beads on a string” — the beads are nucleosomes.
Histones
Histones are small, highly conserved basic proteins rich in the positively charged amino acids lysine (Lys) and arginine (Arg), which interact with the negatively charged phosphate backbone of DNA.
Five histone classes:
| Histone | MW (Da) | Lys (%) | Arg (%) | Role |
|---|---|---|---|---|
| H1 | 21,130 | 29.5 | 11.3 | Linker histone; binds connecting DNA |
| H2A | 13,960 | 10.9 | 19.3 | Core histone |
| H2B | 13,774 | 16.0 | 16.4 | Core histone |
| H3 | 15,273 | 19.6 | 13.3 | Core histone |
| H4 | 11,236 | 10.8 | 13.7 | Core histone |
Nucleosome Structure
The nucleosome is the fundamental repeating unit of chromatin:
- Histone octamer core: 2 copies each of H2A, H2B, H3, and H4
- DNA wrapping: ~146 bp of DNA wrapped 1.65 times around the histone octamer
- Linker DNA: ~54 bp of DNA connecting adjacent nucleosomes (bound by histone H1)
- Repeat unit: ~200 bp total per nucleosome
Geometry of DNA binding to the nucleosome: The histone core does not bind randomly. The sequence of the bound DNA matters:
- Regions with two or more A-T base pairs favor DNA curvature
- Regions with two or more G-C base pairs resist curvature
- Alternating A-T-rich regions at ~10 bp intervals (one helical turn) in the correct phase help the DNA to wrap tightly around the nucleosome
Histone Modifications (Epigenetic Marks)
The N-terminal “tails” of core histones (which protrude outside the nucleosome core) can be reversibly modified:
| Modification | Residue | Effect |
|---|---|---|
| Acetylation | Lys | Neutralizes positive charge → loosens DNA–histone interaction → activates transcription |
| Methylation | Lys, Arg | Can activate or repress transcription depending on position and degree |
| Phosphorylation | Ser, Thr | Involved in chromosome condensation (mitosis) and DNA repair signaling |
| Ubiquitination | Lys | Various effects on transcription and DNA repair |
These modifications constitute the “histone code”, which is read by regulatory proteins to modulate:
- Chromatin structure
- Gene transcription
- DNA repair
- Cell cycle progression
Higher-Order Chromatin Folding
Successive levels of organization compact DNA progressively:
- Naked DNA: 2 nm double helix
- Nucleosome array (“beads on a string”): 11 nm fiber
- 30-nm chromatin fiber (solenoid): ~6 nucleosomes per helical turn; requires H1
- 300-nm loops: Chromatin loops anchored to a protein scaffold
- 250-nm fiber (further coiling of loops)
- Metaphase chromosome: 700–1400 nm; maximally compacted
This hierarchical compaction reduces the effective length of DNA by ~10,000-fold in a metaphase chromosome.
Role of lncRNA in Chromatin Structure
Long non-coding RNAs (lncRNAs) contribute to chromosome organization:
- (a) lncRNAs interact with DNA-binding proteins to tether distant DNA segments together
- (b) lncRNAs interact with specific DNA sequences and recruit gene-regulatory proteins to regions, suppressing or activating transcription nearby
The most famous example is XIST RNA, which coats one entire X chromosome in female mammals, triggering its inactivation (X-chromosome inactivation).
17. DNA Supercoiling and Topoisomerases
DNA Supercoiling
The DNA double helix itself can be coiled — this is supercoiling. It arises when the axis of the double helix is itself coiled in space (also called a superhelix).
- The most common source of supercoiling in cells is underwinding of the double helix in a closed circular DNA (fewer helical turns than expected for relaxed B-form DNA)
- This creates negative supercoils (the DNA is wound in the opposite direction to the right-handed helix)
- Negative supercoiling is favorable for processes requiring strand separation (replication, transcription, recombination) because it partially pre-unwinds the helix
Supercoiling is also relevant in linear eukaryotic chromosomes because topological domains are maintained by protein attachment points.
States of circular DNA:
| State | Description |
|---|---|
| Relaxed | Normal B-form, no supercoiling |
| Strained (underwound) | Fewer turns than expected |
| Supercoiled | Strain accommodated by coiling of the helix axis |
| Strand separated | At very high levels of underwinding |
Topoisomerases
Topoisomerases are enzymes that change the topology of DNA (i.e., alter the number of supercoils) by transiently breaking and rejoining phosphodiester bonds.
Type I Topoisomerases
- Mechanism: Transiently break ONE strand of the double helix
- The active-site tyrosine forms a covalent 5’-phosphotyrosyl protein-DNA linkage (cleaves one strand)
- The unbroken strand passes through the break (or the broken strand rotates)
- The break is religated (3’-OH attacks the phosphotyrosyl linkage)
- Effect: Changes the linking number by 1 per catalytic cycle
- ATP not required
- Relaxes both positive and negative supercoils
Type II Topoisomerases
- Mechanism: Transiently break BOTH strands of the double helix
- An intact segment of duplex DNA is passed through the double-strand break
- Both strands are then religated
- Effect: Changes the linking number by 2 per catalytic cycle
- Requires ATP hydrolysis
- Can introduce negative supercoils (DNA gyrase in bacteria), relax supercoils, or decatenate (separate interlinked circular DNA molecules after replication)
Summary of Topoisomerase Types
| Family | Type | Mechanism | Domain |
|---|---|---|---|
| IA | Relaxes (−) | Strand passage | Bacteria, eukaryotes |
| IB | Swivelase | Strand rotation | Bacteria, eukaryotes |
| IIA (DNA gyrase) | Introduces (−) supercoils | Strand passage | Bacteria |
| IIA (Topo IIα, IIβ) | Relaxes (+ or −) | Strand passage | Eukaryotes |
| IIA (Topo IV) | Decatenase | Strand passage | Bacteria |
Topoisomerases as Drug Targets
Topoisomerases are excellent therapeutic targets because they are essential for cell survival and proliferation:
Topoisomerase I Inhibitors (Camptothecins):
- Topotecan: Used for ovarian and lung cancer
- Irinotecan: Used for colorectal cancer
Topoisomerase II Inhibitors:
- Etoposide (VP-16): Used for lung cancer
- Doxorubicin (adriamycin): Used for breast cancer (and other cancers)
Mechanism of action: These drugs stabilize the enzyme-DNA cleavage complex (the “cleavable complex”), trapping broken DNA ends. When a replication fork or transcription machinery encounters a drug-stabilized cleavage complex, it triggers double-strand breaks, leading to cell death. Cancer cells, which divide more rapidly, are more susceptible.
Antibiotic targets: Bacterial DNA gyrase (Type IIA) and Topo IV are targets of fluoroquinolone antibiotics(e.g., ciprofloxacin), which are selectively toxic to bacteria because these enzymes differ significantly from their eukaryotic counterparts.
18. SMC Proteins: Cohesins and Condensins
SMC proteins (Structural Maintenance of Chromosomes) are a family of large ATPases essential for maintaining chromosome structure and integrity.
Structure of SMC Proteins
Each SMC protein has a characteristic architecture:
- Globular N-terminal and C-terminal domains: Each contributes to an ATPase (ABC-type) active site
- Two α-helical coiled-coil regions: Connect the terminal domains to a central hinge domain
- SMC proteins function as dimers (forming a V-shaped structure with the hinge at the apex and the ATPase head domains at the tips)
Types of SMC Complexes
Cohesins
- Function: Hold sister chromatids together after DNA replication until anaphase
- Loaded onto chromosomes during S phase (replication)
- Essential for proper chromosome segregation in mitosis and meiosis
- Cohesin ring encircles the two sister chromatid DNA molecules
Condensins
- Function: Drive chromosome condensation as cells enter mitosis
- Essential for compacting the chromatin from its interphase state into short, thick mitotic chromosomes
Cell Cycle Dynamics
During the cell cycle:
- S phase: DNA replication; cohesin is deposited, linking sister chromatids
- G2 phase: Condensin begins chromosome condensation in preparation for mitosis
- Prophase → Metaphase: Maximum condensation; chromosomes align at the metaphase plate
- Anaphase: Cohesin is cleaved (by separase), allowing sister chromatids to separate to opposite poles
19. Mitochondrial DNA (mtDNA)
Structure and Properties
Mitochondria contain their own circular double-stranded DNA:
- Circular (not linear like nuclear chromosomes)
- Multiple copies per mitochondrion: ~100 copies in leukocytes to ~10,000 in neurons
- Replication is independent of the cell cycle (continuous, not just in S phase)
Genetic Content
The human mitochondrial genome encodes:
- 2 ribosomal RNA (rRNA) genes
- 22 tRNA genes
- 13 protein-coding genes — all encoding subunits of the oxidative phosphorylation (OXPHOS) complexes (Complex I, III, IV, and ATP synthase)
The majority of mitochondrial proteins (~1500) are actually encoded by the nuclear genome, synthesized in the cytoplasm, and imported into mitochondria.
Maternal Inheritance
Mitochondria (and their DNA) are inherited exclusively through the mother, because sperm mitochondria are eliminated after fertilization.
Homoplasmy vs. Heteroplasmy
- Homoplasmy: All mtDNA copies in a cell have the same sequence
- Heteroplasmy: A cell contains two or more distinct populations of mtDNA (e.g., wild-type and mutant)
Clinical significance: Pathogenic mtDNA mutations often need to exceed a threshold level of heteroplasmy (typically ~60–90%) before causing disease. Below this threshold, sufficient wild-type mitochondria compensate.
mtDNA Mutations and Diseases
Due to:
- High copy number (many replications per cell division)
- Less efficient DNA repair compared to the nucleus
- Proximity to ROS generated by the electron transport chain
mtDNA accumulates mutations at a higher rate than nuclear DNA.
Common mitochondrial diseases and associated mutations:
| Mutation | Disease |
|---|---|
| m.3243A>G | MELAS (mitochondrial encephalomyopathy, lactic acidosis and stroke-like episodes); MIDD |
| m.8344A>G | MERRF (myoclonic epilepsy with ragged red fibres) |
| m.14459G>A | MELAS, MILS, cardiomyopathy |
| m.13513G>A | MELAS, MILS |
| m.8993T>G | NARP (neurogenic muscle weakness, ataxia, retinitis pigmentosa); MILS |
| m.8483_13459del | PMPS (Pearson marrow-pancreas syndrome); KSS (Kearns–Sayre syndrome) |
mtDNA Release and Inflammation
Damaged or stressed mitochondria can release mtDNA into the cytosol or into the bloodstream:
- Cytoplasmic mtDNA: Activates innate immune signaling pathways (e.g., NLRP3 inflammasome, cGAS-STING pathway), triggering inflammation
- Circulating cell-free mtDNA (CCF-mtDNA): Released from cells passively (cell damage) or actively (via extracellular vesicles); can be detected in blood as a biomarker of mitochondrial stress, trauma, or systemic inflammation
20. Polymerase Chain Reaction (PCR)
Developed by Kary Mullis in 1983 (Nobel Prize in Chemistry, 1993), PCR is a technique to amplify a specific DNA sequence exponentially in vitro.
Requirements
- Target DNA (template)
- Two oligonucleotide primers flanking the region to be amplified (one complementary to each strand)
- Thermostable DNA polymerase (e.g., Taq polymerase from Thermus aquaticus, which survives high temperatures)
- dNTPs (all four deoxyribonucleoside triphosphates)
- Buffer with Mg²⁺
PCR Cycle (Three Steps)
- Denaturation (~95°C): Heat separates the two DNA strands
- Annealing (~50–65°C, depends on primer Tₘ): Primers bind (hybridize) to their complementary sequences on each strand
- Extension (~72°C): Taq polymerase extends the primers in the 5’→3’ direction, synthesizing new DNA
Exponential Amplification
Each cycle doubles the amount of target DNA. After 20 cycles: ~10⁶-fold amplification. After 30 cycles: ~10⁹-fold amplification.
Why does PCR exploit DNA denaturation/renaturation? The very properties of DNA base-pair complementarity and the reversibility of denaturation make PCR possible. The melting temperature concept — calculated from the base composition of the primers — determines the optimal annealing temperature.
Applications of PCR
- Diagnosis of infectious diseases (e.g., PCR for SARS-CoV-2, HIV, tuberculosis)
- Genetic testing and prenatal diagnosis
- Forensic science (DNA fingerprinting)
- Sequencing (PCR amplification precedes sequencing)
- Cloning of genes
- Detection of mutations (RT-PCR, qPCR, etc.)
21. Conclusion
This module has provided a comprehensive survey of nucleic acid biochemistry, from the atomic level to the chromosomal level:
-
Nucleotides — the monomeric building blocks — consist of a nitrogenous base, a pentose sugar, and a phosphate group. Their precise chemistry dictates all higher-order properties of DNA and RNA.
-
Polynucleotide chains are linked by 3’,5’-phosphodiester bonds, giving strands directionality (polarity).
-
DNA’s double-helical structure, governed by Watson-Crick base pairing and base-stacking, is the molecular basis of genetic inheritance. The B-form is physiologically dominant, but A- and Z-forms have important roles.
-
Unusual structures (hairpins, cruciforms, triple helices, G-quadruplexes) expand the functional repertoire of DNA and RNA.
-
DNA can be denatured and re-annealed — a property exploited in PCR, hybridization, and countless molecular biology techniques.
-
DNA is subject to damage (deamination, depurination, oxidation, UV damage, chemical mutagenesis). Repair pathways counteract these, but failures lead to mutations.
-
DNA methylation provides an epigenetic layer of gene regulation, influencing transcription without altering the base sequence.
-
RNA adopts complex secondary structures enabling its diverse roles as mRNA, tRNA, rRNA, ribozyme, and regulatory RNA.
-
Nucleases degrade nucleic acids; restriction enzymes are powerful tools in molecular biology.
-
Nucleotides serve multiple cellular functions beyond being nucleic acid building blocks: energy currency (ATP), signaling (cAMP, cGMP), coenzymes (NAD⁺, FAD, CoA).
-
The human genome (~3 billion bp) is organized in linear chromosomes, most of which is non-coding. Genes are often interrupted by introns.
-
Chromatin and nucleosomes package DNA into the nucleus. Histone modifications and DNA methylation together constitute the epigenome, regulating gene expression across cell types.
-
Supercoiling and topoisomerases control DNA topology, which is essential for replication, transcription, and chromosome segregation. Topoisomerases are key therapeutic targets.
-
SMC proteins (cohesins and condensins) maintain chromosome architecture through the cell cycle.
-
Mitochondrial DNA is a separate, maternally inherited circular genome; its mutations cause a spectrum of metabolic diseases.
-
PCR harnesses the principles of DNA denaturation and base-pair complementarity to amplify specific sequences — one of the most transformative tools in modern biology and medicine.
Study tip: Focus on understanding the mechanistic reasons for each structural feature — for example, why B-form DNA predominates at physiological conditions, why RNA is less stable than DNA, and why histones are basic proteins. These mechanistic insights will help you apply knowledge to novel questions in your exam.
Reference: David L. Nelson & Michael M. Cox, Lehninger Principles of Biochemistry, 7th or 8th Edition, W.H. Freeman, New York.