Protein Reverse Translation Peplica

The following conversation refers to this patent: US20030100000A1 Methods and compositions for reverse translation

The AI is Claude-Sonnet 3.5

protein reverse translation peplica US20030100000A1
pdf
U

rewrite this patent as a tech brief. the reader knows about molecular biology and chemistry, so just explain the mechanism of the central idea (reverse translation) that is used by this particular implementation. don't repeat a bunch of claims or history, just explain how it works.

Here is a tech brief summarizing the key mechanism of reverse translation described in this patent:

Reverse Translation Mechanism

This patent describes a method for "reverse translating" a peptide or protein sequence directly into its corresponding DNA or RNA sequence without sequencing the protein first. The key steps are:

  1. Stepwise degradation of the peptide/protein from one end, releasing individual amino acid derivatives one at a time (e.g. using Edman degradation).
  2. For each released amino acid derivative, allowing it to interact with a set of "Reverse Translation Elements" (RTEs). Each RTE can specifically bind both an amino acid derivative and its corresponding codon.
  3. Through competitive binding or displacement, the amino acid derivative causes release of the matching codon from the RTE.
  4. The released codon is then incorporated into a growing oligonucleotide chain using standard oligonucleotide synthesis chemistry.
  5. Repeating steps 1-4 builds up the full nucleic acid sequence encoding the original peptide/protein.

The RTEs are the key novel component, acting as the physical link between amino acids and codons. They can be made using molecularly imprinted polymers or antibodies with dual specificity.

This allows direct conversion of protein sequence information into DNA/RNA sequence without the need for protein sequencing or degenerate oligonucleotide synthesis. It enables amplification and cloning of proteins from very small initial quantities.

U

are there a large number of different RTE species? do they bind preferentially to one amino acid? how does step 3 work, in excruciating detail please

Certainly. Let me provide more detail on the RTEs and step 3:

Number and Specificity of RTEs:

  1. There are multiple RTE species, ideally at least one for each of the 20 standard amino acids.
  2. Each RTE is designed to bind preferentially to one specific amino acid or its derivative (e.g., from Edman degradation).
  3. While there are 61 amino acid-encoding codons due to degeneracy, the number of unique RTEs is closer to 20 than 61. This is because each RTE can potentially accommodate multiple codons for the same amino acid.

Step 3 in Detail:

The patent describes three main variations for step 3: "Knob", "Displacement", and "Competitive" RT. Let's focus on the Displacement method as an example:

  1. RTEs are prepared as a solid phase (e.g., beads or a column) with binding sites specific for amino acid sidechains.
  2. Before reverse translation begins, the RTEs are saturated with conjugates of amino acid derivatives and their encoding codons (called C-AAs). These bind to the RTEs via the amino acid portion.
  3. When an amino acid derivative from protein degradation is introduced, it competes for binding to its specific RTE.
  4. Due to the competitive nature of binding and similar affinities, the incoming amino acid derivative displaces the C-AA with the matching amino acid sidechain structure.
  5. This displacement releases the C-AA (amino acid-codon conjugate) into solution.
  6. The released C-AA contains the codon corresponding to the amino acid that was just degraded from the protein.
  7. The solution containing the displaced C-AA is collected.
  8. The codon portion of the C-AA is then used in oligonucleotide synthesis to extend the growing DNA/RNA chain.

This process relies on the specific recognition of amino acid sidechains by the RTEs and the principle of competitive binding. The incoming amino acid derivative effectively "kicks out" the matching codon, allowing it to be captured and used in DNA synthesis.

The efficiency and specificity of this process depend on factors like:

  • The binding affinity and specificity of the RTEs
  • The relative concentrations of bound C-AAs and incoming amino acid derivatives
  • The kinetics of the displacement reaction

The patent suggests using fluorescently labeled amino acids to optimize and verify the displacement process during development.

U
  • does the patent describe how they discovered the ~20 molecules that bind specifically to one and only one amino acid? (is this actually what's happening in the scheme?)
  • an amino acid derivative is just a free amino acid, right?
  • how does the growing DNA/RNA chain incorporate only one copy of each codon? is there a protecting group or something?

Excellent questions. Let me address each one:

  1. Specific binding molecules: The patent doesn't describe discovering ~20 molecules that bind specifically to one and only one amino acid. Instead, it proposes creating these specific binding sites using two main approaches:

a) Molecularly Imprinted Polymers (MIPs): These are synthetic materials created by polymerizing monomers around a template molecule (in this case, an amino acid or its derivative). When the template is removed, it leaves behind a binding site specific to that molecule.

b) Antibody-derived molecules: Using techniques like phage display to generate antibodies or antibody fragments that specifically bind to individual amino acids or their derivatives.

The scheme doesn't rely on naturally occurring molecules with perfect specificity, but rather on engineered binding sites.

  1. Amino acid derivatives: An amino acid derivative in this context is not just a free amino acid. It's typically the product of the degradation step, such as the phenylthiohydantoin (PTH) derivative produced by Edman degradation. These derivatives retain the side chain structure of the original amino acid but have been modified by the degradation chemistry.
  2. Incorporating one copy of each codon: The patent describes using standard oligonucleotide synthesis chemistry to incorporate the codons into the growing chain. This process typically involves:

a) Protecting groups: The 5' end of the growing chain is protected between addition steps.

b) Controlled addition: Only one codon (as a phosphoramidite or similar activated form) is added in each cycle.

c) Capping and oxidation steps: These ensure only correctly extended chains continue to grow.

The released codon from the RTE step would be in a form ready for this standard synthesis process, likely with appropriate protecting groups already in place.

The key is that the oligonucleotide synthesis is performed one codon at a time, controlled by the sequential release of codons from the RTE process, rather than adding all codons simultaneously.