Written by Billal Ahmed (PEACE Advisor)
I work in an MCB lab, and I thought I'd explain the gist of my research to you all! The nature of MCB research is much different than the kind of learning done in a typical biology class, so I think shedding some light on it may be helpful to some of you trying to decide between career choices or seeing if research is for you.
The specific details of my research is probably something I shouldn't disclose over a blog, so I'm not going to mention any specific names or sequences, but I will go over the conceptual logic in detail.
Before I get started, I'd like to note that biology research is very, very nuanced. Though general models are helpful to contextualize research and will save you in the vast majority of undergraduate biology classes, there are specific details and exceptions that are important to understand. These details are usually learned and dealt with as they come, because if you're trained with a general skill set they should be easy to acquire.
My project studies protein cleavage, which is the process by which a protein (which is just a long string of Amino Acids) is cleaved into two separate strings of amino acids at one of the peptide bonds.
Consider a sample protein. To avoid having to write out every amino acid, let's zoom into a particular region:
Amino Acid 1<> Amino Acid 2 <> Amino Acid 3<> Amino Acid 4<> Amino Acid 5<> Amino Acid 6
Let's say some arbitrary enzyme in the cell where this protein is produced cleaves between Amino Acids 4 and 5. This will yield two new peptide chains:
(N-terminus, or “start” of protein) Rest of Protein<>Amino Acid 1<> Amino Acid 2 <> Amino Acid 3<> Amino Acid 4
Amino Acid 5<> Amino Acid 6<>Rest of Protein (C-terminus, or “end” of protein)
In combination with the original protein., there are now 3 distinct species. However, often times one of these will be degraded very quickly, depending on the N-terminus amino acid. Recall that the N-terminus Amino Acid is the “start” of the protein, the amino acid with the free Amine Group. The start codon for all genes is ATG, which corresponds to the Amino Acid Methionine. Though not emphasized in most introductory biology/biochemistry courses, peptide half-lives are largely determined by the N-terminus Amino Acid, and Peptides that start with methionine have the greatest half-lives (evolutionary, it shouldn't be surprising why the start codon of all genes codes for methionine, but that's a story for another day). Looking back at the two peptides created by this cleavage event, if Amino Acid 5 does not start with Methionine, there is a high chance that this two-amino acid peptide will be degraded by the cell and “functionally”not exist. This leaves us with just two peptides, the full-length 6 amino acid chain and the shortened, 4 amino acid chain.
Though there is a lot of diversity in protein cleavage models, this is the basic model I'm wrestling with in my research. There is a cleavage event, and I'm trying to discern where this cleavage event occurs.
But how exactly would a biologist attempt to figure out the location of this cleavage event? Working with the assumption that the cleavage occurs at a specific site in the peptide sequence, if you remove this specific “cleavage site,” the cleavage should stop. This is exactly how I'm approaching this question.
To make mutant versions of proteins, biologists usually make mutant versions of the DNA gene that encodes the protein. Biologists will then add this DNA to cells, and have the cells transcribe and translate this gene into proteins. Though there are several ways to generate these sorts of deletions, I use a molecular cloning technique called Overlap-Extension PCR. Using this method, I'm able to replicate an entire gene but knock out any area in the middle that I want to remove. So if my protein is 1000 amino acids long, I can replicate base pairs 1 to 3003, but i can knock out any region in the middle; if I want to remove Amino Acids 300 to 400, I can make a new DNA sequence that goes from base pair 1 to 900, 1201 to 3003, with the entire region of 901 to 1200 being removed. You can also use this technique to replace existing amino acids instead of changing them. Let me show you examples of both amino acid removal and replacement, using the sample peptide I used earlier. I will arbitrarily define these Amino Acids and write a corresponding DNA sequence that would be transcribed and translated into those Amino Acids.
Amino Acid 1<> Amino Acid 2 <> Amino Acid 3<> Amino Acid 4<> Amino Acid 5<> Amino Acid 6 <> Amino Acid 7 <> Amino Acid 8
DNA Sequence: CTA-GGG-GCA-AAG-AGC-GTA-CGC-ATA
Disclaimer: The following PCR primers will NOT work because the Tm of the binding regions is way too small. This is a thought exercise meant to get you to think about primer design in the context of studying protein cleavage.
Let's design PCR primers to first eliminate the region and secondly replace the region with another amino acid. PCR's require 2 primers, this cloning actually requires two separate reactions. We have a 5' Primer at the start of the gene (way down at the N-terminus) and a 3' primer at the end of the gene (way down at the C-terminus), but if we just did this standard PCR, we'd get the normal gene without any modifications. To create modifications, we actually need two separate PCR reactions. Remember that in any PCR you need a 5' primer and a 3' primer , so we have to obey that rule. We essentially to amplify the first half (that is, the portion of the gene before the region we want to remove/replace) and amplify the second half separately. Then, in a second PCR reaction, we combine the two fragments and PCR them together from the ends of the gene, thus re-constructing the entire mutated gene. We'll call these primers around the mutation area “internal primers.”
5' external primer (at start of gene)
3' internal primer ( near front end of mutation area)
3' external primer (at end of gene)
5' internal primer (near the back end of the mutation area)
Let's say we suspect the cleavage site is located between the Lysine and the Serine. We can cut out those Amino Acids or change them and see if that knocks out the cleavage event. Generally speaking, changing an amino acid is less deleterious of a mutation then a deletion, so in a practical context, you'd probably cut it out first, test for cleavage, and if it does cleave, change the amino acids and see if it cleaves.
Let's say we want to delete the Lysine and the Serine. The key to this technique is that you want your primers to have overhangs that bind to each other: this way, when you combine your fragments from the first PCR reaction, the overhangs will bind to each other and turn your two fragments into one long fragment. When deleting those amino acids, you want to “skip them” when designing your primer. Based on the DNA sequence above, for the internal 3' primer you'll want a primer that'll amplify from the alanine towards the front of the gene, with an overhang that encompasses the Valine, Arginine, and Isoleucine (DNA sequence GTACGCATA). This 3' primer will work in tandem with the 5' primer at the start of the gene to amplify the first half of the gene, before your mutation area. For the internal 5' primer you'll want a primer that'll amplify from the Valine towards the end of the gene, with an overhang that encompasses the Leucine, Glycine, and Alanine (CTAGGGGCA). Blue text will be the overhang, and green text will be the binding part of the primer.
3' Primer: TATGCGTACTGCCCCTAG
5' Primer: CTAGGGGCA GTACGCATA
Remember that for the 3' primer, the actual sequence of the primer will be the reverse complement of the way you'd write it based on the sequence because you must make the directionality of the primer 5 to 3. Notice that the entire 3' and 5' primers are reverse complements of each other. When making an internal deletion with Overlap-Extension PCR, the 5' and 3' will be reverse complements of each other.
Now let's design some primers for a substitution mutation. Let's say we want to replace the suspected Lysine and Serine with two Alanines (Alanine tends to be the most vanilla Amino Acid, so suspect residues are often replaced with Alanines to see if there is still functionality, in this case cleavage). When making a substitution mutation with Overlap-Extension PCR, you want your overlap to be new you're replacing the original region with. In this case, your overhang should be any combination of base pairs that will lead to two alanines (from above, GCAGCA). The binding parts of the primers should be the same, because the regions you want to bind haven't changed at all.
3' Primer: CGTCGT TGCCCCTAG
5' Primer: GCAGCA GTACGCATA
Note that in this case, the two primers are NOT perfect reverse complements; just the overhang regions.
Using this kind of PCR, you can generate mutant versions of the gene that will produce proteins lacking certain amino acids. This is merely one way of analyzing the complex question of protein cleavage, but hopefully this gave you some insight into the techniques and thought processes of a scientist trying to unravel the mysteries of the biological universe.