With the development of life science, many proteins have been studied or are being studied. For the proteins that have been studied by predecessors, we can use NCBI and UniProt to understand the protein structure, post translation modification and other relevant information. For the proteins that have not been studied, we can use some prediction websites or software to analyze the signal peptide and hydrophobicity of proteins, Topological structure, isoelectric point and other information.
Most amino acids have more than one codon. The analysis of E.coli codon application shows that some codons are rarely used, especially some codons of Arg, IIE, Leu, Gly, pro and other amino acids. The rarity or lack of one or more tRNAs may lead to the termination of translation. Although the presence of a small number of rare codons will not have a great impact on the expression of the target protein, the expression of the target protein will be very low if a protein gene contains multiple or a string of rare codons. If the gene is wild type, we'd better do a rare codon analysis first. If the degree of rare codon is not high, Rosetta strain can be used to express these proteins containing rare codon genes. If the rarity of codon is too high, the best way to deal with it is to directly synthesize genes after optimization of codon.
There are two main reasons. One is that the large intestine stops growing or splitting after adding IPTG and other inducers. At this time, we need to consider that the protein itself has certain toxicity to the bacteria. At this time, we can choose some host strains with tight control, such as those with plyss. Another reason is that it's a headache. When we find that the amount of bacteria in the large-area shaker gradually decreases or even the medium returns to the clarified state, we should suspect the possibility of phage pollution.
Some protein expression level is low, or most of the expression is inclusion body, usually through gene optimization, vector and strain selection, expression condition optimization, induction condition optimization can effectively increase the protein soluble ratio and expression amount.
1. Codon optimization
Codon optimization is based on the preference of large intestine for codons. Generally, codons with a frequency greater than 20% are selected. The optimized gene sequence can improve the stability of mRNA secondary structure and avoid translation delay, translation termination before maturity, translation shift and amino acid mismatch due to insufficient tRNA library.
2. Reduce protein expression rate
By reducing the induction concentration of IPTG, the protein expression rate was reduced. By adjusting the balance between the rate of peptide aggregation and its folding rate, the soluble expression of protein can be improved.
3. Temperature
At 37 degrees, some proteins often form inclusion bodies, while at 30 degrees, soluble or active proteins may be produced. Under some conditions (12-20 degrees), prolonging the induction time (overnight) will maximize the production of soluble protein.
4. Expression of cell periplasm
We can choose some carriers to transport the protein to the pericyte. Periplasmic space is more favorable for protein folding and disulfide bond formation. The conventional carriers are pet22 series. However, it should be noted that some proteins are not suitable for transport to the periplasmic space. For example, the inner part of the periplasmic fusion with β - gal has been proved to be toxic.
In order to reduce the formation of inclusion bodies, civilization suggests that we use gene synthesis to obtain genes. At the same time, more soluble proteins were obtained by reducing the induction temperature, such as 12-20 ° C, reducing the IPTG concentration (0.01-0.1mm), prolonging the induction time, and special medium. In addition, we need to analyze whether the protein itself is hydrophobic. If the hydrophobicity of the protein itself is relatively strong, we need to use the fusion expression protein of the solubilizing tag to achieve the soluble purpose, such as TRX, GST, Nusa. But it is not a bad situation to form inclusion bodies: the expression of inclusion bodies is often very high, and it is easy to obtain high-purity protein samples by purification. Inclusion body is also a good choice for some projects that need protein to immunize animals to prepare WB.
After a certain amount of inclusion body is obtained, column refolding or dialysis / dilution refolding can be attempted,
On column renaturation method: after the protein was on the column in denatured state, it was eluted with a certain concentration of urea or guanidine hydrochloride, 1mm reduction and 0.2mm oxidized glutathione, and then eluted with imidazole.
Dialysis / dilution refolding: the addition of reductive and oxidized glutathione or chaps and other surfactants in the process of dialysis and urea removal can help the protein fold properly.
The main reasons are as follows
1. The protein itself has been degraded, and there may be some proteases that are not conducive to the stability of the protein
2. The problem of translation starting site: if a sequence similar to the ribosome binding site (aagggg) and an appropriate interval sequence appear upstream of the ATG codon, degradation may occur. The solution can adopt carriers with fusion tags at both ends, for example, some carriers of pet series have his tag fusion tags at N-end and C-end. In this way, the full-length protein can be separated from the truncated protein by increasing the concentration of imidazole. Or the full-length protein can be obtained by affinity purification with different labels at both ends of the protein.
3. Another factor affecting the stability of the protein is the amino acid adjacent to the N-terminal met, that is, the N-terminal principle. When the following amino acids appear at the N-terminal: Arg, Lys, Phe, Leu, Trp and Tyr proteins have short half-life, and the protein will have the problem of unstable degradation, especially Leu. When it appears at the second position, the protein is extremely unstable. In the selection of clonal sites, NCO I and ndei are both good N-terminal clonal sites. When using ndei, we should pay special attention to the emergence of Leu.