Overview of Bacterial and Yeast Systems for Protein Expression

Over the past decade the variety of hosts and vector systems for recombinant protein expression has increased dramatically. Researchers now select from among mammalian, insect, yeast, and prokaryotic hosts, and the number of vectors available for use in these organisms continues to grow. With the increased availability of cDNAs and protein coding sequencing information, it is certain that these and other, yet to be developed systems will be important in the future. Despite the development of eukaryotic systems, E. coli remains the most widely used host for recombinant protein expression. Optimization of recombinant protein expression in prokaryotic and eukaryotic host systems has been carried out by varying simple parameters such as expression vectors, host strains, media composition, and growth temperature. Recombinant gene expression in eukaryotic systems is often the only viable route to the large-scale production of authentic, post translationally modified proteins. It is becoming increasingly easy to find a suitable system to overexpress virtually any gene product, provided that it is properly engineered into an appropriate expression vector.


INTRODUCTION
Protein quantity is an important consideration, since substantial time and effort are required to achieve higher quantities and is often easily obtained from prokaryotic or eukaryotic expression systems. Protein targets represent the majority of expressed proteins used in classical pharmaceutical drug discovery, which involves the configuration of a high-throughput screen of a chemical or natural product library in order to find selective antagonists or agonists of the protein's biological activity. In contrast to reagent proteins, therapeutic protein agents are the most demanding in terms of resource. Therapeutic proteins have intrinsic biological properties like medical drugs. The ultimate objective for expression of a therapeutic protein is the production of clinical-grade protein approaching or exceeding gram per liter quantities. For most expression systems this is not readily achievable. Other than bacterial and yeast expression, the most robust system for producing these levels is the Chinese hamster ovary (CHO) system. Due to the lack of proper post-translational modifications (e.g., glycosylation) in bacteria and yeast, CHO cell expression is often the only choice to achieve sufficient expression.

Escherichia coli
E.coli has rapid growth rate, ease of high cell density fermentation, affordability and the availability of excellent genetic tools, due to these characteristics of E.coli it have been extensively used for protein over expression [1]. E. coli is comprehensively used in laboratories which resulted in technologies to target protein over expression to different intracellular compartments (Fig. 1).This is beneficial because these compartments have various environments which might enable folding of particular proteins of interest [2]. The optimization of recombinant protein expression in E.coli was carried out widely by trial and error by altering the simple parameters like expression vectors, host strains, media composition, and growth temperature [3]. During the past years, pervasive studies have shown that replacing codons with a heterologous gene with synonymous ones used superiorly on the expression of host (codon optimization), and manipulating the nucleotide sequence of the translational initiation region could have a profound effect on recombinant protein yields [4]. In order to perform molecular epigenetic techniques successfully, it is important to have a full understanding of the properties and of the various Escherichia coli host strains that are commonly used for the propagation and manipulation of recombinant DNA [5]. E.coli is an enteric rod-shaped Gram-negative bacterium with a circular genome of 4.6Mb [6]. It was initially chosen as a model system because of its ability to grow on chemically defined media and its rapid growth rate. In rich media, during the log phase of its growth, E.coli doubles every 20-30min; hence, with overnight incubation period, single selected organisms will double enough times to yield a colony on an agar plate 0r 1-2 billion cells per millilitre of the liquid media. The ease of its transformability and genetic manipulation has subsequently solidified the role of E.coli as the host of choice for the propagation, manipulation, and characterization of recombinant DNA [7]. In the past 60 years, intensive research was performed on E.coli, as a result, now a lot more is known about these bacilli than any other organisms on earth. A diverse variety of E.coli mutants were isolated and characterized. Almost all strains currently used in recombinant DNA experiments are derived from a single strain: E.coli K-12, which is isolated from the feces of a patient with diphtheria in 1922 [8]. Genetic state of the DNA in an organism can be indicated by Genotype. It is associated with an observed behaviour known as Phenotype [9]. Genotypes of E.coli strains are described in conformity with a standard nomenclature and genes are given in three-letter, lowercase, italicized names that are often mnemonics. The methylotrophic yeast Pichia pastoris is widely used as a host system for production of recombinant protein [10]. It is also commonly used as model organism for basic research of peroxisome and secretory organelles biosynthesis. Additionally, it has come into focus for the glycol-proteins production with human-like N-glycan structures, also for several metabolites and recombinant proteins [11]. Recently, P. pastoris has been reclassified into a new genus, namely Komagataella and divided into three species K. pastoris, K. phaffii, and K. pseudopastoris [12]. The strains GS115 and X-33 have been made available by Invitrogen and belong to the species K. phaffii. Apart from that, other strains belonging to either K. pastoris or K. phaffii are freely used by researchers. Moreover, in accordance with published literature, all strains are further named P. pastoris, standing for the entire genus Komagataella. At present, the genomes of two P. pastoris strains (DSMZ 70382 and GS 115) have been fully sequenced [13]. There from, two genome browsers were set up.
Until then, most data on genetic and physiological background for strain X-33 and process design relied on analogies to other, well studied yeasts like Saccharomyces cerevisiae. Accordingly, P. pastoris gene names follow mainly the format established for S. Cerevisiae [14].

Pichia pastoris
Pichia pastoris is alike Saccharomyces cerevisiae in terms of general growth conditions and handling [15]. Knowledge of basic microbiological and sterile techniques is essential for attempting to grow and manipulate any microorganism and also familiarity with basic molecular biology and protein chemistry is required. The number of functionally annotated genes (9.4 Mb; 5.450 ORFs) can be comparable to other yeasts, majority of metabolic enzymes are present in single copies and the count of actually secreted proteins is low, making secretory production of heterologous proteins attractive. For selection on neomycin and largescale growth, a wild type of Pichia strain, X-33 is used [16]. It usually grows in YPD and in minimal media. For liquid cultures, plates, and slants the growth temperature of Pichia pastoris is 28-30°C [17]. Protein expression will be damaged and Cell death can also be caused if there is growth above 32°C during induction [18-20].
Other major facts are, doubling time of exponential phase Mut+ or MutS Pichia in YPD is ~2 hours, Mut+ and MutS strains do not vary in growth rates unless grown on methanol, Doubling time of log phase Mut+ Pichia in methanol medium (MM) is 4-6 hours [21]. Doubling time of log phase MutS Pichia in MM is ~18 hours. OD600 = ~5 × 10 7 cells/ml. YPD medium and YPD agar slants can be used to store cells for weeks to months. Each strain is to be streaked for single colonies on YPD. One colony must be transferred to YPD stab and grown for 2 days at 30°C. At 4°C, cells can be stored on YPD for several weeks. In order to store the cells for months to years, they should be stored at -80°C. A single colony of each strain is to be cultured overnight in YPD. Cells need to be harvested and suspended in YPD containing 15% glycerol at a final OD 600 of 50-100 approximately 2.5 × 10 9 -5.0 × 10 9 cells/ml). In liquid nitrogen or a dry ice/ethanol bath cells are frozen and then stored at -80°C.

Recombinant protein production in Pichia pastoris
The use of P.pastoris as a cellular host for recombinant protein production unwaveringly increases [22]. It is easily manipulated, genetically and also cultured, can reach upto high cell densities (> 130 g l-1 dry cell weight) on methanol and glucose [23]. Equally salient, as eukaryote, it provides the potential for producing soluble, well folded proteins, which underwent post-translational modifications, such as glycosylation (O-and N-linked; less overglycosylating than S. cerevisiae), disulfide bridge formation and processing of signal sequences [24]. For intracellular expression, the aminoterminal methionine residue is cleaved off, unlike proteins expressed in E. coli, or specific amino acid residues are likely to be phosphorylated, generating phospho-proteins without limitations and bottlenecks obtained by the secretory pathway and the protein can also be acetylated.
Transformation of the haploid, homothallic P. Pastoris host with recombinant DNA is mediated either by integrative plasmids or by autonomously replicating plasmids [25]. Homology of the introduced DNA with chromosomal locus is required for directed integration or replacement. Multiple integrations are commonly obtained on purpose. Only restriction at unique site, homologous to the P. Pastoris genome is required for integration [26]. Genetically stable transformants with high transcription rates are result of transformation mediated by electroporation. Most vectors are hybrids between bacterial and yeast sequences, possessing an origin of replication for E. coli and selection markers. In addition, plasmids also contain a multiple cloning site (MCS). Both auxotrophic markers, like functional histidine dehydrogenase gene, which can be used in histidine dehydrogenase defective GS115, as well as dominant markers are present [27].
Additional benefits of the P. Pastoris system are strong constitutive and inducible promoter systems. A major issue of recombinant protein production is the transcription efficiency; therefore, the choice of the promoter is vital. The number of available promoters is limited and however, mainly contains the methanol inducible alcohol oxidase 1 (AOX1) promoter and the constitutive glyceraldehyde-3-phosphate dehydrogenase (GAP) promoter [28]. Ellis and co-workers were the first to isolate the P. pastoris alcohol oxidase and two other genes regulated by methanol [29]. Alcohol oxidase is the vital enzyme in the methanol utilization pathway; specific for methylotrophic yeasts. It is encoded by two genes, AOX1 and AOX2, and structurally and functionally characterized as well as reviewed several times. The AOX promoters are tightly regulated by a carbon source-dependent repression/induction mechanism; showing full repression during growth on glucose or glycerol excess conditions, and maximal induction during growth on methanol. In contradiction, the glyceraldehyde-3-phosphate dehydrogenase promoter (PGAP) is constitutively expressed, even though its strength differs depending on the carbon source used for cell growth. This offers an attractive alternative to PAOX1 on glucose, especially if induction by methanol may be inappropriate or inconvenient; simultaneously increasing cell viability. Furthermore, the activity of PAOX1 in methanol-grown shake-flask cultures and slightly lower on methanol is not as strong as PGAP in glucose-grown shake-flask cultures [30].

CONCLUSION
All genes must be expressed to exhibit their biological activities. How genes are expressed and regulated is a central question in molecular biology and our knowledge in this area has been expanding enormously in recent years. The complexity of gene regulation is compounded by the fact that gene activities reach every comer of biology. Transcription is universally the first step toward expressing a gene. It is a highly regulated process.
Understanding the molecular mechanisms of transcription regulation is of fundamental importance. For protein-coding genes, post-transcriptional steps, including pre-mRNA processing, mRNA transport and translation, can also play important roles in regulating gene expression.