Screening of Human Epidermal Growth Factor Receptor 2 (HER2) Extracellular Domain for Potential Epitopes by Using Immuno-informatics Tools.

The human epidermal growth factor receptor 2 (HER2) is a well-studied oncoprotein that is overexpressed in a considerable proportion of breast cancer patients. The increased expression of this tyrosine kinase receptor is usually associated with poor clinical prognosis in female patients with breast cancer. In these patients, specic response of immune system against HER2 had been observed. This suggests that immunotherapy approaches can be employed for enhancing the response of tumor inltrating lymphocytes against HER2 in susceptible tumor microenvironment. In this regard, peptide vaccines are considered one of the most affordable immunotherapy modalities due to their low production cost and long-term effect. For this purpose, we have screened the extracellular domain of HER2 crystal for potential B-cells and T-cells epitopes by using different immuno-informatics tools. The output peptides were then rened and ltered according to their antigenicity, allergenicity and vulnerability to selected proteases. Here, we present multiple B-cells and T-cells epitope candidates against HER2 extracellular domain with high antigenicity, low allergenicity and good resistance for selected proteolytic enzymes. These ltered epitopes can be used for design and construction of anti-HER2 peptide vaccine for potential use in HER2 positive breast cancer patients. Additionally, the sequence of linear B-cells epitopes can be used for the design of monoclonal antibody variable region against HER2 extracellular domain.


Background
Breast cancer is considered the second cause of death in female cancer patients, the possibility of death in women with breast cancer is about 2.6% [1]. Therapeutic options available for management of breast cancer involves surgery, chemotherapy, radiotherapy, hormonal therapy and immunotherapy. A signi cant advancement in cancer immunotherapy has been accomplished due to better understanding of immune cells regulatory roles in tumor microenvironment (TME) [2]. Cancer immunotherapy does include different modalities like vaccination, adoptive T-cells therapy and chimeric antigen receptor (CAR) T-cells therapy [3]. These forms of immunotherapy are designed to enhance the capacity of tumor in ltrating lymphocytes (TILs) to recognize tumor associated antigen (TAA) and hence halt tumor progression [4].
The human epidermal growth factor receptor 2 (HER2) also known as ErbB2 (Erythroblastosis homolog B2) is a well-known oncoprotein. HER2 is a receptor tyrosine kinase that is overexpressed in 20% to 30% of invasive breast cancer cases [5]. The higher expression of HER2 on the surface of tumor cells is usually associated with poor clinical outcome and more tumor invasiveness. Immune system speci c activities against HER2 had been observed in HER2 positive breast cancer patients. Thus, stimulating immune cells to target HER2 can be considered as a potential therapeutic tool in HER2 positive breast cancer patients [4].
Unlike other cancer immunotherapy modalities, vaccination represents a cost effective method with ability to induce long term memory effect [2]. Peptides derived from different parts of HER2 molecule had been used to generate several anti-HER2 vaccine candidates. In this regard, one of the most effective and promising vaccine candidates is E75 with ability to induce cytotoxic T-lymphocytes response against HER2 molecules as observed in clinical trials. This experimental peptide vaccine was derived from the extracellular domain of HER2 molecule with a sequence of "KIFGSLAFL" and a position located at 369-377 [6].
In the current study, we have screened the extracellular domain of HER2 crystal with several immunoinformatics tools to identify potential linear epitopes for B-cells and T-cells. The predicted epitopes were then ltered according to their antigenicity, allergenicity and susceptibility to selected proteases. The nal ltered epitope candidates can be used to design novel anti-HER2 peptide vaccine or even monoclonal antibody variable region when considering B-cells epitopes.

Methods
Setting up screening study plan: The general framework for this study is similar to our previously published works [7,8]. A owchart summary for screening and ltration steps to identify potential epitopes can be seen in Figure 1.
Prediction of physicochemical characteristics for the extracellular domain of HER2 crystal: ProtParam online tool was used to predict various physical and chemical features for the extracellular domain of HER2 [9]. For this purpose, FASTA sequence of HER2 extracellular domain crystal with PDB code 6OGE was submitted to the prediction tool. We have reported different physicochemical characteristics for the submitted sequence like molecular weight, isoelectric point and instability index.
Additionally, both allergenicity score and antigenicity potential were predicted for extracellular domain of HER2 sequence by using AllerTOP v. 2.0 and VaxiJen v. 2.0 respectively [10,11]. For prediction of antigenicity, a threshold value of 0.5 was used.

Prediction of linear B-cells epitopes:
Antigen sequence properties online tool was employed to screen HER2 extracellular domain sequence for continuous B-cells epitopes. This virtual screening tool was accessed through The Immune Epitope Database (IEDB) [12]. Three prediction methods were used to screen the FASTA sequence of the submitted crystal and these are: Emini surface accessibility scale [13], BepiPred-2.0 [14] and Kolaskar and Tongaonkar antigenicity scale [15]. A default threshold was used for screening by these three prediction methods. Then, antigenicity score was calculated for each generated epitope by using VaxiJen v. 2.0 [11]. We have reported only those epitopes with antigenicity score greater than the threshold value of 0.5.
Prediction of T-cells epitopes presented by major histocompatibility complex class I (MHC-I): The sequence of HER2 extracellular domain was submitted in FASTA format to a combined predictor tool accessible through IEDB [12]. This online tool predicts the potential of a peptide to become a T-cells epitope by calculating peptide ability for processing by proteasomes, transporter associated with antigen processing (TAP) and also MHC-I molecules. For this tool, we have used NetMHCpan version 3.0 method [16] and a selected panel of 51 human leukocyte antigen (HLA) as seen in Table 1. The length of the generated T-cells epitopes was speci ed to 9-mer. Finally, we have presented only those epitopes with VaxiJen score greater than 0.5. Prediction of T-cells epitopes presented by major histocompatibility complex class II (MHC-II): We have used Tepitool, accessible through IEDB website, to predict T-cells peptides that can be presented through MHC-II pathway [12,17]. This online tool provides a exible interface of six steps that facilitates the screening of submitted FASTA sequence for prediction of peptides that can bind either MHC-I or MHC-II molecules. Here, the extracellular domain of HER2 crystal (PDB: 6OGE) was submitted in FASTA format. Then, a panel of pre-selected MHC-II restricted alleles was used for screening the sequence as seen in Table 1. A default setting was applied to generate moderate number of potential epitopes with a length of 15-mer. We have used NetMHCIIpan-3.0 method to predict peptides with potential capacity of MHC-II binding [18]. The generated 15-mer peptides were sorted according to their binding a nity percentile rank with a cutoff value of 2.5. Again, we have only reported those peptides with VaxiJen score more than 0.5.

Prediction of allergenicity and proteolysis susceptibility for the generated B-cells and T-cells epitopes:
AllerTOP v. 2.0 web-based tool was employed to ltrate and re ne the generated epitopes according to their predicted potential to induce allergic reaction [10]. Only those epitopes that are probably nonallergenic were then submitted for proteolysis susceptibility prediction by PeptideCutter tool [19]. The submitted one letter sequence for each epitope was evaluated for degradation vulnerability by Arg-C proteinase, Asp-N endopeptidase, Caspase-1, Neutrophil elastase and Trypsin. Only those epitopes that are resistant to degradation by ≥ 3 enzymes were then subjected for further consideration.
Evaluating surface accessibility of nal ltered B-cells epitopes: E cient B-cells epitopes must be located in a solvent accessible region of the antigen under evaluation. Surface accessibility of the epitope is essential for successful recognition by B-cells receptors, these receptors are actually membrane bound immunoglobulins [20,21]. We have used PyMOL version 2.3 to visualize the position of ltered linear B-cells epitopes within HER2 extracellular domain crystal [22].
Molecular docking of ltered T-cells epitope candidates against MHC-I molecule: The tertiary structure for each sequence of T-cells peptides with MHC-I binding capacity was modelled by using PEP-FOLD 2.0 server [23]. The generated PDB le for each epitope was then docked against HLA-A*02:01 crystal (PDB: 5SWQ) by using PatchDock server [24]. For docking process, the receptor binding site was de ned with the number of the following residues in chain A of HLA-A*02:01 crystal: 63, 66, 77, 99, 146, 147 and 171. Docking results were then further re ned by using FireDock server [25]. The interaction between each 9-mer T-cells epitope and HLA-A*02:01 molecule was then visualized by using LigPlot+ v.1.4.5 [26] and for the rst ranked complex only.
Population coverage of nal ltered T-cells epitopes: The sequence for each T-cells epitope presented by MHC-I or MHC-II molecules was submitted to population coverage prediction tool via IEDB server [12]. This tool can calculate population response to speci c T-cells epitope in various locations of the world by using HLA genotypic frequencies and also collected data about MHC binding and/or T-cells restriction [27]. Here, class I and II combined calculation option was employed to predict population coverage for T-cells epitopes presented by MHC-I or MHC-II pathways. We have used a large panel of MHC restricted alleles as can be seen in Table 1 in order to make sure that challenges like MHC polymorphism and difference in MHC expression frequency among various populations can be minimized.

Results And Discussion
The prediction of physicochemical properties for HER2 extracellular domain crystal, as summarized in Table 2, indicates that the whole crystal can't be used as anti-HER2 vaccine candidate. This is because the extracellular domain of HER2 seems to be unstable as the instability index is greater than 40, also the crystal is probably a non-antigenic protein with antigenicity potential less than 0.5 [28]. Therefore, we have screened HER2 extracellular domain for potential B-cells and T-cells epitopes as alternative strategy.
It is worth to mention that the extracellular domain of HER2 looks to have a net negative charge as the number of negatively charged residues is greater than those with positive charge, also the predicted isoelectric point is less than 7 [29]. Twenty-two linear B-cells epitopes were reported in HER2 extracellular domain crystal, as seen in Table 3, and by using three prediction methods. These continuous epitopes have variable length and position, all have antigenicity score greater than the threshold value of 0.5. Regarding the prediction of T-cells epitopes that are presented by MHC-I pathway, 18 peptides were reported in Table 4. All these T-cells epitopes have 9-mer length with antigenicity score more than 0.5. These epitopes were ranked according to their total score, this score represents a cumulative measure for peptide processing by proteasome, TAP and MHC-I. In general, higher total score re ects more e cient presentation of a peptide by MHC-I pathway. TAP: Transporter associated with antigen processing; MHC: major histocompatibility complex.
For T-cells epitopes with potential capacity for MHC-II binding, 27 peptide candidates were predicted in Table 5. All these epitopes have 15-mer length and VaxiJen score greater than 0.5. These T-cells epitopes were sorted based on their predicted percentile rank, a lower percentile rank value is usually associated with better peptide binding to MHC-II molecules [12]. Then, the antigenic B-cells and T-cells epitopes were further ltered and re ned based on their potential to induce allergic reaction as reported in Table 6. Only those peptides that are probably non-allergenic were then assessed for their vulnerability to proteolytic degradation by ve selected enzymes as seen in Table  7. B-cells and T-cells epitopes that are probably non-allergenic and resistant to degradation by ≥ 3 enzymes were then considered for further analysis. The sequence along with length and position of these nal ltered epitopes are presented in Table 8 as potential candidates.  The position of each potential B-cells epitope, as listed in Table 8, was then visually assessed by PyMOL for surface accessibility. According to Figure 2, the location of these four linear B-cells epitopes is accessible by solvent. This may facilitate the interaction between these surface peptides in HER2 extracellular domain and membrane bound immunoglobulins in B-cells.
Docking results for interaction between ltered T-cells epitopes and HLA-A*02:01 molecule is summarized in Table 9. For these six T-cells epitopes, we have reported the global energy of binding to MHC-I molecules. A lower global binding energy re ects better interaction between T-cells epitope and MHC-I binding groove. Table 9 also reports the contribution of attractive (VdW) Van der Waals forces energy, (ACE) Atomic contact energy and energy of hydrogen bonds towards global energy. Finally, the table also shows residues in MHC-I molecule that may be involved in hydrogen bond interaction with each T-cells epitope. Figure 3 represents a three-dimensional illustration for interaction between each T-cells epitope and MHC-I molecule. Modelling of interaction between T-cells epitopes and MHC-I molecule indicates that these six peptides are potential binders. Finally, the world population coverage analysis of T-cells epitopes presented by either MHC-I or MHC-II pathways shows that these nine peptides have excellent coverage against MHC restricted alleles employed. According to Figure 4, the combination of these epitopes resulted in a projected worldwide coverage of 100% and 47.81 as average number of epitope hits, while the minimum number of epitope hits was 38.08 as recognized by 90% of the population.

Conclusion
Here, we report multiple B-cells and T-cells epitopes by screening HER2 extracellular domain crystal with various immuno-informatics tools. The nal re ned epitopes are predicted to be antigenic, non-allergenic with good resistance against selected proteolytic enzymes. The location of linear B-cells epitopes seems to be solvent accessible; these peptides can be used for the design of antibody variable regions against HER2. On the other hand, T-cells epitopes are believed to be good binders to MHC-I or MHC-II molecules with excellent population coverage. These ltered B-cells and T-cells epitopes can be used for the construction of anti-HER2 peptide vaccine candidate for potential use against HER2 positive breast cancer.

Declarations
Potential competing interests: The authors declare no competing interests.

Figure 1
A concise illustration for study plan steps.

Figure 2
Locations of nal ltered B-cells epitopes are highlighted within HER2 extracellular domain crystal.  Worldwide population coverage analysis for ltered T-cells epitopes presented by MHC-I or MHC-II molecules.