Deciphering the Genetic Alterations in SPARC Gene Family and Its Association with HNSCC

The aim of this study is to identify the genetic alteration in SPARC gene family and its association with head and neck squamous cell carcinoma (HNSCC). Head and neck cancer is a set of cancerous lesions arising from the squamous cell of the mucous membrane of the oral cavity, nose throat, larynx and pharynx. SPARC gene encodes for cysteine rich acid matrix metalloprotein, osteonectin whose expression in metastatic OSCC (Oral squamous cell carcinoma) was found to be higher. This expression pattern also correlated with the worst pattern of invasion and differentiation of OSCC tumors. In line with the above facts, the present study was carried out to Original Research Article Vidyashri et al.; JPRI, 32(27): 87-97, 2020; Article no.JPRI.59696 88 ascertain the gene alterations and their consequences. Also the putative association of gene alterations with HNSCC was analyzed using computational tools. The Cancer Gene Atlas (TCGA, Firehose Legacy) dataset hosted by the cBioportal server was used in the present study. The nonsynonymous variants identified were further assessed for protein stability and pathogenicity employing IMutant and PROVEAN tools. Gene amplification was observed in the FSTL1 gene, which was also shown to present with the highest frequency of gene alterations (5%) among eight genes. Furthermore, the expression of the FSTL1 gene was found to differ significantly among different grades of HNSCC. In conclusion, the study throws light on the possible association of the FSTL1 gene of the SPARC family with HNSCC.


INTRODUCTION
Head and neck squamous cell carcinoma (HNSCC) is the most common type of cancer which accounts for more than 6,50,000 cases and 3,30,000 deaths annually. HNSCC represents the cancerous lesions arising from the squamous cell of the mucous membrane of the oral cavity, nose throat, larynx and pharynx [1]. According to the GLOBOCAN survey, 2018, the incidence of HNSCC was found to be clustered in specific regions in the world with a high incidence rate recorded in the south Asian countries [2]. While the total number of incidences of HNSCC has reduced since 1975 there can also be unusual bleeding, facial swelling, or trouble breathing. The most common etiology of head and neck cancer are tobacco smoking and alcohol consumption. It has been shown that the frequency of occurrence is more in men than women [3]. The human papilloma virus (HPV) has also emerged as a major cause of HNSCC among non-smokers and light drinkers [4].
SPARC gene encodes for osteonectin also known as basement membrane protein 40 or secretary protein acidic and rich in cysteine. It is a 32000 molecular weight glycoprotein and has high affinity for hydroxyapatite and type 1 collagen and regulates bone remineralisation [5]. It binds to the structural matrix protein such as vitronectin and collagen and as a consequence regulates the cellular interaction with the extracellular protein [6]. It is generally transient and expressed during embryogenesis, tissue repair and remodelling. Each SPARC member of the family possesses a characteristic conserved EC (E-F hand calcium binding) area with an E-F hand motif [7]. Based on collection homologies of the EC domain names, the gene family can be of 4 types-SPARC and Helvin, SMOCK1 & 2, Testican 1, 2 &3 and FSTL-1(Follistatin -like protein) [8,9] The most prominent proteases associated with tumourigenesis are matrix metalloproteinases which are produced due to proteolysis of the extracellular matrix. Due to the studies based on their association with multiple forms of cancer, they are considered as targets for drugs in anticancer therapies [8]. SPARC has also been proven to regulate the activity of matrix metalloproteinases and it's NH2 terminal regulates the activation of MMP2 at the cell surface and aids in tumour progression [10]. MMPs are a family of enzymes taken into consideration to be the primary mediators of ECM proteolysis and turnover. SPARC is also involved in extracellular matrix formation, remodelling and is associated with the initiation of changes in cell's shape.
Osteonectin (ON) are generally produced by fibroblasts [11]. ON has also been shown to play a major role in wound healing and is secreted by macrophages in such cases. Proliferation of specific cells, most commonly endothelial cells are mediated by osteonectin [12]. Overproduction of osteonectin is seen in scleroderma fibroblasts. This indicates the presence of osteonectin in tissue remodelling [13]. The human SPARC gene is located on chromosome 5q31-q33. Low expression of SPARC leads to osteogenesis imperfecta and osteoporosis while upregulation is seen in pulmonary and cardiac fibrosis, cardiac disease, and cancer [14].

Data Source
The Cancer Genome Atlas, TCGA dataset consists of a total of 528 cases of head and neck squamous cell carcinoma, of which 512 tumor samples had sequencing and copy number alteration data. There is a full profile of mutated, amplified, deleted genes for each sample. Table  1 contains the demographic data of the patients analysed in the study. Oncoprint data was obtained on submitting user-defined queries based on the eight crucial genes of the SPARC family viz., FSTL1, SMOC1, SMOC2, SPARC, SPARCL1, SPOCK1, SPOCK2 and SPOCK3 which were analysed for further study [15,16].

Oncoprint Data Analysis
Oncoprint analysis is the shortened and concise summation of the genetic alterations in graphical form. It provides data on multiple genes across a set of tumor samples. The details on the frequency distribution of variations in each of the genes, the variant allele frequency, gene deletions, amplifications, insertions, frameshift etc., were recorded [15,16].

Protein Stability Analysis
I-Mutant v3.0 is a support vector machine (SVM)based method for automated detection of improvements in protein stability through single point mutations. The change in the free energy change values (DDG) was used for interpreting the protein stability. A value > 0 was found to increase and < 0 was found to decrease the stability of the protein [17].

PROVEAN (Protein Variation Effect Analyzer)
predicts the impact on the biological function of a protein upon substitution with an amino acid ( Table 3). The present analysis employs a userdefined query of missense variants entered along with the reference sequence obtained from the NCBI database with a default cut-off value of -2.5. A score less than -2.5 or greater than -2.5 was considered to be deleterious and neutral respectively (Table 3) [18].

UALCAN Analysis
The expression of the gene in HNSCC was analysed using the UALCAN (http://ualcan.path.uab.edu/cgi-bin/TCGA-survival1.pl?genenam=SPARC&ctype=HNSC) database. Survival curve analysis based on the tumor grade and expression profile was performed to demonstrate the putative role of SPARC gene with HNSC. Gene expression data are expressed as transcripts per million (TPM). The survival effect analysis based on the gene expression pattern was assessed using Kaplan-Meier survival analysis [20].

RESULTS AND DISCUSSION
Head and neck cancers have globally affected more than 5.5 million people and have caused about 379,000 deaths [21]. The majority of the squamous cell carcinomas occur in the head and neck region [22]. Genetic factors have also been recognised as an important predisposing factor of squamous cell carcinoma [23]. Head and neck cancer includes oral, pharyngeal, laryngeal and throat cancer [24]. The primary database used was cBioportal which hosts several datasets of which the dataset (TCGA, Firehose Legacy) was selected for the present study. The TCGA dataset had information on 528 HNSCC patients (530 samples). The male:female ratio was found to be 2.7:1, with the diagnostic age groups ranging from 19 -90 years. The number of individuals with the history of smoking and alcohol were roughly around 98% (515 individuals) and 67% (352 individuals). The dataset had samples from patients of American (85.6%), African (9.1%), Asian (2.1%) and  Table 1, of which 59% of patients had grade 2 tumor.
Oncoprint data analysis was performed to analyse the genetic alterations or variations seen in the gene family. Here, the FSTL1 gene has been observed to have a high level of variation (Fig. 1). The oncoprint data analysis revealed gene amplification in 8 genes, of which FSTL1 (5%) harboured the highest frequency of gene amplification. The genes SMOC1, SMOC2, SPARC, SPOCK1, SPOCK2 and SPOCK3 demonstrated deep deletions. The SMOC1 and SPOCK3 genes harboured the highest number of variations/mutations from among all the genes identified with alterations ( Table 2). Several truncating and mis-sense variants of unknown significance have been documented (Fig. 1). Genetic alterations in the genes associated with the SPARC family have been documented. The alteration of each gene has been noted and seen whether the changes noted were new and novel or already reported in previous studies. Only missense mutation has been taken into consideration here. In the FSTL1 gene, which shows the most variance, the changes noted are, amplification, replacement of glutamic acid by lysine in the 280th position, replacement of aspartic acid by asparagine in the 242nd position and replacement of asparagine by lysine in the 69th position of the amino acid chain.
A total of 5 reported variants were identified using gnomAD analysis viz., SMOC1 (rs899963298, rs766160933), SPARCL1 (4-88415749-GCCTT-G), SPOCK2 (rs1291754865) and SPOCK3 (rs1400335404). All the variants identified in the present study had a minor allele frequency < 0.01, implying the fact that these are rare variants that may be associated with the risk of a particular disease.
The stability of the protein largely affects the biological function of the protein. Hence, protein stability was assessed for all the nonsynonymous variants identified in the study. The I-Mutant analysis produced a score which was used for interpretation of results. The protein stability for each of the non-synonymous variants were assessed and tabulated ( Table 3). The majority of missense variants observed were found to decrease the stability of the protein product, thereby giving away a chance for influencing the catalysis process. Although presented with decreased stability all the variants were not found to lead to a deleterious phenotype. Interestingly, the majority of the variants produced neutral effect with the exception in a few gene variants exhibiting deleterious outcomes viz., FSTL1, SMOC1, SMOC2 and SPOCK1.
Considering the FSTL1 gene, as it exhibits the highest frequency of alterations among the eight genes its expression has been documented across various grades of HNSC tumour. The comparison of gene expression patterns between different grades of HNSC returned significant values between grade 1 vs grade 2 (p=6.2 X 10 ). A p value less than 0.05 is considered to be significant (Fig. 2). The effect of differential expression pattern of FSTL1 gene upon the survival probability of HNSC patients was also recorded. A significant difference (p value = 0.011) was found between low/medium level expression in African American patients and low/medium level expression in Caucasian patients. A low/medium level expression presented with a low survival rate in African American patients. Furthermore, highlevel expression in Caucasian patients and low/medium level expression of the FSTL1 gene in African American patients also returned a significant result (p value = 0.04). Here, a low/medium level expression was related to a low survival rate in African American patients (A p value less than 0.05 is considered to be significant) (Figs 3a and 3b).
Based on a previous study done on oral cancer, it was observed that the OSCC cells expressed more SPARC expression than normal cells. Also, a higher level of SPARC expression correlated with lesser differentiation of the cells and thus, a higher grade of cancer [25,26]. Increased number of SPARC positive cells are seen in leukoplakia, carcinoma in situ and early SCC. It was also found that the malignancy or the migration of the tumor cells were not brought about or regulated by the SPARC gene. In the early stage of neoplasia, the dysplasia cells tend to induce SPARC which may improve the survival characteristics but it is not involved with metastasis which occurs in the final stages of cancer [27]. SPARC expression in different tumours is variable. SPARC is expressed in high levels in breast [28], neuroblastoma, rectal and brain cancers [29][30][31][32] and in low levels in pancreatic cancer [33,34], bladder cancer [35] and acute leukemia [36] It was also seen experimentally that there is the presence of SPARC localised in the stroma adjacent to the tumor in OSCC patients. Also  [38]. This can be attributed to the fact that the latter types of cancers have a lesser metastatic property when compared to gliomas and melanomas [39]. It has been noticed by Yiu et al., that the down regulation of SPARC is essential for ovarian carcinogenesis as the SPARC tends to induce apoptosis in the ovarian cancer cells and thus sensitising the cells to the apoptotic activity of SPARC [40]. In a research for novel markers for poor prognosis in HNSCC by Chin et al., SPARC was found to be highly expressed in transition from normal mucosa to   = 0.04). Here, a low/medium level expression presented with a low survival rate in African American patients. A p value less than 0.05 is considered to be significant tumor tissues [41]. However, in a study by Neil et al., it was found that even though the SPARC was upregulated in head and neck cancer patients, it tended to increase the accumulation of albumin in the tumor, thus assisting in the increase in the effectiveness of albumin bound Paclitaxel, a chemotherapeutic drug and thus aid in better clinical outcomes of SPARC positive patients with poor prognosis [42]. Another research on the association between SPARC and anticancer drugs by Trieu et al. yielded similar result [43]. Computational approach has long been used to screen for alterations in candidate genes involved in the crucial biochemical pathways leading to the disease phenotype [44]. Hence the same methodology has been used in this study which shows that the expression of SPARC gene variants in patients with HNSC.

CONCLUSION
The markers identified in the present study have to be screened in specific populations to derive an association between the genetic markers and HNSCC. This will open new avenues towards identification of potential diagnostic and therapeutic leads. Therefore, results of the computational process require further experimental validation to prove the association factor.