Identification of Key Genes Between Lung Adenocarcinoma and Lung Squamous Cell Carcinoma by Bioinformatics Analysis

335 2576-2508/ C AUDT 2020·http://www.AUDT.org This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed. Identification of Key Genes Between Lung Adenocarcinoma and Lung Squamous Cell Carcinoma by Bioinformatics Analysis

N on-small cell lung cancer (NSCLC) accounts for approximately 85% of LC [1], which the morbidity and mortality rank first in malignant tumors [2]. The incidence rates of lung adenocarcinoma (LUAD, approximately 50% of cases) and lung squamous cell carcinoma (LUSC, approximately 30% of cases) comprises 70% of NSCLC, which are the predominant histological subtypes of NSCLC [2][3][4]. Until recently, surgery is the recommended treatment for stage Ⅰ-Ⅱ NSCLC patients, 5-year survival of clinical stage Ⅰ NSCLC and pathological stage Ⅰ NSCLC after surgical resection can reach 68-92% and 73-90%, respectively [5,6]. However, due to lack of early effective diagnostic methods, 70% of LC patients are diagnosed at advantage stage when the tumor lose the opportunity of surgical resection [7]. Experimental chemotherapy is the main treatment method for these patients, though the clinically outcomes are poor.
Attributed to the development of next-generation sequencing, the treatment strategies for advanced NSCLC patients have evolved to effective regimens that are targeted to specific molecular subtypes and specific genomic abnormalities of tumors [8,9]. According to statistics, up to 69% of advanced NSCLC patients have a potentially actionable molecular target [10]. Based on cell origin and different growth patterns, the specific targeted treatment produces modest improvements in survival for NSCLC patients [11][12][13][14][15][16]. However, Targeted treatment is limited to certain subsets of NSCLC patients, and patients with different histological subtypes and molecular abnormalities have different response to the same targeted therapy medications. For example, epidermal growth factor receptor (EGFR)-tyrosine kinase inhibitors (TKIs) had great curative effect for patients with EGFR-mutated LUAD, but almost ineffective in EGFR-mutated LUSC [15][16][17]. Therefore, further progress is expected to discovery new and potentially great genotype-directed therapy targets for more subgroup's patients with NSCLC, which could be made a notable impact in outcomes for advanced NSCLC patients.
Since gene microarray and bioinformatics methods are broadly used in cancer researches, further identify the pathogenesis and search for potential specific diagnostic, prognostic markers and therapeutic targets have become possible. As a result, it is important to clarify the differentially expressed genes (DEGs), biological processes and molecular mechanisms of LUAD and LUSC from the perspective of biomedicine. In this study, we explored DEGs by comparison the genes expression profiles of tumor tissues between LUSC and LUAD patients from Gene Expression Omnibus (GEO) database (GEO database). The DEGs were conducted gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Furthermore, Protein-protein interaction (PPI) network, module analysis, and survival analysis were performed.

Microarray data
GSE10245 was downloaded from the GEO DataSets, annotated by Human Genome U133 Plus 2.0 (Affymetrix, USA), and the chip platform was GPL570. A total of 58 tissue samples were collected, including 40 LUAD and 18 LUSC.

Data analysis and identification of DEGs
Software R x64 (version 3.4.3) was used to analyze the GSE10245 expression profile dataset, identify the DEGs compared LUAD with LUSC, plot heatmap and volcanic map. The DEGs screening criteria in this study were adjusted P value<0.05 and |logFC|>1.

GO and KEGG analysis of DEGs
GO and KEGG enrichment analysis were applied by Visualization and Integrated Discovery (DAVID, https:// david.ncifcrf.gov/) program. P < 0.05 was regarded statistically significant.

PPI network establishment and modules selection
STRING database (https://string-db.org/) was used to infer the PPI information. Entering all of DEGs names in STRING for a search, the combined score>0.9 was considered significant. Molecular Complex Detection (MCODE), a plus in Cytoscape version 3.6.1, was used to screen the modules from PPI network. Degree Cutoff =3, Node Score Cutoff=0.2, K-Core=2, Max.Depth=100 were set as the cut-off criterion. Modules with MCODE score>8 were presented. Eligible hub genes were selected as potential key genes.

Survival analysis
The Kaplan-Meier plotter online tool (http://kmplot. com/) was applied to assess the effect of the eligible hub genes on survival [18]. The Logrank test P < 0.05 was statistically significant.

Identification of DEGs
A total of 784 DEGs were identified, 517 genes were upregulated and the remaining 267 genes were downregulated. Then we plotted heatmap of the top 100 DEGs (top 50 upregulated and downregulated genes), as discribed in Figure 1. All of the 20460 genes were represented on the volcano map, as presented in Figure 2. Red dots represented 784 DEGs and black dots represented the remaining genes with no different expression.

GO and KEGG analysis of DEGs
All the DEGs names were submitted to DAVID (https://david.ncifcrf.gov/) for GO and KEGG analysis, and we found statistically significant GO analysis 201 items and KEGG pathways 17 items. For the biological process (BP), DEGs were significantly aggregated in epidermis development, keratinocyte differentiation, negative regulation of endopeptidase activity, establishment of skin barrier, keratinization and peptide cross-linking, as shown in table 1. For the cellular component (CC), DEGs were significantly assembled in the extracellular space, cornified envelope, extracellular exosome, extracellular region, apical plasma membrane and apicolateral plasma membrane, as shown in table 1. For the molecular function (MF), DEGs were markedly enriched in serine type endopeptidase inhibitor activity and structural molecule activity, as shown in table 1. KEGG analysis showed that DEGs were significantly enriched in cell cycle, as shown in table 1.

N S C L C h a s p a r t i c u l a r l y h i g h i n t r a t u m o r a l heterogeneity due to distinct molecular features.
Understanding the molecular mechanisms of NSCLC is critical for diagnosis, treatment and prognosis. In this study, we identified 784 DEGs either upregulated or downregulated in all compared LUAD with LUSC. Function annotation revealed statistically significant GO analysis 201 items and KEGG pathways 17 items. Through constructing PPI network, ten hub genes were considered to be the key DEGs. Three downregulated (P2RY1, CHRM3, and LPAR3) genes and two upregulated (NMU and S1PR5) genes were significantly related to worse overall survival in LUAD patients.  For cellular component (CP), DEGs were markedly located in the extracellular exosomes, extracellular space, cornified envelope, and extracellular region, as shown in table 1. Exosomes, which contain proteins, nucleic acids, lipids and the encapsulated contents, play an vital role in cell communication, tumor progression, drug resistance, and recent years have also play an important role in liquid biopsy of NSCLC [19][20][21]. It provides a new basis for early diagnosis, precision medicine and prognosis of NSCLC. However, the study of exosomes still is in their infancy, whether it can be applied in clinic remains need to be confirmed by further research. For molecular function (MF), DEGs were mainly promoted serine type endopeptidase inhibitor activity and structural molecule activity. Serine type endopeptidase inhibitor is a subtype of serine protease inhibitor, which is termed serpins. Serpins sit at intersections of normal physiology and pathology, both ensuring normal biological function but also, when dysfunctional, leading to disease [22][23][24][25]. It is involved in a variety of biological processes, such as inflammation, immune response, cell invasion, tissue remodeling and cancer development [22,[26][27][28]. With cancers, altered levels of serpins and related proteases are correlated with invasion and metastasis, even can serve as markers for aggression [29]. For biological process (BP) enrichment, DEGs were mainly involved in epidermis development, keratinocyte differentiation, keratinization and peptide cross-linking. It mainly refers to the process of epidermal cells from formation to maturity over time. In vivo, epithelial cells and mesenchymal cells can be converted under certain conditions during the epithelia cells development. The epithelial-mesenchymal transition (EMT) is critical for cancer progression [30][31][32][33]. EMT is driven by transcriptional factors, results in enhancer migration and invasive potential of epithelial cells, and is critical for the metastatic spread of epithelial tumors [30,34,35]. gene, so its role in NSCLC is less studied. S1PR5 is a constitutively G protein-coupled receptor with the ability to inhibit cell proliferation and induce cell migration in cancer by Gα12/13-evoked stimulation of Rho, suggesting an inhibitory role of S1PR5 [52][53][54]. S1PR5 has also been shown to have pro-survival effect by inducing autophagy and S1PR5 activation was found to be involved by applying small interfering RNA and dihydro-S1P in cancer [55,56].

Conclusion
In conclusion, our study discovered a series of key targets for further research on molecular mechanisms of LUAD by a comprehensive bioinformatics analysis, which provides new therapeutic and prognostic clues for LUAD patients, Other than bioinformatics exploration,    P2RY1 encodes a G-protein couples receptors, adenine di-phosphate (ADP) is a physiological agonist for P2RY1 [36]. P2RY1 couples to phospholipase C, which triggers Ca 2+ release from the intracellular reservoir, leading to reversible platelet aggregation [37]. The P2RY1 polymorphism might participate in the control of the various physiological functions, and negative regulation of P2RY1 in the multidrugchemoresistance of bladder cancer cells was reported [38,39]. The copy number variation (CNV) frequencies for pharmacogenes using The Cancer Genome Atlas dataset showed that P2RY1 may play a role in determining individual variations in drug responses in NSCLC patients [40]. CHRM3 is one of the muscarinic receptors and plays an important role in many kinds of cancer, such as breast cancer, colon cancer, and lung cancer [41]. Experienced study revealed that CHRM3 highly expressed in NSCLC and enhanced the expression and activity of matrix metalloproteinase 9 (MMP9) through PI3K/Akt, which regulated the transcription of MMPs and EMT-related genes in NSCLC [42,43]. It has demonstrated overexpression of CHRM3 in endometrial carcinoma patients, as an prognostic factor, had a shorter overall survival time [44]. NMU encodes a member of the neuromedin family of neuropeptides and is associated with cell motility, invasion, and resistance to anoikis [45]. In NSCLC pathogenesis, NMU might be a target of the lung metastasis suppressor effect of RhoGD12 and RhoGD12 reconstitution in NSCLC cells without expression resulted in lung metastasis suppression [46]. In Shuangjie You's study, it showed that NMU was the candidate promoting the tumorigenesis and alectinib resistance in NSCLC [47]. Sweta Rani et al reported that NMU was a candidate prognostic indicator and therapeutic target for HER2-overexpressing tumors, which associated with poor outcomes and effected sensitivity to lapatinib, trastuzumab, neratinib and afatinib in patients with HER2 overexpressing tumors [45]. LPAR3 indicates several biological functions, such as cell proliferation, migration, and tumor progression [48]. LPAR3 abnormally expressed in many cancer cells, such as ovarian cancer, colorectal cancer, and LC [49][50][51]. The developmental study found that aberrant expression of LPAR3 due to aberrant DNA methylation induced inhibition of lung cancer cell migration in rat, which suggested that LPAR3 may be promising chemotherapeutic target gene and chemoprevention for LC [51]. S1PR5 is the most recently deorphanized functional researches that further delineate the effects of the key DEGs are needed to validate their specific roles in the pathogenesis of LUAD, even and in the diagnosis, individualized treatment, and prognosis of LUAD.