Design of Novel Selective Estrogen Receptor Inhibitors using Molecular Docking and Protein- Ligand Interaction Fingerprint Studies

Aims: The genomic and non-genomic actions of human estrogen receptor α (hERα) play a crucial role in breast epithelial cell proliferation and survival, as well as mammary tumorigenesis. hERα has been proved as a potential target for breast cancer therapy over the last 3 decades. Hence designing novel inhibitors targeting hERα can be a valuable approach in breast cancer therapy. Study Design: In the present study, the goal is to identify novel hERα inhibitors through molecular docking, AI based virtual screening and interaction fingerprint analysis. Place and Duration of Study: Department of Bioinformatics, Sri Venkateswara Institute of Medical Sciences, Tirupati, Andhra Pradesh, India in between July 2021-sep 2021. Methodology: Molecular docking studies were performed using the human estrogen receptor alpha ligand-binding domain in complex with 4-hydroxytamoxifen (PDB: 3ERT) against existing compounds from literature. The best docked existing compound and co-crystal ligand were subjected to shape screening against 28 million compounds and resulted compounds constituted Original Research Article Rajitha et al.; JPRI, 33(46A): 470-483, 2021; Article no.JPRI.75474 471 the hERα inhibitor dataset which was subjected to rigid receptor docking. Further, interaction fingerprint analysis was applied as complimentary method to docking for comparing the similarity of predicted binding poses of proposed leads with that of reference binding pose. Results: Co-crystal ligand (4-OHT) and E99 exhibited better binding affinity among existing ligand set. Rigid receptor docking studies resulted in four lead compounds possessing better docking scores than 4-OHT and E99. Moreover, leads showed favorable absorption, distribution, metabolism, excretion and toxicity properties within the range of 95% FDA approved drugs. Proposed leads showed interactions with binding site residues of hERα similar to that of 4-OHT with better binding affinity. The ability of obtained leads to retrieve actives was validated using receiver operative characteristic (ROC) curve. Conclusion: From above results, it has been observed that the proposed leads have potential to act as novel hERα inhibitors.


INTRODUCTION
Breast cancer refers to the erratic growth and proliferation of cells which undergo changes in their molecular characteristics that originate in the breast tissue [1]. Breast cancer accounts for 14.8% of cancers in Indian women, it is reported that an Indian woman is being diagnosed with breast cancer for every four minutes, and is one of the leading causes of cancer death in women [2][3][4]. Frequently occurring breast cancers are infiltrating ductal carcinoma, infiltrating lobular carcinoma; lobular carcinoma in situ; ductal carcinoma in situ [5].
Breast cancers were categorized into 4 types using genetic information about breast cancer cells. These breast cancer groups include: a) Group-1 (luminal A) tumors that are estrogen receptor (ER) positive and progesterone receptor (PR) positive, but human epidermal growth factor receptor 2 (HER2) negative. b) Group-2 (luminal B) tumors are ER positive, PR negative and HER2 positive. C) Group-3 (HER2 positive) tumors are ER negative and PR negative, but HER2 positive. D) Group-4 (basal-like) is also known as triple-negative breast cancer, that are ER negative, PR negative and HER2 negative [6]. As stated by American Cancer Society, about 2 out of 3 cases of breast cancer are hormone receptor-positive [7]. Among them ERα-positive breast cancer is the most common type (70%). ER-α, a major subtype of ER, plays crucial role in breast cancer cell proliferation, invasion and apoptosis [8]. Most ERα ligands target the ligand binding domain. Two distinct activation functions (AFs), AF-1 of N-terminus, AF-2 in the ligand binding domain promotes transcriptional activation by ERα. AF-1 activity is regulated by growth factors involved in the Mitogen-activated protein kinase pathway, while AF-2 function is responsive to the ligand binding i.e. agonists stimulate AF-2 activity while antagonists does not stimulate AF-2 activity [9].
Selective estrogen receptor modulators (SERMs) are used for preventing the effect of estrogens in breast tissues. These compounds bind to the ER similar to natural estrogens so that estrogen cannot attach to a breast cell, thus preventing estrogen's signals to cell to grow and multiply [10]. Currently available SERMs include raloxifene, lasofoxifene and bazedoxifene [11]. Known SERM tamoxifen binds with ER and prevents growth of breast tissue [12]. Unfortunately, it increases the risk of blood clots in legs and lungs and also increases the risk of developing endometrial cancer [13]. Aromatase inhibitors like exemestane, anastrozole and letrozole inhibit aromatase thus minimizes estrogen production. But, these drugs associated with risk of fractures and osteoporosis. Chemotherapy may also cause various side effects and suppress the production of blood cells causing anemia/bleeding or weakens immune system [14]. Hence, there is a need for development of novel safe and effective anti-estrogenic agents.
Machine learning and artificial intelligence-based methods for ligand discovery have received a significant attention in both academic research and industry due to high cost of clinical trials and low success rate (6.2%), to minimize the cost and also to identify potential therapeutic agents [15]. In modern drug discovery, computational techniques such as molecular docking and virtual screening approaches minimizes the initial level of research to discover potential drug candidates for a disease [16]. Deep learning algorithms can understand the patterns within existing dataset and play vital role with virtual screening. The drug discovery process applies algorithms to understand the pattern of different chemical compounds with diseases [17]. Interaction fingerprints are binary representations of proteinligand complexes and these are valuable tools in virtual screening to identify novel hits that can be eventually optimized to drug candidates [18].
As there are no ideal antiestrogenic agent available for breast cancer therapy, the present study focused on designing novel estrogen receptor inhibitors using molecular docking, artificial intelligence based virtual screening, interaction fingerprint and receiver operative characteristic (ROC) metrics.

Ligand dataset collection and preparation
Selective estrogen receptor inhibitors were retrieved from literature. Structure of co-crystal ligand 4-hydroxytamoxifen (4-OHT) was retrieved from PDB database. Ligand preparation was done using LigPrep application with Epik module of Schrödinger, including enhancement of ligand stereochemical nature, protonation states, developing tautomers and finally energy minimization of the ligand using the force field OPLS_3 at pH7.0 ± 2.0 [19,20].

Protein preparation
Among the available crystal structures, the best resolute X-ray structure of human estrogen receptor alpha (hERα) ligand-binding domain cocrystallized with 4-hydroxytamoxifen (PDB: 3ERT) was considered in this study to propose antagonists through virtual screening, molecular docking and protein-ligand interaction fingerprint analysis. Estrogen receptor crystal structure was imported to Maestro and then prepared using protein preparation wizard of the Schrödinger by addition of hydrogen atoms, bond order and formal charge corrections, removal of atomic clashes, tautomeric alterations and ionization states of protein. The hydrogen bonding network was optimized by reorienting the hydroxyl and thiol groups in the protein and performed other operations that are not part of the X-ray crystal structure refinement process. Optimization of protein was done at neutral pH and then the energy minimization was done by applying optimized potentials for liquid simulations (OPLS-3) force field for all atoms. A receptor grid was generated around inhibitor binding site residues of the estrogen receptor using Glide v7.1. Using the grid region, the undesirable water molecules were removed from the inhibitor binding site of target protein using Protein preparation wizard [21].

Glide XP Docking
Docking is a procedure to predict the preferable binding orientation between the two molecules to form a stable complex. There are many steps in which artificial intelligence methods come into play in the docking, such as feature selection & extraction, classification & regression for the design of scoring functions and recognition of binding sites. Few examples include applications of probabilistic Naïve Bayes methods to improve docking scoring functions, applications of neural networks to virtual screening in combination with docking etc. [22]. GLIDE (grid-based ligand docking with energetics) XP (extra precision) docking procedure was adopted for analyzing the binding affinity between the protein and ligand. The prepared and optimized ligands were flexibly docked into the grid box generated around inhibitor binding site residues of the protein using Monte Carlo-based simulated algorithm minimization method [23]. Glide Score (Gscore) was used for representing binding affinity, binding orientation and ranking. Docking was implemented to retain the best molecules with better binding affinity and good binding orientation without steric clashes. 10 poses were generated during XP docking for each ligand and the best pose was retained after post-docking minimization.

Exploring in-House Library
Based on the XP Gscore, the best docked existing compound and co-crystal ligand were screened against inhouse library molecules containing more than 28 million compounds of eMolecules, ChemBank, ChemPDB, KEGG ligand, Unannotated NCI, Anti-HIV NCI, Drug likeness NCI, AkosGmbh, Asinex, and TimTec databases [24]. The best docked existing compound and co-crystal ligand were imported as hERα inhibitor dataset, to Maestro v11.1 for virtual screening workflow to dock with hERα using Glide with defined pH range 7.0±2.0 by applying Qik Prop v5.1, Lipinski's filter and reactive filter. The best docked existing compound, co-crystal ligand and resulted compounds from shape screening against 28 million compounds constituted the hERα inhibitor dataset.

Virtual Screening
Human ERα inhibitor dataset was docked with inhibitor binding site residues of estrogen receptor. Schrödinger virtual screening workflow uses three flexible docking methods, namely Glide high throughput virtual screening (HTVS), standard precision (SP) and extra precision (XP) docking [25]. The top ligand molecules obtained through XP docking were compared with the cocrystal ligand to propose novel hERα inhibitors which constitute proposed leads. The obtained leads were analyzed for pharmacological descriptors and ADME/T properties.

Generation of Interaction Fingerprints
Docking interactions of proposed leads were further evaluated by interaction fingerprint analysis to observe whether leads exhibited similar interactions when compared with that of co-crystal ligand. Interaction fingerprint was generated for the proposed leads and co-crystal ligand docked complexes that translates 3D structural binding information from a proteinligand complex into a one-dimensional binary string [26]. Each fingerprint represents the "structural interaction profile" of the complex that can be used to organize, analyze, and visualize the rich amount of information encoded in ligand receptor complexes. Value 1 represents the given interaction is established and 0 denotes absence of specific interaction [18].

Evaluation of Virtual Screening
Based on docking scores and interaction fingerprint analysis, the proposed leads were further validated by enrichment factor (EF) metrics. An internal library containing 1005 compounds was created with 1000 decoys from Schrödinger along with proposed leads, and the best co-crystal ligands among existing ligands of hERα as actives. The discriminative ability of the generated leads was evaluated by distinguishing the active compounds from the internal library consisting both actives and decoys. The leads and co-crystal ligand were screened against the internal library. The ligand dataset was docked to the hERα and the results were analyzed using EF at N% of sample size and BEDROC (α 20) metrics for validating whether the best docked compounds are reliable in picking the leads as actives [25].

Ligand sets
The structures of 129 existing estrogen receptor inhibitors were retrieved from literature, and co-crystal ligand 4-OHT structure was retrieved from PDB database for ligand set preparation. All the ligands were prepared using Lig Prep application with Epik module of Schrödinger.

Protein preparation and binding site analysis
The ideal protein 3ERT was selected to define the antagonist behavior of leads because 3ERT has ligand binding domain to show antagonist behavior of 4-OHT. 4-OHT is the active metabolite of tamoxifen which blocks the AF-2 activity by disrupting the topography of the AF-2 surface in hERα thus preventing transcriptional activation. The energies of prepared crystal structure of hERα were minimized by protein preparation wizard. Inhibitor binding site residues were defined around grid generated within the hERα protein for further study as these residues contribute to the structural and functional properties of hERα. The hERα -4-OHT complex inhibitor binding site comprises residues such as Met 343, Leu 346, Thr 347, Ala 350, Glu 353, Trp 383, Leu 384, Leu 387, Arg 394 and Leu 525 within the 4 Å region surrounding 4-OHT. The binding site residues of hERα were defined 4 Å around co-crystallized inhibitor using PD Bsum [27].

Identification of the Best Existing Ligand
About 129 existing ligands from literature and one co-crystal ligand 4-hydroxytamoxifen (4-OHT) were docked with hERα binding site residues. Among all the docked complexes, 4-OHT possesses least XP Gscore of -11.897 Kcal/mol with good bonded and non-bonded interactions with inhibitor binding site residues followed by existing ligands E99 with XP Gscores -9.608 Kcal/mol (Table 1).

Interactions of 4-OHT, Proposed Leads and E99 with hERα
4-OHT formed two hydrogen bonds with side chain residues of Glu 353 and Arg 394 (Fig. 1). RRD strategy revealed that lead 1 conformation is having better binding affinity than the cocrystal ligand and obtained leads. The lead1 showed good binding affinity due to various interactions with inhibitor binding site such as hydrogen bonding, polar, hydrophobic, electrostatic and stearic interactions with key interacting residues with XPGscore of -12.668 Kcal/Mol [25].
formed four hydrogen bonds, in which one hydrogen bond observed with side chain residue of Glu 353 and three hydrogen bonds were observed with backbone residues of Ala 350, Met 528, and Leu 387. Lead 1 also involved in two pipi stacking interactions with Trp 383 (Fig. 2). Literature study revealed that ligand-protein complexes stabilized by multiple aromatic interactions involving tryptophan residue and it was found that pi-pi stacking is essential for the favorable electron correlation, whereas cation-π contacts produce further electrostatic contributions.

Interaction Fingerprint Analysis
A 9 bit interaction fingerprint was generated to describe 3D protein-ligand interaction of proposed leads and 4-OHT in the inhibitor binding site of hERα. Each bit of the fingerprint represents a pharmacophore feature and has 0 or 1, which means presence or absence of the features in the conformation (Table 4)

Lead Validation
Four proposed leads and co-crystal ligand 4-OHT were taken as a query to screen against the internal library of 1000 decoys and actives resulted from XP docking with hERα, yielded 60% of known actives which were within EF1% of internal library that comprises both actives and decoys (EF1% = 60). ROC metric corresponds to the position of actives to the orderly ranked compounds that are linearly arranged among the internal library defined. Truchon and Bayly considered ROC with ≥ 0.7 as a suitable execution measuring value (where, ROC was limited to 0-1) [25]. In the present study all the known actives were retrieved with ROC of 0.98 relative to the hERα inhibitor ligands in the virtual screening. BEDROC (α=20) metrics measures the early recognition enrichment of actives among the ranked compounds from the internal database. BEDROC value of 0.94 is a beneficial value that embodies the magnitude of early recognition of actives from the ranked compounds in the internal library. Deduced EF, ROC and BEDROC values were 100 (EF1%), 0.98, 1.0 (α =20.0) respectively, explains that four leads were efficacious and sufficient in retrieving the active compounds. The enrichment curve graphically represents quality of retrieved actives which were ranked after comparing to decoys in the internal library (Fig. 7).

CONCLUSION
Binding of estrogen such as estradiol to ERα induces tumor growth in most ERαpositive breast cancer cell lines, selective estrogen receptor modulators prevent estrogenresponsive breast cancers by targeting the ERα. In present study, 3ERT crystal structure was selected for docking and interaction fingerprint analysis to design novel estrogen receptor inhibitors. The analysis of hERα-4-OHT complexes revealed key amino acids present in the binding site of hERα that are important for ligand binding. The best docked existing ligand and co-crystal ligand were used for machine learning based virtual screening against in-house library to efficiently find potential lead molecules among millions of compounds and rigid receptor docking was performed with the generated library of hERα inhibitors. Four leads were finally obtained as the outcomes of the study with better binding affinity in terms of XPG scores, good structural properties with molecular contacts, pharmacological properties than the existing compounds and co-crystal ligand. Interaction fingerprint analysis further confirmed that proposed leads exhibited interaction pattern similar to that of co-crystal ligand with increased binding affinity and favorable orientation towards hERα, so that critical binding sites were blocked in turn to reduce the activity of hERα in estrogendependent tumor growth. The obtained leads from rigid receptor docking studies were analyzed for ADME/T properties to propose the leads. The proposed leads were also validated and ranked better than the existing co-crystal ligand and decoys in ROC metrics. As the proposed ligands are novel and can be synthesized in the lab and further they will be subjected for evaluation of antiestrogenic activity by in vitro. Hence these leads were proposed as novel inhibitors against hERα.

DISCLAIMER
The products used for this research are commonly and predominantly use products in our area of research and country. There is absolutely no conflict of interest between the authors and producers of the products because we do not intend to use these products as an avenue for any litigation but for the advancement of knowledge. Also, the research was not funded by the producing company rather it was funded by personal efforts of the authors.

CONSENT
It is not applicable.

ETHICAL APPROVAL
It is not applicable.