Department of Pharmaceutical Chemistry, Ashokrao Mane College of pharmacy, Peth-Vadgaon / Shivaji University 416112, Maharashtra, India
Background: Mutations in the BRCA1 gene significantly contribute to hereditary breast cancer, primarily by disrupting its role in DNA damage repair. The BRCT domain (PDB ID: 1T15) of BRCA1 is particularly crucial for maintaining genomic integrity, making it a compelling target for therapeutic intervention. Objective: This study aimed to identify potential FDA-approved drugs capable of binding effectively to the mutated BRCA1 (1T15) receptor using structure-based virtual screening and molecular dynamics simulations. Methods: The BRCA1 BRCT domain structure was refined using PDB-REDO, enhancing its stereochemical quality and validation scores. Virtual screening was conducted using DrugRep against a library of 91 FDA-approved compounds. The top hits were evaluated for protein-ligand interactions and subjected to molecular dynamics analysis via iMODS, including deformability, B-factor, covariance matrices, and Eigenvalue -based assessments. Results: Refinement led to substantial improvements in model quality metrics, including rotamer normality (from the 52nd to the 91st percentile) and bump severity (from the 16th to the 76th percentile). The top-scoring compound, Ponatinib, showed a strong binding affinity of –8.9 kcal/mol and formed stable interactions within the largest cavity (834 ų). MD simulations confirmed the stability and flexibility of the complex, highlighting functionally dynamic regions and potential long-range interactions essential for ligand-induced conformational changes. Conclusion: The study underscores the effectiveness of integrating protein structure refinement, virtual screening, and molecular dynamics for rapid drug repurposing. Compounds like Ponatinib exhibit strong potential for targeting mutated BRCA1 in breast cancer, warranting further experimental validation.
Breast neoplasms comprise the highest numbers of diagnoses made in women and are also a major cause of death associated with cancer in women worldwide [1]. Advances have been made in terms of diagnostics and treatment, but therapy resistance, recurrence of disease and genetic heterogeneity persist as challenges to effective treatment [2]. Mutations in the Breast Cancer Type 1 Susceptibility Protein (BRCA1) are one of the most important genetic causes of breast cancer [3]. BRCA1 is tumor suppressor protein [4]. This protein has to be provided because it keeps the genomic integrity intact [5]. It is concerned with several cellular processes such as the repair of DNA damage (particularly through homologous recombination, transcription regulation, chromatin remodeling, and cell cycle control within a cell [6]. Mutations of this gene lead to impairment of these protective mechanisms that accumulate DNA damage and instability in the genome and susceptibility to cancers [7]. The BRCT (BRCA1 C-Terminal) domain is one of the most critical functional regions of the BRCA1 protein; this region recognizes and binds phosphoprotein partners, such as BACH1 helicase, in a DNA damage response pathway [8]. BACH1, a helicase, constitutes a major part of the BRCT domain of BRCA1 as represented by the 1T15 crystal structure that shows it bound with a phosphorylated peptide derived from BACH1 [9]. This structural complex is essential for understanding how BRCA1 interacts with other DNA repair proteins and how modified mutations can disrupt these interactions, causing inefficient DNA repair and cancer susceptibility [10]. Then BRCT domain is associated with tumor suppression and is therefore a promising candidate for the therapeutic intervention [11]. Conventional drug discovery methods, though, take a lot and cost much time [12]. New drug repurposing - that is finding new therapeutic purposes for old existing drugs - became quite useful, especially in oncology, as it enables one to depend on existing drugs approved for less toxicity at the early stages of drug development. Otherwise, there would be great savings in time and money in developing drugs [13]. The present study leveraged computational drug repurposing to identify potential FDA approved drug candidates that can bind and inhibit the BRCT domain of BRCA1 protein [14]. The structure of 1T15, was then followed up to ensure its structural accuracy through rigorous refinement and validation processes [15]. The refined structure thus obtained served as a receptor for additional structure-based virtual screening against a database of FDA approved compounds [16]. The obtained hits were further analyzed for their binding interactions and also dynamic stability using molecular dynamics simulation tools [17].Aiming to discover novel information regarding the therapeutic targeting of BRCA1-deficient breast cancer using safe, preapproved drugs, thereby hastening the process of effectiveness induction into treatment intervention [18].
The global burden of the disease calls for new therapies focused again on the genetically driven types of cancers like BRCA1 for instance [19]. The necessity for therapeutic exploitation arises from the possible action of the BRCA1 gene in DNA repair; mutations thereby generate certain specific vulnerabilities that can be exploited [20]. The reason the BRCT domain (1T15) stands out as a target is that it controls most of the protein interactions, and with structural data, one can design good drugs for inhibition [21]. Theoretically, the computational screening of FDA-approved drugs provides the most reasonable trade-off between speed and safety and efficiency [22]. In focusing on repurposing, the study contributes to the broader trend of drug repositioning in oncology, where established drugs such as metformin or aspirin have shown unexpected anticancer potential [23]. The two sides of targeting mutated BRCA1 are: restore the function of the heterozygous carriers and exploit synthetic lethality in homozygous mutant tumors [24]. Despite these reasons, challenges that would mitigate the benefit of targeting mutated BRCA1 include mutation heterogeneity, off-target effects, and drug resistance [25]. These challenges can, however, be managed through computational tools that predict the most promising drug-BRCT interactions with very high precision, albeit with pitfalls like not all possible mutations being taken into account and simplistic binding simulations. Innovations in future studies may include incorporating machine learning to further refine drug selection or having trials with combination therapies to offset resistance [26]. In conclusion, this study utilizes the molecular insights of BRCA1's BRCT domain and the efficiency of computer screening as a means to find FDA approved drugs for repurposing [27]. In this way, it hopes to fill the urgent void in therapy for BRCA1-mutated breast cancers, advancing personalized medicine and perhaps providing better outcomes for those afflicted by this horrible disease [28].
2. LITERATURE REVIEW
1. BREAST CANCER
Author(s) & Year |
Main Findings |
Research Gaps |
References |
Katsura et al. (2022) |
Breast cancer remains the most common malignancy worldwide; medical education lacks sufficient exposure to breast cancer diagnosis and referral pathways. |
Need for improved medical training and curriculum updates for early detection and management. |
[29] |
Kashyap et al. (2022) [Retracted] |
Discussed rising global breast cancer incidence, risk factors, and advanced diagnostic tools like AI and liquid biopsy. |
Retraction raises concerns over data integrity; further validated research is required on AI and liquid biopsy in diverse populations. |
[30] |
Wilkinson &Gathani (2022) |
Breast cancer incidence correlates with economic development; WHO's Global Breast Cancer Initiative focuses on prevention, early diagnosis, and treatment. |
Barriers in implementing global initiatives in low- and middle-income countries due to infrastructure and healthcare access limitations. |
[31] |
Roy et al. (2023) |
Classified breast cancer into four molecular subtypes (Luminal A, Luminal B, HER2-enriched, Basal-like); linked molecular subtypes with imaging features. |
More research is needed to refine imaging markers for each molecular subtype and validate their prognostic significance. |
[32] |
Li et al. (2022) |
Progesterone receptors (PR) play a key role in breast cancer prognosis, especially in hormone-positive subtypes like Luminal A. |
Further studies are needed to explore PR’s role in endocrine therapy resistance and its predictive value. |
[33] |
2. IMODE AND MOLECULAR DISCOVERY
No. |
Authors (Year) |
Main Findings |
Research Gaps |
References |
1 |
Crampon et al. (2022) |
ML and DL are enhancing molecular docking accuracy, improving pose prediction and scoring functions. |
Lack of standardized protocols and benchmark datasets for validating ML models across various protein families. |
[34] |
2 |
Ye et al. (2021) |
Identified quercetin and related compounds from Yinchen Wuling powder showing stable docking with PTGS2, implying lipid metabolism involvement. |
No experimental validation; results are based solely on computational predictions. |
[35] |
3 |
Stanzione et al. (2021) |
Docking is now widely integrated with other computational tools to improve hit identification and lead optimization. |
Reproducibility issues due to variability in docking protocols; lack of consensus on best practices. |
[36] |
4 |
Luo et al. (2023) |
Fenugreek compounds like diosgenin and quercetin target key diabetes-related genes; validated via in vitro glucose uptake assays. |
Incomplete pharmacokinetic and toxicity assessment; limited biological validation beyond glucose uptake. |
[37] |
5 |
Dong et al. (2021) |
Epicedium components interact with depression-related targets (IL6, AKT1); quercetin and kaempferol show strong docking interactions. |
Absence of molecular dynamics simulations and experimental studies to support binding stability and biological relevance. |
[38] |
3. Drug Repurposing
Drug repurposing has emerged as a promising strategy to accelerate drug development, especially for diseases with limited treatment options. Roessler et al. (2021) emphasized that although over 7000 rare diseases have been identified globally, less than 6% have approved therapies. They highlighted drug repurposing as a cost-effective and time-saving alternative to de novo drug development, offering higher success rates. However, the authors noted significant challenges, including limited commercial incentives, insufficient patient registries, and lack of systematic methodologies for rare disease indications [39].
In oncology, Xia et al. (2024) reviewed the therapeutic potential of repurposed drugs for cancer treatment. They discussed how repurposed agents target both tumor cells and the tumor microenvironment, and how integration with nanotechnology can enhance drug delivery. The authors also stressed the value of repurposed drugs in combination regimens to improve efficacy. Despite these advantages, they identified several gaps such as drug resistance, limited clinical translation, and a lack of robust validation models that slow the transition from preclinical to clinical use [40].
Issa et al. (2021) focused on computational approaches, specifically machine and deep learning, in accelerating drug repurposing for cancer. They described how omics data, combined with artificial intelligence, help uncover novel drug-target relationships. However, they also pointed out that despite computational progress, successful clinical translation remains rare. Challenges include algorithmic bias, overfitting, and the difficulty of experimentally validating in silico predictions [41].
Ghosh et al. (2022) explored drug repurposing in stroke therapy, pointing out the limitations of current FDA-approved options like t-PA and mechanical thrombectomy. They proposed repurposing marketed drugs with known safety profiles as a practical solution to address the narrow therapeutic window of conventional treatments. Nevertheless, they emphasized the lack of comprehensive clinical trials and mechanistic studies as a key barrier to implementation in stroke care [42].
Tan et al. (2023) evaluated the role of real-world data (RWD) in drug repurposing, identifying its growing use in hypothesis generation, validation, and safety assessments. They found that RWD can bridge gaps in traditional clinical trials and help uncover new indications. However, challenges such as fragmented data sources, poor clinical granularity, and susceptibility to bias and confounding were cited as significant limitations. The authors also stressed the need for clear regulatory guidance to support the use of RWD in repurposing efforts [43].
4. Molecular dynamics simulation study
Molecular dynamics (MD) simulation has become an indispensable tool in biomedical research, particularly for exploring the dynamic behavior of biomolecules and understanding drug interactions at the atomic level. Wu et al. (2022) provided a comprehensive review of MD applications in biomedicine, emphasizing its ability to capture conformational changes due to mutations or ligand binding, which are often elusive in traditional experimental methods. They discussed critical steps like protein structure preparation and the selection of appropriate force fields, while also introducing enhanced sampling methods. The review concluded that integrating MD with experimental approaches is essential for unraveling the structure-function relationship in proteins, yet highlighted a gap in the routine use of these techniques in clinical translational studies [44].
Similarly, Guo and Liu (2024) underscored the value of MD simulations in studying protein structures and dynamics. Although their study provided a succinct overview, it lacked in-depth analysis or case-specific demonstrations of MD application, signaling a need for more focused and application-driven MD studies in complex biological systems [45].
In a more specific context, Li et al. (2023) employed MD simulations alongside spectroscopic techniques to investigate the interaction mechanisms between lysozyme and hyaluronan. Their findings revealed that electrostatic interactions predominantly drive complex formation, with significant structural changes observed in lysozyme's α-helix and β-sheet regions. Importantly, residues such as ARG114 were identified as critical interaction points. While the study provided valuable insights into drug encapsulation and delivery, it pointed to a broader research gap in exploring similar complexes for pharmaceutical applications [46].
Expanding the scope, Maleki et al. (2021) discussed the role of molecular simulation in the evolving field of nanomedicine. They proposed that MD simulations will play a crucial role in nanomaterial design and therapeutic delivery strategies. However, the commentary nature of their article highlighted a gap in empirical data, calling for more experimental validations and in silico-nanomedicine integration studies [47].
Lastly, Dong et al. (2024) utilized MD simulations to investigate the atomic-scale synthesis process of graphene via rotating arc plasma. Their findings outlined distinct stages of graphene formation and the influence of precursor concentrations and temperatures. While this study advanced understanding in material sciences, it illustrated the broader utility of MD beyond biomedicine, yet also revealed the gap in cross-disciplinary application of MD principles and tools across fields like nanotechnology and structural biology [48].
3. OBJECTIVES OF THE WORK:
4. GRAPHICAL ABSTRACT
5. MATERIALS AND METHODS
5.1 Selection of compound
Compounds underwent a ranking procedure on the basis of their respective binding energy scores after completion of virtual screening [49]. The drugs showing the best performance in binding affinities were selected for subsequent analysis [50]. Among the compounds analyzed were the top three candidates: Ponatinib with a -8.9 kcal/mol (-7.86 kcal/mol binding energy as per literature) score (DrugBankId: DB08901), Pimozide with -8.5 kcal/mol (Drug Bank Id: DB01100), Umeclidinium with a -8.4 kcal/mol (Drug Bank Id: DB09076). These results imply a huge possible interaction between these molecules and the BRCT domain of BRCA1.
5.2Protein Structure Retrieval and Preparation
The crystal structure of the BRCT domain of BRCA1 protein in complex with the phosphorylated peptide of BACH1 helicase (PDB ID: 1T15) was obtained from RCSB Protein Data Bank (https://www.rcsb.org/). This structure represents an important region of BRCA1, involved in protein-protein interactions for DNA repair [51].
In order to ensure accurate structure and remove all probable errors present in the original PDB file, the structure was refined on the PDB-REDO Server (https://pdb-redo.eu/) [52]. This procedure was particularly helpful for optimizing the atomic coordinates and minimizing stereochemical inconsistencies [53].
Further validation consisted of runs on the SAVES v6.0 server, which employs different validation tools: Ramachandran Plot (PROCHECK) for dihedral angles of the protein backbone, Verify 3D for the 3D to 1D evaluation, ERRAT for analysis of non-bonded atom-atom interactions. Therefore, these measures were taken to ensure that the model of the recently built protein was indeed of high quality for subsequent docking and simulation studies.
5.3 Virtual Screening
In order to identify possible prohibitors of the BRCT domain, a receptor-based virtual screening approach was carried out with DrugRep web platform (https://drugrep.org) a comprehensive apparatus to perform structure-based screening against library of FDA-approved drugs.
The 1T15 protein structure refined has been used as an uploaded receptor [54]. Screening performed under FDA-approved drugs curated database is aimed at drug repurposing and lesser extensive safety validation [55]. Docking was performed using AutoDock Vina or equivalent molecular docking algorithm within DrugRep, which calculates the binding affinity of each ligand towards the receptor in kcal/mol [56]. The more negative the binding energy value, the interaction is stronger and more favorable [57].
5.4 Protein–Ligand Complex Analysis
The highest-scoring complexes were viewed and analyzed to understand their molecular interactions [58]. Software tools such as Discovery Studio Visualizer, PyMOL, or BIOVIA to probe the binding pocket and their visualized interactions [59].
Some interaction parameters explored were: Hydrogen bonding, hydrophobic interaction, Pi-pi stacking, salt bridges (if any).
Such analyses allowed determining how well each drug fitted into the binding site and what kind of interaction it had with surrounding amino acid residues.
5.5 Molecular Dynamics Simulation (iMODS)
In order to study the flexibility of the structure and stability of drug-bound BRCT complexes, the iMODS (internal mode analysis for protein-ligand complexes) server (http://imods.chaconlab.org/) was used for molecular dynamics simulation. iMODS is a lightweight yet powerful software tool for NMA of protein-ligand complexes [60]. The following dynamic attributes were evaluated: Molecular deformability: The expected flexibility of each residue in the structure. B-factor: They reflect atomic fluctuations and atomic thermal motion. Eigenvalues: They represent motion stiffness, the lower the eigenvalue the easier structure deformed. Variability analysis: Describes the relative motion of atoms during normal modes. Covariance map: Displays correlated and anti-correlated atomic motions. Elastic network model: Shows interactions and connections among atoms in the protein complex.
The simulations described here revealed insight into the conformational dynamics of the ligand-bound BRCT domain, indicative of the stability and likely biological efficacy for the chosen drug candidates [61].
6. RESULTS
6.1 Structure Refinement and Validation
6.1.1. PBD-REDO Results
Table 1: Comparison of Crystallographic Refinement and Model Quality between Original and PDB-REDO Structures
Metric |
Original |
PDB-REDO |
Crystallographic refinement |
||
R |
0.2058 |
0.1724 |
R-free |
0.2210 |
0.2106 |
Bond length RMS Z-score |
0.752 |
0.302 |
Bond angle RMS Z-score |
0.871 |
0.554 |
Model quality raw scores percentiles |
||
Ramachandran plot normality |
47 |
56 |
Rotamer normality |
52 |
91 |
Coarse packing |
53 |
58 |
Fine packing |
33 |
37 |
Bump severity |
16 |
76 |
Hydrogen bond satisfaction |
32 |
47 |
The table comparing validation metrics between the Original and PDB-REDO datasets shows that PDB-REDO refinement secures a dramatic increase in crystallographic structure quality across numerous parameters. R and R-free values drop from 0.2058 to 0.1724 and from 0.2210 to 0.2106, respectively, indicating improved fit to experimental data, with the bond length RMS Z-score and angle RMS Z-score falling from 0.752 to 0.302 and from 0.871 to 0.554, respectively, corresponding to improved stereochemistry. The model quality aspects also improve with rotamer normality from the 52nd to the 91st percentile, bump severity from the 16th to the 76th, and Ramachandran plot normality from the 47th to the 56th; these show modest improvements in coarse packing (53rd to 58th), fine packing (33rd to 37th), and hydrogen bond satisfaction (32nd to 47th). In summary, PDB-REDO refinement produces a more accurate and biologically realistic model, allowing it to effectively minimize steric clashes and maximize side-chain conformation-suitable for drug design and molecular simulations.
Figure1: Comparison of Model Quality Metrics between Original and PDB-REDO Structures
Box plots showing model quality metrics R-free, Ramachandran plot Z-score, and rotamer quality Z-score for the Original vs. PDB-REDO datasets (N=2474) accounting for resolution neighbors are displayed in the image. In the case of R-free, the PDB-REDO structures present a slightly lower median of approximately 22% as opposed to the Original's 23%, whereas both distributions range across approximately 14%-34%, which shows that there is some improvement in the fit of the model to the experimental data after refinement. In terms of the Ramachandran plot Z-score, a substantial improvement has been seen in PDB-REDO, with a median shift from about -1 to 0; this explains positive backbone conformations while wide variability (from -7 to +3) exists in both datasets. Z-scores for rotamer quality also improved significantly, with medians for PDB-REDO shifting from about -2 to 0, indicating that favorable side-chain conformations were obtained. The ranges for rotamer quality are similar (-7 to +3). In conclusion, PDB-REDO refinement improves model quality on all accounts; however, the most impact is seen on stereochemical measures like Ramachandran and rotamer scores, rendering the refined structures a better fit for structural biology efforts, although there remains great variability suggesting some structures still could be optimized further.
Figure 2: Kleywegt-like Ramachandran Plot Showing Dihedral Angle Distribution of Amino Acid Residues in Protein Structures
It is a Ramachandran plot, a popular diagram in structural biology that can be used to observe the dihedral backbone angles φ (phi) and ψ (psi) for amino acid residues in a protein structure. The colored contours are showing the permitted and favored regions of these angles; red and dark orange areas indicate the most frequent angles where usually one can find energetically favorable conformations like α-helices and β-sheets. The plot is then used together with blue-green dots representing particular amino acid residues in the considered protein structure. A red and orange area is typically favorable-angle regions for residues, suggesting that the residue is in a favorable conformation and, hence, indicates that the protein folds well and possibly is stable. On the contrary, residues outside these regions occupy disallowed areas and then perhaps indicate unusual conformations, some strain in the structure, or some errors of the protein model. This kind of plot is usually generated during validation of protein structure, such as after homology modeling or crystallography refinement, to assess the stereochemical quality and overall stability of the protein.
Figure 3: Distribution of Residue Numbers with Error Thresholds at Window Center 1760
This is a plot of error values, usually generated by protein validation programs such as VERIFY3D or ERRAT, for assessing quality in a protein model derived from the 3D structure. The residue number (as a sliding window center) is represented horizontally on the x-axis, and the error value or confidence score for each segment of the protein appears on the y-axis. The horizontal lines at 95% and 99% confidence levels function as benchmarks-thus regions above these lines are locations that may reveal structural errors or areas that are less reliable in the model. In this case, most of the residues fall below the 95% line suggesting a rather acceptable quality even if some distinct spikes (marked in yellow) exceed the 95% threshold, particularly around residues ~1740 and ~1770. These peaks signify regions close by that perhaps contain problematic geometry or possible misfolding or imprecisions in their modeling. These areas would require further scrutiny and possibly refinement to improve the protein model's overall structural reliability.
Figure 4: Scatter Plot of Raw and Average Scores across Categories A1 to A21
This is basically a 3D profile verification plot produced by the VERIFY3D server, which assesses how well the atomic model (3D) fits or corresponds to its own amino acid sequence (1D). The X-axis of the plot gives lists of amino acid residues or regions, and the Y-axis gives the score, both raw (green dots) and averaged (blue line), indicating how well each residue fits its expected 3D environment; anything above 0.2 is generally accepted as good. In the plot, generally the blue line remains above the threshold, suggesting good consistency for the protein model with its sequence environment; however, there are a few dips below 0.2 that indicate areas possibly misfolded or outside the structural norm that may need some refinement. A few scattered green dots (raw scores) give indications of the quality of individual residues, some of which are dramatically out of range from the mean scores-an observation that is somewhat normal yet worthy of investigation. This indicates the reliability of the protein model with only minor local deviations, which should probably be confirmed by further structural validation
6.2 Results of Virtual Screening Outcomes
Table 2: FDA-Approved Compounds Identified Through Receptor-Based Virtual Screening and Their Binding Scores
Sr.No. |
ID |
Name |
Score |
Sr.No. |
ID |
Name |
Score |
1 |
DB08901 |
Ponatinib |
-8.9 |
46 |
DB04540 |
Cholesterol |
-7.1 |
2 |
DB01100 |
Pimozide |
-8.5 |
47 |
DB09477 |
Enalaprilat |
-7.0 |
3 |
DB09076 |
Umeclidinium |
-8.4 |
48 |
DB06152 |
Nylidrin |
-7.0 |
4 |
DB08950 |
Indoramin |
-8.4 |
49 |
DB08930 |
Dolutegravir |
-7.0 |
5 |
DB00222 |
Glimepiride |
-8.4 |
50 |
DB00757 |
Dolasetron |
-7.0 |
6 |
DB12877 |
Oxatomide |
-8.2 |
51 |
DB05039 |
Indacaterol |
-7.0 |
7 |
DB00496 |
Darifenacin |
-8.1 |
52 |
DB12371 |
Siponimod |
-7.0 |
8 |
DB06626 |
Axitinib |
-8.1 |
53 |
DB12978 |
Pexidartinib |
-7.0 |
9 |
DB01070 |
Dihydrotachysterol |
-8.1 |
54 |
DB06077 |
Lumateperone |
-7.0 |
10 |
DB04038 |
Ergosterol |
-8.1 |
55 |
DB00821 |
Carprofen |
-6.9 |
11 |
DB00246 |
Ziprasidone |
-8.1 |
56 |
DB00823 |
Ethynodiol diacetate |
-6.9 |
12 |
DB00878 |
Chlorhexidine |
-8.1 |
57 |
DB11742 |
Ebastine |
-6.9 |
13 |
DB01238 |
Aripiprazole |
-8.0 |
58 |
DB01102 |
Arbutamine |
-6.9 |
14 |
DB12867 |
Benperidol |
-7.9 |
59 |
DB12523 |
Mizolastine |
-6.9 |
15 |
DB00450 |
Droperidol |
-7.9 |
60 |
DB04209 |
Dequalinium |
-6.9 |
16 |
DB09195 |
Lorpiprazole |
-7.9 |
61 |
DB01195 |
Flecainide |
-6.9 |
17 |
DB09128 |
Brexpiprazole |
-7.8 |
62 |
DB11963 |
Dacomitinib |
-6.9 |
18 |
DB00390 |
Digoxin |
-7.8 |
63 |
DB13874 |
Enasidenib |
-6.9 |
19 |
DB01012 |
Cinacalcet |
-7.8 |
64 |
DB09495 |
Avobenzone |
-6.8 |
20 |
DB01261 |
Sitagliptin |
-7.7 |
65 |
DB00601 |
Linezolid |
-6.8 |
21 |
DB11652 |
Tucatinib |
-7.6 |
66 |
DB01132 |
Pioglitazone |
-6.8 |
22 |
DB01267 |
Paliperidone |
-7.6 |
67 |
DB01288 |
Fenoterol |
-6.8 |
23 |
DB00734 |
Risperidone |
-7.5 |
68 |
DB08916 |
Afatinib |
-6.8 |
24 |
DB05990 |
Obeticholic acid |
-7.5 |
69 |
DB00913 |
Anileridine |
-6.8 |
25 |
DB01184 |
Domperidone |
-7.5 |
70 |
DB00448 |
Lansoprazole |
-6.8 |
26 |
DB04841 |
Flunarizine |
-7.4 |
71 |
DB14914 |
Flortaucipir F-18 |
-6.7 |
27 |
DB00656 |
Trazodone |
-7.4 |
72 |
DB08828 |
Vismodegib |
-6.7 |
28 |
DB01053 |
Benzylpenicillin |
-7.3 |
73 |
DB00805 |
Minaprine |
-6.7 |
29 |
DB08909 |
Glycerol phenylbutyrate |
-7.3 |
74 |
DB06654 |
Safinamide |
-6.7 |
30 |
DB13520 |
Metergoline |
-7.3 |
75 |
DB01079 |
Tegaserod |
-6.7 |
31 |
DB08896 |
Regorafenib |
-7.2 |
76 |
DB00414 |
Acetohexamide |
-6.7 |
32 |
DB08954 |
Ifenprodil |
-7.2 |
77 |
DB13766 |
Lidoflazine |
-6.7 |
33 |
DB00490 |
Buspirone |
-7.2 |
78 |
DB01219 |
Dantrolene |
-6.7 |
34 |
DB08865 |
Crizotinib |
-7.2 |
79 |
DB11699 |
Tropisetron |
-6.6 |
35 |
DB04794 |
Bifonazole |
-7.2 |
80 |
DB14196 |
N-Cyclohexyl-N'-phenyl-1,4-phenylenediamine |
-6.6 |
36 |
DB00966 |
Telmisartan |
-7.2 |
81 |
DB04835 |
Maraviroc |
-6.6 |
37 |
DB11637 |
Delamanid |
-7.2 |
82 |
DB11155 |
Triclocarban |
-6.6 |
38 |
DB00813 |
Fentanyl |
-7.1 |
83 |
DB01268 |
Sunitinib |
-6.6 |
39 |
DB00843 |
Donepezil |
-7.1 |
84 |
DB00735 |
Naftifine |
-6.6 |
40 |
DB00502 |
Haloperidol |
-7.1 |
85 |
DB04812 |
Benoxaprofen |
-6.6 |
41 |
DB04946 |
Iloperidone |
-7.1 |
86 |
DB05016 |
Ataluren |
-6.6 |
42 |
DB00398 |
Sorafenib |
-7.1 |
87 |
DB00833 |
Cefaclor |
-6.5 |
43 |
DB06603 |
Panobinostat |
-7.1 |
88 |
DB13944 |
Testosterone enanthate |
-6.5 |
44 |
DB08976 |
Floctafenine |
-7.1 |
89 |
DB11581 |
Venetoclax |
-6.4 |
45 |
DB12401 |
Bromperidol |
-7.1 |
90 |
DB00563 |
Methotrexate |
-6.4 |
|
|
|
|
91 |
DB01604 |
Pivampicillin |
-6.3 |
6.3 Cross Validation of FDA Approved Compounds
Table 3: Cross-Validation FDA-Approved Compounds Identified Through Receptor-Based Virtual Screening with their Binding Scores and Docking Scores
ID |
Name |
Score |
Docking Score |
DB08901 |
Ponatinib |
-8.9 |
-8.9 |
DB01100 |
Pimozide |
-8.5 |
-8.3 |
DB09076 |
Umeclidinium |
-8.4 |
-7.6 |
DB08950 |
Indoramin |
-8.4 |
-8.1 |
DB00222 |
Glimepiride |
-8.4 |
-8.7 |
DB12877 |
Oxatomide |
-8.2 |
-8.4 |
DB00496 |
Darifenacin |
-8.1 |
-8.2 |
DB06626 |
Axitinib |
-8.1 |
-8.1 |
DB01070 |
Dihydrotachysterol |
-8.1 |
-7.8 |
DB04038 |
Ergosterol |
-8.1 |
-7.8 |
Table Whether it is the ID of the ten compounds mentioned along with their names, scores, and docking scores, they will likely represent their binding affinities or interaction strengths with a target protein in molecular docking studies. Among all collected compounds, Ponatinib (DB08901) has the best score and docking score of -8.9, which reflects its strongest binding. Other compounds having scores less than their docking scores, like Pimozide (-8.5, -8.3), Umeclidinium (-8.4, -7.6), and Glimepiride (-8.4, -8.7), are still strong binders, which indicate that slight variations are present in the predicted and actual binding interactions. Thus, there are ranges of scores (-8.1 to -8.9) as well as docking scores (-7.6 to -8.9). These indicate the narrow variation in binding affinities, in which most of the tested compounds performed very well. Umeclidinium and Ergosterol might have more significant differences between their scores versus docking scores, which could suggest difficulties in predicting their binding poses or interactions. Such data could be useful for further optimization or for lead selection toward drug development, with Ponatinib being the best solvent for that.
Figure 5: Structural Analysis of Binding Cavities with CurPocket IDs and Their Respective Volumes
Here you have a representation of five pockets, representing the five different protein associating pockets; designated by some CurPocket IDs C1-C5. Each protein binding pocket has cavity volumes of 834, 229, 124, 108, and 96. These molecular structures present the latest highly developed computational modeling techniques. Shown in translucent gray is the protein surface. Highlighted in pink are internal binding cavities. The yellow ribbons are said to indicate secondary structure of the protein-probably alpha helices and loops. The cavity volume ranges from a maximum of 834 in C1 to a minimum of 96 in C5; hence, diverse potential has been observed for ligand-binding within the protein. Larger cavities like C1 could accommodate bulkier ligands or substrates, possibly playing a role in catalytic activity or allosteric regulation, whereas smaller cavities like C5 might be more selective, interacting with smaller molecules or ions, which could be critical for specific biochemical functions such as signaling or enzymatic precision. This variability emphasizes the adaptability of the protein in the structure to multiple biological actions depending on size and chemical properties of the ligand, which is highly championed in molecular drug design and protein engineering studies.
Figure 6: Molecular Docking Analysis of Ligand Binding within the Active Site of a Protein Target
The infogram illustrates a molecular docking study that shows the relationship of a small ligand with the active-site of a protein. To those on the left, this 3D model of the protein presents a view of its binding pocket, with the ligand (yellow) snugly contained in a cavity lined with key amino acid residues (red, blue, and white representing various elements such as oxygen, nitrogen, and carbon). On that side is a zoomed-in 2D schematic of the binding site, specifying interactions such as hydrogen bonding (dashed lines) and hydrophobic occupancy around the ligand, residues like Tyr155, Lys157, and Asn1774. This analysis connotes that the binding affinity can quite probably be robust due to shape and chemical property complementarity between ligand and the active site of the protein, which probably plays an important role in drug designing or biochemical pathway understanding.
Figure 7: Molecular Interaction Diagram of Ligand Binding within Protein Active Site
The image shows a molecular docking or protein-ligand interaction diagram likely picturing a binding pocket of a protein and key amino acid residues interacting with a ligand. The central hexagonal ring structure could be the ligand, an aromatic compound, surrounded by residues such as VAL A:1703 , ILE A:1860, LEU A:1701, PRO B:9, SER B:6, ASN A:1874, and PRO A:1659. Different interaction types are designated by the colored lines: green dashed lines indicate hydrogen bonding interactions (for example, with SER B:6), orange lines possibly designate hydrophobic interactions (for example, with VAL A:1703), and pink lines may represent π-π stacking or other aromatic interactions (for example, with PRO A:1659). The arrangement of residues in space around the ligand suggests a well-defined binding pocket stable through a combination of polar and non-polar interactions that stabilize the ligand and are essential for understanding the protein's function or designing inhibitors for drug discovery.
Table 4: Potential Mechanism of Action and Relevance of High-Ranking Compound
Sr. No. |
Compound |
DB ID |
Score |
Potential Mechanism Action |
Relevance |
1 |
Ponatinib |
DB08901 |
-8.9 |
Third-generation tyrosine kinase inhibitor; inhibits BCR-ABL (including T315I), VEGFR, PDGFR, FGFR, SRC, KIT, RET, FLT3, blocking phosphorylation. |
FDA-approved for CML, Ph+ ALL (resistant cases); potential in other cancers (e.g., breast, lung). Severe cardiovascular/hepatotoxicity risks require careful monitoring. |
2 |
Pimozide |
DB01100 |
-8.5 |
Dopamine D2, D3, D4 antagonist; blocks 5-HT7, hERG channels, and voltage-gated calcium channels, modulating neurotransmission. |
FDA-approved for Tourette syndrome; used in schizophrenia, delusional disorders. Potential anticancer effects (preclinical). QT prolongation, extrapyramidal symptoms limit use. |
3 |
Umeclidinium |
DB09076 |
-8.4 |
Long-acting muscarinic antagonist (LAMA); binds M1–M5 receptors (high M3 affinity), inhibiting acetylcholine-mediated bronchoconstriction. |
FDA-approved for COPD maintenance; improves lung function, reduces exacerbations. Minimal systemic side effects; contraindicated in acute bronchospasm. Limited to respiratory use. |
4 |
Indoramin |
DB08950 |
-8.4 |
Selective alpha-1 adrenergic antagonist; inhibits norepinephrine-induced vasoconstriction, causing vasodilation; mild antihistamine effects. |
Used for hypertension, BPH (improves urinary flow). Less common due to better alternatives (e.g., tamsulosin). Sedation, dizziness limit use; caution in heart failure/hypotension. |
5 |
Glimepiride |
DB00222 |
-8.4 |
Second-generation sulfonylurea; binds SUR1 on ATP-sensitive K+ channels, triggering insulin release; enhances insulin sensitivity. |
FDA-approved for type 2 diabetes; effective glycemic control. Risk of hypoglycemia, especially in elderly/renal impairment. Potential cardiovascular benefits (unvalidated). |
6 |
Oxatomide |
DB12877 |
-8.2 |
Antihistamine; blocks histamine H1 receptors; also inhibits mast cell mediator release, reducing allergic responses. |
Used for allergic rhinitis, urticaria, asthma (in some countries). Limited global availability; side effects include sedation, weight gain. Potential in allergic disorders but less common than newer antihistamines. |
7 |
Darifenacin |
DB00496 |
-8.1 |
Selective M3 muscarinic receptor antagonist; reduces bladder smooth muscle contraction, decreasing urinary urgency and frequency. |
FDA-approved for overactive bladder; improves urinary symptoms. Minimal CNS effects due to low brain penetration. Side effects: dry mouth, constipation. Limited to urological use. |
8 |
Axitinib |
DB06626 |
-8.1 |
Tyrosine kinase inhibitor; selectively inhibits VEGFR-1, -2, -3, PDGFR, KIT, reducing angiogenesis and tumor growth. |
FDA-approved for advanced renal cell carcinoma. Potential in other VEGF-driven cancers. Side effects: hypertension, fatigue, diarrhea. Requires monitoring for cardiovascular risks. |
9 |
Dihydrotachysterol |
DB01070 |
-8.1 |
Vitamin D analog; activates vitamin D receptor, increasing calcium absorption and bone mineralization. |
Used for hypocalcemia, hypoparathyroidism, renal osteodystrophy. Less common today due to newer vitamin D analogs. Risk of hypercalcemia requires monitoring. Limited to metabolic bone/calcium disorders. |
10 |
Ergosterol |
DB04038 |
-8.1 |
Sterol precursor to vitamin D2; component of fungal cell membranes, targeted by antifungals (e.g., amphotericin B). |
Not a therapeutic agent; studied in fungal biology, antifungal drug development. Relevance in research for targeting fungal infections. No direct clinical use in humans. |
11 |
Ziprasidone |
DB00246 |
-8.1 |
Atypical antipsychotic; antagonizes dopamine D2, serotonin 5-HT2A receptors; partial 5-HT1A agonist, modulating mood and psychosis. |
FDA-approved for schizophrenia, bipolar mania. Balances efficacy with lower metabolic side effects. QT prolongation risk requires ECG monitoring. Used in psychiatry but not first-line. |
6.4. Molecular Dynamics Simulation Analysis
Figure 8: Deformability Profile Across Protein Atom Indices
The diagram here is perhaps a deformability plot produced by a normal mode analysis or a molecular dynamics simulation, displaying the deformability of the atoms of the protein concerning their indices (numbered from 0 to 200). The deformability score on the y-axis (scoring from 0 to 1) can be compared with the atom index on the x-axis. Deformability shows jagged green lines, showing different levels of excitement-about peaks at atom indices 50, 100, and 150; these show spaces of higher deformability, suggesting that these atoms or their residues are flexible and are probably part of functional movements or changes in conformation. One could use this context to hone-in on dynamic parts of the protein that are essential to its function, such as loops or hinges that may inform understanding of their role in binding or catalysis, particularly as such parts are then coupled with ligand-protein interaction data from molecular docking studies.
Figure 9: Comparison of B-Factor Profiles from Normal Mode Analysis (NMA) and PDB Data across Protein Atom Indices
The image compares the B-factors on a comparative plot that refers to the measurement of atomic displacement or flexibility along molecular indices (protein atom indices 0 to 200), with data from Normal Mode Analysis (NMA, in cologne) superimposed on the Protein Data Bank (PDB, in gray). The y-axis measures the B-factor (0-1) while the atom index is depicted on the horizontal x-axis. Both profiles show relatively similar trends displaying peaks at around 50, 100, and 150 indices where the regions are more flexible but have differing predictions of slightly higher B-factor estimates of NMA in some areas, indicating more theoretical flexibility than that of the experimental PDB data. The comparison shows how further complementing experimental data is the fact that computations of NMA would contain those dynamic regions in proteins that could correspond to functional sites, such those for ligand binding, shown in a molecular docking study, as well as those depicting conformational changes from deformability analysis.
Figure 10: Eigenvalue Distribution Across Normal Modes in Normal Mode Analysis (NMA)
What the histogram shows is at which eigenvalues eigenvectors correspond to different mode indices, the first eigenvalue having been referred to as 2.9417979e-04. The x-axis represents mode index values ranging from 0 to 20, while the y-axis shows values of eigenvalue magnitude, peaking above 13. Trendwise, we see that eigenvalues start from lesser magnitudes, gradually increase, peaking around mode index 15–17, indicating that most of the significant modes are concentrated at this peak, probably signifying the presence of dominant frequencies or patterns in the underlying system. That means in applications like signal processing or structural analysis, where normally the higher eigenvalues are associated with more significant modes.
Figure 11: Variance Contribution of Normal Modes in Normal Mode Analysis (NMA)
The histogram represents the cumulative variance explained by different mode indices in the x-axis mode indices ranging from 0 to 20 and the y-axis represents variance (in percentage) up to 100%. Purple highlights a rapid increase in explained variance for the first few modes (0-5) that would account for a large part of the total variance. From mode index rises, increase plateaus after near 100% around mode 15. Hence, the initial modes capture most variations of the system which is a repeated trend in dimensionality reduction techniques such as PCA where such early modes often represent the most significant patterns or trends in the data while the later modes result in diminishing returns.
Figure 12: Cross-Correlation Matrix of Residue Fluctuations in Protein Dynamics
Interactions among residues have been visualized by the heatmap, which both has axes labeled "Residue Index" from 0 to 200 and color scales ranging between blue (negative values) and red (positive values) indicating degrees of interaction strength. A well-dominant red diagonal line from bottom left to upward right indicates strong positive interactions between the residues of the same index. Therefore, self-interactions or local correlation is on the one side, while the other areas show patches of mixed blue and red indicating different degrees of such positive-negative interactions among different residues thereby. This is a typical pattern in molecular dynamics or protein structure analysis. The diagonal part highlights self-similarity of the residues while the off-diagonal part represents long-range interactions sufficient to understand the structure or function relationships in the system.
Figure 13: Atomic Contact Map for Molecular Interaction Analysis
The scatter plot fairly describes the relationship between the two Atom Indices labeled equally on both the x-axis and the y-axis, which refer to the Atom Index ranging from 0 to 200. A thick diagonal line of data points runs from the original point (0, 0) to the higher corner (200, 60), hence establishing a strong positive correlation for both indices. The respective color gradient from 0.0 (light gray) to 60.0 (dark black) shown by the color bar on the right-hand corner indicates the density/intensity of the data points, where a higher concentration is reflected as darker shades. The near-linear arrangement of the points implies that as one Atom Index rises, the other one does as well almost proportionately, hence suggesting a direct and predictable relationship between the two. This pattern could have significance in scientific situations, like physics or chemistry, where Atom Indices may stand for properties such as atomic numbers, energy levels, or positions in a lattice structure. The fact that the points remained tightly clustered along the diagonal with little scatter also suggests the two indices have either a high degree of dependency or that they share some common underlying premise that may shed light on atomic interactions, molecular structures, or systematic trends in the data set. Some further tests, such as the correlation coefficient or even a closer look at the physical meaning of these indices, would go a long way to help understand this relationship and its possible applications in research or computational modelling.
CONCLUSION
The protein structure refinement through virtual screening and molecular dynamics (MD) simulations, as presented in the study, have marked significant advances in structural biology and drug discovery. The refinement via PDB-REDO is apparent in improvements of the crystallographic structure quality, as demonstrated by reduced R and R-free values, better stereochemistry (bond length and angle RMS Z-scores), and climbing high in model quality metrics such as rotamer normality (52nd to 91st percentile) and bump severity (16th to 76th percentile). These improvements lead to a more accurate and biologically relevant protein model that is essential in the context of drug design and molecular simulations. Virtual screening of FDA-approved compounds also has identified such high-affinity binders like Ponatinib (score: -8.9), confirmed cross-validated robust binding predictions. The analysis of binding cavities includes (CurPocket IDs C1-C5) diverse ligand-binding capabilities, as larger cavities like C1 (834 ų) for bulkier ligands were modelled to have smaller cavities such as C5 (96 ų) for precise interactions, which indicates that this protein possesses the ability to adapt for various biochemical functions. Take together, these findings demonstrate the power of integrating computational refinement and screening in the prioritization of lead compounds for their therapeutic development. Protein dynamics and flexibility were further assessed by carrying out a molecular dynamics simulation. Normal mode analysis and B-factor profiling elucidated the flexible regions such as residues 50, 100, and 150, which were critical for functional motions as supported by deformability and cross-correlation analyses. The principal eigenvalue distribution and contribution of the variance by NMA presented major motions that are essential for the protein dynamics fundamental to conformational changes during ligand binding. Additionally, with atomic contact maps and residue fluctuation studies giving insights into local versus long-range interactions beneficial for structural stability and functional diversity. However, the spectrums of model quality metrics vary (such as Ramachandran, rotamer Z-scores) and the present structural errors, for example, with residues ~1740, ~1770, suggesting sections still needing optimization. Future studies should thus concentrate on maximizing these trouble regions within the context of MD and experimentally validated refinement as well as cross-disciplinary endeavors such as nanomedicine or material sciences to fully engage MD potential in formulating solutions to complex biological and pharmaceutical problems.
REFERENCES
Sudarshan Bachate*, Dyaneshwar Mane, R. B. Ghotane, Tejaswini Biraje, Computational Drug Repurposing for Targeting the BRCA1 BRCT Domain in Breast Cancer: Insights from Virtual Screening and Molecular Dynamics Simulations, Int. J. of Pharm. Sci., 2025, Vol 3, Issue 6, 1032-1054. https://doi.org/10.5281/zenodo.15603375