Structure-guided drug design depends on the correct identification of ligands in crystal structures of protein complexes. However, the interpretation of the electron density maps is challenging and often burdened with confirmation bias. Ligand identification can be aided by automatic methods such as CheckMyBlob, a machine learning algorithm that learns to generalize ligand descriptions from sets of moieties deposited in the Protein Data Bank. Here, we present the CheckMyBlob web server, a platform that can identify ligands in unmodeled fragments of electron density maps or validate ligands in existing models. The server processes PDB/mmCIF and MTZ files and returns a ranking of 10 most likely ligands for each detected electron density blob along with interactive 3D visualizations. Additionally, for each prediction/validation, a plugin script is generated that enables users to conduct a detailed analysis of the server results in Coot. The CheckMyBlob web server is available at https://checkmyblob.bioreproducibility.org.
As part of the global mobilization to combat the present pandemic, almost 100 000 COVID-19-related papers have been published and nearly a thousand models of macromolecules encoded by SARS-CoV-2 have been deposited in the Protein Data Bank within less than a year. The avalanche of new structural data has given rise to multiple resources dedicated to assessing the correctness and quality of structural data and models. Here, an approach to evaluate the massive amounts of such data using the resource https://covid19.bioreproducibility.org is described, which offers a template that could be used in large-scale initiatives undertaken in response to future biomedical crises. Broader use of the described methodology could considerably curtail information noise and significantly improve the reproducibility of biomedical research.
Cytochrome P450 monooxygenase CYP51 (sterol 14α-demethylase) is a well-known target of the azole drug fluconazole for treating cryptococcosis, a life-threatening fungal infection in immune-compromised patients in poor countries. Studies indicate that mutations in CYP51 confer fluconazole resistance on cryptococcal species. Despite the importance of CYP51 in these species, few studies on the structural analysis of CYP51 and its interactions with different azole drugs have been reported. We therefore performed in silico structural analysis of 11 CYP51s from cryptococcal species and other Tremellomycetes. Interactions of 11 CYP51s with nine ligands (three substrates and six azoles) performed by Rosetta docking using 10,000 combinations for each of the CYP51-ligand complex (11 CYP51s × 9 ligands = 99 complexes) and hierarchical agglomerative clustering were used for selecting the complexes. A web application for visualization of CYP51s’ interactions with ligands was developed (http://bioshell.pl/azoledocking/). The study results indicated that Tremellomycetes CYP51s have a high preference for itraconazole, corroborating the in vitro effectiveness of itraconazole compared to fluconazole. Amino acids interacting with different ligands were found to be conserved across CYP51s, indicating that the procedure employed in this study is accurate and can be automated for studying P450-ligand interactions to cater for the growing number of P450s.
BioShell is an open-source package for processing biological data, particularly focused on structural applications. The package provides parsers, data structures and algorithms for handling and analyzing macromolecular sequences, structures and sequence profiles. The most frequently used routines are accessible by a set of easy-to-use command line utilities for a Linux environment. The full functionality of the package assumes knowledge of C++ or Python to assemble an application using this software library. Since the last publication that announced the version 2.0, the package has been greatly expanded and rewritten in C++ standard 11 (C++11) to improve its modularity and efficiency. A new testing platform has been implemented to continuously test the correctness and integrity of the package. More than two hundred test programs have been published to provide simple examples that can be used as templates. This makes BioShell an easy to use library that greatly speeds up development of bioinformatics applications and web services without compromising computational efficiency.
X‐ray crystallography is the main experimental method behind ligand–macromolecule complexes found in the Protein Data Bank (PDB). Applying bioinformatics methods to such structural data can fuel drug discovery, albeit under the condition that the information is correct. Regrettably, a small number of structures in the PDB are of suboptimal quality due to incorrectly identified and modeled ligands in protein–ligand complexes. In this paper, we combine a theoretical‐graph approach, nuclear density estimates, bioinformatics methods, and prior chemical knowledge to analyze two non‐physiological ligands, HEPES and MES, that are frequent components of crystallization and purifications buffers. Our analysis includes quantum mechanics calculations and Cambridge Structure Database (CSD) queries to define the ideal conformation of these ligands, geometry analysis of PDB deposits regarding several quality factors, and a search for homologous structures to identify other small molecules that could bind in place of the parasitic ligand. Our results highlight the need for careful refinement of macromolecule–ligand complexes and better validation tools that integrate results from all relevant resources.
Although the dynamical refinement improves the fit significantly compared to the kinematical refinement, its current implementation still does not describe the electron diffraction data fully. To further improve the fit, it is necessary to properly account for further effects like crystal imperfections, effects of inelastic scattering and also the bonding effects in the electrostatic potential.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.