Repository logo
 
Loading...
Project Logo
Research Project

Morse Theory-Based Pockets Detection over Proteins and other Biomolecules

Authors

Publications

Geometric algorithms for cavity detection on protein surfaces
Publication . Simões, Tiago Miguel Carrola; Gomes, Abel João Padrão
Macromolecular structures such as proteins heavily empower cellular processes or functions. These biological functions result from interactions between proteins and peptides, catalytic substrates, nucleotides or even human-made chemicals. Thus, several interactions can be distinguished: protein-ligand, protein-protein, protein-DNA, and so on. Furthermore, those interactions only happen under chemical- and shapecomplementarity conditions, and usually take place in regions known as binding sites. Typically, a protein consists of four structural levels. The primary structure of a protein is made up of its amino acid sequences (or chains). Its secondary structure essentially comprises -helices and -sheets, which are sub-sequences (or sub-domains) of amino acids of the primary structure. Its tertiary structure results from the composition of sub-domains into domains, which represent the geometric shape of the protein. Finally, the quaternary structure of a protein results from the aggregate of two or more tertiary structures, usually known as a protein complex. This thesis fits in the scope of structure-based drug design and protein docking. Specifically, one addresses the fundamental problem of detecting and identifying protein cavities, which are often seen as tentative binding sites for ligands in protein-ligand interactions. In general, cavity prediction algorithms split into three main categories: energy-based, geometry-based, and evolution-based. Evolutionary methods build upon evolutionary sequence conservation estimates; that is, these methods allow us to detect functional sites through the computation of the evolutionary conservation of the positions of amino acids in proteins. Energy-based methods build upon the computation of interaction energies between protein and ligand atoms. In turn, geometry-based algorithms build upon the analysis of the geometric shape of the protein (i.e., its tertiary structure) to identify cavities. This thesis focuses on geometric methods. We introduce here three new geometric-based algorithms for protein cavity detection. The main contribution of this thesis lies in the use of computer graphics techniques in the analysis and recognition of cavities in proteins, much in the spirit of molecular graphics and modeling. As seen further ahead, these techniques include field-of-view (FoV), voxel ray casting, back-face culling, shape diameter functions, Morse theory, and critical points. The leading idea is to come up with protein shape segmentation, much like we commonly do in mesh segmentation in computer graphics. In practice, protein cavity algorithms are nothing more than segmentation algorithms designed for proteins.
CavBench: a benchmark for protein cavity detection methods
Publication . Dias, Sérgio; Simões, Tiago M. C.; Fernandes, Francisco; Martins, Ana Mafalda; Ferreira, Alfredo; Jorge, Joaquim A; Gomes, Abel
Extensive research has been applied to discover new techniques and methods to model protein-ligand interactions. In particular, considerable efforts focused on identifying candidate binding sites, which quite often are active sites that correspond to protein pockets or cavities. Thus, these cavities play an important role in molecular docking. However, there is no established benchmark to assess the accuracy of new cavity detection methods. In practice, each new technique is evaluated using a small set of proteins with known binding sites as ground-truth. However, studies supported by large datasets of known cavities and/or binding sites and statistical classification (i.e., false positives, false negatives, true positives, and true negatives) would yield much stronger and reliable assessments. To this end, we propose CavBench, a generic and extensible benchmark to compare different cavity detection methods relative to diverse ground truth datasets (e.g., PDBsum) using statistical classification methods.

Organizational Units

Description

Keywords

Contributors

Funders

Funding agency

Fundação para a Ciência e a Tecnologia

Funding programme

Funding Award Number

SFRH/BD/99813/2014

ID