👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization

[Denisa Šrámková]: Interpretability of Binary Protein Knot Classification 19. 10. 2023

Abstract

Prots1

Proteins with knotted backbones are an exceedingly rare phenomenon, and the mechanisms governing knot formation and functional implications remain poorly understood. We fine-tuned the ProtBert-BFD Transformer to classify proteins as either knotted or unknotted solely from their primary structure. As a training set, we used a collection of proteins from selected protein families whose 3D structures were predicted by AlphaFold2. The knotted status of proteins was assigned using Topoly (polymer topology analysis tool).

While the model exhibits high accuracy (98%) in predicting a protein's knot status, it does not directly provide a biological explanation or pinpoint which regions of the protein contribute to knot formation. To uncover this phenomenon, we propose a patching technique: a sliding window (patch) replacing part of the sequence and therefore testing the importance of this part for the knot formation. We tested this method on proteins from the SPOUT family and found that the most influential patches reside within the C-terminal portion of the knot core, which is also responsible for substrate binding.

Prots2

Slides

PDB 101 Course Notes

Lecture recording

Interpretability of Binary Protein Knot Classification

Readings

Denisa Šrámková, Maciej Sikora, Dawid Uchal, Eva Klimentová, Agata P. Perlinska, Mai Lan Nguyen, Marta Korpacz, Roksana Malinowska, Pawel Rubach, Petr Šimeček, Joanna I. Sulkowska: Knot or Not? Sequence-Based Identification of Knotted Proteins With Machine Learning https://www.biorxiv.org/content/10.1101/2023.09.06.556468v1

Předchozí

Následující

👷 Seminar on Machine Learning, Information Retrieval, and Scientific Visualization
- Nyní studovat
  
  [Michal Štefánik]: Can In-context Learners Learn a new Reasoning Concept from Demonstrations? 5. 10. 2023
- Nyní studovat
  
  [Denisa Šrámková]: Interpretability of Binary Protein Knot Classification 19. 10. 2023
- Nyní studovat
  
  [Adam Hájek]: De-Novo Identification of Small Molecules from their GC-EI-MS Spectra 19. 10. 2023
- Nyní studovat
  
  [Vlastimil Martinek]: Predicting RNA Halflife 2. 11. 2023
- Nyní studovat
  
  [Dávid Meluš, Šárka Ščavnická]: Intelligent Back Office Work in Progress Thesis Reports 9. 11. 2023
- Nyní studovat
  
  [David Valecký]: Transformers in Computer Vision 16. 11. 2023
- Nyní studovat
  
  [Marek Kadlčík]: Teaching Models to Use a Calculator for Solving Math Word Problems 23. 11. 2023
- Nyní studovat
  
  [Jan Rodák]: Uses Machine Learning for Security Compliance 30. 11. 2023
- Nyní studovat
  
  [Andrej Kubanda]: Forecasting of glycemia 7. 12. 2023
- Nyní studovat
  
  [David Čechák]: Understanding miRNA Binding Behavior Through Deep Learning Models 14. 12. 2023
- Nyní studovat
  
  [Michal Štefánik]: EMNLP presentation breaking news and report 21. 12. 2023

Operace