1 Secondary structure diagrams of proteins, protein families and ligands Radka Svobodová NCBR, CEITEC MASARYK UNIVERSITY Current trends: Number of available structures grows 2 Current trends: Size of deposited structures also grows 3 Current trends: Protein families are getting bigger 4 Analysis of individual structure Analysis of a whole family Protein family structures and their analysis 5 ▪ Comparison of protein family members ▪ Different species ▪ Different substituents ▪ Mutations ▪ Active and inactive forms ▪ Firm (conserved) and flexible regions ▪ Binding of ligands Protein family structures and their analysis How to do it? 6 Cytochrome P450 (protein family 1.10.630.10) Aldolase class I (protein family 3.20.20.70) 7 Cytochrome P450 (protein family 1.10.630.10) Aldolase class I (protein family 3.20.20.70) Insight into protein family: Secondary structure 2D diagrams Protein family structures and their analysis How to do it? Protein family structures and their analysis Secondary structure utilization – necessary steps 8 ▪ Detection ▪ Annotation ▪ Visualization Visualization of secondary structure in 2D: Solved in past? Not for protein families! 9 1TQN 1OG2 ISSUE 1: Similar proteins have different 2D diagrams RMSD: 2.295 Å Hera, PDBe Visualization of secondary structure in 2D: Solved in past? Not for protein families! 10 ISSUE 2: Secondary structure elements close in 2D diagrams are far in reality 1TQN Hera, PDBe ISSUE 3: 2D diagrams does not reflect a shape of a protein Visualization of secondary structure in 2D: Solved in past? 11 1ORW HERA Protein family based 2D diagrams How to get them? Input: Step 1: Detection & annotation ▪ Find secondary structure elements (SSE) ▪ Annotate them Step 2: Statistics ▪ Average length of SSE ▪ Average occurence of SSE 12 Protein family based 2D diagrams How to get them? Step 3: Construct the 2D diagram ▪ Group all b-strands into sheets ▪ Divide the helices and sheets into primary (common for most of the domains) and secondary (the remaining ones). ▪ Place all primary helices and sheets into the 2D diagram. ▪ Adjust the angles of the primary helices and sheets. ▪ Add all secondary helices and sheets into the 2D diagram. ▪ Adjust the angles of the secondary helices and sheets. Step 4: Draw the 2D diagrams 13 Protein family 2D diagrams 2DProts database 14 https://2dprots.ncbr.muni.cz Protein family 2D diagrams 2DProts database 15 Protein family 2D diagrams 2DProts database 16 2DProts outputs 2D diagram of a protein domain 17 2DProts outputs: Multiple 2D diagram of protein domains in a family 18 With opacity No opacity Superfamily: Dipeptidylpeptidase IV (2.140.10.30) PROTEIN FAMILY 2DProts HERA CATH PROTEIN Current solution Superfamily: Rhodopsin 7-helix transmembrane proteins (1.20.1070.10) PROTEIN FAMILY 2DProts HERA CATH PROTEIN Current solution Superfamily: Aldolase class I (3.20.20.70) PROTEIN FAMILY 2DProts HERA CATH PROTEIN Current solution 2DProts integration to CATH 22 2DProts integration into OverProt 23 https://overprot.ncbr.muni.cz 24 2DProts integration into OverProt https://overprot.ncbr.muni.cz 25 Publications Sillitoe I, ..., Berka K, Hutařová Vařeková I, Svobodová R., et al. (2021). CATH: increased structural coverage of functional space. Nucleic Acids Research, 49(D1), D266-D273. Hutařová Vařeková, I., Hutař, J., Midlik, A., Horský, V., Hladká, E., Svobodová, R., & Berka, K. (2021). 2DProts: database of family-wide protein secondary structure diagrams. Bioinformatics, 37(23), 4599- 4601. Porin Family 2.40.160.10 2DProts: Coloring by structure properties Example: Occurence of secondary structures 26 Cytochrome reductase, Family 2.140.10.30 2DProts: Integration of ligands 27 PDB ID 2bgn, domain A00 Cytochrome reductase, family 2.140.10.30 2DProts: Integration of ligands 28 OMPF Porin PDB ID 2zfg, domain A00 Porin, Family 2.40.160.10 2DProts: 2D diagrams for proteins 29 Hemoglobine PDB ID 1v4w Pseudomonas aeruginosa lectin II PDB ID 1gzt 29 2DProts: Integration of AlphaFoldDB 30 2DProts: Integration of AlphaFoldDB 31 E. coli PapC protein, C-terminal domain Family 2.60.40.2070 Structures from PDB Structures from AlphaFoldDB