Methods in Genomics and Proteomics Mass Spectrometry in Proteomics CG980 Laboratory of Functional Genomics and Proteomics National Centre for Biomolecular Research Faculty of Science, Masaryk University Zbyněk Zdráhal RG Proteomics, CEITEC-MU Proteomics CF , CEITEC-MU NCBR, FS MU zdrahal@sci.muni.cz Úvod 01 Introduction GIGO garbage in garbage out Sample preparation is a base of good results CG980 Main applications of mass spectrometry in proteomics * Intact mass measurements * Protein identification (incl. protein complexes, de novo sequencing) * Analysis of protein modifications * Protein quantification * * MS imaging * * Protein structure elucidation (complementary to NMR) http://i-mass.com/jj2.gif J.J. Thomson - father of mass spectrometry, Nobel prize for physics, 1906 CG980 06 Mass Spectrometry Basics Mass Spectrometry (MS) Method principle: * measurement of m/z ion ratio of analyte m – ion mass z – charge number Note: Apart from selected types of ionization, all steps of MS analysis take place in vacuum to prevent ions from unwanted collisions during their way from ion source to detector (mean free path of molecules) Analysis outcome: * mass spectrum – dependence of ion intensity on their m/z allowing determination of ion mass in case of molecular ion mass of the whole molecule 2955 m/z a.i. Basic steps of MS analysis: * ionization of analyte molecules (fragments) * ion separation according to their m/z * ion detection Ion source Mass analyzer Detector vacuum vacuum CG980 Landmark in MS of biomolecules New „soft“ ionization techniques (in the middle of 80s in the 20th century) basic prerequisite for wide use of MS in biomolecule analysis, mainly proteins (Nobel prize 2002) MALDI Matrix Assisted Laser Desorption/Ionization ESI ElectroSpray Ionization Koichi Tanaka Shimadzu Corp., Kyoto, Japan John B. Fenn Virginia Commonwealth University, Richmond, USA KARAS M., HILLENKAMP F. Laser Desorption Ionization of Proteins with Molecular Masses Exceeding 10000 Daltons Anal. Chem., 60 (20): 2299-2301 (1988) CG980 Mass spectrometry of proteins Most widely used technique in proteomics MALDI Most often in combination with Time-of- Flight analyzer TOF (matrix-assisted laser desorption/ionization time-of-flight mass spectrometry) MALDI - TOF MS MALDI – TOF/TOF MS ESI In combination with several different analyzer types Ion trap – IT triple quadruple and its variants - QQQ, Q-TOF, Q-LIT, Ion Cyclotron Resonance - ICR Orbitrap and its combinations – e.g. IT-Orbitrap CG980 MALDI-TOF MS analysis Příprava vzorku: * sample is mixed with excess of matrix (in solution) * mixture is deposited on a sample target and is allowed to dry * sample co-crystallize with matrix during drying * MS analysis matrix is low mass compound capable to absorb laser radiation e.g. Dihydroxybenzoic acid (for UV laser) Result: Ø Soft ionization without unwanted fragmentation Ø Simple spectra Ø Saving of sample on the sample target (for additional analysis) H+ Laser pulse („shot“) CG980 Target for sample deposition prior MALDI-MS mikromakro 384 positions Detail of sample cocrystallized with matrix (DHB) prepared for analysis CG980 Desorption-ionization process pictures by courtesy of Dr. Sauerland (Bruker) CG980 Time-of-Flight Analyzer (TOF) Separation of ions according to time of flight in the analyzer, time is recalculated to mass E= ½ mv2 V= s/t E – ion energy m – ion mass v – ion velocity s – flight path t – ion flight time Prerequisite: Ions have to receive the same kinetic energy before entering analyzer drift zone which they enter simultaneously and their flight time is measured in a detector MALDI ion source Detector Start Finish m1< m2 < m3 1 2 3 CG980 MALDI - MS/MS instrument MALDI-TOF/TOF mass spectrometer Ultraflex III (Bruker) CG980 Task: find out more about MALDI-TOF MS basics on web use expression „time-of-flight mass spectrometry“ CG980 MALDI – TOF MS animation IgG MALDI-MS spectrum of protein (IgG) 1+ 2+ dimer 1+ CG980 PNI MALDI-MS spectrum of a peptide (~ fmol) [M+H]+ R ~ 18 000 [M+Na]+ CG980 BD18215_ microorganism identification In clinical practice for clinical pathogens sample MALDI MS (3 – 20 kDa) Microorganism identification taxonomic studies MALDI-MS profiling CG980 peptide/protein extraction PCA analysis database of MALDI MS profiles The identification is based on comparison of the measured profile with profiles in database 5000 7000 9000 11000 m/z 0 500 1000 1500 2000 2500 a.i. 5000 7000 9000 11000 m/z 0 500 1000 1500 2000 2500 a.i. Alcaligenes faecalis Sphingomonas paucimobilis Aeromonas hydrophila Pseudomonas aeruginosa MALDI-MS spectra (profiles) of selected bacteria species MALDI-MS profiling CG980 0 100 200 300 400 500 600 700 800 900 1000 Campylobacter fetus subsp fetus CCM 5683 CCM Campylobacter fetus subsp fetus CCM 6210 CCM Campylobacter fetus subsp fetus CCM 5682 CCM Campylobacter coli CCM 6211 CCM Campylobacter jejuni ssp jejuni CCM 6189 CCM Campylobacter jejuni ssp jejuni CCM 6191 CCM Campylobacter jejuni CCM 6214 CCM Campylobacter jejuni ssp jejuni CCM 7212 CCM Corynebacterium pilosum CCM 6140 CCM Corynebacterium urealyticum CCM 3975 CCM Corynebacterium urealyticum CCM 3976 CCM Corynebacterium urealyticum CCM 4186 CCM Finegoldia magna CCM 3785 CCM Streptococcus mitis CCM 7411 CCM Serratia rubidaea CCM 3412 CCM Peptostreptococcus anaerobius CCM 3790 CCM Staphylococcus epidermidis CCM 2124 CCM Staphylococcus epidermidis CCM 2446 CCM Staphylococcus epidermidis CCM 7221 CCM Staphylococcus saprophyticus CCM 3317 CCM Distance Level Method limitation MALDI-MS profiling CG980 In general, the method can not discriminate bacteria at strain level distinquished strains indistinguishable strains ESI ionization Sample preparation: * sample has to be in solution * sample solution is introduced into the ion source by spray needle Sample ionization: * sample solution is sprayed by spray needle in ion source chamber under atmospheric pressure * ionization proceeds within the spray of liquid droplets by applying strong electric field * charged liquid droplets are formed, which are transformed to multiply- charged ions during evaporation * ions are transported into vacuum part of the instrument via transfer line and subjected to MS analysis Result: Ø Soft ionization without unwanted fragmentation Ø Multiply charged ions Ø Easy on-line connection with separation techniques (LC, CE) LC - liquid chromatography; CE - capillary electrophoresis CG980 MS scan measurement of m/z ratio of analyzed compounds * ion capture in an ion trap * sequential ejecting of ions from the trap according to m/z * ion detection Ion trap operation MS/MS scan targeted fragmentation of selected ions (precursors) * ion capture in an ion trap * ejecting of all ions except ions with selected m/z * excitation and fragmentation of selected ions * detection of formed fragment ions (product ions) MM900285287[2] MM900285287[2] MM900285287[2] CG980 Linearni past, citlivost CG980 http://planetorbitrap.com/data/fe/image/Lumos_Schematic%281%29.jpg Orbitrap Fusion™ Lumos Tribrid example of hybrid mass spectrometer Resolution Orbitrap 15,000–500,000 (FWHM) at m/z 200 Precursor fragmentation techniques: CID – ion trap ETD – ion trap HCD – ion-routing multipole EThcD – ion trap/ion-routing multi pole Ion separation/detection: Ion trap – low resolution Orbitrap – high resolution ETD HD – high dynamic range ETD providing significantly increased fragment ion coverage Pepbtb-5 ESI–MS spectrum of protein + 21 + 12 transformed spectrum myoglobin (MW 16 951) Serie of multiply-charged ions is formed differing in number of charges CG980 IonFragmentation MS/MS fragmentation of peptides v peptides consist of individual aminoacids which are connected by peptide bond v during fragmentation (CID), peptide is fragmented preferentially at peptide bond and thus: all peptide bonds might be fragmented (in each precursor molecule different ones) forming set of fragments with various number of aminoacids differences in m/z (or mass) of „neighbouring fragments“ determines type of terminal aminoacid in the longer fragment v serie of fragment ions are formed (b – y, a – x, c – z) which can be used for de novo primary structure elucidation; moreover they are predictable and they can be used for database search based protein identification even if they are not complete Outline of tripeptide fragmentation CG980 Roepstorff P. and Fohlman, J., Biomed. Mass Spectrom., 11 (11), 601 (1984) + + + + b1 b2 b3 b4 b- ion serie + + + + y1 y2 y3 y4 y- ion serie MS fragmentation maps for individual peptides MS vs MS/MS of peptides (CID) [M+H]+ MS/MS C-terminus N-terminus + DM AA CID – collision induced dissociation CG980 ESI-MS a MS/MS spectra of a peptide (MW 1148.5) MS analysis mass determination MS/MS analysis data for protein identification or primary structure elucidation Doubly-charged ion of the peptide selected for MS/MS by intensity N E V T E CG980 04 LC-MS systems On-line LC-MS system ESI-IT mass spectrometer HCT Ultra (Bruker) connected on-line to capillary liquid chromatograph Ultimate (LC Packings) MS ESI HPLC system fused-silica transfer line CG980 Protein mix digestion 2D - RP 1D - SCX step by step fractionation 2-D LC MudPit MS/MS MS/MS 1D - RP Protein or mix digestion 1-D LC separation (optional) 2-D GE CG980 Basic LC-MS/MS separation schemes peptide separation Blood plasma (3500 – 9000 proteins ??) example of multidimensional separation IEF (liq) 20 fractions 1. dimension LC (RP) 1600 fractions 2. dimension GE (1D/2D) ∞ fractions 3/4. dimension from H. Wang, Molecular & Cellular Proteomics, 2005, 4, 618–625. 0. dimension depletion 2 fractions CG980 21 Protein Identification using Mass Spectrometry bottom up Protein identification using mass spectrometric data top down protein (mix) separation digestion (specific protease) protein (mix) MS MS/MS analysis separation MS/MS analysis Identification (DB search, de novo) CG980 Protein identification using MS bottom up proteins with known primary sequence Common approaches: specific digestion MS peptides comparison of obtained peptide map with sequence database specific digestion MS/MS peptides comparison of fragmentation maps of individual peptides with sequence database Separation of digested peptides protein separation MALDI-TOF MS gel electrophoresis liquid chromatography isoelectric focusation ESI-MS liquid chromatography capillary electrophoresis peptide mapping MS/MS Ion Search CG980 Protein identification by peptide mapping (peptide mass fingerprinting) Blue approach A10-2jpg Separation of protein mixture by two-dimensional gel electrophoresis 1. dimension Isoelectric point protein separation CG980 Digestion enzymatic digestion of protein results in set of peptides specific protease is preferred Trypsin cleaves after lysine (K) and arginine (R), if proline does not follow QNGVQMLSPSEIPQRDWFPSDFTFGAATSAYQIEGAWNEDGKGESNWDHFCHNHPERILD GSNSDIGANSYHMYKTDVRLLKEMGMDAYRFSISWPRILPKGTKEGGINPDGIKYYRNLI NLLLENGIEP QNGVQMLSPSEIPQR 1-15 1683.848 Da DWFPSDFTFGAATSAYQIEGAWNEDGK 16-42 3010.317 Da GESNWDHFCHNHPER 43-57 1864.757 Da ILDGSNSDIGANSYHMYK 58-75 1984.907 Da TDVR 76-79 490.262 Da ... Set of masses of these formed peptides (i.e. peptide map) is characteristic for given protein similarly as fingerprint for human individual. Specific digestion CG980 MALDI - TOF MS spectrum of peptides after protein digestion MS spectrum contains masses of peptides formed by digestion of selected protein MS peptides CG980 Protein identification– peptide mapping database searching Measured peptide map (set of masses (or m/z) of peptides formed by digestion of analysed protein) is searched against database of protein sequences using database search engines. Database search engine calculates theoretical peptide map for each protein sequence in database (applying cleavage rules for selected protease) and stepwise compares experimentally obtained peptide map of our analysed protein with in-silico calculated peptide maps. Searching results in a list of proteins with most similar peptide maps. Similarity extent is given by score, all protein candidates with score value higher than the limit significant value (calculated by software) are considered as identified by search engine. CG980 comparison of obtained peptide map with sequence database CAYYVZNB Mascot Search Results 1. S18600 Mass: 47780 Total score: 165 Peptides matched: 12 glutamate-ammonia ligase (EC 6.3.1.2) precursor, chloroplast (clone lambdaAtgsl1) - Arabidopsis thaliana 2. S32228 Mass: 47714 Total score: 76 Peptides matched: 7 glutamate-ammonia ligase (EC 6.3.1.2) precursor - rape - Brassica napus Database : MSDB 20021127 (1019653 sequences) Timestamp : 26 Jan 2003 at 10:36:50 GMT Top Score : 165 for S18600, glutamate-ammonia ligase .... Sequence Coverage: 44% 1 MAQILAASPT CQMRVPKHSS VIASSSKLWS SVVLKQKKQS NNKVRGFRVL 51 ALQSDNSTVN RVETLLNLDT KPYSDRIIAE YIWIGGSGID LRSKSRTIEK 101 PVEDPSELPK WNYDGSSTGQ APGEDSEVIL YPQAIFRDPF RGGNNILVIC 151 DTWTPAGEPI PTNKRAKAAE IFSNKKVSGE VPWFGIEQEY TLLQQNVKWP 201 LGWPVGAFPG PQGPYYCGVG ADKIWGRDIS DAHYKACLYA GINISGTNGE 251 VMPGQWEFQV GPSVGIDAGD HVWCARYLLE RITEQAGVVL TLDPKPIEGD 301 WNGAGCHTNY STKSMREEGG FEVIKKAILN LSLRHKEHIS AYGEGNERRL 351 TGKHETASID QFSWGVANRG CSIRVGRDTE AKGKGYLEDR RPASNMDPYI 401 VTSLLAETTL LWEPTLEAEA LAAQKLSLNV www.matrixscience.com Result of database searching peptide mapping Score Limitní skore = 58 Sequence regions in red corresponds to assigned peptides from measured peptide map comparison of obtained peptide map with sequence database CG980 1799.96 1675.87 1287.59 2539.17 1458.85 1502.76 1117.50 1877.89 1012.50 1576.71 2161.02 3156.32 2638.21 0.0 0.2 0.4 0.6 0.8 1.0 4 x10 1000 1250 1500 1750 2000 2250 2500 2750 3000 m/z MALDI MS – peptide map MALDI MS/MS – fragmentation map of peptide - [M+H]+ 1675.9 Protein identification based on MS data vs MS/MS data CG980 Protein identification by LC-MS/MS Green approach Protein digestion In difference of „blue approach“ this time whole complex protein mixture is digested altogether, again using specific protease (usually trypsin). This peptide mixture is separated (frequently multidimensionally depending on sample complexity) and subjected to MS/MS analysis. specific digestion CG980 Separation of tryptic peptides formed by digestion of protein mixture Digest of human blood plasma sample separated by liquid chromatography (1D separation) connected to mass spectrometer (LC-MS/MS) CG980 Separation of digested peptides MS/MS spectrum of tryptic peptide (m/z 608.3, 2+) MS/MS spectrum contains the peptide fragments formed by collision induced dissociation in ion trap. These fragments carry specific information about peptide sequence and allow identification. CG980 MS/MS peptides Protein identification based on – MS/MS data database searching Measured fragmentation maps (i.e. sets of masses (or m/z) of fragments formed during MS/MS of individual peptides) are searched against database of protein sequences by search engine. At first, database search engine prepares theoretical peptide map for a protein sequence in database, subsequently, it calculates theoretical fragmentation map for each peptide of corresponding peptide map (according to given fragmentation rules) and then these in-silico prepared fragmentation maps are compared with our experimentally obtained fragmentation maps of analyzed peptides. The engine performs this operation for each protein sequence in database. Software calculates individual score for each peptide, score value higher than limit score determines signifikant similarity between theoretical and measured fragmentation map – significant peptide identification. In final, search engine assort peptides to corresponding protein sequences (the more peptides with significant score per protein – the more reliable protein identification). The software also calculate protein score which is derived from individual peptide score as a tool for setting up results. CG980 comparison of fragmentation maps of individual peptides with sequence database Database : SwissProt 51.2 (243975 sequences; 89639744 residues) Taxonomy : Homo sapiens (human) (15175 sequences) Timestamp : 16 Dec 2006 at 16:05:59 GMT Significant hits: AACT_HUMAN Alpha-1-antichymotrypsin precursor (ACT) – Homo sapiens Score Distribution 1. AACT_HUMAN Mass: 47621 Score: 99 Queries matched: 1 Alpha-1-antichymotrypsin precursor (ACT) Homo sapiens (Human) Query Observed Mr(expt) Mr(calc) Delta Miss Score Expect Rank Peptide 1 608.3000 1214.5854 1214.7234 -0.1380 0 99 2.2e-08 1 K.ITLLSALVETR.T Peptide Summary Report Result of database searching MS/MS data Mascot Search Results Limitní skore = 35 Protein score Thanks to variability of primary protein structure it is possible to determine identity of protein based on fragments (MS/MS spectrum) of a single peptide. Individual peptide score comparison of fragmentation maps of individual peptides with sequence database CG980 Processed MS/MS spectrum Differences in m/z (resp. masses) of neighbouring fragment ions of corresponding serie (b, y) enables to determine individual aminoacids and their place in sequence. Aminoacid order derived from fragments of y-serie CG980 Protein identification – database searching m/z MS database searching + unknown protein (sequence is not available, de novo sequencing) protein is identified (protein with its sequence in database) - CG980 other reasons of unsuccessful identification: low protein concentration, unspecific digestion, unknown modification, low quality of MS data, ... comparison of fragmentation maps of individual peptides with sequence database 43 Characterization of Posttranslational Modifications * mutations (protein isoforms) * * chemical (oxidation, deamidation, etc.) * * posttranslational (e.g. phosphorylations, glycosylations) List of modifications and tools: DeltaMass - https://abrf.org/delta-mass ExPASy - http://www.expasy.org/proteomics/post-translational_modification Characterization of protein modifications MS in analysis of protein modifications * modification type * localozation * site occupancy CG980 * low abundance of modified proteins * protein occurs frequently in several modification forms * protein modification status can change during sample preparation * signal suppresion of modified peptides in MS (preferential ionization of unmodified peptides) CG980 Difficulties in PTMs analysis * specific sample preparation procedures (enrichment techniques, enzyme treatment etc.) * specific MS/MS operation modes To improve success of PTMs analysis tas-sable tas-sable CG980 52/celkem 52/30 Proteomics Core Facility * phosphatase inhibitors, denaturation (as soon as possible) * enrichment of phosphopeptides (proteins) - TiO2 (MOAC – „metal oxide affinity chromatography“) - IMAC („immobilized metal affinity chromatography“) - SCX resp. SAX or HILIC („ion exchange or hydrophilic interaction chromatography“) - immunoprecipitation pomocí specifické protilátky I.L. Batalha, Trends in Biotechnology 30 (2), 100-110 (2012) phosphopeptides Maldi-MS w/o enrichment phosphopeptides Maldi-MS TiO2 enrichment Phosphorylations sample treatment Phosphorylations MS analysis CG980 dedicated MS/MS fragmentation techniques preserving phosphogroup at aminoacid residue CID (limited) ETD (ECD) electron transfer (capture) dissociation HCD higher-energy collision dissociation EThcD electron-transfer/higher-energy collision dissociation Frese at al., J. Proteome Res., 12, 1520−1525 (2013) ESI-MS 165.0 226.8 285.1 334.7 380.1 408.2 447.7 495.0 538.3 569.3 640.4 668.3 873.5 916.9 1076.4 200 400 600 800 1000 1200 m/z 0.0 0.5 1.0 1.5 2.0 2.5 7 x10 Intens. 40 Da 165.0 221.9 325.7 408.0 487.7 528.2 578.3 615.6 661.3 920.9 1155.4 200 400 600 800 1000 1200 m/z 0.00 0.25 0.50 0.75 1.00 1.25 7 x10 Intens. 1070 1120 1170 m/z 500 1000 1500 2000 2500 a.i. 80 Da MALDI-MS Phosphorylations how MS see modifications CG980 shift in peptide mass in MS spectrum corresponding to modification mass indicates presence of given type of PTM CG980 S m/z D m/z = 87 Sphos m/z D m/z = 167 ESI-MS/MS (ETD) Phosphorylations how MS see modifications shift in fragment mass in MS/MS spectrum corresponding to modification mass indentifies and localizes given type of PTM w/o deacetyase inhibitors H4 e d c b a e d c b a H2B H3 H3 H2A H2A H1 H1 with deacetylase inhibitor H4 H2B Histone acetylations 2-D gel electrophoresis (AUT-AU) histone extracts CG980 A A B B C C D D E 4x Ac 3x Ac 2x Ac 1x Ac 0 1 MSGRGKGGKG LGKGGAKRHR KVLRDNIQGI TKPAIRRLAR RGGVKRISGL 51 IYEETRGVLK VFLENVIRDA VTYTEHAKRK TVTAMDVVYA LKRQGRTLYG 101 FGG Histone acetylations LC-MS/MS analysis results CG980 w/o deacetyase inhibitors with deacetylase inhibitor detail of MS/MS (ETD) spectrum 128 unmodified K 128+42 acetylated K Histone acetylations distinquishing modification site in peptide GKGGKG LGKGGAKR (3x Ac) CG980 09 Protein quantification Protein quantification by MS general approaches methods based on application of isotopic labels * absolute quantification (determination of amount/concentration of given protein using additon of internal standard with known concentration) * * relative quantification (comparison of changes in protein levels between two or more samples) Isotopic labels are introduced to proteins at different stages of experiment: during cell cultivation or by chemical reaction after protein isolation or after digestion (peptides). label free methods based on advanced processing of MS (MS/MS) data CG980 Flow_01 Ø AQUA Peptide Selection Ø Order selected peptide isotopically labeled (15N, 13C) Ø Adding labeled peptide to protein mix Ø Digest Ø Analyze by LC-MS/MS to quantitate protein of interest Select an optimal tryptic peptide and stable isotope amino acid from the sequence of your protein of interest Optimize LC-MS/MS separation protocol for quantitation AQUA only for protein(s) selected in advance Protein quantification by MS absolute quantification CG980 Protein sample B common digestion of pooled samples the same peptide distinguished by isotopic label (different label mass) 100% Protein sample A SILAC, in-vivo LC-MS Protein quantification by MS relative quantification individual protein isolation pooling common purification/fractionation CG980 common LC-MS/MS Individual sample digestion individual labeling iTRAQ, TMT The same peptides are not distinguished in MS mode MS/MS Protein quantification by MS relative quantification CG980 Protein sample B Protein sample A individual protein isolation individual purification/fractionation reporter ions are released bringing quantitation info Procedure summary for parallel isobaric labeling for tandem mass spectrometry Thermo Fischer Scientific Protein quantification by MS relative quantification – TMT labeling CG980 Analysis protocol summary for tandem mass spectrometry with 6-plex isobaric tags Relative quantitation by tandem mass spectrometry with TMT10plex Reagents Reagents contain different numbers and combinations of 13C and 15N isotopes in the mass reporter. The different isotopes result in a 10-plex set of tags that have mass differences in the reporter that can be detected using high resolution Orbitrap MS instruments. Thermo Fischer Scientific Protein quantification by MS relative quantification – TMT labeling CG980 07 … and this is the end