Míry polohy a variability Dominik Heger Masaryk University hegerdQchemi. muni. cz STDT 03 Míry polohy a variability Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 1/15 Statistika K čemu sumarizovat data? Není nejlepší mít všechna data? Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 2/15 Statistika K čemu sumarizovat data? Není nejlepší mít všechna data? 8 2 0 3 1 4 5 8 2 1 7 2 7 3 8 5 5 2 9 0 6 3 1 6 4 0 8 7 3 3 1 9 7 5 2 5 7 6 9 8 0 3 6 2 5 1 2 7 5 2 2 3 3 8 6 1 4 2 4 □ 2 1 8 9 5 2 6 9 8 3 4 □ 1 0 4 7 5 5 6 3 0 7 7 1 9 1 6 1 7 4 1 7 1 3 7 9 3 3 7 1 9 3 9 5 3 4 9 5 5 2 7 5 a 0 3 4 8 8 1 2 7 5 3 4 2 8 7 8 1 4 1 4 9 4 2 4 1 5 2 9 4 é 2 1 5 2 8 1 9 8 4 8 5 1 3 9 6 0 7 2 1 9 0 2 0 6 7 D 6 □ 1 3 0 0 3 8 8 7 5 1 5 1 7 3 4 5 2 0 7 4 7 9 6 ó 7 7 4 3 5 3 1 9 3 7 4 9 5 0 2 0 1 4 6 2 5 4 5 8 5 □ 9 2 3 4 3 9 5 2 7 9 e 9 0 5 5 8 5 1 7 7 3 5 5 4 7 7 2 4 1 5 3 0 9 1 3 7 2 5 8 7 7 1 3 6 3 9 7 3 7 9 1 7 7 2 9 5 6 7 8 5 4 5 3 4 5 4 1 9 8 6 7 5 7 9 3 1 8 5 9 2 8 9 8 6 4 4 1 5 3 7 7 0 8 0 2 5 é □ 6 1 2 0 1 3 3 3 9 0 5 2 S 7 4 0 9 0 3 7 3 1 7 9 4 5 5 2 8 4 6 0 1 0 8 6 2 1 0 0 5 0 3 1 5 4 9 0 3 7 4 7 0 1 7 7 □ é 6 3 2 8 8 5 8 9 5 á 4 0 5 9 1 8 0 5 4 9 4 3 3 a 5 7 5 7 4 3 4 5 7 9 9 5 Q 7 7 6 6 8 8 5 9 9 1 7 1 3 é 9 2 9 1 9 4 2 3 3 ü 8 1 8 1 7 6 4 7 2 6 2 2 8 0 9 4 5 3 7 2 5 4 6 6 5 6 6 5 0 4 6 5 6 8 1 7 5 9 0 0 2 0 5 6 5 8 5 1 9 5 3 3 7 4 □ 5 a 2 4 0 3 9 6 9 4 7 3 5 7 0 6 5 4 7 1 1 8 5 3 2 8 0 9 8 Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 2/15 Statistika K čemu sumarizovat data? Není nejlepší mít všechna data? 8 2 0 3 1 4 5 8 2 1 7 2 7 3 8 5 5 2 9 0 6 3 1 6 4 0 8 7 3 3 1 9 7 5 2 5 7 6 9 8 0 3 6 2 5 1 2 7 5 2 2 3 3 8 6 1 4 2 4 □ 2 1 8 9 5 2 6 9 8 3 4 □ 1 0 4 7 5 5 6 3 0 7 7 1 9 1 6 1 7 4 1 7 1 3 7 9 3 3 7 1 9 3 9 5 3 4 9 5 5 2 7 5 a 0 3 4 8 8 1 2 7 5 3 4 2 8 7 8 1 4 1 4 9 4 2 4 1 5 2 9 4 é 2 1 5 2 8 1 9 8 4 8 5 1 3 9 6 0 7 2 1 9 0 2 0 6 7 D 6 □ 1 3 0 0 3 8 8 7 5 1 5 1 7 3 4 5 2 0 7 4 7 9 6 ó 7 7 4 3 5 3 1 9 3 7 4 9 5 0 2 0 1 4 6 2 5 4 5 8 5 □ 9 2 3 4 3 9 5 2 7 9 e 9 0 5 5 8 5 1 7 7 3 5 5 4 7 7 2 4 1 5 3 0 9 1 3 7 2 5 8 7 7 1 3 6 3 9 7 3 7 9 1 7 7 2 9 5 6 7 8 5 4 5 3 4 5 4 1 9 8 6 7 5 7 9 3 1 8 5 9 2 8 9 8 6 4 4 1 5 3 7 7 0 8 0 2 5 é □ 6 1 2 0 1 3 3 3 9 0 5 2 S 7 4 0 9 0 3 7 3 1 7 9 4 5 5 2 8 4 6 0 1 0 8 6 2 1 0 0 5 0 3 1 5 4 9 0 3 7 4 7 0 1 7 7 □ é 6 3 2 8 8 5 8 9 5 á 4 0 5 9 1 8 0 5 4 9 4 3 3 a 5 7 5 7 4 3 4 5 7 9 9 5 Q 7 7 6 6 8 8 5 9 9 1 7 1 3 é 9 2 9 1 9 4 2 3 3 ü 8 1 8 1 7 6 4 7 2 6 2 2 8 0 9 4 5 3 7 2 5 4 6 6 5 6 6 5 0 4 6 5 6 8 1 7 5 9 0 0 2 0 5 6 5 8 5 1 9 5 3 3 7 4 □ 5 a 2 4 0 3 9 6 9 4 7 3 5 7 0 6 5 4 7 1 1 8 5 3 2 8 0 9 8 Numerický popis Grafický popis Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 2/15 Statistika K čemu sumarizovat data? Není nejlepší mít všechna data? 8203145821 727385529063164 0873319752576980362512752 2338614240261895269834010 4755630771916174171379337 1939534955275803488127534 2878141494241529462152819 848513966 0 7 2 1 9 0 2 0 6 7 0 6 0 1 3 0 0388475151734520747966774 3531937495020146254585092 3459527989055851773554772 41 5309137258771363978791 7 7295678545345419B67579318 5928986441537708025606120 1333905287409037317945528 4601086210050315490374701 7706632885895640591805494 3385757434579695077668859 917136929 1942330818776472 6228094537254665665046568 1759002056585195337405824 0396947357065471 185328098 Grafický popis Numerický popis 1. - jedné proměnné (stupeň vzdělání, příjem, oblíbená barva) Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 2/15 Statistika K čemu sumarizovat data? Není nejlepší mít všechna data? 8 2 0 3 1 4 5 8 2 1 7 2 7 3 8 5 5 2 9 0 6 3 1 6 4 0 8 7 3 3 1 9 7 5 2 5 7 6 9 8 0 3 6 2 5 1 2 7 5 2 2 3 3 8 6 1 4 2 4 □ 2 1 8 9 5 2 6 9 8 3 4 □ 1 0 4 7 5 5 6 3 0 7 7 1 9 1 6 1 7 4 1 7 1 3 7 9 3 3 7 1 9 3 9 5 3 4 9 5 5 2 7 5 a 0 3 4 8 8 1 2 7 5 3 4 2 8 7 8 1 4 1 4 9 4 2 4 1 5 2 9 4 é 2 1 5 2 8 1 9 8 4 8 5 1 3 9 6 0 7 2 1 9 0 2 0 6 7 D 6 □ 1 3 0 0 3 8 8 7 5 1 5 1 7 3 4 5 2 0 7 4 7 9 6 ó 7 7 4 3 5 3 1 9 3 7 4 9 5 0 2 0 1 4 6 2 5 4 5 8 5 □ 9 2 3 4 3 9 5 2 7 9 e 9 0 5 5 8 5 1 7 7 3 5 5 4 7 7 2 4 1 5 3 0 9 1 3 7 2 5 8 7 7 1 3 6 3 9 7 3 7 9 1 7 7 2 9 5 6 7 8 5 4 5 3 4 5 4 1 9 8 6 7 5 7 9 3 1 8 5 9 2 8 9 8 6 4 4 1 5 3 7 7 0 8 0 2 5 é □ 6 1 2 0 1 3 3 3 9 0 5 2 S 7 4 0 9 0 3 7 3 1 7 9 4 5 5 2 8 4 6 0 1 0 8 6 2 1 0 0 5 0 3 1 5 4 9 0 3 7 4 7 0 1 7 7 □ é 6 3 2 8 8 5 8 9 5 á 4 0 5 9 1 8 0 5 4 9 4 3 3 a 5 7 5 7 4 3 4 5 7 9 9 5 Q 7 7 6 6 8 8 5 9 9 1 7 1 3 é 9 2 9 1 9 4 2 3 3 ü 8 1 8 1 7 6 4 7 2 6 2 2 8 0 9 4 5 3 7 2 5 4 6 6 5 6 6 5 0 4 6 5 6 8 1 7 5 9 0 0 2 0 5 6 5 8 5 1 9 5 3 3 7 4 □ 5 a 2 4 0 3 9 6 9 4 7 3 5 7 0 6 5 4 7 1 1 8 5 3 2 8 0 9 8 Grafický popis Numerický popis 1. - jedné proměnné (stupeň vzdělání, příjem, oblíbená barva) 2. - vzath mezi dvěmi proměnnými (Jak stupeň vzdělání ovlivní příjem?) Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 2/15 Relationship between the Particle Sizes, Sea Salt Concentration, and Sublimation Temperature -16 °c - 30 °c -40 °c ZÁVACKÁ, K., V. NEDĚLA, M. OLBERT, E. TIHLAŘÍKOVÁ, Ľ. VETRÁKOVÁ, X. YANG AND D. HEGER Temperature and Concentration Affect Particle Size Upon Sublimation of Saline Ice: Implications for Sea Salt Aerosol Production in Polar Rpcnon<; Gponhvsiral Rpsparrh I pttpr^ 9099 AQ(R} 10 1 02Q/?0?1 crlOQ70QR Dominik Heger (MU)_Míry polohy a variability_STDT 03 Míry polohy a variability 3/15 At -30 °C, Brine is Still Fluid to Become Mostly Solid at -40 °C (0.85 PSU) ESEM pictures ZAVACKA, K., V. NEDĚLA, M. OLBERT, E. TIHLAŘIKOVA, L. VĚTRÁKOVÁ, X. YANG AND D. HEGER Temperature and Concentration Affect Particle Size Upon Sublimation of Saline Ice: Implications for Sea Salt Aerosol Production in Polar Regions. Geophysical Research Letters, 2022, 49(8),10.1029/2021gl097098. Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 4/15 SSAs: the Size Decreases with the Temperature and Concentration: Histogram and Box-Plot 0.85 psu o E ° o t o Q. o O 10 20 30 3.5 psu o o 11 Q. (N O oo O - s d o o 10 —r 20 T 30 Diameter/nm O -40'C O -30X 40 40 50 50 0.85 PSU PA0_85PSU_40_C PA0_85PSU 30 C 3.5 PSU P A3 5PSU 40 C PA3 5PSU 30 C ZAVACKA, K., V. NEDELA, M. OLBERT, E. TIHLARIKOVA, L. VETRAKOVA, X. YANG AND D. HEGER Temperature and Concentration Affect Particle Size Upon Sublimation of Saline Ice: Implications for Sea Salt Aerosol Production in Polar I /~v /~\ ^ /~v /-\ f~\ Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 5 Empirická distribuční funkce Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 6/15 A right-skewed distribution = do pravá zšikmělé rozložení Income in USA 2010 Ui 1 - J O -1 1-1-1-1-1-1 I I I I I I 0 10 25 50 100 150 income (thousands of $) Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 7/15 A right-skewed distribution = do pravá zšikmělé rozložení Income in USA 2010 1 - s. I I I 0 10 25 50 100 income (thousands of $) 150 Income in USA 1973 Figure 4. Distribution of families by income in the U.S. in 1973. CO ct S3* í 8 X 0J 10 15 20 25 30 35 40 INCOME (THOUSANDS OF DOLLARS) 45 50 Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 7/15 V histogramu O plocha pravoúhelníků představuje pravděpodobnosti O výška pravoúhelníků představuje hustotu - procenta celkové populace na jednotku horizontální osy. Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 8/15 Average Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 9/15 Comparing averages age (years) 20-30 30-40 40-50 50-60 60-75 75+ average height(") 69.3 69.5 69.4 69.2 68.3 67.2 Intervals include the left endpoint but not the right. [National Health and Nutrition Examination Survey, 1999-2002] Dominik Heger (MU) Míry polohy a variability < rS1 ► < -ž ► 4 > -E >0 °s O STDT 03 Miry polohy a variabilitylO / 15 Comparing averages age (years) 20-30 30-40 40-50 50-60 60-75 75+ average height(") 69.3 69.5 69.4 69.2 68.3 67.2 Intervals include the left endpoint but not the right. [National Health and Nutrition Examination Survey, 1999-2002] Is this table telling us that as men get older, on average they get a bit taller and then get shorter? Dominik Heger (MU) Míry polohy a variability STDT 03 Miry polohy a variabilitylO / 15 Comparing averages age (years) 20-30 30-40 40-50 50-60 60-75 75+ average height(") 69.3 69.5 69.4 69.2 68.3 67.2 Intervals include the left endpoint but not the right. [National Health and Nutrition Examination Survey, 1999-2002] Is this table telling us that as men get older, on average they get a bit taller and then get shorter? Data - longitudinal x cross-sectional Dominik Heger (MU) Míry polohy a variability STDT 03 Miry polohy a variabilitylO / 15 Comparing averages age (years) 20-30 30-40 40-50 50-60 60-75 75+ average height(") 69.3 69.5 69.4 69.2 68.3 67.2 Intervals include the left endpoint but not the right. [National Health and Nutrition Examination Survey, 1999-2002] Is this table telling us that as men get older, on average they get a bit taller and then get shorter? Data - longitudinal x cross-sectional The men in various categories are not the same. Dominik Heger (MU) Míry polohy a variability STDT 03 Miry polohy a variabilitylO / 15 Comparing averages age (years) 20-30 30-40 40-50 50-60 60-75 75+ average height(") 69.3 69.5 69.4 69.2 68.3 67.2 Intervals include the left endpoint but not the right. [National Health and Nutrition Examination Survey, 1999-2002] Is this table telling us that as men get older, on average they get a bit taller and then get shorter? Data - longitudinal x cross-sectional The men in various categories are not the same. The older men were shorter, on average. Dominik Heger (MU) Míry polohy a variability STDT 03 Miry polohy a variabilitylO / 15 Comparing averages age (years) 20-30 30-40 40-50 50-60 60-75 75+ average height(") 69.3 69.5 69.4 69.2 68.3 67.2 Intervals include the left endpoint but not the right. [National Health and Nutrition Examination Survey, 1999-2002] Is this table telling us that as men get older, on average they get a bit taller and then get shorter? Data - longitudinal x cross-sectional The men in various categories are not the same. The older men were shorter, on average. When comparing averages first think: O How are the groups related to each other? O Take a look on the numerical averages. < rS1 ► < ^1 ► 1 >0 °s O Dominik Heger (MU) Míry polohy a variability STDT 03 Miry polohy a variabilitylO / 15 Measures of Spread How can we quantify your distance from the median and/or mean? Dominik Heger (MU) Míry polohy a variability < rS1 ► < -ž ► 4 > -E >0 °s O STDT 03 Miry polohy a variability!! / 15 Measures of Spread How can we quantify your distance from the median and/or mean? How far from average am I? Dominik Heger (MU) Míry polohy a variability STDT 03 Miry polohy a variability!! / 15 Measures of Spread How can we quantify your distance from the median and/or mean? How far from average am I? How much am I deviating? Dominik Heger (MU) Míry polohy a variability STDT 03 Miry polohy a variability!! / 15 Measures of Spread How can we quantify your distance from the median and/or mean? How far from average am I? How much am I deviating? The amount your score is off (from average) is the deviation. Dominik Heger (MU) Míry polohy a variability STDT 03 Miry polohy a variability!! / 15 Range and interquartile range Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variabilityl2 / 15 Range and interquartile range How far from median? Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variabilityl2 / 15 Range and interquartile range How far from median? Maternal ages i 30 35 40 maternal age (years) Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variabilityl2 Range and interquartile range How far from median? Maternal ages Box plot i 30 35 maternal age (years) Dominik Heger (MU) Míry polohy a variability □ rS ► < .š ► STDT 03 Míry polohy a variabilityl2 / 15 Deviation from average What is the typical (standard) deviation from average? Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variabilityl3 Deviation from average What is the typical (standard) deviation from average? Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variabilityl3 Deviation from average What is the typical (standard) deviation from average? Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 13 Deviation from average What is the typical (standard) deviation from average? i i i i i i i i i 012345678 SD = standard deviation = směrodatná odchylka = (square) root mean square of deviations from average = (druhá) odmocnina průměru čtverců odchylek od průměru variance = rozptyl = mean square of deviations from average = průměr čtverců odchylek od průměru Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variability 13 / 15 Properties of SD Why SD is so commonly used measure of spread? SD for given distribution meausres typical distance from average. O It is non negative Q It has the same units as average and the list. O It measures the average distance from the data to their mean (rms of the deviations of the data from their mean) O Chebychev inequality Pafnuty Lvovich Chebychev (1821 - 1894) In any list, the proportion of entries that are k or more SDs away from the average is at most 1/k2. Pro jakoukoli číselnou řadu platí: podíl členů, které jsou od průměru vzdáleny alespoň /c-krát SD je nejvíce l/k2. Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variabilityl4 / 15 Infromation sources https://courses.edx.org/courses/BerkeleyX/Stat_2.lx/ http://www.stat.berkeley.edu/ stark/SticiGui/Text/location.htm Dominik Heger (MU) Míry polohy a variability STDT 03 Míry polohy a variabilityl5 / 15