Data processing News articles Data in acadamic articles and books Real Data Data linked to research articles •Processed data •Edited data •Original variables not used for article are deleted •Indication of source of data • •data.mendeley.com •https://dataverse.harvard.edu/ • • Information webpage using some data •https://osp.stat.gov.lt/en/lietuvos-ekonomikos-raida/gyventojai-ir-socialine-statistika • • • • •https://osp.stat.gov.lt/statistiniu-rodikliu-analize?hash=b6dcd0f9-dd87-428e-a0d6-13d429b532a4#/ • • • Real Data Open data repositories https://datahub.brno.cz/ •Vykopirovat vietnam You can use AI !!! •Google search is obviously AI too •E.g. ChatGPT •https://chat.openai.com/ •tell me about some sources about social and political data in Nigeria? •It is neccessary to check the results Typical scenarios •Excel (xls, xlsx) •SPSS (sav), Stata (dta), R (rdata) •CSV, TXT •XML •JSON Examples SPSS SPSS is available at the MU website (see next page) SPSS to excel online conversion: https://secure.ncounter.de/spssconverter •https://inet.muni.cz/?idr=ict.soft&pk=u&x=0&y=500 CSV Our example of csv data: https://data.europarl.europa.eu/en/datasets/texts-adopted-by-the-european-parliament-year2023/26 • > TT This is important for tables containing text in other language than English Split column into more columns Cleaning (ctr+h) • Sorting This box has to be marked to see variable names in „sort by“ Here you can add one more box for choosing sorting variable Computation Grouping variable: for what categories we want to make the computation Way of computation: what kind of number we want (sum, count, mean, min, max, SD) Sum: the ammount which you have to pay in shop is sum of prices of pieces in your shopingcart Count: number of pieces in your shopingcart Mean: average price of pieces (e.g. 20, 30, 50, 100 mean is (20+30+50+100/4 = 50) SD: mean deviation of values around the mean (50-20) + (50-30) + (50-50) +(100-50)/4 = 25 (real calculation works with squareroot of exposed numbers What you want to count Result •It is not over yet •Copy the results to new sheet and insert them as values •Click with right button of mouse •Select only rows wit result •Add new empty column •Use text to column wit „space“ as delimitor •Sort data by new column •Delete everything without word count (in new column) Compute percentages •=C2/$C$70*100 • •$ allows to calculate values for whole column, but the value in denominator remains the sam