Určeno pouze pro výukové účely na FSS MU Brno - distribuce a použití pro jiné účely je zakázáno materiál pro kurs Statistická analýza dat (Jak pracovat s daty a zadávat výpočty v SPSS) katedra sociologie FSS MU v Brně Soukromý tisk Masarykovy univerzity v Brně pro vlastní potřebu. Není určeno k zveřejněni. © Petr Mareš, Ladislav Rabušic Určeno pouze pro výukové účely na FSS MU Brno - distribuce a použití pro jiné účely je zakázáno 0. lekce (na zopakováni) ZÁKLADNÍ STRATEGIE ANALÝZY: VÝZKUMNÝ PROBLÉM, VÝZKUMNÉ OTÁZKY A PROMĚNNÉ. Určeno-pouze-pro výukové účely na FSS MU Brno š> ä 0 ti ,c i k. o <£: O O U) 0. « to 4^ 0 8 c « 8 U) 'Ä < 3 fr (0 £ ~ £ m ■5 Q) u s C (S Hl Ol Ol u & 3 ra >, 14 ti já 5 CO £ 0> 0> to r- Ö I ■E3 < § o V) .5 'C h3 O 8' •a ä c C9 Qf ■c <ŕ S u (B O) 0) 1» O 0 U *8 c s 3 <5 53 < 5 2 S 1B Sí C UJ 5"« 2 U S = m 2 S š ^■3 ^ ■'■ S -c ■a | »■s o> 5 *■" d.3 i* os >Š 0 (o SPc at o P 8 •Ö 53 S© «f a» to Dl O) 58 0) Qí z,c < S O g §< A variable -which cannot be given, yet is a maj or focus of the study, is called an attribute independent variable (Kerlinger, 1986). In other words, the values of the independent variable are attributes of the persons or the environment that are not manipulated during the study. For example, gender, age, ethnic group, or disability are alrributesof aperson. Other labels for the independent variable. SPSS uses a variety of terms such as factor (chapters 5,15,16,17 and 18), covariate (chapter 13),- and grouping variable (chapters 14,15). In other cases (chapters 5,9) SPSS does not make a distinction between the independent and dependent variable, just labeling them variables. Another common label for an attribute independent variable is a measured variable. However, we prefer attribute so it is not easily confused with the dependent variable, which is also measured. Sometimes variables such as gender or ethnic group are called moderator or mediating variables because they serve these functions; however, SPSS toes not use these terms so we will not either in this book. Type of independent variable and inferences about cause and effect. When we analyze data from a research study, the statistical analysis does not differentiate^whether the independent variable is an active independent variable or an attribute independent variable. However, even though SPSS and most statistics books use the label independent variable for both active and attribute variables, there is a crucial difference in interpretation. A significant change or difference following manipulation of the active independent variable may reasonably lead the investigator to infer that the independent variable caused the change in the dependent variable. However, a significant change or difference between or among values of an attribute independent variable should not lead one to the interpretation that the attribute independent variable caused the dependent variable to change. A major goal of scientific research is to be able to identify a causal relationship between two variables. For those in applied disciplines, the need to demonstrate that a given intervention or treatment causes change in behavior or performance is extremely important. Only the approaches that have an active independent variable (the randomized experimental and to a lesser extent the quasi-experimental) can be successful in providing data that allow one to infer that the independent variable caused the dependent variable. Although studies with attribute independent variables are limited in what can be' said about causation, they can lead to solid conclusions about the differences between groups and about associations between variables. Furthermore, they are the only available approach if the focus of your research is on attribute independent variables. The descriptive approach, as we define it, does not attempt to identify relationships. It focuses on describing variables. As implied above, this distinction between active and attribute independent variables is important because terms such as main effect and effect size used by SPSS and most statistics books might lead one to believe that if you find a significant difference the independent variable caused the difference. These terms are misleading when the independent variable is an attribute. Values of the independent variable. In defining a variable, we said that it must have more than one value. When describing the different categories of an independent variable, SPSS uses the 3 - distribuce a použití pro jiné účely je zakázáno word values. This does not necessarily imply that the values are ordered.1 Suppose that an investigator is performing a study to investigate the effect of a treatment. One group of participants is assigned to the treatment group. A second group does not receive the treatment. The study could be conceptualized as having one independent variable {treatment type), with two values or levels (treatment and no treatment). The independent, variable in this example wouldbe classified as an active independent variable. Instead, suppose the investigator was interested primarily in comparing two differeiit treatments but decided to include a third no-treatment group as a control group in the study. The study still would be conceptualized as having one active independent variable (treatment type), but with three values (the two treatment conditions and the control condition). This variable could be diagrammed as follows:. Variable Label Values Value Labels 1 = Treatment 1 Treatment type 2 = Treatment 2 3 = No treatment (control) As an additional example, consider gender, which is an attribute independent variable with two values, male and female. It could be diagrammed as follows: 1 = Male Gender 2 = Female Note that in SPSS each variable is given a label; the values, which are numbers, may also have labels. It is especially important to know the value labels whenthe variable is nominal; i.e., when the values of thg«_yariable are just names and, thus, are not ordered. Dependent Variables The dependent variable is the presumed outcome or criterion. It is assumed to measure or assess the effect of the independent variable. Dependent variables are often test scores, ratings on questionnaires, readings from instruments (electrocardiogram, galvanic skin response, etc.), or measures of physical performance. When we discuss measurement in chapter 3, we are usually referring to the dependent variable. SPSS also uses a number of other terms for the dependent variable. The most common is dependent list, used in cases where you can do the same statistic several times, for a list of dependent variables. In disaiminant analysis (chapter 13), the dependent variable is called the grouping variable. The term test variable is used in several of the chapters on t tests and analysis of variance. ' The terms categories, levels, groups, or samples are sometimes used interchangeably with the term values, especially in statistics books. Likewise the terra factor is often used instead of independent variable.. 4 Určeno pouze pro výukové účely na FSS MU Br s> ä 0 TJ C 1 's 0 a h . U) \n Q. Jä -- K ■«= dig »■5 o> J5 o>5 T* i* OS >i 0 vj S>c tu g tg ö i8 _ OJ ■a a §o ■ ■c <8 d» to 01 Q) b "> 0 5) d)iy 2.C < ? OS ■K S O g s< Basic comparative approach. The comparative research approach differs from the experimental and quasi-experimental approaches because the investigator cannot randomly assign participants to groups and because there is not an active independent variable. Table 1.1 shows that, like experiments and quasi-experiments, comparative designs usually have a few levels or categories for the independent variable and make comparisons between groups. Studies that use the comparative approach examine the presumed effect of an attribute independent variable. An example of the comparative approach is a study that compared two groups of children on a series of motor performance tests. The investigators attempted to determine whether the differences between the two groups were due to perceptual or motor processing problems. One group of children, who hadmotor handicaps, was compared to a second group of children who did not have motor problems. Notice that the independent variable in this study was an attribute independent variable with two levels, motor handicapped and not handicapped. Thus, it is not possible for the investigator to randomly assign participants to groups, or "give" the independent Variable; the independent variable was not active. The independent variable had only two values d - distribuce a použití pro jiné účely je zakázáno Chapter 1 - Research Problems, Approaches, and Questions or categories so a statistical comparison between the groups would be performed. It is, of course, possible for comparisons to be made between three or more groups.2 Basic associational approach. Now, we would like to consider an approach to research where the independent variable is usually continuous or has several ordered categories, usually five or more. Suppose that the investigator is interested in the relationship between giftedness and self-perceived confidence in children. Assume that the dependent variable is a self-confidence scale for children. The independent variable is giftedness. If giftedness had been divided into high, average, and low groups (a few values or levels), we would have called the research approach comparative because the logical thing to do would be to compare the groups. However, in the typical associational approach, the independent variable is continuous or has at least five ordered levels or values.3 All participants would be in a single group with two continuous variables-giftedness and self-concept. A correlation coefficient could be performed to detennine the strength of the relationship between the two variables. As implied above, it is somewhat arbitrary whether a study is considered to be comparative or associational. For example, a continuous variable such as age can always be divided into a-small number of levels such as young and old. However, we make this distinction for two reasons. First, we think it is usually unwise to divide a variable with many ordered levels into a few because information is lost. For example, if the cut point for "old age" was 657persons 66 and 96 would be lumped together as would persons 21 and 64. Second, different types of statistics are usually used with the two approaches (see Fig. 1.1). We think this distinction and the similar one made in the section on research questions will help you decide on an appropriate statistic, which we have found is one of the hardest parts of the research process for students. Basic descriptive approach. This approach is different from the other four in that only one variable is considered at a time so that no relationships are made. Table 1.1 shows that this lack of comparisons ör associations is what distinguishes this approach from the other four.. Of course, the descriptive approach does not meet any of the other criteria such as random assignment of participants to groups. Most research studies include some descriptive questions (at least to describe the sample), but do not stop there. It is rare these days for published quantitative research to be purely descriptive; we almost always study several variables and their relationships. However, political polls and consumer surveys are sometimes only interested in describing how voters as a whole react to issues or what products a group of consumers will buy. Exploratory studies of a new topic may just describe what people say or feel about that topic. Most research books use a considerably broader definition for descriptive research. Some use the phrase "descriptive research" to include all research that is not randomized experimental or J It is also possible to compare relatively large numbers of groups (e.g., 5 or 10) if one has enough participants that the group sizes are adequate, but this is atypical. 3 It is possible, as we will see in chapters 7 and 8, to use the associational approach and statistics when one has fewer than five ordered values of the variables and even with unordered nominal variables, but this is not typical. • .• Určeno pouze pro LyheO^uEoSí no - distribuce-^ p©uéi#íA Ol <4 1 3 J3 i* iB í K» «C 01 CS O) «E v . d •i Q 'S Z ■á m « CC to os > g 0 0 D! t 0 ,p 'C 43 C3 S "O s c es Or <É 1 c» 01 S u. 0 8 C5 EC 0) S .C <-e 0 i 0 g 2 ■í quasi-experimental. Others do not seem to have a clear definition, using descriptive almost as a synonym for exploratory or sometimes "correlational" research. We think it is clearer and less confusing to students to restrict the term descriptive research to questions and studies that use only descriptive statistics, such as averages, percentages, histograms, and frequency distributions, and do not test null hypotheses with inferential statistics. Complex Research Approaches . It is important to note that most studies are more complex than implied by the above examples. In fact, almost all studies have more than one hypothesis or research question and may utilize more than one of the above approaches. It is coromon to find a study with one active independent variable (e.g., type of treatment) and one or more attribute independent variables (e.g., gender). This type of study combines the randomized experimental approach (if the participants were randomly assigned to. groups) ana the comparative approach. Most "survey" studies include both the assöciational and comparative approaches, as mentioned above, most studies also have some descriptive questions so it is common for published studies to use three or even more of the approaches. Research Questions/Hypotheses Next, we divide research questions into three broad types: difference, assöciational, and descriptive, For the difference type of question, we compare groups or values of the independent variable on their scores on the dependent variable. This type of question typically is used with the randomized experimental, quasi-experimental, and comparative approaches. For an assöciational question, we associate or relate theindependent and dependent variables. Descriptive questions are not answered with inferential statistics; they merely describe or summarize data. Basic Difference Versus Assöciational Research Questions or Hypotheses Hypotheses are defined as predictive statements about the relationship between variables. Fig. 1.1 shows that both difference and assöciational questions/hypotheses have as a general purpose the exploration of relationships between variables: This similarity is in agreement with the statement by statisticians that all parametric inferential statistics are relational, and it is consistent with the notion that the distinction between the comparative and assöciational approach is somewhat arbitrary.4 However, we believe that the distinction is educationally useful. Note that difference and assöciational questions differ in specific purpose and the kinds of statistics they use to answer the question. 4 We use the term assöciational for this type of research question, approach, and statistics rather than relational or correlational to distinguish them from the general purpose of both difference and assöciational questions/hypotheses described above. Also we wanted to distinguish between correlation, as a specific statistical technique, and the broader types of approach, questions, and group of statistics. General Purpose Specific Approach Explore Relationships Between Variables Randomized Experimental, Assöciational Quasi-Experimental, and Comparative Description (Only) Dt'criptive Specific Purpose Compare Groups ■ Type of Question/Hypothesis General Type of Statistic Find Associations, Relate Variables, Make Predictions Assöciational Summarize Data Descriptive Difference Inferential Statistics (e.g.,r test, ANOVA) Assocíaflórial Descriptive Statistics Inferential Statistics (e.g., histograms, (e.g., correlation, means, percentages, multiple regression) box plots) Fig. 1.1. Schematic diagram showing how the purpose, approach and type of research question correspond to the general type of statistic used in a study. Table 1.2 provides the general format and one example of abasic difference hypothesis and of a basic assöciational hypothesis. Research questions are similar to hypotheses, but they are stated in question format. We think it is advisable to use the question format whenone does not have a clear directional prediction and for the descriptive approach. More details and examples are given in Appendix A. Určeno pouze pro výukové účely na FSS MU Brno - distribuce a použití pro jiné účely je zakázáno !? ä o ■Q C ' S;;- W: 0 "Ä w . <** K pi:« 0 0 C H 5 2 a < 3. n ■S s c w ra S a) § 3 s "T -1 ä>-- S -c tB S r- _■ w o.a p S o v) B) c o* 5 '5<5 £§ "S a -íl d» Q roju S3 auty 2*t Chapter 1 - Research Problems, Approaches, and Questions Table 1.2. Examples of Basic Difference and Associational Hypotheses Difference (group comparison) Hypothesis For this type of hypothesis, the levels or values of the independent variable (e.g., gender) are used to divide the participants into groups (male and female) which are then compared to see if they differ in respect to the^average scores on the dependent variable (e.g., empathy). An example of a directional research hypothesis is: Women will score higher than men on empathy scores. In other words, the average empathy scores of the women will be significantly higher than the average empathy scores for men. Associational (relational) Hypothesis For this type of hyporaesis, the scores on the independent variable (e.g., self-esteem) are associated with or related to the dependent variable (e.g., empathy). It is often arbitrary ■which variable is considered the independent variable but most researchers have an idea about what they think is the predictor (independent) and what is the outcome (dependent) variable. An example of a directional research hypothesis is: There will be a positive association (relation) between self-esteem scores and empathy scores. In other words, those persons who are high on self-esteem will tend to have high empathy, those with low self-esteem will tend also to have low empathy, and those in the middle on the independent variable will tend to be in the middle on file dependent variable. Chapter 1 - Research Problems, Approaches, and Questions Six Types of Research Questions Table 1.3 expands our overview of research questions to include both basic and complex questions of each of the three types: descriptive, difference, and associational The table also includes references to the tables in chapters 3 and 7, designed to help you select an appropriate statistic and examples of the types of statistics that we include under each of the six types of questions. Appendix A and the last section in this chapter provide examples of research questions for each of the six types. We use the terms basic and complex because the more common names, univariate and multivariate, are not ušed consistently in the literature. Note that some complex descriptive statistics (e.g., a cross-tabulation table) could be tested for significance with inferential statistics; if they were so tested they would no longer be considered descriptive. We think that most qualitative/constructivist researchers ask complex descriptive questions because they consider more than one variable/concept at a time but do not use inferential/hypothesis testing statistics. Furthermore, complex descriptive statistics are used to check reliability (e.g., Cronbach's alpha) and to reduce the number of variables (e.g., factor analysis). ., ■■-■■.;• .„.-:., Table 1.3. Summary of Types of Research Questions Type of Research Questions (Number of Variables) Statistics (Example) 1) Basic Descriptive Questions -1 variable 2) Complex Descriptive Questions - 2 or more variables, but no use of inferential statistics 3) Basic Difference Questions -1 independent and 1 dependent variable. Independent variable usually has a few values (ordered or not). 4) Complex Difference Question - 3 or more variables. Usually 2 or a few independent variables and 1 or more dependent variables considered together. See Table 3.2 (mean, standard deviation, frequency distribution) (box plots, cross-tabulation tables, factor analysis, measures of reliability) Table 7.1 (r test, one-way ANOVA) Table 7.3 (factorial ANOVA MÁNOVA) 5) Basic Associational Questions -1 independent variable Table 7.2 and 1 dependent variable. Usually at least 5 ordered values (correlation tested for for both variables. Often they are continuous. significance) 6) Complex Associational Questions-2 or more Table 7.4 independent variables and 1 or more dependent variables. (multiple regression) Usually 5+ ordered values for all variables but some or all can be dichotomous variables. Difference versus associational inferential statistics. We mink it is educationally useful, although not common in statistics books, to divide inferential statistics into two types corresponding to difference and associational hypotheses/questions. Difference inferential statistics are used for the experimental, quasi-experimental, and comparative approaches, which test for differences between groups (e.g., using analysis of variance). Associational inferential statistics test for associations or relationships between variables and use correlation or multiple regression analysis.5 We will utilize this contrast between difference and associational inferential statistics in chapter 7 and later in this book. 5 We realize that all parametric inferential statistics are relational so this dichotomy of using one type öf data analysis procedure to test for differences (when there are a few values or levels of the independent variables) and another type of data, analysis procedure to test for associations (when there are continuous independent variables) is somewhat artificial. Both continuous and categorical independent variables can be used in a general linear, model 10 Určeno pouze pro výukové účely na FSS MU Br Chapter 1 - Research Problems) Approaches, and Questions ä? ä o •Q c s L. Q 14« in § 0 » S3 c * £ 'C 45 O Cfl •tí 3 c CC O š 0 TJ . C § L. O »♦» W U): U) P« S ni 01 S 'oá j: ' vo> CQ o> 2 ■ r" ui d , i 0 U) ■■ S?*: O <» a I 5. w to ■li! T3 C (B (11 JE i si fjl OS o* e S« ■d S S| -I ÖS 2<- CHAPTER 3 Measurement and Descriptive Statistics According to S.S. Stevens (1951), "In its broadest sense measurement is the assignment of numerals to objects or events according to rules" (p.l). As we have seen in chapter 1, the process of research begins with a problem that is made up of a question-about the relationship between two, or usually more, variables. Measurement is introducedwhen these variables are operationally defined by certain rules which determine how the participants' responses will be translated into numerals. These numbers can represent nonordered categories in which the numerals do not indicate a greater or lesser degree of the characteristic of the variable. Stevens went on to describe four scales or levels of measurement that he labeled: nominal, ordinal, interval, and ratio. Stevens and most writers since then have argued that the level or scale of measurement used to collect data is one of the most important deterroinants of the types of statistics that can be done appropriately with that data. As implied by the phrase "levels of measurement," these types of measurements vary from the most basic (nominal) to the highest level (ratio). However, since none of the statistics that are commonly used in social sciences or education require the use of ratio scales we will not discuss them to any extent. Nominal Scales/Variables These are the most basic or primitive forms of scales in which the numerals assigned to each category stand for the name of the category, hut have no implied order or value. Males may be assigned the numeral 1 and females may be coded as 2. This does not imply that females are higher than males or that two males equal a female or any of the other typical mathematical uses of the numerals. The same reasoning applies to many other true nominal categories such as ethnic groups, type of disability, section number in a class schedule, or marital status (e.g., never married, married, divorced, or widowed). In each of these cases the categories are distinct and nonoverlapping, but not ordered, thus each category in the variable marital status is different from each other but there is no necessary order to the categories. Thus, the four categories could be numbered 1 for never married; 2 for married, 3 for divorced, and 4 for widowed or the reverse, or any combination of assigning a number to each category. What this obviously implies is that you must not treat the numbers used for identifying the categories in a nominal scale as if they were numbers that could he used in a formula, added together, subtracted from one another, or used to compute an average. Average marital status makes no sense. However, if one asks a computer to do average-marital status, it will blindly do so and give you meaningless information. The important thing about nominal scales is to have clearly defined, nonoverlapping or mutually exclusive categories which can he coded reliably by observers or by self-report. Qualitative or naturalistic researchers rely heavily, if not exclusively, on nominal scales and on the process of developing appropriate codes or categories for behaviors, words, etc. Although using quaHtative/nominal scales does dramatically reduce the types of statistics that can be used with your data, it does not altogether eliminate the use of statistics to summarize your data and 25 distribuce a použití pro jiné účely je zakázáno make inferences. Therefore, even when the data are nominal or qualitative categories, one's research may benefit from the use of appropriate statistics. We will return shortly to discuss the types of statistics, both descriptive and inferential, that are appropriate for nominal data. Dichotomous Variables It is often hard to tell whether a dichotomous variable, one with two values or categories (e.g., Yes or No, Pass or Fail), is nominal or ordered and researchers disagree. We argue that, although some such dichotomous variables are clearly nominal (e.g., gender) and others are clearly ordered (e.g., math grades-high and low), all dichotomous variables form a special case. Statistics such as the mean or variance would be meaningless for a three or more category nominal variable (e.g., ethnic group or marital status, as described above). However, such statistics do have meaning when there are only two categories. For example, in the HSB data the average gender is 1.55 (with males = 1 and females = 2). This means that 55% of the participants were females. Furthermore, we will see in Chapter 12, multiple regression, that dichotomous variables, called dummy variables, can be used as independent variables along with other variables that are interval scale. Thus, it is not necessary to decide whether a dichotomous variable is nominal, and it can be treated as if it were interval scale. Table 3.1. Descriptions of Scales of Measurement With Dichotomous Variables Aided Scale_________________Description_________________________________________ Nominal = 3 or more unordered or nominal categories Dichotomous = 2 categories either nominal or ordered (special case) Ordinal = 3 or more ordered categories, but clearly unequal intervals - ■■•?- .„.. between categories or ranks Interval = 3 or more ordered categories, and approximately equal intervals between categories Ratio = 3 or more ordered categories, with equal intervals between categories and a true zero Ordinal Scales/Variables (i.e., Unequal Interval Scales) In ordinal scales there are not only mutually exclusive categories as in norninal scales, but the categories are ordered from low to high in much the same way that one would rank the order in which horses finished a race (i.e., first, second, third, ...last). Thus, in an ordinal scale one knows which participant is highest or most preferred on a dimension but the intervals between the various ranks are not equal. For example, the second place horse may finish far behind the winner but only a fraction of a second in front of the third place finisher. Thus, in this case there 26 Určeno pouze pro výukové účely na FSS MU Brno - distribuce a použití pro jiné účely je z? áno ľv'* '■ S) ; ä: ... 5- •' "O .C i «. Ä to . "> K M 'S g c w Í3 0. a C- UJ C C « 0 « a á 5 . -J ^.. S ■= tig síl en S v , o .8 QS <« E" o* >§ o « 3?*= .2 o 'C 'ä oS "D 3 C5 v ■. j; 0 U) ■8 0S Oa 2,C < e o S ■■B-q;" are unequal intervals between first, second, and third place with a very small interval between second.and third and a much larger one between first and second. Interval and Ratio Scales/Variables (i.e., Equal Interval Scales) Interval scales have not only mutually exclusive categories that are ordered from low to high, but also the categories are equally spaced (i.e., have equal intervals between them). Most physical measurements (length, weight, money, etc.) are ratio scales because they not only have equal intervals between the values/categories, but also have a true zero, which means in the above examples, no length, no weight, or no money. Few psychological scales have this property of a true zero and thus even if tiiey are very well constructed equal interval scales, it is not possible to say that one has no intelligence or no extroversion or no attitude of a certain type. While there are differences between interval and ratio scales, the differences are not important for us because we can do all of the types of statistics that we have available with interval data. As long as the scale has equal intervals, it is not necessary to have a true zero. Distinguishing Between Ordinal and Interval Scales It is usually fairly easy to tell whether three categories are ordered or not, so students and researchers can distinguish between nominal and ordinal data, except perhaps when there are only two categories, and then it does not matter. The distinction between nominal and ordinal makes a lot of difference in what statistics are appropriate. However, it is considerably harder to distinguish between ordinal and interval data. While almost all physical measurements provide either ratio or interval data, the situation is less clear with regard to psychological measurements. When we come to the measurement of psychological characteristics such as attitudes, often we cannot be certain about whether the intervals between the ordered categories are equal, as required for an interval level scale. Suppose we have a five-point scale on which we are to rate our attitude about a certain statement from strongly agree as 5 to strongly disagree as 1. The issue is whether the intervals between a rating of 1 and 2,2 and 3,3 and 4, and 4 and 5 are all equal or not. One could argue that because the numbers are equally spaced on the page, and because they are equally spaced in terms of their numerical values, the subjects will view them as equal intervals. However, especially if the in-between points are identified (e.g., strongly agree, agree, neutral, disagree, and strongly disagree), it could be argued that the difference between strongly agree and agree is riot the same as between agree and neutral; this contention would be hard to disprove. Some questionnaire or survey items have response categories that are not exactly equal intervals. For example, let's take the case where the subjects are asked to identify their age as one of five categories: 21 to 30,31 to 40,41 to 50,51 to 60, and 61 and above. It should be clear that the last category is larger in terms of number of years covered than the other four categories. Thus, the age intervals are not exactly equal. However, we would consider this scale and the ones above to be at least approximately interval. On the other hand, an example of an ordered scale that is clearly not interval would be one that asked how frequently subjects do something. The answers go something like this: every day, once a week, once a month, once a year, once every 5 years. Yon can see that the categories 27 become wider and wider and, therefore, are not equal intervals. There is clearly much more difference between 1 year and 5 years ťhan there is between 1 day and 1 week. Most of the above information is summarized in the top of Table 3.2. Table 3.2. Selection of Appropriate Descriptive Statistics for One Dependent Variable Level/Scale of Measurement of Variable Nominal Ordinal Interval or Ratio Characteristics of the Variable - Qualitative data - Not ordered - True categories: only names, labels - Quantitative data - Ordered data - Rank order only - Quantitative data - Ordered data - Equal intervals' between values Examples Gender, school, curriculum type, hair color 1st, 2nd, 3rd place, ranked preferences Age, height, good test scores, good rating scales Frequency Distribution Redhead- HI Blond - TTTT Brunette- n Best - n Better - IH Good - HI 5 - I 4 - n 3' - m 2 - m ľ - n Frequency Polygon/ Histogram Bar Graph or Chart ...No ,. Yes Yes Yes Yes Yes '"'■'"''' Central Tendency Mean Median Mode , No . No. . Yes' " Mean Rank Yes, . Yes . Yes Yes Yes. '" Variability Standard Deviation . . Range How many categories Percent in each No No Yes Yes of Ranks Yes,but!-. .. Yes Yes •■•-■■■■■ ' Yes'-- --.•■ Yes. Yes Yes Shape Skewness Knrtosis No No :• NO.' Np .. Yes ■Yes ' Tlaiange of ordinal data may we]] be misleading 28 . lekce HROMADNÝMI DATY PŘED JEJICH ANALÝZOU (Modul FILES: procedury), PRÁCE S PROSTŘEDÍM (Moduly Edit, View, Utilities) A VÝSTUPY Z ANALÝZY (Modul: Output) Určeno pouze pro výukové účely na FSS MU Brno - distribuce a použití pro jiné účely je zakázáno ! i I I f i 1 i -5 f Starting SPSS for Windows The easiest way to run SPSS for Windows is by using the Start button Dur- Zt mStaUrn °f ^ ^ SetUp Procedure adds SpSS to the meľu that appears when you dick the Start button, as shown in Figure 2 1 Always use the left mouse button unless the right one is specifically indicated. Menu bar Toolbar Figure 2.1 SPSS on the Start menu ^l-f;,-"P''fiŕ=.r^^jŤlr":-'.v■-'.^ľ. .!'-1c^-|:v',c*_"--,".:.'!-.-,ľ;.'.'- i * s rsfstart to dispky **start mmu'*» dick spss The SPSS Data Editor window is displayed, as shown in Figure 2.2 You can move it, like any other window, by clicking and drawing it title bar, or resize it by eliciting and dragging its side! or corner! Figure 2.2 SPSS Data Editor winde Opening a Data File The SPSS Data Editor window displays your working data file. You don't have one yet—that's why the Data Editor is empty. If you have data of your own that are not in the computer yet, you can type the numbers right into the Data Editor. If the data are already in a spreadsheet or database file, you can probably read that file into SPSS. The data used in tins book are already in the form of SPSS data files. To use them for the exercises, or just to follow along in the analysis, simply open the appropriate data file. To open a data file: P- Click the left mouse button on the word File on the SPSS Data Editor menu bar, as shown in Figure 2.3. The File menu is displayed. Status bar ü c Q) > E o T3 "O re C/5 o. a. «í I C < ro ro Q o a> TJ '5 O od en O) CO >« P o An Introductory Tour: SPSS for Windows 10 Chapter 2 ► On the File menu, click Open. Figure 2.3 Opening a data file iaiffi,ig.B.H:HJiHreM^^MMM^^MBWUía^ S1MÉIBŽÉKÍ1H11: M^^ÉBĚs». i iffšgfl! lk"ľ-Cririyl 5 «Vv:'^'/- jjiUpon'b.1!* «gřaci I l!!^^䫧t5B^'ágŠ^kgŽlii^JE-Ss ,:-"^ When you click Open on the File menu, the Open FUe dialog box appears, as shown in Figure 2.4. Figure 2.4 Open File dialog box Up-folder icon Click to search for the file on a different drive or in a different folder ► Click the gss.sav data file where it appears in the list. ► Click Open. ^m What if the gss.sav file doesn't appear? Only files in the current ,_' .' drive and directory are listed. The file you want may either be m another directory or saved on a different drive. To look in a parent folder (one that contains the current folder), click the up-folder icon, as shown in Figure 2.4. To look in a subfolder (one contained in the current folder), doubleclick it in the list. To look on a different drive, click the up-folder icon repeatedly until you reach My Computer, then double-click the desired drive icon and continue down through the folder hierarchy on that drive. ■ ■ ■ When SPSS has finished reading the data file, it displays the data in the Data Editor, as shown in Figure 2.5. This particular data file contains selected information for 1500 people who were interviewed in the 1993 General Social Survey, which annually asks a broad range of questions to a sample of adults in the United States population. Figure 2.5 Data Editor window with GSS data To view the data in the Data Editor, from the menus choose: Window gss - SPSS Data Editor If your screen displays all . ^«s numbers rather than value labels such as Male and Female in the cells, from the menus choose: View Value Labels Určeno pouze pro výukové účely na F ; MO Brno - distribuce a použití pro jiné účely je zakázáno 12 Chapter 2 An Introductory Tour: SPSS for Windows 13 Figure 2.7 Frequencies dialog box with default variable labels X o IP c > ■D •o CO L. (U a a 1 TO C «C JS ra Q S tu TJ 5 O » O o CO >co O <§> How important is a gjlgä <$> It Bfe exciting oľ oW;'!;- fjg|§| <§> Mother a college gr j|fäp; 4> Highest Year ol! - . ~' -"J :$> UKE OR DISUKE %l|j|p r^> Falher a college gre; ■- g^ '.& Highest Yaar of Sei—Jp ;F IN LIFE HÜy ^mrnm |pl! fe-^ij To make this book easier to read, we'll use variable names instead of labels m dialog boxes, as shown in Figure 2.7. To display variable names rather than labels in your dialog boxes (so you can follow along with the text), you need to change one of SPSS's default options. ► From the menus select: Edit Options... ► In the Options dialog box, click the General tab. ► In the Variable Lists group box, click Display names. ► Click OK. This change doesn't take effect until the next time you open a data file. The effect of the changed option is shown in Figure 2.8. Variable selected -in source list As a shortcut to scroll the source list, click in the list and type the letter p. This scrolls to the first variable beginning with p. Select Bar-charts Figure 2.8 Frequencies dialog box %> opera <$> pacolieg '9 paeduc ^ijäpSäStS """"'Ä! ,ÉÍÍfftt ■ Click to move variables between lists To use this dialog box: ► Click happy m the scroll list and then click \T\. This moves happy into the Variable(s) list. ► Scroll down the source list until you see postlife and move it into the Vanable(s) list as well. ► Click Charts. This opens the Frequencies Charts dialog box, as shown in Figure 2 9 Here you can request charts along with your frequency tables. Figure 2.9 Frequencies Charts dialog box LÁJÍI ► Select Bar charts, as shown in Figure 2.9. An Introductory Tour: SPSS for Windows 15 The Viewer Window The Viewer window is where you see the statistics and graphics—the output—from your work in SPSS. As shown in Figure 2.10, the Viewer window is split into two parts, or panes. (A piece of a window is often called a pane in computer software, just as it is at your local hardware store.) I 1 ■ö ■o re co L.