f , , ... Presenting v ^ your Results In this chapter, we will take you through some techniques for presenting your results to a scientific or policy audience. This part of research is given far less attention than it should be. It is one thing to produce a bunch of statistical findings - it is quite another to be able to communicate them clearly to your peers and to a nontechnical audience. It takes quite a bit of practice to be able to do this effectively. We will use a worked example throughout this chapter to illustrate how we go about starting with a research question, developing hypotheses around the research question, presenting statistical tesults, and creating discussion about the results of our statistical tests. Throughout the worked example, we are using a different data set than in previous chapters but still from the first (1991) wave of the British Household Panel Survey, which can be obtained from the UK Data Archive (www.data-archivc.ac.uk). DECIDING ON A RESEARCH QUESTION The first thing that you need to do when you are undertaking research is to decide on a feasible research question. The feasibility of a research question depends on many things, including your interests, the time and funding that you have, and your skill set. You may have to narrow a very wide topic to something more, specific if you have a limited amount of time in which to produce results. Or you may have to modify a research question if you don't have the analytic skills to answer your original question (provided you don't have the time or desire to learn the new skills whilst answering your research question!). Suppose you are interested in attitudes towards gender roles. A very broad research question would be, 'What determines attitudes towards gender roles?' But in reality, it is likely that the answer to 319 320 Presenting your Resu'ts this question would require very detailed information about individuals' environments when they were growing up, as well as detailed information about their parents' beliefs and behaviours. It may be the case that you don't have such detailed information. You may want to narrow your research question to something more specific, such as, 'How do adults' characteristics influence their attitudes about gender roles?" REVIEWING THE LITERATURE Before undertaking any study, it important to first review the existing literature on your general topic. Conducting a literature review is beyond the scope of this book, but one of the authors has written about this task elsewhere (see Neuman and Robson 2008). A quick search would show you that Burl and Scott (2002), Fortm (2005) and McOaniel (2008) have al! published studies that examine gender role attitudes. Careful review of these articles and others would help you become familiar with theories in this area of study and the findings of these studies would assist you in developing testable hypotheses, DEVELOPING HYPOTHESES Hypotheses come from three general places; theory, previous research, and exploration. Theory and previous research can obviously guide your expectations about what you might find in your data. In many cases, however, researchers working in new areas may not have previous research or a suitable theory to draw from and therefore might undertake exploratory analysis to uncover patterns in the data. Sometimes scientists use hypotheses that are derived from a combination of theory and research, and also have additional hypotheses that are exploratory. From the literature, we would be able to make the following hypotheses: H 1: Women will have more liberal gender role altitudes than men. H2: Younger people will have more liberal gender role attitudes than older people. H3: Married people will have less liberal gender role attitudes than other marital statuses. Exploring the data and selecting measures 321 K4: Religious people will have less liberal gender role attitudes than non-religious people. HS: Education will be positively associated with liberal gender roie attitudes. H6: Ethnic minorities, particularly Asians, will be less liberal than White respondents. ! 17: There will be an interaction between sex and income such that there is a stronger positive association between income and libera! gender role attitudes for women. We will also include an exploratory hypothesis to test in our analyses: IS, There will be an interaction between sex and marital status on gender role attitudes. EXPLORING THE DATA AND SELECTING MEASURES I Our analyses here are going to be based on the same survey that we have been using for the earlier chaptets of the book. In 'real ; life', you may have a choice of data sets from which you could select the data on which you want to test your hypotheses. Or you may have collected your own data for the express purpose of I answering a set of research questions. From the above research questions, we will need measures of: , sender role attitudes, sex, age, ethnicity, marital status, religiosity, ! education, and income. Recall from Chapter 3 that we constructed a scale that * assessed attitudes towards gendet roles. After some analysis, it •vas determined that the following items would be kept in the scale, opfama: A pre-school child is likely to suffer if his or her . v mother works. i Djyfamb: All m all, family life suffers when the woman has a full-time job. J up fame: A woman and her family would all be happier if she goes out to work. 'ipfamd: Both the husband and wife should both contribute to * the household income. * Presenting your Results opfame: Having a lull-tune job is the best way for a woman to be an independent person. opfamf: A husband's job is to earn money; a wife's job is to look after the home and family. upfamh: Employers should make special arrangements to help mothers combine jobs and chiidcare. opfami: A single parent can bring up children as well as a couple. The response categories for ali items were: 1, stronglv agree; 2, agree; 3, neither agree uor disagree; 4, disagree; and 5, strongly disagree. We reverse coded items opfame, opfamd, opfame, opfamk and opfami and then added all the items together to give a scaie with a minimum of 8 and a maximum of 40. People who score 8 express very conservative attitudes, while those around the 40 mark would be very liberal. We could have used command alpha, but, as shown in Chapter 3, it rescales the variables and their values become less intuitive (although it does produce mathematically the same scale). In this chapter, we will call this variable gendcrroles. We know from previous chapters that we have a dummy variable that measures sex called female, a variable measuring age called age, and a marital status variable mastat, and a variable that measures monthly income called fimn. We also have a seven-category variable that measures education, called qfacbi, as well as a dummy variable that indicates if a respondent was active m a religious group, which is called acliverel. As this example involves adults* characteristics we restrict the sample to those 18 and over. keep if age>17 UNIVARIATE ANALYSIS Before undertaking a detailed analysis, ir is important that we get our hands dirty with the data and really get familiar with the variables of interest. All analyses should begin at rhc univariate (i.e. one-variable) level. We cannot overemphasize that it is important to get to know your variables before you throw them into mote complex analyses. In real life, you should also be aware of any sampling issues that are present in your data (i.e. do you need to Univariate analysis 323 include any weights to adjusr lor sampling?). Don't forget to specify your missing values! We can check our variables by running a summarize command on our dichovomous and interval variables: su genderroles female age ftmn activerel su genderroles feirale age rim" activerel 7 art curl 6 Obs Mean Sr.d. Dev. Mir. Max 3188 25.5C424 4.758196 S 4 0 9 92 B .5333669 .4989106 0 1 9920 45.49526 18.02041 18 97 9582 758.1702 742,0371 t; 11297 9 572 . ~, C19641 . 30261 69 C 1 genderroJ.es aye f i.mn r-tctivfir^I We can sec that the mean of genclerroles is 25.50 with a standard deviation of 4.77. The sample is about 53% female, and the average age is 45.50 years. As well, the average monthly income {fimn) is £758.17 and about 10% of the respondents are actively involved in religious groups {uctiuerel). We will now tabulate iht categorical variables in our data set. . La mast,at -> tabulation of mastat mariLai status j Free;. Percent Cum. married j 6,009 60 , , 57 60 , . 57 ring os couple | 67 0 6 . , 7 5 6 /. . 33 widowed | 866 8 . .73 76 . . 06 divorced j 43 4 4 , . 38 80 . .43 separar.ed \ 189 1 . 91 82 . 34 never married j 1 ,752 17 . . 66 100 . . 00 Total J 9,920 100.00 We can see that the majority of sample members are married (just over 60%), with the next largest category being never married (about 18%). 324 Presenting your Results ta qfachi highest academic | qnaliiication irreq. Perc erit L am. higher degree 1 ? 2 . 28 ] .28 Is: degree 5 9 5 6 . 2 5 7 . 53 hnd,hnc,teaching 495 5 . 18 12 .71 a level 1,349 14 . 10 26.81 o level 2 , 320 2 4 .25 51.06 cse j 469 4 . 9 0 55.96 none of these j 4,213 44 . 04 10 0.00 Total | 9 , 567 100 . 00 The largest category in the variable measuring highest academic qualification {qfachi) is 'none of these', which can be interpreted as having only compulsory schooling or less, just over 7% of the sample had a university degree or higher. . ta race ethnic group membership | Freq. Percent C urn. white | 9, 1 9 6 96 1 5 96 1.5 black-carib | 65 0 68 96 8 3 biack-african | A'2, 0 44 97 21 biack-othex" j 26 0 2 7 97 54 indian j 99 1 04 9 8 5 8 Pakistani | 42 0 44 99 02 bangladeshi | 6 0 06 99 0 8 Chinese j q 0 09 99 17 other ethnic grp ] 79 0 83 100 00 Total | 9 , 564 :i oo 00 In terms of ethnic group membership {race), we can see that over 96% of the sample is White. Some of the categories, like 'Black other', 'Bangladeshi', and 'Chinese', are also very small: 26, 6, and 9, respectively. We need to think if there ate ways of collapsing the categories so that we do not have problems with this variable later. If we try to make a number of dummy variables out of this variable the way it is currently coded, we will run into problems with the smaller groups - they will be associated with a lot of 'error' (indicated by large standard errors), or the estimation Univariate analysis techniques will simply kick them out of the estimation procedure due ro collinearity problems. There are always debates around how to 'best' collapse ethnic gr Tap categories, and there is no one best way. Here, we are going to group ail the 'Black' categories together, create a single group for Indian, Pakistani, and Bangladeshi called 'Asian', and group 'Chinese" with 'other'. Of course, the Asian group masks the differences between Muslim and non-Muslim Asians and creating the single category 'Black' also loses the major cultural differences between Caribbean Blacks and African Blacks. Also putting Chinese with 'Other' simply loses the uniqueness of the Chinese in a very heterogeneous and basically undefined group. But in real-life research, such decisions must be made. gen rac«2-zace recode race2 3=2 4=2 5=3 6=3 7=3 8=4 9=4 lab def race2 1 "white" 2 "black" 3 "osian" /// 4 "other" lab v»l rac»2 race2 As a final step, we check our new variable to make sure the receding was done properly. tab rac«2 . tab race2 race2 j eq. Percent Cum. white j 9, 196 96 .15 96 .15 black j 133 .1 . 39 97 . 54 a si. an j 14 7 1. 54 99 . 08 other j 88 0 . 92 100.00 Total 9 , 564 100.00 When reading academic articles and reports, the first table that you often see is a table of descriptive statistics. It is a good idea to ni.de such a table to give the reader some indication of the char-ii te«!sties of your sample. However, notice that in the previous tables, the N differs quite a bit. For age and female, there are 10.264 observations, but for geitderroles there are 9515 and for ri; iw.fj there are 9.902. 326 Presenting your Results Why are the numbers of observations different for the variables? This is due to people not answering survey items. As trie variable genderroles is a composite score of eight items, a person had to have answered every one of the eight items to be included in the scale - at least that is how we constructed it here (see Chapter 3 for alternative techniques that allow for individuals to be missing on one or more cf the items). And some people simply don't like to answer certain types of questions, such as those concerning income or religious beliefs. Because oi these differing N sizes, it is better to wait to produce our final table of descriptive statistics (i.e. the one to include in our report! until after we have done our multivariate estimations. This is because after we do regressions, we can get the descriptive statistics for our "estimation sample' - that is, the sub-sample of cases that have data on all our variables of interest. We also need to check the distribution of our dependent variable so that we know its properties for when we want to conduct multivariate analyses. We can check this visually with the histogram command. histogram g-flnderroles, discrete . histogram aenderroles, discrate (start-8, wirith-1.) .1 10 20 30 40 genderrales Btvariate tests 327 from the histogram we can see that our dependent variable 01 interest is reasonably normally distributed, It is surprising i,i see how- 'normal' it is, as it is remarkable how few interval variables {in our experience at least) display such tidy distributive characteristics. BIVARIATE TESTS Beiorc we directly test our hypotheses, we. should undertake some bu'anate tests. One or the assumptions of many multivariate tech-ruaues is that the independent variables are not highly correlated with one another. We can check this assumption with the corr annuund. Wr ha»e two categorical variables in our analysis - qfachi and nutil.it Wf cannot simply correlate these variables with the others because the numbers associated with their categories are nominal. We need to convert these variables to sets of dummy variables. We ean do this with the xi: command. I xi: su i .qfachi i emastat ^'..^r j . rL-J i .race2 . if u_ n Iqfacn L_ 1-7 liidlural ly codSd! _ IQ rachi_ 1 omitted . t I I:!ij.stat_7 -5 (natu oil li- coiled; imui, l.at 1 omi 5.1.el J.J -J Irdca2 _i -4 (natural ly coded; Irace2_ c.;mi tted) 1 Var.aD e obs Mean 7 Ld. Ucv. Kir± Max . r-xi-c 1 .0625065 .2420B59 0 1 ■Ir,hi _3 9567 .05-8418 .2217255 0 1 _iq;aeii A 9567 .1410077 .3480455 0 1 * JltT'm _5 1 9567 .2425063 .4266176 0 1 1 _r.jt i hi " 1 91=67 .0490227 .2159267 0 1 l: :,i.hi. _7 9:767 .4403679 .196457' 0 » 7 992C .06/5403 . 2509681 0 1 1 V..-,: ■ ,-i < .08729114 .252286 0 1 ..J iu = t tt A I 992 0 . 0437 5 .2045487 0 1 I - n=. -1 - L 6 9920 .0196624 .13 67152 0 .......... , . -------- ----- ----- ------ -------- _lin7.~T .it 1 9920 .1 .'56129 .38136 0 1 _Iiace2 2 1 9 9 64 . 0.1 .'.9063 .1171083 0 1 * 1 9564 1 9564 .C1537U'. . 0052012 ,1536765 .0954354 0 0 y 'ion can see that this has created a set of dummies for qfachi and 'naitai. This process by default drops the lowest coded variable as 328 Presenting your Results the reference category, However, for the corr command, we will need all categories for qfachi, mastat, and race! to be dummy coded. We can make the ones corresponding to category I for each variable manually. In the case of mastat, category 1 corresponds to being married; for qfachi, respondents in category 1 have a higher degree; for race2, category 1 corresponds to being White, gen married=jru-istat==l gen higherdeg=q£achi==i gen white-race2-~l Now we can create a correlation matrix (partial table shown) for all the variables: corr genderroles age female married _Im* /// higherdeg _I<3* fixui activerel Irac«2* white hicihcrc ..Iznsctat ;: In rhc correlation matrix, we look lor correlations that arc highrr than about 0.60. We want variables to be correlated with the dependent variable. Because we put genderroles first in our list of variables, the correlates with it will be in the first column. What we are trying to spor is if the correlations between our independent variables are of concern. In the full matrix (nor presented) we observe thar having low education (_Iqfac~7) is quite strongly correlated with age (0.4877). Of course, the categories of a variable converted to dummies will he correlated with each other, often quite highly. In this example being White and being Asian (.Jrace~3) are correlated at -0.6211 (not shown). Apart from these unavoidable correlations between the dummies, there is nothing that raises alarm in this correlation matrix. Bivanate tests 328 II there were a large correlation of, say, 0.70 between age and pnn. for example, we would have to make a decision about drop-pine one of these variables as multivariate estimation techniques wo.ild not be able to properly capture the individual effects of variables that are so highly correlated. Substantively, they are obvioush different, but if they are correlated so highly that they are not 'mathematically' different enough for Stata (or any other soft'.1, ate program, or even hand calculation for that matter!) to tell them apart. Note that it was quite 'in fashion' to publish correlation matrices up until about 10 years ago. Now it is very rarely done. If you are writing a technical report, you may want to include such a marn:. m an Appendix, but nowadays it is rarely a main part of a scholarly social sciences paper. We conclude our bivariate tests with some scatterplots. scatter genderroles age, msymbol(point) jitter(3) 40 - • m 20 10- ■ ' ' .; 20 40 60 80 100 age flic -,catrei plot between genderroles and age reveals that there is eunvuee of a downward negative linear association. We can also add a linear fit line to display this association: sc«Lcer tjeriderrol.es age, msynsbol (point) /// : jitter(3) || Ifit genderroles age 330 Presenting your Results 40- 10- 20 40 60 80 100 ago genderroles----Fitted values ■ Now we look at the relationship between genderroles and total income (jhnn); scattsr senderrolos firoa 0 5000 10000 total income: last month Bivariate tests 331 Immediately you can see that the association between these two variables does not look stronglv linear. And there are a number of outliers - one in particular in excess of £10,000. We need to examine this income variable more closely. s'u fimn, detail . su firmi, detail 9582 9582 758.1702 742.0371 550619 3.412333 27.19502 We know that income variables are often highly skewed - and the details provided from rite so output reveal this. The mean (758) and median (550) are very different, and the skewness (3.41) and kurtosis (27.19) are also very large. In the previous chapter, we took the natural logarithm of income to help normalize it, as it was our dependent variable. We could do this, but transforming income as an independent variable is not often done because it is not as important when it is an dependent variable. The robustness of the multivariate tests that are commonly used can cope with non-normally distributed independent variables. What we should do, however, is examine what happens if we eliminate one of the bigger outliers: corr gendexxoles film . corr genclerroles fimn fobs=S188) gender-s fimn total income: last month Lies Smallest 10% 9 0% 95% 99% 134 281. 56 23 45 )50 1023.333 1602.562 2006.54 3447.667 Largest 8628.875 8716.667 9455.773 11297 Obs Sum of Wgt Mean Std. Dev. Variance Skewness Kurtosis geriderroles | 1.0000 fimn i 0.0278 1.0000 332 Presenting your Results corr gancerrcles fimn if tinui vdiuhies a.*-* & ^pecm' case o* categu *c-*i "3H-£,&les '=>ee Chapter 3). t >> o.-ii.tt.iiJ; -» ■ 'r.t:- ■ ■ - 62 -58 t I Ik 75 i:-0- P.cb > P V e..-3&ff = -C ,<■,> = -fl-. LVj? Tob-1 | h Ij _. r. »i >• - ^/ Co=f a:|i#| t F ft 1*111 Iri-C hi j a:.;t t.:«-j" I , ; >J7!S36 -f ',*>« - It I Tw*-, J _ | •g "■irj 1 l-i a ,i'< - _1W5.?- -i"„-I } ,= f JC u a,- ;f c23 .5 *i r 7 S E e%^|^ t 1. J I $ CO-, . 1SI.-)j4 O si*- | -> i„ £-Stw [ - VlS:,^ (.■ U 4 4 '< 031 1 rl'i-' Multivariate tests 337 1 i-lis - .'.1.1 _ „^fr*r*~: t ."ei4;«=i i .?*■ _-=>?x:eraa-t. i i'.fi < V.- -> a: 5 r < - - r -st* r" °- * -r=-C°_ -I fiHB4-i7 ., "ftfii J 1j Fl T.jf^.-f-.,,-. 1 -.. . 4 -. ..'Oei.r.. i "V -- 1 You can see here that the results, are much tidier, with no 'dropped' messages. This solves the problem of how to tidy up your output with an interaction of a dummy variable, with numerous predictors, if the variable you want to interact with numerous other predictors is a categorical variable' with numerous categories, however, there is no- 'quick fix' to get Stata to stop inserting the main effects several times. There is no 'error1 in the results - you wiH just have to remove the 'dropped' comments manually when you are creating your tables. ■ \ S To test all of our hypotheses in one model, we can use the command: i aei s regress gender roles age i. female* i .mastat /// i . qf achi i . female J finm activerel i rac@2 I . xi: re F - C-OGyd R squared = 0. 1.7G1 Ad: .^-.-rr.:ared - 0. 1.7-jl 5oo-. 1-13- - . 3 j 18 ■ 338 Presenting your Results The information at the top of the output telis us that category 1 in mastal has been omitted, as have the categories i for qfuchi and 1 for race!. These correspond to being married, having a higher degree, and being White. If the categories that arc omitted seem reasonable for resting our hypotheses, you can just leave them. But if it seems more logical to change the reference category, use the char command. One of our hypotheses is that education will be positively associated with liberal gender role attitudes, so it probably makes more sense to have the lowest education coded as the reference category, because if support for our hypothesis is found with the default reference category, our coefficients will all be negative - which isn't 'wrong', but less intuitive. char qfaehi [omit] 7 xii regress geaderroles age i.female*!.mastat /// i.qfachi i . female | £imn activerel ;i.race2 Multivariate tests 339 -':!,:< In 1 : : I, I":,' ' I ' t' ' ' I II . Il 11 Mm;) = i (. - 1 '( .^lff5.5^ .27401.6 ,07654l>v .:,lo.lCM .:v' .I" r i. ■ ' i .-i. OSjBSV .3V78 -.194:27b 512604 .1-/4664 1 a ' ' I cc I 71.G21S . 136 1ne model and see what happens. ! • mei all effect of income (fimn) was positive and non-.ui'i .a " ith each unit i£l! increase in income being associated in m •> • ^ease in genderroles of 0.00007. This is a very small 4.»-Hi« itm 1 ut this is due to the way the variable fimn is measured The ui'-i tction between female and f/mre (JfentXfimn_l) was . ; , f- am, though. This suggests that as females earn more, they and lo hue higher scores on genderroles, and that this is •*»« In u t)\ different from the effect that fimn has on males' gear dr-i i. h ui tudes. We will explore this interaction in more depth i n i in tins. itapter. I « >elationsmp between being active in a religious group >,i hi iiil at d genderroles is in the expected direction. Compared hi lr-sp indents who were not active in a religious group, being i ii, iu a it ugious group was associated with a 1.306 decrease in i >i ' 1«', Finally, in tet ins of ethnic group membership, we find oi it j'-ltiiiL to Whites, being Black (_lraee2_2) is associated with ' tn tease in genderrolei, being Asian (_Inice2_3) with a 1 \2'i ..«• if vie in genderroles and no significant difference for ifdn r1 _l,occ2._4). o Mew our hypotheses and see what we can determine ,, 1 it ill: Women will have more liberal gender role attitudes than men: supported. H2: Younger people will have more liberal gender role attitudes than men: supported. Hi: Married people will have less liberal gender role attitudes than other marital statuses: supported. H4: Religious people will have less liberal gender role attitudes than non-religious people: supported. Hi: education will be positively associated with liberal gender role attitudes: somewhat supported. H6; Ethnic minorities, particularly Asians, will be less libera! than White respondents: somewhat supported. H7: There will be an interaction between sex and income such that there is a stronger positive association between income and liberal gender role attitudes for women: supported. ÍI8: There will be an interaction between sex and marital status on gender role attitudes: supported. 342 Presenting your Resulis w'e have ioand at least some support for all of our hypotheses. In real life, it is very rare to find support tor all your hypotheses. Even when you laii to find support for your hypotheses, such a non-finding' can be a finding In itself (but again, in 'real life' it is actually quite difficult to publish such findings, unfortunately). Jf we are satisfied with our models and don't want to make in\ further adjustments, we can now start thinking about making tables and graphs for our paper. Making tables Recall that, earlier in the chapter, we urged you ro not make tables of descriptive statistics until you have run your final model and have an 'estimation sample', ^e can get descriptive statistics now using the if e(sample) option or by keeping only the cases in the final regression by using the command keep if e (sample). It is very important to note that the following command must be used immediately after the final regression you are using, because it is only the last estimates that are stored in memory. gen noqual=qfachi xi: su gonderrolee age female i.mastat raarriad /// i .qf«Chi nocpial f Iran activerel i . race2 white if e (sample) gen noq-.ia I =qf achi ■ x_ : <>._-iiuerr-: les _ . qtacni noqua.l . .ju^H-.at ..i.mastal_'l-.v^rhi lqtd<:> i -,race2 Iidcrö 1 >'. ein Leinale i ,i mi aLTiverel (natura I Iy inatun I ly cvoe [riahu''d_ly concii .-j t .ncrried /' / ,• ■:ice2 wl'jiLe if n- (sample ed; linas -at_l omitted eel; It,if -:>.:~ni _7 oral 1. Udce2. 1 omi 1.:-.l;cI) Variabl Tinas t a-: h fj 9 2 Sj92 4 5 Mear .vU _>2 \ r»?.3 9 51.09756 0700644 . 0 <- 4 8 6 8 8 . Ii I 96733 Uev. Mi 20V027Q 1 3Ü8fS2S Hl Itivariate tests 343 _.Iqfachi_ Inic.chi_ f lmn lou will nonce now that the rV tor all the variables is 8o92, which is the same as the N tor our regression model. hde lot i in cm md pi^te thi output a^ >\e ha\ l done -nilo a Hold r'oLWrieiit, it doesn't ic dh icsthiHe a loiunal quality table. \ou will have to highlight the table m the Results window and then go to Edit -> Copv table. ■I trti^rl -Fötmac 'rmh • .Table Wiir.Ji'w Help • ' ' .:;••;'•. Dev. Mki 4 1 1 Si 40 jl) 1 Li4 'S !J f Ifi k Be U .sew? -hh t 1 Ü M f 1 \\ «r-S !43 1 ■ __> - <-' 1 441fJc ft 7 ' J _ii t- 369^ .0136733 \ 77t '41 (.' < Hill d .1 I'll- iy I) '1 i.,r...... > £r ) r: irfa h ■ Si.-". HS/ till 1 0 t 4! ji 1 j 3"1 - 1 Presenting your Results Now open a blank Word document and press Paste. The result will be ugly. Now, in Word, select the contents of the table then go to Table -> Convert Text to Table and the following dialogue box will appear. More often than not, it is 'smart1 enough to guess the number of columns and rows that you want. Check that these are correct then click OK. í&trňbsr of rows: -'Vs. Fixed coiufiin width: i sole stvís:-ínoriř?) lab- >hw Then you should get this; Multivariate tests 345 f Soinr may prefer to copy the table into Excel to do the initial formatting rather than. Word. With ; i bit of tidying up, we have the following table: Table ot descriptive statistics (W = 8692) Variable fvlean Std, Dev. Min Max Dependent variable Gender roles 25.545 4.755 8 40 Independent variables Ac,e 45.102 17.995 18 97 Female 0.561 0,496 0 1 Marital status Living as a couple 0.070 0.2B5 0 1 Widowed 0.086 0.280 0 1 Divorced 0.045 0,207 0 1 Separated 0.020 0.139 0 1 Single 0.177 0.382 0 1 M*:>ied 0.802 0.489 0 1 Educational attainment l-liqher degree 0.009 0.092 0 1 University degree 0.055 0.228 0 1 HNO, / .WC, leaching 0.050 0.218 0 1 A levels 0.139 0,346 0 1 O ie. vets 0.250 0.433 0 1 USE 0.052 0,221 0 1 No qualifications 0.446 0.497 0 1 IncoTie 649.749 479.790 0 2005 Actrve in religious group 0.100 O.300 0 1 Ethnicity ß/jcA 0.013 0.111 0 1 As/an 0.0! 4 0.117 0 1 Other 0.009 0.095 0 1 White 0.964 0.185 0 1 The next table yon will want to produce is a table of your iviij cssion findings. Let us run the regression again (without dis- playing the output) to make sore it is the most o sent thing m Stata's memory, xi; regress gencierroles age I. female*i.mastat /// i .qfaehi \ . female j fimn activerel i. race?. We will now use the command esttab to make a regression table. You might have to install it first. If so, type £ indit esttab 346 Presenting your Results and follow the instructions. There is also a useful online tutorial by the author of esttab at http://repec.0rg/bocodc/e/estout/ ind.cx.html (Jann 2005, 2007). if you aist type esttab after the regression, you will get the following, partial, output m your Results window: As you can see, the results have the unstandardized coefficient and the t statistic (in parentheses). At the bottom of the output, there it, a note about the t statistics being in parentheses and that the stars correspond to 5 p<0.05, *" JXO.01, p Understanding interactions We aaie two statistically significant interactions: between female and a category of mastat and between female and income (fimni. it •'* often useful, particularly if your audience is non-technical, to ■cave more information about what your interaction means. because both of our interactions are with female, what the regression coefficients are telling us is that the slopes for males and females on the categorical variables- are significantly different from one another. One useful way of getting to the bottom of the jnteiaction is to run the model separately for the variables in the interaction term. So, for example, we can run the models separately tor men and women. Instead of just pasting the output for the separate estimations below, we are going to save the results and make a table with both of them using the estimates store and esttab command. hirst, we run a regression for only females (remembering to , i take i nit the interactions) xi: regress genderro3.es age i.mastat /// 1 • ijfachi fimn activerel i.race2 if female—1 Wr then get Stata to store these results as a model called 'female*. ■■:t estimates stoi:e female * We then run the same model on males: J xi; regress genderroles age i.mastat /// i.qfachi fimn activerel i.rac©2 if Cemale==0 J W- store the results as a model called 'male'. estimates store male We then use esttab to create a table with results by requesting | dint models 'female' and 'male' be displayed, with variable labels » (label), standard errors (se), adjusted R! (ar2), and with only the ser of marital status variables (*mastat*) and income (fimn) J displayed using the keep option (as these arc the coefficients 'A-e are interested in comparing). We write mtitles so that each model is given the name we stated above (i.e. 'female' and 'male'). J If we don't specify it, the dependent variable would appear § 350 Presenting your Results instead. We are using the option replace in case we want to rerun the models for whatever reason. This option overwrites any existing files with the same name. It we wanted to fax anv mistakes and we hadn't written replace, Stata would return the following message: file inceraction.rtf already exists r(602; ; sat tab female male, label se ar2 /// .keep(*mastat* fimxi) mtitles, /// using xnteract.i on.rtf, replace . esctab female male, label ee ar?, //■' keep! *mascat* fimrii mtitles, /// (usjaig inLeraction.rtf, replace (outp-m. wx'it.teri :c i r.teracLion . rt.c) After clicking on the active link, we obtain the lollowing table: (1) (2) female male inastat——2 0.574* 1,566"" (0.268) (0.277) m*stat—=3 0.858"* 0,0882 (0.227) (0.390) mastat=—4 0.890"* 0.896* (0.287) (0.386) mas tat——5 1.182" 0.262 (0.404) (0.627) mastat—=6 0.878*** 0.537" (0.196) (0.203) total income: last month 0.0021 1"* 0.000000752 (0.0001 77) (0.000162) Observations 4B76 3816 Adjusted R? 0,165 0.150 Standard eno^ in ptirenthesf's. * p < 0.05, ** p < 0.01, 111 p < 0.00 J I Multivariate tests 351 ; Graphing interactions hir.'i '.ctioa> presented in a table of regression results are difficult to net i piet and not very intuitive, So if is useful to visually display \wj a die i are telling you. Understanding and graphing interaction tern.s b-r\,een a categorical variable and an interval variable is lO'ilnabiy easier than getting to grips with, an interaction '•etvtra. uvo categorical variables. So, we'll start with the interaction between sex and income, ' >->isi run the estimation: xi; c»(T*:«»tt genderroles age i.f«mal«*i.mastat /// * i.cj-faohi i.female|fimn activerel i . race2 ' j lb en ' t que t predicted/fitted values ot gemierroies: pieclicc xb \4>i,\ iu show you two different ways to graph the interaction, gi rh». graphs are slightly different, but both show substantively tii,u, ioi women, as income increases so too do their liberal gender ^ h :)t,:ut'ijs, whereas for .men there is little, if any, effect, as * sfii I'vi h\ the flat line. iarst. wc graph the predicted values (xb) against income x- {fnipi separately tor males and females, using a linear fit *- {Jfur: graph. We use a linear fir so that that a single line is p.istnnj. Not using this option and simply requesting a line i, "i.iph would result in a cra7,y looking graph resembling a large sv'ibble The legend option tells Stata to label the lines as 'luna'.V and 'Males', is woway (Ifit xb tian if £eaiale= = l) /// I (If it xb f imri if ±em«ile»==Q ) , /// legend(order (1 "Females" 2 "Males")) «* Alt?i rwriveJy, we could use the xt3 and postgrr3 com- tnuation 01 commands introduced in Chapter 8. One issue with * thii m..ahivj is that to get accurate predictions of the regression en .del a variable can only be in one interaction term. In out pr -vu.i.i, model female was in two interaction terms so, for this j_ p iHnyh, e remove the i, female*'! .masfcat term and run the 352 Presenting your Results 500 1000 1500 total income: last month 2000 regression model. The postgr3 command produces the graph below which tells us pretty much the same as the one above. xi3i regress genderroles age i.mastat /// i.qfaehi i.*amal«*£i»m activersl i.raee2 • • post.gr 3 finui, by (female) total income: last month Multivariate tests 353 Bex 9.2: Doing commands 'quietly' | 1 If you want to rerun a regression just to make sure you have the tight estimates in the memory but don't want to see the results you can prefix the command with qui: which is short for e!idrrroles for males is over twice the size of the effect for kmalcv Being widowed, on the other hand, has a very large effect fVn women (0.858), but not statistically significant effect for men. * Nimibrly, being separated lias a larger effect on women (1.182) than men (0.262). „ Wnile the coefficients for these marital statuses look rather * ditletent for males and females, the lack of their statistical sign-ifirmcr in an interaction suggests that their slopes are not ,j' significantly different from one another. This is likely to be due to ' ~ the -analler number of cases when you break down the separated ^ and divorced categories by sex: f ti jnastat sex if e(sample) lei masi_ar. sex if e (sample) *• s ex marital status | male female Total * married | 2,37 0 2, 865 5,235 1-i-ving- a;-= couple 292 317 609 widowed 139 608 74 7 divorced 129 261 390 separated 47 124 171 9 never married 839 701 1, 540 Total 3, 816 4, 87 6 8, 692 * 354 Presenting your Results Interactions between dummy coded categorical variables ai" quite trick) to understand and even more tricky to present m a meaningful way. This is mainly because the regression coefficients are not really 'slopes' but differences between groups. An interaction between a categorical variable and an interval variable, as above, clearly shows a difference in the slopes for the effect of income on gender roles for men and women. Rather than talk about differences in differences, wc shall call the coefficients tor tin m I it il rf nil'- e i* a a ie- lot es "d >" e nil buk nt fur d iL r n e' u ,1 >\ es o t i ith tl e addel c >ii( he'Hon th-u U' die Jumrm i i i iV u» ttiofnt t^-litni in the «in.it uxgo v, in du c ise Ti ist win ir n i led Lit I «>l nth ldevmtftii of the regression results attain: 6*54 f We can see that there is a main effect for the female variable where women, on average, report more liberal attitudes to gender roles than men. Then three of the marital status categories have significant main effects in that those living together, those separated, and those who have never been married all have, on average, more liberal artitudes to gender roles. The one significant interaction tettn between sex and marital status is _TfetcQQnas_~2 which is for the "living together' category. As marital status categories are dummy ,:oded with married as the reference category, this interaction tells us that the 'slope' between 'married* and 'living together' categories is different for men and women. None of the other interacrion terms are significant, which tells us that the 'slope' between those categories and being married is the same for men and women. Ihese results do not tell us if those who arc divorced are different from those who are separated. This is a drawback with using dummy coding, and some prefer to use effect coding to get round this issue of choosing a reference category. See Chapter 8 for an example of effect coding. Multivariate tests 355 M the risk or being redundant, let's have a look at this using •-.one giapbs. We have done these graphs using the postgr3 eonnoud uid then using the Graph Editor to show you what is nos-d'le in the Editor. pcsr.gr 3 aastat, by (female) The basic postgr3 command tor the sex arid marital status interaction model produces the following line chart. A line chart is not technically correct for this, but it gives enough information. In this graph you can see that the solid line is for women and the dashed i.ine is for men. The average difference between these lines represents the main effect of the female variable, but what the interaction is looking at is the difference in the 'slopes' between each of the categories and the married categories, marital status r Below we have used the Graph Editor and taken out the lines and replaced the category points with markers: circles for women and triangles for men. We have also taken out the information for the widowed, separated, and divorced categories. To help make our • J:" point three categories are enough. We have plotted the 'slopes' "* between the 'married' category and the 'living together' (cohabiting) category with solid lines and between the married category * and the never married (single) category with dashed lines. Hopefully, this makes it clear that the _If eniXmas_~2 interaction i ■ & 356 Presenting your Results term is the difference between the two solid line 'slopes'. In other words, the difieren.ee (slope) between those who are married and those who are living together is significantly different for men and women. The difference (slope! is greater for men, which is also shown by the negative sign on the interaction term's coefficient in the regression results as women are represented by female-1. Now compare the solid line 'slopes' with the dashed line 'slopes', d he dashed lines are almost parallel which indicates that there is no gender difference m the differences (slopes) betiveen those being married and those who have never been married. This is shown in the regression results by the _If emXmas_~6 interaction term not being significant. Again, it is worth noting that from these results we cannot say anything about differences in 'slopes' between other pairs of categories such as between divorced and widowed, if you wish yem can draw in the other three 'slopes' between married and widowed, married and separated, and married and divorced for both men and women in the first line graph and see how they are reasonably parallel, which is reflected in the nonsignificant interaction terms ,_If emXmas_~3, _IfertiXrnas _~4 and _i lemXmas _~S respectively. 28 j JfemXmas _~2l A _________ JfemXmas _~6J \ 24i^.__n________r______________f_______r.. married cohabiting single marital status The usefulness of interactions between categorical variables h open to debate. Take this example and ask what this difference of differences (slopes) actually means. The way we have worded Writing up your findings 357 the example, where all categories are relative to those who are married, implies what happens when someone changes from that category to another, but is that change logical or the norm? It might make a bit more sense to compare those who are living together and those who are married with those who have never been married, as a common social process is from single to cohabiting to married. Not all who are married moved from the cohabiting or single categories as there will be people who were in the divorced or widowed categories who then married. However, it makes little sense to compare those who have never been married with those who are separated, divorced or widowed as it is not possible to move from being single to being separated, divorced or widowed without first being married, WRITING UP YOUR FINDINGS A 'typical' research article in the social and behavioural sciences is organized in the following way: 1. Introduction 2. Literature review and theory 3. Rationale for current study (highlighting any gaps, shortcomings, and/or contradictions in the existing literature) and hypotheses 4. Description of data, variables, and analytic approach 5. Results 6. Discussion 7. Conclusion The best way to learn how ro do these steps is to read lots of articles m your discipline and organize your papers in a similar way. We've discussed here how to create hypotheses, test them, understand your output, and make tables and graphs to display your results. In our opinion, the graphical display of results is something that is truly underrated in the teaching of social statistics - and it is a skill that is much appreciated by novices, policy-makers, and non-technical people who are trying to make sense of quantitative reports and articles. You should always try to make complex statistical output as simple to understand as possible. While you may very much like large tables of numbers !we sympathize completely), obey can be daunting and far from user-friendly to your proposed readership. 358 Presenting your Results Discussing your results and tying them back m with the literature review is a skill that you can only develop over time, Your first attempts are likely to sound like the Results section regurgitated, but it is important to link the findings with previous research and theory. It is an art, if you don't mind our saying so. This is likely to be the section that you will have to rewrite several times. It should also include any shortcomings in your analysis. If you don't acknowledge shortcomings, people reviewing your work will be certain to remind you of them. Tn the example analysis undertaken in this chapter, we would be sure to talk about how the results for education w7ere interesting and unexpected, and why this might be so (i.e. that older people might be in some of the classifications). We would also highlight the shortcomings of how we measured ethnicity and how the results might be masking important differences between people in the groups. The discussion section is also a good place to talk about recommendations for future research.