Chapter 12 Kim Sheehan and Mariea Hoy ON-LINE SURVEYS From 'Using e-mail to survey internet users in the United States: methodology and assessment/ Journal of Computer Mediated Communication 4(3) March 1999 THE INTERNET PRESENTS ENORMOUS potential for interaction between on-line users and researchers. . . . |We present] evidence based on previous research that discusses the strengths and limitations of web page-based surveys and assesses the viability of using e-mail as a survey data collection method. . . . Web-based surveys To date, the Internet offers both web page-based surveys and e-mail for prospective researchers to use for data collection. Web page-based surveys tend to collect broad-based data from individuals all over the world who self-select to respond to surveys that are posted on web sites. These web page-based polls can collect demographic information, as well as other types of purchase, psychographic and opinion data. Numerous benefits to web-based surveys have been noted. A web page-based survey can take advantage of the graphic power available ihrough programming languages such as HTML and JavaScript to create an attractive, interesting, and compelling survey that is inviting to respondents. . . . The use of CGI scripts allow adaptive questioning, which means that the questions that a respondent is asked depend on his or her answers to previous questions (Kehoe and Pitkow 1996). This allows for follow-up questions that can enrich responses as well as easier navigation for respondents. 106 KIM SHEEHAN AND MARIEA HOY Web page-based polls have been noted for their ability to generate a high number of responses (Kehoe and Pitkow 1993): the GVU polls at Georgia Institute of Technology generate more than 10,000 responses per poll. The sheer number of responses suggests that the results represent a diverse set of users. For example, it was estimated that one out of every 100 on-line users responded to each of the GVU polls (Kehoe and Pitkow 1996). This high volume of responses can be collected verv- quickly. . . . For example, studies have shown that several hundred responses can be generated over the course of a single weekend (McCullough 1998). This time factor alone suggests huge benefits over traditional surveying techniques in terms of being able to collect and analyze data quickly, and implement decisions based on the findings. The costs of both data collection and analysis can be minimized by the use of web-based surveys (McCullough 1998). Outside of high start-up costs for equipment and web page design, the actual implementation of a survey can be almost free, with no costs for paper or postage. Data analysis can be simplified by a direct transfer from the form to the analysis software, where limited data cleaning would be necessary (McCullough 1998). Web page-based surveys allow for anonymity in responses, since the respondent can choose whether to provide his or her name or not. Previous research (Kicsler and Sproull 1986) has indicated that anonymity may affect response rates positivclv, as respondents may be more willing to respond without fear that their answers mav be identifiable to them. Since respondents type in their answers directly to a form on a web page, there is no need for an interviewer to have contact with the respondents. . . . Therefore, survey responses will be free from errors caused by interviewers, resulting in cleaner data (McCullough 1998). Similarly, the lack of an interviewer eliminates any potential for bias that the interviewer brings to the survey. An interviewer's mood, prejudices or opinions will not be reflected in the data (McCullough 1998). However, web-based surveys do present some limitations that researchers must recognize when they are considering this method. Web pagi based surveys must attract respondents i" a* web page with messages posted in news groups, links on other web pages, banner ads, and other types ol methods. As a result, all segments ol a Web population mav not be represented in the sample (Kehoe and Pitkow 1996). All Internet users do not use the same browsers, and different browsers may not present images and text on web pages in the same manner. For example, some users (such as those subscribing to freenets) use only a text-based web browser (such as Lvnx), and may not be able to respond to the survey. Some web based-polls are announced in Usenet newsgroups. Therefore, if potential respondents are not a frequent visitor to newsgroups, thev mav not be aware of the survey announcement posted in newsgroups, and thus may not have the opportunity to complete the survey. The self-select nature of web page-based surveys also mav affect their generalizability. . . . Web page-based polls generally allow for multiple responses from a single individual, as well as responses from individuals outside of the population of interest (e.g. persons in countries where a product or service is not available, or from ON-LINE SURVEYS 107 persons who are vounger or older than the population of interest). This could also bias the results. One way to validate a method is to compare it to other methods that are accepted within the research community. Since it is almost impossible to develop response rates to web page-based surveys (Kehoe and Pitkow 1996), it is difficult to compare web page-based survev methods to traditional survey data collection methods such as postal mail and telephone surveys. This leads to another gencral-izability issue. Without an understanding of the size of the respondent pool in comparison to the size of the universe and the sampling pool, it is also difficult to generalize research findings beyond the universe of those responding to the survey. E-mail as a data collection method Using e-mail as a survey data collection method comparable to postal mail may ameliorate some of the issues inherent in web page-based data collection. Previous research presents several reasons to support the idea the e-mail ofiers much promise as a means of administering surveys as well as pitfalls to be avoided. Todav, as many as 100 million people worldwide have access to e-mail (DOC 1998). Eighty per cent of all users use the Internet daily, with many reporting that 'surfing' replaces 'TV viewing' as entertainment (Kehoe, Pitkow and Morton 1997). The sheer number of individuals using the medium coupled with the frequency and ease with which they could be contacted suggest that e-mail is a viable survey method. A lack of a national directory of e-mail addresses could be seen as a limitation to e-mail surveys. For example, Schuldt and Totten (1994) reported a problem with obtaining names for their sample. This situation has changed in recent vears. Many content providers compile their own databases and should be able to access names quickly from these sources. Some organizations (such as universities and trade associations) publish directories, both paper and on-line, with e-mail addresses. Online search engines such as Lycos provide 'People finders' for e-mail addresses. When respondents use the 'reply' function of their e-mail programs to return their completed survevs, their names and e-mail addresses can be automaticallv written on the electronic message (i.e. the survey) the researcher would receive. While previous research (Kiesler and Sproull 1986) has indicated that anonymity may have affected response rates positively, other researchers (Couper, Blair and Triplctt 1997) suggest that the lack of anonymity may not have any effect on response rates. With e-mail surveys, anonymity could be guaranteed through the use of encryption technolog)', and confidentiality can be guaranteed through confidentiality assurances. This study chose to guarantee confidentiality. Assuring that responses will be confidential throughout the data collection process should help to build respondent trust and enhance response rates. An additional benefit to using e-mail is that duplicate responses can be eliminated. Steel, Schwendig and KUpatrick (1992) suggested that duplicate responses can become problematic since researchers using postal mail often send out multiple copies of questionnaires to their entire sample in order to increase response rates. 108 KIM SHEEHAN AND MARIEA HOY E-mail presents a benefit over postal mail, then, since e-mail responses can be tracked and previous respondents can be eliminated from fallow-up e-mail. E-mail surveys mav allow the researcher to develop a profile of non-1 respondents. Depending on the search engine used and the respondents' server, some demographic information about persons with e-mail accounts is available on-1 line and some demographic information such as gender and location may be com- j piled. It might also be possible to attempt to contact non-respondents using an I alternative method (such as postal mail or telephone) to solicit responses that could I be compared to the e-mail sample for similarities. It should be noted that demo-1 graphic information about persons with e-mail accounts may not be completely accurate, as individuals may have changed locations or jobs since the information was provided. However, the availability of such data allows for options that the researcher can consider when assessing non-response. As with web page-based surveys, there appears to be some cost savings inherent in using new technology. Parker (1992) indicates that cost savings from e-mail compared to traditional mail and telephone surveys arc based on low transmission costs and elimination or reduction of paper costs. E-mail may also present cost savings over web page-based surveys, as costs for page design and posting to a server would not be incurred. However, some savings may be offset by the on-line server used (costs varv by Internet service provider) and time considerations (transmission costs may increase by the minute, which may impact the length of the survey). When respondents perceive technology as easy to use, they seem more likely to | respond (Parker 1992). As more people become familiar with the Internet, these individuals should become comfortable using tlie technology to answer surveys. An additional advantage to e-mail is that respondents can return it in one of three ways; e-mail, fax or postal mail (Parker 1992). This flexibility may enhance the perception of ease of use. Unless the respondent purposely deletes the survey, it cannot bel accidentally tossed or misplaced like a mail survey. Yet, comparable to a mail survey,) the respondents still have the benefits of completing the survey at their own pace and convenience. There is not clear evidence that new technology produces a higher response! dian postal mail. In a review of nine studies that have used both postal mail and e-mail four studies show' postal mail achieving higher response rates than e-mail, three studies indicate that e-mail response rates arc higher than postal mail, and two studies did not show significant differences in response rates. Researchers indicated that the lack of familiarity with the technology may have impacted some of the response rates. It is also important to note that many of these studies are from small, homogenous populations, and thus may not represent larger population groups' response tendencies. Past studies found direct marketers can collect data more quickly using e-mail than with postal mail methods. In the five studies that reported response time results, e-mail responses were collected significantly faster than postal mail responses. The variety of populations used in these studies suggest that this rapid rate of response might be seen among larger Internet populations. Current research has also identified two key limitations unique to e-mail that must be considered when planning an e-mail survey. First, researchers must recognize that unsolicited surveys may be considered aggressive by respondents, and not ON-LINE SURVEYS 109 in keeping with Internet culture (Mehta and Sivadas 1995). Minimizing a perception of intrusiveness should help to address this problem (Schillewaert, Langerak and Duhamel 1998). Second, the changing nature of the Internet suggests that it is possible that e-mail addresses may become out-of-date fairly quickly (Smith 1997). Addressing this issue early on can prepare the researcher for dealing with delivery failures. . , , Considerations ... It would not be possible to generalize results to mass markets including both Internet users and non-Internet users based on knowledge attained solely from online respondents. This has also been shown as a limitation to web based polls (Coomber 1997; Kehoc and Pitkow 1996). However, depending on the research question, it is possible that sample information can be used to generalize to the on-line population. One of the most challenging limitations is the changing nature of the Internet. The composition of the Internet changes daily with new individuals logging on and others adding or switching Internet service providers. Thus, some directories mav contain information that is out of date or incomplete. . . . The changing nature of the Internet is also seen in changes to how search engines operate. Any ownership changes of a search engine or other web content provider may result in unanticipated changes to this methodology. Additionally, the technology allows individuals to set up mail filters, which delete messages from those senders not on the receiver's 'approved' list. This deletion may or may not be reported to the sender. As use of mail filters grows, response rates may be affected. Researchers should anticipate these changes by testing search engines prior to address generation to make sure that the method is still appropriate and pre-testing die study with a random sample of names to determine and plan for non-deliverable mail. While response rates now appear promising, respondent distrust of data collection may influence response rates in the future. . . . One respondent wrote, 'if you are a student then lam the Emperor of Japan'. The novelty of using e-mail to collect data mav be partly to blame. Until diis mediod becomes more ingrained with academics and popularized among on-line users, respondent concern and distrust is likely to continue. Additionally, individual ISPs have policies and procedures that may limit the success of e-mail surveys. We encountered one ISP that monitored the number of e-mails delivered to its users that originated from a single address. If die number was very large, the ISP assumed that the sender was 'spamming', and the system operator blocked the originator from sending additional messages to the ISP's subscribers. , . . How government regulation will affect the promise of e-mail remains to be seen. Federal courts have barred specific companies from sending unsolicited e-mail advertisements to subscribers of CompuServe (Kanaley 1997). The courts are ruling that ISPs have the right to restrict access by 'spammers', mostly for economic reasons. Users who pay hourly access rates complain about spending too much time and money reading messages they have no interest in. How this will affect mailing in 110 KIM SHEEHAN AND MARIEA HOY the future is not yet clear. Options being discussed include charging mailers a fee f« each piece of mail sent. Some believe this will cause companies to be more select™ in the addresses to which they send mail. Obviously, this would increase the costs« e-mail surveying. . . . While e-mail surveying will probablv never replace the broad-based data availt able via postal mail surveys, it will probablv provide adequate data for the study« on-line populations, and given the propensity of 'hard to reach' individuals a respond, may provide richer data about on-line Iwhavior than postal mail survevs.fl on-line usage continues to grow, and as more and more consumers have access toa mail, it is conceivable that this method mav be eventually used in place of postal mad to gather information about broad-based consumer segments. References Coomber, R. (1997) 'Using the Internet for survey research.' Sociological Research Online 2 (9 |On-line). Available: http://www.soCTesonline.org.Uk/2/2/2.himl Couper, M.P., Blair, J. and Triple«, T. (1997) 'A comparison ofmall and e-mail for a survey« employees in federal statistical agencies.' Paper presented at the annual conference of As American Association for Public Opinion Research, Norfolk, Va. Kanaley,R. (|997, February 5) 'Judge bars bulk mailer from online.' Philadelphia Inquirer, p. Cm Kehoe, C., and Pitkow, ]. (1996) 'Surveying the territory: GVtľs rive WWW user surveys.' Jfl World \\ride Weh Journal, 1, (3). [Also on-line|. Available: http://www.tv.gatech.edu/gvan user surveys/papers/w3j.html Kehoe, C, Pitkow, ]., and Morton, K. (1997) Eighth WWW am survey |0n-llnel. Available; httpffl /www.gvu.gatech.edu/user_surveys/survey-1997-10/ Kiesler, S., and Sproull, L. S, (1986) 'Response effects In the electronic survey.' Public Opiam Quarterly, SO; pp. 402-413. McCullough, D. (1998) 'Web-based market research, the dawning of a new era.' Direct Moriam 61 (8), pp. 36-39. Mehta, R. and Si vada, E. (1995) 'Comparing response rates and response content in mall electronic mail surveys. * journal of the Market Research Society 17 (4), pp. 429-440. Parker, L. (1992) 'Collecting data the e-mail way.' Training and Development, July: pp. 52-54. j Schillewaert, N-, Langerak, F., and Duhamel, T. (1998) 'Non probability sampling for WWW surveys: a comparison of methods.' Journal of the Market Research Society 4, (40), pp. 307-3 313. Schuldt, B.A.,and'lbtten J.W. (1994) "Electronic mail vs. mail survey response rates.' Markétou Research, Winter: pp. 1 -7. Smith, C. (1997) 'Casting the net: Surveying an Internet population.' Journal of Computer Meáiaud: Communication 3(1). Available: http://www.ascusc.org/jcmc/vol3/issuel/smilh.htmL ■ Steele, T.J..Schwendig, I..W.,and Kilpatriek. J. A. (1992) 'Duplicate responses to multiple sun™ mailings: A problem?' Journal of Advertising Research 32 (I): pp. 26—33.