Journal of Research Practice
Volume 2, Issue 2, Article D1, 2006
Internet-Based Data Collection: Promises and Realities
Jacob A. Benfield
Department of Psychology, Colorado State University, Fort Collins, CO 80521-1876, USA
William J. Szlemko
Department of Psychology, Colorado State University, Fort Collins, CO 80521-1876, USA
The use of Internet to aid research practice has become more popular in the recent years. In fact, some believe that Internet surveying and electronic data collection may revolutionize many disciplines by allowing for easier data collection, larger samples, and therefore more representative data. However, others are skeptical of its usability as well as its practical value. The paper highlights both positive and negative outcomes experienced in a number of e-research projects, focusing on several common mistakes and difficulties experienced by the authors. The discussion focuses on ethics and review board issues, recruitment and sampling techniques, technological issues and errors, and data collection, cleaning, and analysis.
Keywords: Internet; data collection; research ethics; sampling
Suggested Citation: Benfield, J. A., & Szlemko, W. J. (2006). Internet-based data collection: Promises and realities. Journal of Research Practice, 2(2), Article D1. Retrieved [date of access] from, http://jrp.icaap.org/index.php/jrp/article/view/30/51
1. Internet as a Research Tool
With the advancement of information and communication technology, researchers have found new methods of data collection and analysis. This has evolved from telephone surveys, computerized data analysis, and use of cell phones and pagers, to collecting information at random intervals, use of Personal Digital Assistants (or "PalmPilots"), and use of the Internet in research. Although the Internet is fast becoming a common fixture in contemporary life in many parts of the world, it remains relatively unused for primary data collection in many research fields. For example, social science research is yet to respond to the emergence of the Internet, as shown by only 494 peer reviewed articles with keywords "Internet research" published within major social science journals over the decade 1996-2006 (as per our search in the CSA Illumina® bibliographic database). Increasingly, however, the Internet is being treated as a rich source for literature and secondary data in social science research.
Until relatively recently, use of the Internet for primary data collection required the researcher to either know HTML or have someone else create a new program. Fortunately, within the past few years a number of new technological solutions and services have emerged that allow the researcher to create studies (i.e., surveys, experiments, etc.) online without needing the knowledge of computer programming. This has coincided with a large increase in studies using the Internet to collect primary data. A search in the Web of Science® bibliographic database indicates that the number of publications during the six-year period 2000-2005, using "Internet research" as keywords, is 128, which is 312 per cent higher than the corresponding figure during the six-year period prior to 2000, i.e.,1994-1999. Similar results are seen for "Internet data collection" (325 per cent), "web based research" (333 per cent), and "electronic data collection" (327 per cent). Of course, these impressive percentages are based on low base figures; Internet use in research still remains rather limited.
By its very nature, the Internet appears to be a very promising medium for researchers. As a vehicle for data collection, it promises increased sample size, greater sample diversity, easier access and convenience, lower costs and time investment, and many other appealing features. It is even possible to use the Internet for pilot testing media messages and advertisement campaigns. But without careful attention, the researcher may get into difficulties. It is the purpose of this article to expose some of the potential pitfalls awaiting the unwary researcher. Along with the potential pitfalls, solutions utilized by the authors are also discussed.
2. Manual vs. Internet-Based Data Collection
We have encountered a number of issues in our various attempts at using the Internet for primary data collection. A list of such issues must include those associated with research ethics guidelines, technical snags arising from power failures, data cleaning requirements, and low response rate. Sometimes, the experience has been so frustrating as to make manual data collection through paper-and-pencil research packets appear more attractive. However, with experience, we have learnt to be judicious in selecting the appropriate data collection method for a given research project and taking the necessary precautions if we choose to use the Internet.
Researchers, especially psychologists, have often looked at the method of data collection with regard to the impact it can have on results. The issues of questionnaire design, for example the implications of using forced choice, Likert scales, open response, or multiple response formats, are all issues much older than the Internet (Orlich, 1978; Schuman & Presser, 1981; Sudman & Bradburn, 1982). These will always be important when designing data collection instruments. The design of the instrument should be informed by the research question being addressed. Any advantages or disadvantages offered by a specific question format will not be altered by technology, but technology may introduce additional issues (Manfreda, Batagelj, & Vehovar, 2002). Each of these response types is easily available in an electronic format. Some researchers have compared manual and electronic formats, examining the issues of validity and reliability of research instruments (Berrens, Bohara, Jenkins-Smith, Silva, & Weimer, 2003; Schilewaert & Meulemeester, 2005; Sethuraman, Kerin, & Cron, 2005). They have found test-retest reliabilities for both formats to be nearly equal, indicating that both formats can generate equally reliable data assuming that the participants are cooperative and truthful, and the questions are valid. They have also found internal consistency, predictive validity, and recruitment trends within socio-demographic categories to be comparable between the two formats. In essence, the mode of data collection (i.e., manual or electronic) does not, in itself, seem to significantly alter the type of respondent recruited or the quality of data given by the respondent.
Collecting data from people with poor reading comprehension or those not accustomed to taking paper-and-pencil tests is already known to be difficult. Similarly, while using electronic data collection methods, the respondents' lack of familiarity with computers could be an issue. In some of our survey research projects, we have compared the paper-and-pencil method with the computer based method. In our pilot tests, we have found that the computer based method was usually faster (because of the respondents' familiarity and ease with the computer keyboard and the mouse). However, during the actual data collection, the mobile laboratory had a touchpad instead of a mouse, which slowed down the respondents using the electronic version, in comparison with those who used the paper version. In short, computer skills and familiarity with the input devices affect a respondent's ability to complete an electronic survey. This is in addition to problems experienced by respondents who have poor reading comprehension or who are not comfortable with filling out questionnaires.
Another relevant difference between paper-and-pencil and electronic formats is the level of rapport possible with the respondent. The impact of such rapport may be unpredictable. For some respondents, the signed letter accompanying a paper-and-pencil format may be more persuasive than an e-mail from a stranger, commonly sent with the electronic format. It is uncertain whether face-to-face interaction with a person or the relative anonymity of the Internet would produce more authentic responses.
With identity theft (i.e., the deliberate assumption of another person's identity without the latter's knowledge) being a major issue of current concern, Internet data collection may not seem as legitimate as data collected in a community center or a university laboratory. Internet data collection could indeed be problematic from the point of view of source credibility--an important issue in persuasive communication, as research in the area of persuasion indicates (Hong, 2006; Hovland & Weiss, 1951; Olson & Cal, 1984). Additionally, as the psychologist Stanley Milgram (1974) argues, people are more likely to obey an authority that is present in the room compared to one that is in the next room or on the phone. Accordingly, the manual paper-and-pencil method can be expected to produce higher-quality data compared to the Internet-based method, the former being more tangible, more personal, and in short, more credible to the respondents, especially if the research staff is in the room with them (Nosek, Banaji, & Greenwald, 2002).
On the more positive side, Internet-based data collection, if utilized properly, can reduce costs and make unfunded projects feasible, yield larger and more representative samples, and obviate hundreds of hours of data entry. Table 1 compares the advantages and disadvantages of manual and online modes of data collection.
Table 1. Comparison between Manual and Internet-Based Data Collection*
Internet is a tool that is out there, for better or for worse. Its usefulness in research is largely dependent on its judicious use. As depicted in Figure 1, a series of questions pertaining to different stages of the research project need to be answered before making a final choice regarding the data collection format. In this figure, the solid lines represent the progression of the decision making process concerning the use of electronic data collection. The broken lines lead to the likely decision, with the lines on the right representing a negative answer to the question posed at each stage (thus favoring manual data collection) and the lines on the left representing a positive answer (thus favoring electronic data collection).
Figure 1. Considerations for Incorporating Internet-Based Data Collection in a Research Project
3. Research Ethics
The Institutional Review Board (IRB) is the US version of the research ethics committees created in many universities and other research institutions in response to the rising concerns about both human and animal use in research. The IRB's role is to oversee research being conducted within an institution in an attempt to ensure that participants' rights and privilege are being upheld. In the United States, IRBs generally focus on the principles laid out in the Belmont Report (1978). When considering whether or not a specific research project should be allowed to be completed, IRB reviewers focus on three key principles: (a) beneficence (i.e., lack of harm and/or received benefit), (b) respect for persons (i.e., confidentiality and ability to withdraw from research), and (c) justice (i.e., opportunity for all participants to benefit from outcome). In essence, the IRB serves as the research participants' informed and trained advocate.
Some IRB members may have some special concerns when dealing with proposals involving primary data collection via the Internet (Naglieri et al., 2004; Nosek, Banaji, & Greenwald, 2002). Anonymity and confidentiality are always concerns in data collection, but the potential for recording the IP (Internet Protocol) addresses, thereby the identity of the remote computers, makes Internet-based proposals more complicated (Berry, 2004). Other issues, such as data security during transmission, are unique to Internet-based data collection. Some common IRB issues the authors have encountered are discussed in the following paragraphs.
Primary data collection via the Internet presents a unique issue during data transmission (Hewson, Laurent, & Vogel, 1996). The data are most susceptible to hacking, corruption, etc., while these are being transferred from the respondents' computers to the researchers' computer. One relatively easy method of limiting these possibilities is the encryption of data during transmission. Data encryption may be accomplished through various methods, but from the IRB viewpoint, the method of encryption appears to be of less importance than the fact that encryption is being done. Of course, providing for data encryption can add to the cost of the project.
Irrespective of the mode of data collection, physical security of data is a major issue once data have been collected. With Internet-based data collection, physical security includes much more than a locked file cabinet in a secure room. Consideration must be given to both physical and electronic security of the server where data are stored. Physical security of the server should minimally include a room with restricted access. Internet data collection can be facilitated by numerous agencies that specialize in allowing researchers to create their own study. These agencies often provide adequate physical security. One physical security measure that may be overlooked is environmental controls that regulate temperature, humidity, and air flow. Environmental controls are particularly relevant for electronic data. Papers locked in a file cabinet will not be affected by a 105 degree Fahrenheit temperature, but this may cause problems with computer hard-drives. These extensive safeguards may not be necessary depending on the IRB, but having them will provide peace of mind for researchers and IRB members alike. Electronic security begins with the encryption process described above; it does not, however, end there. It would be necessary for the server to have firewalls. Firewalls protect the server from unauthorized electronic entry (i.e., hacking). Other electronic security commonly includes the use of passwords, PIN codes, and access codes.
When conducting Internet surveys, there is a potential threat to anonymity of the respondent that needs to be considered (Pittenger, 2003; Waern, 2001). It is possible for a computer program to record the IP address of the computer being used by the respondent. The IP address is a numerical code that is unique to each computer connected to the Internet. It is also possible to record the time when the data were entered. These capabilities mean that the actual respondents can be traced out in many cases. We have dealt with this issue by either deleting the IP addresses from the dataset early in the cleaning process or electing to not record the IP addresses, wherever possible. As an interesting aside, IP addresses collected from personal computers may be useful for matching sets of longitudinal data without collecting specific identifiers or using matched lists of identities and participant codes. In this case, recording IP address is an advantage--not an ethical liability. However, IRBs should be made aware that this is the intent behind recording IP addresses in such a case.
Research involving persons requires some form of informed consent, wherein the persons agree to participate and acknowledge the risks, benefits, and their rights. This can take the form of a verbal consent or a written one. In both verbal and written consents it is ascertainable whether the person providing the consent is indeed the person participating in the research. With Internet-based data collection this is not possible, as there is no visual reference (Pittenger, 2003). Additionally, it is not possible to determine that the person providing the consent meets the inclusion or exclusion criteria, as may be specified by the researcher. Thus, the issue of consent for Internet-based data collection includes issues of the respondent's personal integrity. Commonly, the consent to participate in Internet surveys takes the form of either choosing a box on the screen and pressing a button or choosing the "agree" option. Some IRBs may not consider this to be true informed consent, viewing it simply as the respondent's acknowledgement of reading the page. Since verifying this is next to impossible, some version of a "waiver of consent" becomes appropriate before conducting Internet-based data collection. This is especially relevant considering the possibility of respondents being minors without parental consent. Seeking and securing waivers from IRB for both parental and individual consent has been our approach to avoid subsequent disputes regarding consent, acknowledgement, and participation by minors.
One of the usual conditions of informed consent is that withdrawal from participation or refusal to participate cannot invalidate incentives. In Internet surveys with incentives provided this means that in the event of refusal to participate or early exit from the survey, the participant must be routed to the page meant for debriefing and incentive enrollment. Clearly, this is not perfect as the participant may simply close their Web browser to exit, rather than choose a button marked "exit survey." There is no simple and effective way to ensure that this does not happen and participants always have access to the incentives they are entitled to.
Other considerations that must be weighed are issues of burden and beneficence. Does using the Internet constitute an undue burden on a specific population, for example, computer illiterate individuals? There is no easy answer to this and it may in part depend on the subject matter being researched. Similarly, if participants receive benefits from being involved in the research, are these benefits available to non-computer users? These are difficult questions that each IRB would view differently; however, the best answer is that it depends on the research being conducted and the population being targeted for data collection. Our practice has been to anticipate these issues and, when applicable, justify the decisions in the design of the survey. Open communication with the IRB representative has helped us avoid unforeseen issues, thus leading to faster, more efficient approval processes.
4. Recruitment of Respondents
The Internet appears to be a mechanism to access the most representative participant pool in the world. Because of this, consumer researchers and marketing firms have created dedicated websites and electronic mailing lists designed to send out surveys to the willing public (e.g., NPD Online Research). However, it may not be correct to assume that recruitment of respondents in a virtual setting must be easy. We have utilized a variety of recruitment techniques and learned that, (a) different recruitment procedures can have different effects on the resulting sample and (b) the right recruitment procedure, with some luck, can yield interestingly large samples for the study.
Issues of recruitment have been widely discussed in the context of survey research (Cochran, 1977; McCready, 1996; Rosnow & Rosenthal, 2005; Sudman, 1983). Some of the recruitment methods are discipline-specific while others are more general. Most of these methods can be applied in an Internet-based project with simple alterations (Andrews, Nonnecke, & Preece, 2003; Hewson, Laurent, & Vogel, 1996; Koo & Skinner, 2005; Schillewaert & Meulemeester, 2005). For example, psychologists often utilize student pools from psychology classes--a convenience sample, while sociologists are usually more purposive in trying to sample groups meeting certain criteria (e.g., low-income minorities). Researchers using the Internet can recruit these same groups by either mass e-mailing the survey to the target group or sending out the survey Web site link to community leaders or organizations that interact with the target group.
If an electronic survey is being used simply to speed up data entry and analysis, the common method involving a group of participants meeting at a specified location and time can be used, with the provision of computers at the desired location. In this case, the recruitment procedure would be based on the accessibility of the population being sampled. Of course, the benefit of speedy data entry needs to be weighed against the risks associated with technology and those involved in data preparation processes (see Sections 5 and 6).
Despite the potential participant pool of hundreds of millions, the actual number of respondents in an Internet survey can be quite low (Zhang, 2000). In fact, response rates can be dismal enough to make the time-honored mail-in surveys seem more attractive. Using four of our Internet surveys as a basis, we have presented a discussion of the recruitment techniques which worked for us and those which did not. Our experience indicates the prudence in following multiple recruitment strategies in any project. Moreover, strategies that worked before the Internet generally also work with the Internet.
In a project concerning health behaviors and activity, designed to survey college students, all 25,000 students on a college campus were e-mailed the Web link to the 15-page survey containing several validated and time-tested scales along with an explanation of the study and the opportunity to win prizes. A second e-mail was sent out two weeks later with a reminder and the link. One month after the original recruitment e-mail we had only 509 respondents (i.e., 2 per cent response rate). The inclusion of paper reminders placed in dormitory mailboxes increased participation within freshmen to 5 per cent, which was about 2 per cent prior to this.
A second project of ours with severe recruitment woes involved an attempt to get a community sample of driving behaviors within six cities in three states. The original recruitment procedure involved placing 600 paper leaflets or flyers per community (N=3,600) on vehicles parked in public parking lots during business hours. The flyers contained information about the study employing several persuasion tactics, the link to the Internet survey, and the contact information for the researchers--should a potential respondent have any questions or need help accessing the survey. Because of an inability to give reminders and the need for the respondent to manually enter the Web address, we planned on a 90 per cent non-response rate in order to get 60 participants per community (n=360). After 1,200 flyers distributed in two cities and one month of waiting, five respondents had attempted the online survey with only two finishing it in entirety. Interestingly enough, two respondents had accessed the survey the day before recruitment leaflets were sent out, which indicates that perhaps the IRB members were checking on the link and survey materials.
Not discouraged by a 0.5 per cent response rate, we adopted a snowball sampling technique in which we sent the survey out to friends, family, and colleagues. This recruitment e-mail contained study information, the link to the survey, and instructions to forward the e-mail to friends, family, and colleagues. Using this approach, the 60 initial e-mails yielded three times as many responses (189 to be precise) within the first month. Follow-up information seems to indicate that the snowballing process stopped at the third or fourth iteration. While this technique did yield higher response rates, it did not allow for community-specific analyses to be conducted because the e-mail contacts were distributed in other cities. It did however provide a broad sample with several professions and ages being represented. As an interesting aside, the e-mail sent out by one of the authors reached the other author, at the fourth iteration of snowballing, through routes that neither could have foreseen.
A team of researchers (including one of us) interested in individual attitudes related to the loss of local wildlife also utilized the electronic method to collect data. These researchers focused on college students for their sample and recruited participants by going into a diverse range of courses and verbally recruiting students by providing them with the Internet link on an overhead projector. Interestingly, some course instructors offered extra credit for participation while others did not. For courses providing extra credit, more than 90 per cent of the students responded. The response rate was only 10 per cent where this incentive did not exist. This drastic difference based on extra credit was found to hold irrespective of the class size.
In another study we utilized a participant pool from an Introductory Psychology course. The participants received research credit for their participation that counted towards a course requirement. As it is to be expected, recruitment turned out to be a virtual non-issue. We just posted the study on the sign-up page and then e-mailed the link to those who signed up. Sections 5 and 6 below, focusing on technological and data preparation issues discuss this project and other similar projects which use the electronic method to reduce data entry time and labor.
Recruitment methods such as community sampling, telephone surveys, and mail-in surveys, widely used in different fields of research (Dillman, 1978), have also proved their merit in our Internet surveys. An incentive to participate is not essential but definitely helps and that has been known for some time (Brennan, 1992). In our projects, offering guaranteed benefits yielded greater than 90 per cent response rates. Surveys offering the possibility of some benefit, but no guarantee, had much lower response rates but were better than those without the possibility of such benefit. Reminders have also been shown to improve response rates in manual surveys (Nederhof, 1988; Sheehan & McMillan, 1999). Reminders doubled responses among college freshmen in our health survey even though the resulting response rate was not sufficiently high. The driving behaviors project had no reminders or incentive for the first round of data collection and was a total failure. However, we overcame the lack of incentive and our inability to offer reminders by utilizing snowball sampling that originated with people motivated to help--our friends, family, and colleagues.
5. Technical Snags
Using the Internet to collect data is convenient and can greatly extend sample representativeness; however, the use of Internet is not without some risk. During the doctoral research of one of the authors, data were being collected using a mobile computer laboratory with an array of laptop computers, so as to avoid the time-consuming data entry process. Participants arrived every hour, completed the questionnaire online and left. Shortly into one of the sessions, the electricity supply to the building went out. Fortunately, the laptop batteries were fully charged and so no data were lost, and data collection continued. With desktop computers without uninterrupted power supply (UPS), the data entered till power-failure would have been lost and data collection would have to discontinue until power gets restored. Even with laptops this could have resulted in major inconvenience had the batteries not been charged or had the server been located in the building where the power supply was disrupted. After this experience the researcher printed out research packets to have on hand for future emergencies.
During the same project, the wireless Internet connection was lost for a period of time. This resulted in incomplete data from 18 respondents and created delays for the next session of data collection. A solution that was used in another Internet project conducted by the authors was to have a disc with the survey materials on it and have the respondents record their answers directly onto a Word document, which could later be transferred. To use this option, it is necessary to save each respondent's responses into a separate file for later retrieval, which requires enough disk space and the required level of access to save files.
In another research study conducted by one of the authors in a computer laboratory, all the computers contracted a virus. This was rather unfortunate, resulting in incomplete data from 14 respondents and lost data from 35 respondents. Considering that the sample size was 150, this resulted in approximately one-third of the sample being lost. Amendments for more participants had to be sent to the IRB since one of the experimental conditions was severely compromised by sheer luck of random assignment. Additionally, those 14 participants who were completing the study at the time had their university Internet accounts temporarily deactivated for using an infected computer. Prior to starting data collection each computer had been scanned for viruses and had antivirus updates installed. The virus came from another computer laboratory using the same server and infected the entire university network. Apart from keeping current on antivirus updates and timely virus scans, backing up the data more frequently during data collection could minimize virus-induced losses of already collected data. The paper-and-pencil back-ups will prevent losing participants who are present during the computer infection.
Another technology issue, especially in a laboratory setting, relates to the hardware devices used. In one of the studies mentioned above (i.e., the one with power-failure), the respondents were required to navigate the survey Web site using a touchpad. This resulted in delays and some confusion because the respondents were more used to a mouse, rather than a touchpad. Similarly, the type of screen and keyboard used can also make a difference. Specific screen sizes may be more appropriate for specific groups. Small screen size might be a disadvantage for groups with vision impairment. Similarly, perhaps a touch sensitive screen would be better than a keyboard while working with younger children.
In situations where multiple users may use the same computer to complete the study, it is necessary to determine if the survey software enters the data as new data or if, recognizing the same IP address, records over the previous data. This is not only a concern in laboratory settings; some hostel or dormitory rooms may have a single computer for multiple users. Even in the private home different family members may respond from the same computer. Another software issue is how it handles a respondent who exits the survey or closes the Web browser without completing the survey, whether accidentally or otherwise. Are they allowed to pick up at the point they exited, or do they need to start over? One study the authors were involved in did not allow the respondents to start where they left off. This resulted in numerous partially duplicated data points. For example, one would answer the first third of the survey and then accidentally exit, only to discover that one needed to start at the beginning to take the survey. This would result in the first third of the survey being duplicated, requiring increased time in data cleaning later. Perhaps, this also resulted in frustration and withdrawal from the study, indicated by the fact that after data cleaning to eliminate duplicate entries, approximately 7 per cent of the data sets were incomplete.
When using flyers to recruit respondents, the Web address of the survey can cause a practical difficulty. Since IRBs tend to require data encryption, this necessitates the use of secure Web sites. Secure Web sites are designated with "https" in their address (rather than the usual "http"). This can lead to the respondents not typing the address correctly and consequently being unable to locate the survey. In a laboratory setting, one of the authors discovered that about 13 per cent of the respondents typed the Web address incorrectly. Specifically, they were all making the same error mentioned above. Even when told to be sure to type "https" and emphasizing the letter 's' the error rate was approximately 4 per cent. This tendency may be even more pronounced when using paper flyers or windshield leaflets for recruitment and possibly contributed to the dismal 0.5 per cent response rate encountered in the driving behaviors study.
The authors, jointly or individually, have been involved in over ten Internet-based surveys. Not a single one of those surveys has avoided technical or recruitment problems. Keeping back-up plans ready seems to be the major lesson from these experiences.
6. Data Preparation Issues
The single most appealing advantage of the electronic method of data collection is the elimination of the tedious data entry process. With the electronic method the data are entered into a database at the same time as the respondent completes the survey. If a researcher plans on collecting large amounts of data or having a large sample size, electronic data collection can be invaluable. It is a solution in itself when facing mountains of data and weeks worth of data entry. An additional advantage is that typing errors by the researcher are avoided. The data file is an exact replica of the responses received. However, electronic data files can easily lead to other types of error.
Electronic data files almost always need to be transformed, merged, and/or reformatted before use. Most available electronic formats separate the survey into sections and the data are provided in separate files for each section. These must be merged together so that analyses can be performed. Additionally, some programs that help facilitate creating e-surveys use their own coding schemes, which are not what the researcher might use. For example, 1-7 Likert scales may be recorded as 0-6 scales by the computer. Also, many established subscales have specific scoring criteria. Because of this, simple transformations are usually performed on the data. Also, when the data are downloaded into a database program, some programs default everything to string format, even if the data were meant to be numeric. As a result, another reformatting of the data becomes necessary. None of these issues is hard to correct. However, the more steps we add to the process, the more likely are we to make a mistake.
Data collection over the Internet has many potential benefits. Unfortunately, it also has many potential problems. Properly used, Internet-based data collection can generate large samples, be a solution to funding problems, ease logistics, and eliminate data entry. However, problems can arise during any phase of the research. With careful planning, many issues can be avoided altogether. While not all inclusive, this paper presents many of the issues the authors have encountered while conducting Internet-based data collection.
Advantages of Internet-based research have allowed us to dream a little bigger and pursue projects and research questions we would never have considered. Who would want to collect data in six cities in three states without formal funding? The Internet and some "creative budgeting" allowed the two of us to put the finishing touch on a project that had been two years in the making but confined to the available student pool for data collection. However, we will not discard the paper-and-pencil format either. For some projects, the inclusion of electronic data collection is not only unnecessary but also impractical. It can add unnecessary costs, time commitments, and headaches when used for smaller samples that are easily available. Conducting Internet-based research remains a decision that the researcher must weigh carefully.
We wish to thank Paul A. Bell for providing feedback on early drafts of this article and Donna Merwarth for her competent technical assistance during many of the research projects discussed here and many that have not been discussed. We also acknowledge the contribution of the five anonymous referees and the detailed editorial support received from JRP.
Andrews, D., Nonnecke, B., & Preece, J. (2003). Electronic survey methodology: A case study in reaching hard-to-involve Internet users. International Journal of Human-Computer Interaction, 16(2), 185-210.
Belmont report: Ethical principles and guidelines for the protection of human subjects of research (Vol. 2, DHEW No. 5, 78-0014). (1978). Washington, DC: US Government Printing Office.
Berrens, R. P., Bohara, A. K., Jenkins-Smith, H., Silva, C., & Weimer, D. L. (2003). The advent of Internet surveys for political research: A comparison of telephone and Internet samples. Political Analysis, 11(1), 1-22.
Berry, D. M. (2004). Internet research: Privacy, ethics, and alienation--An open source approach. Internet Research, 14(4), 323-332.
Brennan, M. (1992). The effect of monetary incentive on mail survey response rates: New data. Journal of the Market Research Society, 34, 173-177.
Cochran, W. G. (1977). Sampling techniques (3rd ed.). New York: Wiley.
Dillman, D. A. (1978). Mail and telephone surveys. Newbury Park, CA: Sage.
Hewson, C. M., Laurent, D., & Vogel, C. M. (1996). Proper methodologies for psychological and sociological studies conducted via the Internet. Behavior Research Methods, Instruments, & Computers, 28(2), 186-191.
Hong, T. (2006). The influence of structural and message features on web site credibility. Journal of the American Society for Information Science and Technology, 57(1), 114-127.
Hovland, C. I., & Weiss, W. (1951). The influence of source credibility on communication effectiveness. Public Opinion Quarterly, 15, 635-650.
Koo, M., & Skinner, H. (2005). Challenges of Internet recruitment: A case study with disappointing results, Journal of Medical Internet Research, 7(1), Article e6. Retrieved April 25, 2006, from http://www.jmir.org/2005/1/e6/
Manfreda, K. L., Batagelj, Z., & Vehovar, V. (2002). Design of web survey questionnaires: Three basic experiments. Journal of Computer-Mediated Communication, 7(3). Retrieved April 25, 2006, from http://jcmc.indiana.edu/vol7/issue3/vehovar.html
Milgram, S. (1974). Obedience to authority: An experimental view. New York: Harper and Row.
McCready, W. C. (1996). Applying sampling procedures. In F. T. L. Leong & J.T. Austin (Eds), The psychology research handbook: A guide for graduate students and research assistants (pp. 98-110). Thousand Oaks, CA: Sage.
Naglieri, J. A., Drasgow, F., Schmidt, M., Handler, L., Prifitera, A., Margolis, A., et al. (2004). Psychological testing on the Internet: New problems, old issues. American Psychologist, 59(3), 150-162.
Nederhof, A. J. (1988). Effects of a final telephone reminder and questionnaire cover design in mail surveys. Social Science Research, 17, 353-361.
Nosek, B. A., Banaji, M. R., & Greenwald, A. G. (2002). E-research: Ethics, security, design, and control in psychological research on the Internet. Journal of Social Issues, 58(1), 161-176.
Olson, J. M., & Cal, A. V. (1984). Source credibility, attitudes, and the recall of past behaviours. European Journal of Social Psychology, 14, 203-210.
Orlich, D. (1978). Designing sensible surveys. Pleasntville, NY: Redgrave.
Pittenger, D. J. (2003). Internet research: An opportunity to revisit classic ethical problems in behavioral research. Ethics and Behavior, 13(1), 45-60.
Rosnow, R. L., & Rosenthal, R. (2005). Beginning behavioral research: A conceptual primer (5th ed.). Upper Saddle River, NJ: Prentice Hall.
Sheehan, K. B., & McMillan, S. J. (1999). Response variation in e-mail surveys: An exploration. Journal of Advertising Research, 39(4), 45-54.
Schillewaert, N., & Meulemeester, P. (2005). Comparing response distributions of offline and online data collection methods. International Journal of Market Research, 47(2), 163-178.
Schuman, H., & Presser, S. (1981). Questions and answers in attitude surveys. New York: Academic Press.
Sethuraman, R., Kerin, R. A., & Cron, W. L. (2005). A field study comparing online and offline data collection methods for identifying product attribute preferences using conjoint analysis. Journal of Business Research, 58, 602-610.
Sudman, S. (1983). Applied sampling. In P. Rossi, J. Wright, & A. Anderson (Eds), Handbook of survey research (pp. 145-194). New York: Academic Press.
Sudman, S., & Bradburn, N. M. (1982). Asking questions: A practical guide to questionnaire design. San Francisco: Jossey-Bass.
Waern, Y. (2001). Ethics in global Internet research (Report from the Department of Communication Studies, Linkoping University). Retrieved April 25, 2006, from http://www.cddc.vt.edu/aoir/ethics/public/YWaern-globalirethics.pdf
Zhang, Y. (2000). Using the Internet for survey research: A case study. Journal of the American Society for Information Science, 51(1), 57-68.
Received 24 January 2006 | Accepted 17 April 2006 | Published 4 October 2006
Copyright © 2006 Journal of Research Practice and the authors
The past decade has seen a tremendous increase in internet use and computer-mediated communication (Fox, Rainie, Larsen, Horrigan, Lenhart, Spooner, & Carter, 2001; Horrigan, 2001; Nie & Erbring, 2000; Nie, Hillygus, & Erbring, 2002). As an increasing amount of communicative activity takes place through this new medium, there has likewise been a significant increase in primary research on virtual communities, online relationships, and a variety of other aspects of computer-mediated communication (Flaherty, Pearce, & Rubin, 1998; Matheson, 1991; Nonnecke, Preece, Andrews, & Voutour, 2004; Preece, 1999; Preece & Ghozati, 2001; Walther, 1996; Walther & Boyd, 2002; Wood & Smith, 2001; Wright, 2000a, 2002a, 2002b, 2004). Studies of online populations have led to an increase in the use of online surveys, presenting scholars with new challenges in terms of applying traditional survey research methods to the study of online behavior and Internet use (Andrews, Nonnecke, & Preece, 2003; Bachmann & Elfrink, 1996; Stanton, 1998; Witmer, Colman, & Katzman, 1999; Yun & Trumbo, 2000).
The technology for online survey research is young and evolving. Until recently, creating and conducting an online survey was a time-consuming task requiring familiarity with web authoring programs, HTML code, and scripting programs. Today, survey authoring software packages and online survey services make online survey research much easier and faster. Yet many researchers in different disciplines may be unaware of the advantages and disadvantages associated with conducting survey research online. Advantages include access to individuals in distant locations, the ability to reach difficult to contact participants, and the convenience of having automated data collection, which reduces researcher time and effort. Disadvantages of online survey research include uncertainty over the validity of the data and sampling issues, and concerns surrounding the design, implementation, and evaluation of an online survey.
This article considers and evaluates the advantages and disadvantages related to conducting online surveys identified in previous research. In addition, it reviews the current state of available web survey software packages and services, various features of these software packages and services, and their advantages and limitations. The purpose of the article is to provide an overview of issues and resources in order to assist researchers in determining if they would benefit from using online surveys, and to guide them in the selection and use of online survey techniques. To facilitate these goals, which are both methodological and applied, the author draws on published research dealing with online survey methods, as well as his experience conducting more than 10 online surveys.
Advantages of Online Survey Research
Researchers in a variety of disciplines may find the Internet a fruitful area for conducting survey research. As the cost of computer hardware and software continues to decrease, and the popularity of the Internet increases, more segments of society are using the Internet for communication and information (Fox et al., 2001; Nie et al., 2002). Thousands of groups and organizations have moved online, many of them aggressively promoting their presence through the use of search engines, email lists, and banner advertisements. These organizations not only offer information to consumers, they also present opportunities for researchers to access a variety of populations who are affiliated with these groups.
Communication researchers may find the Internet an especially rich domain for conducting survey research. Virtual communities have flourished online, and hundreds of thousands of people regularly participate in discussions about almost every conceivable issue and interest (Horrigan, 2001; Wellman, 1997; Wellman & Haythornthwaite, 2002). Areas as diverse as interpersonal (Parks & Floyd, 1996; Tidwell & Walther, 2002; Wright, 2004), group (Hollingshead, McGrath, & O'Connor, 1993; Hobman, Bordia, Irmer, & Chang, 2002), organizational (Ahuja & Carley, 1998), health (Rice & Katz, 2001; Wright, 2000a), and mass communication (Flaherty et al.,1998; Flanagin & Metzger, 2001) have been studied using online surveys.
Access to Unique Populations
One advantage of online survey research is that it takes advantage of the ability of the Internet to provide access to groups and individuals who would be difficult, if not impossible, to reach through other channels (Garton, Haythornthwaite, & Wellman, 1999; Wellman, 1997). In many cases, communities and groups exist only in cyberspace. For example, it would be difficult to find a large, concentrated group of people conducting face-to-face discussions of topics such as cyber-stalking, online stock trading, and the pros and cons of virtual dating. While people certainly discuss such issues among friends, family members, and co-workers, few meet face-to-face in large groups to discuss them. One advantage of virtual communities as sites for research is that they offer a mechanism through which a researcher can gain access to people who share specific interests, attitudes, beliefs, and values regarding an issue, problem, or activity. For example, researchers can find a concentrated number of older individuals who use computers on the Internet-based community SeniorNet (Furlong, 1989; Wright, 2000a, 2000c). In contrast, with traditional survey research methods it may be more difficult to reach a large number of demographically-similar older people who are interested in computers.
Another example is the case of individuals with diseases or conditions, such as HIV, eating disorders, and physical disabilities. Individuals with these conditions and diseases are often difficult to reach because they are stigmatized offline. Health communication researchers have been able to go online to study these populations, including examining how features of the computer medium help people cope with the social stigma of their condition (Braithwaite, Waldron, & Finn, 1999; Wright, 2000b). More generally, the Internet enables communication among people who may be hesitant to meet face-to-face. For example, individuals with unpopular political views may hesitate to express themselves openly, and groups of individuals such as Arab-Americans may feel uncomfortable talking about anti-Arab sentiment in public places (Muhtaseb, 2004). These individuals and groups often can be reached on the Internet in larger numbers than would be possible using face-to-face research methods.
A second advantage is that Internet-based survey research may save time for researchers. As already noted, online surveys allow a researcher to reach thousands of people with common characteristics in a short amount of time, despite possibly being separated by great geographic distances (Bachmann & Elfrink, 1996; Garton et al., 2003; Taylor, 2000; Yun & Trumbo, 2000). A researcher interested in surveying hard-to-reach populations can quickly gain access to large numbers of such individuals by posting invitations to participate to newsgroups, chat rooms, and message board communities. In the face-to-face research environment, it would take considerably longer-if it were possible at all-to find an equivalent number of people with specific attributes, interests, and attitudes in one location.
Online surveys may also save time by allowing researchers to collect data while they work on other tasks (Llieva, Baron, & Healey, 2002). Once an invitation to participate in a survey is posted to the website of a community of interest, emailed to people through a listserv service, or distributed through an online survey research service, researchers may collect data while working on other projects (Andrews et al., 2003). Responses to online surveys can be transmitted to the researcher immediately via email, or posted to an HTML document or database file. This allows researchers to conduct preliminary analyses on collected data while waiting for the desired number of responses to accumulate (Llieva et al., 2002). First generation online survey researchers often used email-based surveys, which involved creating online survey forms using word processing software, and later used products such as Macromedia's Dreamweaver. Researchers had to “cut and paste” responses from the email responses into statistical software programs such as SAS and SPSS. More recently, online survey creation software packages provide a variety of templates to create and implement online surveys more easily, as well as to export data to statistical software packages. Moreover, a number of online survey services provide survey design assistance, generate samples, and analyze and interpret data. Some of the newer software packages and web-based services are detailed below.
Online survey researchers can also save money by moving to an electronic medium from a paper format (Bachmann & Elfrink, 1996; Couper, 2000; Llieva et al., 2002; Yun & Trumbo, 2000). Paper surveys tend to be costly, even when using a relatively small sample, and the costs of a traditional large-scale survey using mailed questionnaires can be enormous. The use of online surveys circumvents this problem by eliminating the need for paper and other costs, such as those incurred through postage, printing, and data entry (Llieva et al., 2002; Watt, 1999; Witmer et al., 1999). Similarly, conducting online interviews, either by email, or in a synchronous “chat” format, offers cost savings advantages. Costs for recording equipment, travel, and the telephone can be eliminated. In addition, transcription costs can be avoided since online responses are automatically documented. Newer online survey creation software and web survey services costs can vary from very little to thousands of dollars depending upon the types of features and services selected; however, this is relatively inexpensive compared to the cost of traditional paper-and-pencil surveys.
Disadvantages Associated with Online Survey Research
As discussed above, online surveys offer many advantages over traditional surveys. However, there are also disadvantages that should be considered by researchers contemplating using online survey methodology. Although many of the problems discussed in this section are also inherent in traditional survey research, some are unique to the computer medium.
When conducting online research, investigators can encounter problems as regards sampling (Andrews et al., 2003; Howard, Rainie, & Jones, 2001). For example, relatively little may be known about the characteristics of people in online communities, aside from some basic demographic variables, and even this information may be questionable (Dillman, 2000; Stanton, 1998). A number of recent web survey services provide access to certain populations by offering access to email lists generated from other online surveys conducted through the web survey service. Some offer access to specialized populations based on data from previous surveys. However, if the data were self-reported, there is no guarantee that participants from previous surveys provided accurate demographic or characteristics information.
Generating Samples from Virtual Groups and Organizations
Some virtual groups and organizations provide membership email lists that can help researchers establish a sampling frame. However, not all members of virtual groups and organizations allow their email addresses to be listed, and some may not allow administrators to provide their email addresses to researchers. This makes accurately sizing an online population difficult.
Once an email list is obtained, it is possible to email an online survey invitation and link to every member on the list. Theoretically, this can give researchers a sampling frame. However, problems such as multiple email addresses for the same person, multiple responses from participants, and invalid/inactive email addresses make random sampling online a problematic method in many circumstances (Andrews et al., 2003; Couper, 2000). One solution is for researchers to require participants to contact them to obtain a unique code number (and a place to include this code number on the online questionnaire) prior to completing a survey. However, requiring this extra step may significantly reduce the response rate. Another solution that some newer web survey programs offer is response tracking. Participants are required to submit their email address in order to complete the survey. Once they have completed the survey, the survey program remembers the participant's email address and does not allow anyone using that email address access to the survey. This feature helps to reduce multiple responses, although someone could potentially complete the survey a second time using a secondary email address (Konstan, Rosser, Ross, Stanton, & Edwards, 2005).
Generating a Sample from an Online Community
Establishing a sampling frame when researching an online community presents a number of challenges. Unlike membership-based organizations, many online communities, such as community bulletin boards and chat rooms, do not typically provide participant email addresses. Membership is based on common interests, not fees, and little information is required when registering to use these communities, if registration is required at all. Some researchers attempt to establish a sampling frame by counting the number of participants in an online community, or the published number of members, over a given period of time. In either case, the ebb and flow of communication in online communities can make it difficult to establish an accurate sampling frame. For example, participation in online communities may be sporadic depending on the nature of the group and the individuals involved in discussions. Some people are “regulars,” who may make daily contributions to discussions, while others only participate intermittently. Furthermore, “lurkers,” or individuals who read posts but do not send messages, may complete an online survey even though they are not visible to the rest of the community. The presence of lurkers in online communities appears to be highly variable (Preece, Nonnecke, & Andrews, 2004). Studies have found that in some online communities lurkers represent a high percentage (between 45% and 99%) of community members, while other studies have found few lurkers (Preece et al., 2004). Because lurkers do not make their presence known to the group, this makes it difficult to obtain an accurate sampling frame or an accurate estimate of the population characteristics.
As internet communities become more stable, some community administrators are beginning to compile statistics on their community's participants. Many communities require a person to register with the community in order to participate in discussions, and some communities are willing to provide researchers with statistics about community membership (at least in aggregate form). Registration typically involves asking for the individual's name, basic demographic information such as age and gender, and email address. Other community administrators might ask participants for information about interests, income level, education, etc. Some communities are willing to share participant information with researchers as a validation technique by comparing the survey sample characteristics with those of the online community in general. Yet, because individuals easily can lie about any information they report to community administrators, there is no guarantee of accuracy.
When possible, using both online and traditional paper surveys helps to assess whether individuals responding to the online version are responding in systematically different ways from those who completed the paper version. For example, Query and Wright (2003) used a combination of online and paper surveys to study older adults who were caregivers for loved ones with Alzheimer's disease. The researchers attempted to assess whether the online responses were skewed in any way by comparing the responses from both subsamples. While no significant differences between the two subsamples were found in this particular study, real differences in responses between Internet users and non-Internet users might exist in other populations. This may make it difficult to assess whether the observed differences are due to factors such as participant deception or actual differences due to characteristics associated with computer and non-computer users.
Other Sampling Concerns
Although some studies of online survey methods have found that response rates in email surveys are equal to or better than those for traditional mailed surveys (Mehta & Sivadas, 1995; Stanton, 1998; Thompson, Surface, Martin, Sanders, 2003), these findings may be questionable because non-response rate tracking is difficult to ascertain in most large online communities (Andrews et al., 2003). One relatively inexpensive technique used by market researchers to increase response rates is to offer some type of financial incentive, e.g., a lottery. Individuals who participate in the survey are given a chance to win a prize or gift certificate, and the winner is selected randomly from the pool of respondents. However, this technique is not without problems. Internet users frequently encounter bogus lotteries and other “get rich quick” schemes online, so a lottery approach to increasing response rates could potentially undermine the credibility of the survey. In addition, offering a financial incentive may increase multiple responses to the survey as participants try to “stack the deck” to increase their chances of winning (Konstan, et al., 2005). Straight incentives such as a coupon redeemable for real merchandise, i.e., books, may be more effective and more credible.
Self-selection bias is another major limitation of online survey research (Stanton, 1998; Thompson et al., 2003; Wittmer et al., 1999). In any given Internet community, there are undoubtedly some individuals who are more likely than others to complete an online survey. Many Internet communities pay for community operations with advertising. This can desensitize participants to worthwhile survey requests posted on the website. In short, there is a tendency of some individuals to respond to an invitation to participate in an online survey, while others ignore it, leading to a systematic bias.
These sampling issues inhibit researchers' ability to make generalizations about study findings. This, in turn, limits their ability to estimate population parameters, which presents the greatest threat to conducting probability research. For researchers interested only in conducting nonprobability research, these issues are somewhat less of a concern. Researchers who use nonprobability samples assume that they will not be able to estimate population parameters.
Many of the problems discussed here are not unique to online survey research. Mailed surveys suffer from the same basic limitations. While a researcher may have a person's mailing address, he or she does not know for certain whether the recipient of the mailed survey is the person who actually completes and returns it (Schmidt, 1997). Moreover, respondents to mailed surveys can misrepresent their age, gender, level of education, and a host of other variables as easily as a person can in an online survey. Even when the precise characteristics of a sample are known by the researcher, people can still respond in socially desirable ways or misrepresent their identity or their true feelings about the content of the survey.
The best defense against deception that researchers may have is replication. Only by conducting multiple online surveys with the same or similar types of Internet communities can researchers gain a reliable picture of the characteristics of online survey participants.
Some researchers access potential participants by posting invitations to participate in a survey on community bulletin boards, discussion groups, and chat rooms. However, members of online communities often find this behavior rude or offensive (Hudson & Bruckman, 2004), or consider this type of posting to be “spam” (Andrews et al., 2003). A community moderator may delete the unwanted post, or the researcher may be inundated with emails from irate members of the community. Researchers using email invitations to participate in a survey may face similar rejection. An unwanted email advertisement is often considered an invasion of privacy. The invitation for the survey may be deleted, or the researcher may receive email from participants complaining about it.
Some participants in Internet communities actually welcome studies by researchers, especially when members are interested in how their community is perceived by others. With some diplomatic dialogue initiated by the researcher, it is often possible to work with web community administrators and participants when proposing a study idea (Reid, 1996). This is a more ethnographic approach. Although accessing some online communities can be extremely challenging, seeking permission from the community and taking time to explain the purpose of the study might help a researcher to gain access. Nonetheless, it may take a long time before receiving a response to a request, and community sponsors may reject the researcher's request despite his or her attempts to convey the possible benefits of the study for the community (Andrews et al., 2003). Researchers might apologize in advance for the potentially unwanted posting, with an explanation of the importance of conducting the research and possible benefits to members.
Researchers can foster “good will” between themselves and community participants by offering to provide information about the results of their study to the community. One way to do this is to create a study report, highlighting the most interesting results to the online community audience, post it on a web page, and have community administrators post a link to the page on the community web site. Study results should be presented so that audience members can understand them. For example, the author of this article created a summary of research findings for the SeniorNet community after completing a study of social support among participants (Wright, 2000a). SeniorNet administrators created a special link to this web page so that the participants in the study (as well as other SeniorNet members) could learn about the results and their possible implications.
It is important for researchers to include contact information, information about the study, and something about their credentials when creating an invitation to participate in a survey. In addition to being a requirement of most institutional research review boards in universities in the United States, this helps to enhance the credibility of the survey and it can create opportunities for email interaction between the researcher and participants. This is valuable, especially when participants have questions. However, as Andrews et al. (2003) point out, providing researcher contact information has its downside. Researchers can sometimes become the targets of abusive individuals who resent the invasion of privacy when they encounter an online survey. Hate email or worse types of abuse can occur if some individuals on the Internet find online surveys offensive. A man once called the author's home phone number repeatedly and left threatening messages on his voice mail after obtaining the number from his department secretary (the department number appeared on the informed consent for the online survey). While such incidents tend to be rare, the possibility of irate responses is something to consider.
The above does not necessarily constitute an exhaustive review of the advantages and disadvantages of conducting online surveys, although it represents experiences encountered by many researchers, and points to factors that should be taken into consideration in deciding to use and designing an online survey. The next section surveys current web survey software packages and online survey-related services available to researchers who may be interested in conducting online survey research.
Current Web Survey Software Packages and Online Survey-Related Services
As noted above, today's researchers have help with online survey work. There are currently dozens of online survey software packages and web survey services available to researchers willing to pay for them. Table 1 lists 20 of the more prominent packages and services, along with their web addresses.
The author examined each of the websites for these 20 online survey product and service companies in order to assess current features, pricing, and limitations, as well as to identify current trends in the online survey product and services market. Table 2 presents a comparison of features, pricing issues, and limitations of the 20 online product and service companies.
|Company Name/Product||Features||Pricing||Service Limitations|
|Active Websurvey||Unlimited surveys; software automatically generates HTML codes for survey forms||Information unavailable on website||Customer required to purchase software; limited to 9 question formats|
|Apian Software||Full service web design and hosting available||$1195 up to $5995 depending on number of software users; customer charged for technical support||Customer required to purchase software|
|Create Survey||Standard features; educational discount||$99 a month for unlimited surveys and responses; free email support||Survey housed on company server for a set amount of time|
|EZSurvey||Unlimited surveys; mobile survey technology available; educational discount||$399 for basic software; additional software is extra; telephone training is $150 an hour||Customer required to purchase software|
|FormSite||Weekly survey traffic report; multiple language support||$9.95 up to $99.95 per month depending on desired number of response||Survey housed on company server for only a set amount of time; limited number of response per month|
|HostedSurvey||Standard features; educational discount||Charge is per number of responses; first 250 response are free, then around $20 every 50 responses.||Survey housed on company server for only a set amount of time|
|InfoPoll||Standard features; Software can be downloaded for free||Information unavailable on website; limited customer support; training available for a fee||Software can be downloaded free, but works best on InfoPoll server; customers appear to be charged for using InfoPoll server|
|InstantSurvey||Standard features; supports multimedia||Information unavailable on website; free 30 day trial||Survey housed on company server for only a set amount of time|
|KeySurvey||Online focus group feature; unlimited surveys||$670 per year for a basic subscription; free 30 day trial||Survey housed on company server for only a set amount of time; limited to 2000 responses|
|Perseus||Educational discount; mobile survey technology available||Information unavailable on website; free 30 day trial||Survey housed on company server for only a set amount of time|
|PollPro||Standard features; unlimited surveys||$249 for single user; access to PollPro server is an additional fee||Customer required to purchase software|
|Quask||Supports multimedia||$199 for basic software; access to Quask server for an additional fee||Customer required to purchase software; more advanced features only come with higher priced software|
|Ridgecrest||Standard features; educational discount||$54.95 for 30 days||Survey housed on company server for only a set amount of time; limited to 1000 responses for basic package|
|SumQuest||Standard features; user guidebook for creating questionnaire available||$495 to purchase software; free unlimited telephone support||Customer required to purchase software|
|SuperSurvey||Standard features||$149 per week for basic package.||Survey housed on company server for only a set amount of time; 2000 response per week limit|
|SurveyCrafter||Standard features; educational discount||$495 for basic software package; free and unlimited technical support||Customer required to purchase software|
|SurveyMonkey||Standard features; unlimited surveys||$20 a month for a basic subscription; free email support||Survey housed on company server for a set amount of time; limited to 1000 initial responses|
|SurveySite||Company helps with all aspects of survey design, data collection and analysis; online focus group feature||Information unavailable on website||Company staff rather than customer create and conduct survey|
|WebSurveyor||Standard features; unlimited surveys||$1,495 per year for software license||Customer required to purchase software|
|Zoomerang||Standard features; educational discount||$599 for software||Customer required to purchase software|
This is not, of course, an exhaustive list of online survey software and service businesses. However, it represents a good cross-section of the types of online survey products and services currently available to researchers. The following sections consider some of the current features of online survey products and services, pricing issues, limitations, and the implications of using these products and services for online survey research.
Survey Creation Software vs. Expanded Services
The businesses listed in Tables 1 and 2 offer researchers two basic options for creating and conducting online survey research. One option is the online survey software packages, which are computer programs that researchers use to create and conduct online surveys on their own computer and server. The companies that offer such packages also provide options for customer support, server space for the online survey (in some cases), and several data tracking and analysis options. Other companies offer a wider range of services for conducting online surveys, including research design, online questionnaire development, sampling and data collection services, and data analysis and interpretation services. The major features and problems with each option are discussed below.
Purchasing Software Option
Some companies (see Table 2) require customers to purchase online survey creation software. Owning the software enables researchers to create multiple online surveys of any length as opposed to being charged per survey, per time period (e.g., by the month), by number of responses, by survey length, or by some combination of these options. Many of these companies also offer customer support, including help with design, data collection, participant tracking, and data analysis. One disadvantage of owning the software is that customers have to pay to upgrade software. Given rapid advances in software development, this software may be outdated in a relatively short period of time. Customers who have purchased software receive discounted upgrades, however. An example of this option is EZ Survey, which allows researchers to run the software on their own computer and a server of their choice. This may be an attractive choice for researchers who have access to free server space on their university or research organization server.
Online Questionnaire Features
The businesses listed in Tables 1 and 2 offer a wide array of options for creating online surveys, including many different templates to help first-time web survey researchers. Each of the online survey products reviewed offered some type of online form to collect data from participants. A “form” is an interactive type of web page that allows Internet users to send information across computer networks. After completing an online survey, participants click on a “submit” button on the webpage. This transmits the survey responses to the researcher. Online survey questions are the same types as on a traditional paper/pencil questionnaire, only the participants submit the information over the Internet rather than return questionnaires in person or by mail. Common Gateway Interface (CGI) scripting, a type of computer language that is run on the Web server where an online survey is housed, tells the server how to process information that is submitted.
Most Internet users are quite familiar with Web forms since search engines, including Yahoo! and Google, are sophisticated forms. Writing scripts for processing forms can be done manually, but this type of work can be cumbersome for a busy researcher, especially one who is not technologically proficient. All of the reviewed companies offering online survey products provide a variety of useful questionnaire options, and a user-friendly process to develop online questionnaires.
The businesses listed in Tables 1 and 2 typically offered a range of question types, although the number of options varied from business to business. Basic question options usually include Likert-type scales, semantic differential scales, checklists, textboxes (for qualitative responses), drop-down menus (for nominal or categorical items), and filter questions (to tailor surveys to individual characteristics of survey respondents). In addition, the majority of the reviewed products offer randomized answer choices for participants, so as to vary the order of question responses and thus reduce question order bias.
Some products support multiple language versions of an online survey and versions for visually impaired respondents. Additionally, some products offer more complex question-type options, such as multiple response matrices and the ability to use multimedia, i.e., asking participants to respond to a video or audio clip. A multimedia video or audio clip can be used to jog the memories of respondents or as a reference point for participant responses. For example, researchers who want to measure participant perceptions of a political candidate's positions on foreign policy could include a video clip from a recent speech. Multimedia can also be useful when targeting low literacy populations, since video and audio messages guide participants through an online survey. However, including multimedia can increase download times and may be frustrating to participants who must download media players or other types of programs in order to participate in the survey (Andrews et al., 2003). Taking the use of multimedia a bit further, the technology exists to easily construct a web page that uses video and/or audio clips as stimuli for online experimental and quasi-experimental designs. It is also possible to develop computer scripts that randomly send participants to one of several other web pages. Each web page could contain a different audio and video stimulus, enabling the random assignment of participants to different levels of an independent variable. All respondents (regardless of which condition they were assigned to) could then be linked to an online questionnaire containing dependent variable measures. Researchers who are interested in more sophisticated designs such as these would probably benefit from selecting a business that offers a greater degree of consulting and technical support.
Data Collection and Analysis Features
In addition to helping researchers create online surveys, most of the reviewed businesses offer features that aid the data collection and analysis processes, as well as customer support. These range from basic features to more in-depth involvement by company consultants. Basic survey process features include tracking of survey respondent email, email response notification, real time tracking of item responses, and the ability to export survey responses to statistical software packages such as SAS and SPSS. In addition, most of the reviewed companies offer a required answer feature, which prevents survey data submission unless certain items are responded to. This reduces missing data, especially for key survey measures. Most online survey companies offer a redirect feature to display a “thank you for participating” web page, or any web page a researcher chooses, after a participant submits the data to the researcher. Other basic features include the ability to share data with other researchers, enabling research teams with members at different locations to share survey results.
Although most of the reviewed companies offer free technical support, researchers are generally charged a fee for extensive consultations and/or training. For example, SurveySite offers consultation throughout the entire survey research process, including method design, questionnaire creation, data collection, data analysis, and interpretation of results. Zoomerang offers access to tailored email lists and multisource recruiting for sampling, allowing researchers to target specific demographic groups within a population of interest. Other companies will help researchers collect data by advertising the survey on certain websites. Some companies offer other types of features to aid with the survey research process. For example, EZ Survey offers a free sample size calculator, and businesses such as SurveyMonkey offer pop-up advertising to aid in recruiting participants. Some companies, such as InstantSurvey, unsubscribe respondents from an email list after they have completed a survey, which may help to reduce multiple responses from the same participant.
Several of the companies offer researchers even more sophisticated options for conducting survey research. Perseus can conduct mobile surveys, using wireless handheld devices like Palm pilots. Data are sent through wireless technology to a server (similar to other online survey forms) where the information is posted to a database file. Mobile Internet surveys offer a number of advantages to researchers. Using a wireless device (as opposed to a laptop computer), researchers can bring a survey to otherwise inaccessible populations in the non-virtual world, such as patients in a busy healthcare setting, individuals in rural settings, or socioeconomic groups that do not have access to computers or the Internet. This allows researchers to conduct face-to-face interviews with participants while using the wireless device to store and transmit responses to the survey to a database. In addition, some companies, such as Perseus, have the capability to merge computer technology with traditional survey methods. They offer telephone survey capabilities where participants use a touch tone phone to enter responses.
Other companies, such as KeySurvey and SurveySite, provide the ability to conduct online focus groups. The Internet allows researchers to include participants from multiple geographic locations in the same focus group. Participants view the same video, audio, and/or text in real time from remote locations. Researchers can interact with participants via chat room applications or webcam and audio teleconferencing technologies. Real-time computer applications are important in focus groups because researchers want participants to interact with the focus group facilitator and with each other at the same time. The responses of one participant can trigger ideas and responses among other participants, leading to richer results. These qualitative focus groups are often used as a precursor to developing a quantitative survey to reach broader numbers of individuals.
Costs of survey products and their services vary. In general, the more features and services needed from a web survey company, the more it will cost. However, it is a “buyer beware” situation. Basic features can be purchased for a relatively small amount of money. For instance, SurveyMonkey provides authoring tools, server space, and simple automated survey analysis for about $20 a month ($240 per year); however, there are limitations, such as the limitation of 1000 responses per month. SurveyMonkey charges an additional 5 cents per survey response over the 1000 response limit. Moreover, paying more does not necessarily mean more services. Other businesses, such as KeySurvey, charge substantially more ($670 per year for a basic subscription) for products and services similar to those offered by SurveyMonkey. Other companies charge researchers by the survey. Companies that charge less typically do not recruit participants for customers and do not provide consultation throughout all stages of the research process. However, for many web survey researchers, these basic, less expensive approaches may be sufficient, especially for those experienced in conceptualizing survey projects, data analysis, and interpretation of results. In general, if sample generation or help with analyzing data is not needed, then businesses that include these services in the price should be avoided, or else these services should be negotiated out of the price. Pricing for the businesses reviewed here varied considerably even though they offered similar products, features, and services. For example, SuperSurvey offers products, features, and services similar to SurveyMonkey for $500 to $2000 per business quarter (depending upon number of users and number of responses desired), as opposed to only $20 a month.
As previously stated, while most companies offer free technical support, researchers are generally charged extra for extended training and consultation. In some cases, consultation can be expensive. For example, Perseus charges $2000 per day for personalized training, but also offers discounts for group consultation and training. Moreover, many of the reviewed business websites offer educational discounts for academics, including discounts on software, as well as consultations and other support services. For example, Zoomerang offers educators one year access to their online web survey creation services, server space for surveys, and customer support for around $350 (about $100 less than the regular price for service). Other business, such as Perseus and SurveyCrafter, advertise educational discounts on a wider variety of services. Researchers should inquire about these special discounts since they may help to reduce the overall cost of purchasing web survey software or services.
As noted above, there may be limitations associated with using web survey products and services. Some specific limitations include issues of time, space, and number of responses allowed for a given price. For example, companies such as SurveyMonkey and SuperSurvey will host an online survey for a set amount of time. If a researcher wants to keep a survey on the company's server for an extended period of time (such as more than a year), this costs extra. In addition, some companies often charge more for longer surveys and for a number of respondents exceeding a certain amount (generally over 1000). Purchased software, in contrast, generally does not have space or response number restrictions.
There are also generally limitations to the amount of free customer support a researcher can obtain. Customer support may be available for minor technical problems and customer questions, but customers are generally charged extra for extended consultations and training. Typically, minor questions can be answered for free via telephone, email, or chat applications, but a researcher may be charged for extensive training, such as learning advanced web page creation techniques or data analysis instruction. Researchers who use a company's email lists to generate a sample are limited by the quality of this type of sampling frame. In cases where a company uses the same lists again and again for different clients, the individuals who receive the advertisements about a survey on these lists may become weary of being targeted by multiple surveys, and this could negatively impact response rates.
Implications of Using Web Survey Products and Services
Current web survey products and services have greatly facilitated the process of creating and conducting online surveys. Researchers can save considerable time by utilizing the products and services that are offered by many of the businesses highlighted in this article, compared to the time that it would take most researchers to create an online survey themselves using a web authoring program, thanks to a variety of attractive features offered. The cost of these products and services varies depending on the types of features and services a researcher desires. As with purchasing any product or service, researchers should assess their research needs, budget, and research time frame, and comparison shop when deciding on which business to use.
As we have seen, however, these products and services are not without limitations. While attractive, features of the surveys themselves (such as multimedia) and the services (e.g., using company email lists to generate samples) offered by web survey business can affect the quality of data in a variety of ways. Furthermore, using these products and services does not necessarily circumvent the disadvantages of online surveys, including issues related to sampling frames, response rates, participant deception, and access to populations. In short, researchers should view current web survey products and services as another research tool that-like the online survey itself-has its own unique advantages and disadvantages.
The author would like to thank the anonymous reviewers for their helpful and insightful suggestions for improving this manuscript.