Sie sind auf Seite 1von 19

Internet Research

Emerald Article: Social research 2.0: virtual snowball sampling method using Facebook Fabiola Baltar, Ignasi Brunet

Article information:
To cite this document: Fabiola Baltar, Ignasi Brunet, (2012),"Social research 2.0: virtual snowball sampling method using Facebook", Internet Research, Vol. 22 Iss: 1 pp. 57 - 74 Permanent link to this document: http://dx.doi.org/10.1108/10662241211199960 Downloaded on: 24-11-2012 References: This document contains references to 48 other documents To copy this document: permissions@emeraldinsight.com

Access to this document was granted through an Emerald subscription provided by STELLENBOSCH UNIVERSITY For Authors: If you would like to write for this, or any other Emerald publication, then please use our Emerald for Authors service. Information about how to choose which publication to write for and submission guidelines are available for all. Please visit www.emeraldinsight.com/authors for more information. About Emerald www.emeraldinsight.com With over forty years' experience, Emerald Group Publishing is a leading independent publisher of global research with impact in business, society, public policy and education. In total, Emerald publishes over 275 journals and more than 130 book series, as well as an extensive range of online products and services. Emerald is both COUNTER 3 and TRANSFER compliant. The organization is a partner of the Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archive preservation.
*Related content and download information correct at time of download.

The current issue and full text archive of this journal is available at www.emeraldinsight.com/1066-2243.htm

Social research 2.0: virtual snowball sampling method using Facebook


Fabiola Baltar
University of Mar del Plata, Mar del Plata, Argentina, and

Social research 2.0

57
Received 9 March 2011 Revised 10 March 2011 10 July 2011 26 July 2011 4 August 2011 Accepted 12 August 2011

Ignasi Brunet
University Rovira I Virgili, Reus, Spain
Abstract
Purpose The aim of this paper is to present a sampling method using virtual networks to study hard-to-reach populations. In the ambit of social research, the use of new technologies is still questioned because the selection bias is an obstacle to carry on scientic research on the Internet. In this regard, the authors hypothesis is that the use of social networking sites (Web 2.0) can be effective for the study of hard-to-reach populations. The main advantages of this technique are that it can expand the geographical scope and facilitates the identication of individuals with barriers to access. Therefore, the use of virtual networks in non-probabilistic samples can increase the sample size and its representativeness. Design/methodology/approach To test this hypothesis, a virtual method was designed using Facebook to identify Argentinean immigrant entrepreneurs in Spain (214 cases). A characteristic of this population is that some individuals are administratively invisible in national statistics because they have double nationality (non-EU and EU). The use of virtual sampling was combined with an online questionnaire as a complementary tool for Web 2.0 research in behavioural sciences. Findings The number of cases detected by Facebook and the virtual response rate is higher than traditional snowball technique. The explanation is that people increase their level of condence because the researcher shows his personal information (Facebooks prole) and also participates in their groups of interest (Facebooks groups). Moreover, the online questionnaires administration allows the quality of the information to be controlled and avoids duplication of cases. Originality/value The present article is the rst that uses Facebook as an instrument to study immigrants. Therefore its adoption represents a great challenge in the social research eld because there are many barriers of access and search. It also proposes a novel mix of traditional methodologies updated with the use of new virtual possibilities of studying hard to reach populations, especially in areas of social research where the contributions of these methods are less developed. Keywords Web 2.0, Snowball sampling, Hard to reach population, social research methodology, Sampling methods Paper type Research paper

1. Introduction We cannot ignore the importance of virtual relationships on people lives. Everyday an amount of activities take place in this online reality where individuals express thoughts, intentions and opinions about events that happen in their real world. In fact, in many activities such as commerce, banking, services, rm strategies and politics, the Internet occupies a central role. Moreover, different disciplines have incorporated the use of these information technologies in their practices, for example in areas such as learning (Yardi, 2007), health, banking (Birch and Young, 1997),

Internet Research Vol. 22 No. 1, 2012 pp. 57-74 q Emerald Group Publishing Limited 1066-2243 DOI 10.1108/10662241211199960

INTR 22,1

58

marketing and industry (Burns and Bush, 2006; Birch and Young, 1997) and politics (Kushin and Kitchener, 2009). However, the question about whether the Internet can be a viable scientic research tool is still under discussion. The main query is related with the possibilities of online methodologies to produce valid and reliable data. The existence of sample bias (the Internet population constitutes a biased sample of the total population in terms of demographic characteristics) raises doubts about its usefulness in social research (Coomber, 1997; Stanton, 1998). Nevertheless, the Internet opens new ways to investigate in social and behavioural sciences because there are many scientic questions about some target population that do not look for generalised results but representative ones. As many scholars recognise, the Internet provides new opportunities for non-random survey data (Couper, 2000; Fricker and Schonlau, 2002). In particular, these authors distinguish the advantages of the Internet to capture the hard to reach/hard to involve population. In fact, this paper proposes to analyze if social networking sites (SNSs), specically the Facebook network site, can help researchers to contact hidden or hard to reach persons. Although in the last decade researchers became interested in the study of SNSs, they particularly focus on the characteristics of the users of SNSs (Back et al., 2010), on the analysis of the social capital and their inuence in users behaviour (Ellison et al., 2007; Bigge, 2006) or on ethical issues concerning with the reward of private information on the Internet (Acquisti and Gross, 2006; Boyd and Ellison, 2008; Boyd, 2008; Boyd and Hargittai, 2010). But until now it has not been common to exploit SNSs as a research tool (Brickman-Bhutta, 2009). Boyd and Ellison (2008, p. 1) dene SNSs as web-based services that allow individuals to construct a public or semi-public prole within a bounded system, articulate a list of other users with whom they share a connection, and view and traverse their list of connections and those made by others within the system. Given these characteristics, the emergence of the SNSs has transformed the Internet into an efcient tool for snowball sampling. As it is observed by Brickman-Bhutta (2009, p. 1), online social networking sites offer new ways for researchers to run surveys quickly, cheaply, and single- handled, especially when seeking to construct snowball samples of small or stigmatised subsets of the general population. According to this idea, the exploratory hypothesis is that SNSs are a good complement for sampling hard to reach/hard to involve populations because they make possible to expand the size and scope of the sample, features that constitute the main limitations of this kind of research. Therefore, SNSs are an appropriate tool to apply snowball sampling and can improve the representativeness of the results. In order to test this hypothesis, we applied virtual online sampling to detect Argentinean entrepreneurs in Spain. The characteristic of this target population is that nearly 60 per cent of them live in Spain as European citizens. This group is not administratively visible in Spanish national statistics as Argentineans, so it is impossible to build a sample frame to obtain a probabilistic sample. Moreover, if we do not take account of this hidden population (the double nationality citizens) there is a threat in terms of representativeness of the results. 2. Objectives The purpose of this article is to present a methodology applied to the study of Argentinean entrepreneurs living in Spain. The novelty of the study is the

incorporation of Facebook and online questionnaires to improve the efcacy of the snowball sampling and the collection of data. Although in some other eld of research the discussion and application of these methodologies are better developed (i.e. public health eld), in business and economic research development is scarce. This paper contributes to the literature providing empirical evidence of the benets and weaknesses of using online surveys in social research. Furthermore, it is possible to do rigorous and efcient social research incorporating new technologies to the scientic research process. The article has the following structure. First, the literature review on online sampling including the snowball sampling strategy to study hard-to reach populations and the use of Facebook as a sample frame is presented. Second, the use of online questionnaires as a complementary tool for doing online social research is discussed. Third, the results obtained in the application of SNS to the study of Argentinean entrepreneurs are shown. Finally, the conclusions of the study are presented. 3. Online and traditional sampling in hard to reach populations: an overview The methodological debate around using the Internet as a sample frame can be resumed in one question: how online samples can improve the quality and the response rate in the data collection phase of a study and reduce the selection bias problem. Maronick (2009) considered that the wide variation in response rates on Internet surveys is because of the use of different methods to contact and encourage people to participate in the study. In fact, some authors demonstrate that the response rate of online studies depends on personalised contact strategies (Cook et al., 2000), the interest of the individuals on the topic considered (Groves et al., 2004), the incentives and the length of the survey (Moskowitz and Birgi, 2008) and technical factors (Couper, 2000). Moreover, the quality of the information and the selection bias problem are associated with the sampling error (i.e. missing virtual groups), coverage (i.e. Internet accessibility of the target population), non-response pattern (i.e. differences between respondents and non-respondents on a variable of interest) and measurement errors (i.e. deviation of respondents answers with the population). All these barriers affect the external validity on online research and reduce the feasibility of using Internet space for scientic purposes (Fricker et al., 2005). However, in many research elds, online research can be a powerful instrument to improve the scope of the studies, maximise the time-cost trade off and increase the size of the sample. For example, an accurate application can be related to sampling hard to reach/hard to involve populations. 3.1 Traditional sampling strategies to study hard to reach populations Marpsat and Razandratsima (2010, p. 4) have summarised the main characteristics that dene a population as hard-to reach: . The population of interest has relatively low numbers, which makes an investigation throughout the general population very expensive. . Members of the population of interest are hard to identify. . What they have in common is not easy to detect and is only rarely recorded; there is no sampling frame or only a very incomplete one; the persons concerned do not wish to disclose that they are members of this population of interest, because

Social research 2.0

59

INTR 22,1
.

their behaviour is illicit, is socially stigmatised, they have no desire to revisit a painful past or because they refuse to allow any meddling in their affairs. The behaviour of the population of interest is not known, which leads to a poor choice of places in which to approach them.

60

There are many areas where the characteristics of the unit of analysis and the accessibility to the information are strictly bounded. Under those conditions, authors identied different sampling methods according with the cost of administration, the geographical constraints and the specicities of the hidden population: . snowball sampling (Browne, 2005); . targeted sampling (Watters and Biernacki, 1989); . time space sampling (Muhib et al., 2001); and . respondent-driven sampling (Heckathorn, 1997). Snowball sampling is dened as:
[. . .] a technique for nding research subjects. One subject gives the researcher the name of another subject, who in turn provides the name of a third, and so on. This strategy can be viewed as a response to overcoming the problems associated with sampling concealed hard to reach populations such as the criminal and the isolated (Atkinson and Flint, 2001, p. 1).

Snowball sampling is a useful methodology in exploratory, qualitative and descriptive research, especially in those studies that respondents are few in number or a high degree of trust is required to initiate the contact (hard to reach/hard to involve population). Frequently this technique is associated with the study of vulnerable or stigmatised population which is reluctant to participate in studies using traditional research methods. Although initial seeds in snowball sampling are in theory randomly chosen, it is difcult to carry out in practice and they are selected via a convenience sampling method. Magnani et al. (2005) pointed that the sample composition is inuenced by the choice of initial seeds. Thus, those samples tend to be biased towards more cooperative individuals or those who have a large personal network. According to the limitations observed in the traditional snowball method, the targeted sampling includes an initial ethnographic assessment in order to identify the networks that might exist in a given population. Subgroups are then treated as a cluster sample and reduce the coverage bias and therefore increase the representativeness. In order to improve the external validity of non-probabilistic samples, two methods were developed to approach them to a probabilistic sample. The time space sampling tends to identify accurate subjects in certain types of locations. Researchers enumerate a probability sample of sites, dene the time of meeting and the data are nally collected from all or a sample of members inside them. Because the geographical conditions can change over time, there is a potential sampling bias that makes it necessary to update the sampling frame which means increased costs. Furthermore, it is important to consider that the hidden population might not always be reached in a specic geographical context. Therefore, recently the most developed theory applied is the respondent-driven sampling (Heckathorn, 1997). The method combines the chain referral sampling with a recruitment process based on their social networks that allows the calculation of probabilities. In this approach, the seeds are recruiters. They have a limited number of coupons and they have monetary incentives to participate in the

study. Also, each referred respondent can act as a recruiter until the desirable sample size is reached. The main advantage is the recruitment bias can be controlled in terms of the oversampling of respondents with larger social networks. Many authors observe that snowball samples have a number of deciencies (Van Meter, 1990; Johnston and Sabin, 2010). The main problem is related with the representativeness and the selection bias which limits the external validity of the sample. Atkinson and Flint (2001, p. 2) suggest that:
[. . .] the problem of selection bias may be partially addressed, rstly through the generation of large sample and secondly by the replication of results to strengthen any generalizations.

Social research 2.0

61

In the traditional snowball sampling it is often complex or costly to search and nd a considerable number of cases valid for the purpose of the study. The main barriers to get a considerable sample size are the geographical scope and the time needed to build trust between the researcher and the individuals. In spite of these limitations, we consider that it is important to pay attention to the concept of statistical generalisation and other concepts associated with external validation such as theoretical validity or the possibilities of replication. Van Meter (1990) associates statistical inferences with descending methodologies. Descending methodology involves strategies for the study of general populations. In fact, it is necessary to count with standardised questionnaires and rigorous population samples when traditional statistical analysis is applied. On the other hand, ascending methodologies involve research strategies that are adapted to selected social groups with intensive data collection methods. Techniques like snowball sampling, ethnography and narratives are commonly used, especially when the population of the study is hard to reach. As Van Meter (1990, p. 32) points out:
[the] methods of analysis in this type of methodology must also be adapted to the specic form of data furnished and also to the specic objectives of the research. Typical forms of ascending data analysis arc social network analysis and classication analysis, often called cluster analysis.

The main disadvantages are that there are biases particularly associated with data collection: sampling error, sample bias, and response bias. The selection bias originated in a higher participation of those individual with huge networks and strong ties is the main constrain in non probabilistic samples. However, Wong (2008, p. 4) believes that matching purposive samples and the population characteristics using uncorrelated properties can minimise selection bias. The possibility of comparing the sample with ofcial statistics and with similar demographic data can improve the representativeness of the sample (Witte et al., 2000). Similarly, in the case of ascending methodologies, it is necessary to remark that the sample is strictly related with the objectives of the research and although results cannot be generalised to the population, because the units of observation are not randomly selected, there are other aspects that must be considered in the analysis of external validity, for example, the importance of the theory (Wong, 2008). What we are considering is that statistical generalisation is only referring to the predictive validity of the study. But not all the research questions pursue the aim of predicting results to the population. For instance, many authors have explained that it is possible to make valid inferences from any sample within a theorys domain (Wong, 2008). In fact, the real concern for generalising theoretical explanations in ensuring that the operationalisations of the constructs allow for generalisable inferences (Salganik and

INTR 22,1

Heckathorn, 2004). Therefore, the concept of transferability (operative constructs applied among different samples) acquires great importance in exploratory studies that do not pursue statistical generalisation but theoretical validation of the results. 3.2. Virtual snowball sampling and Internet data collection: advantages and disadvantages In this section we discuss the benets and problems associated with the application of virtual instruments in the study of hidden populations. Although the literature is extensive in the use of online data collection tools (e.g. online surveys, interviews, etc.) (Wilson and Laskey, 2003; Beneld and Szlemko, 2006; Davidovich and Uhr, 2006), it is less developed in the analysis of online recruitment sample process. Moreover, the study of online recruitment of hard to reach populations in economics and business research eld is highly limited (Ilieva et al., 2002; Burns and Bush, 2006). In this regard, the main contributions and applications of online sampling come from health sciences (e.g. drugs additions, alcoholism) and psychology (e.g. sex, discrimination, etc.). Virtual snowball sampling not only facilitates the access to hard to reach population, but also can expand sample size and the scope of the study and reduce costs and time (Beneld and Szlemko, 2006). Following Evans and Mathur (2005), the main advantages of online surveys are: . the exibility to apply them in different formats or have many versions according with the respondent (e.g. language); . online surveys can be administered in a time-efcient manner, minimising the period to collect and process data; . technology innovations that make questionnaires more attractive and easier to use, also to respondents without computational skills; . respondents can answer at a convenient time for themselves; . once the last questionnaire for a study is submitted, the researcher instantaneously has all the data stored in a database; . online surveys can include all kinds of questions (e.g. dichotomous, multiple-choice, scales, open-ended questions); . costs are lower because there are specialised online questionnaire development rms, surveys are self-administered and do not require personal interviews; . online surveys increase the response rate because it is easier to follow-up non-respondents; . the respondents answer the questions in the order intended by the study designer; and . online surveys can be constructed so that the respondent must answer a question before advancing to the others and ensure that respondents answer only the questions that are specically to them. In the same fashion, Evans and Mathur (2005) list a set of problems associated with the use of online surveys: . The perception that the mail is a spam one which increase the non-response rate.

62

. .

. . .

Selection bias related with the Internet population (gender, age, education level, socioeconomic level, etc). The sample selection methods are volunteer samples. Blanket e-mailing is sent to huge numbers of potential respondents in an unsolicited manner and only attracts the attention of proactively participants. Respondents lack of online experience. Unclear answering instructions because online questionnaires are self-administered. Impersonal, there is usually no human contact in online surveys. Privacy related issues with how data will be used. Some studies reect that there is a low response rate of many online surveys (Fricker and Schonlau, 2002; Wilson and Laskey, 2003).

Social research 2.0

63

These limitations can affect directly the validation and quality of data and so scientists are sometimes suspicious in relation to its use. However, the recruitment of the units of observation by online means must be considered carefully. The traditional recruitment methods such as community sampling, mail, and telephone surveys can be applied in an Internet-based project. In fact, the main problem is not the technique used in the research but the criteria selected to obtain the sample. Beneld and Szlemko (2006) made a comparative analysis of four online surveys applied in different studies. The rst project used an e-mail-web link to students, the second project attempted to get a community sample by leaets and web link, the third project was done with snowball sampling techniques sending e-mails to known people and the fourth project recruited students posting the study in a sign-up page and the link to the survey was sent by e-mail. The results have showed that snowball sampling was more effective to contact participants from different places. It also reached higher response rates and provided a broad sample with professions and ages were well represented. Other authors have considered the possibility of applying online recruitment strategies. Rhodes et al. (2007) classied the online strategies in active ones, where the researcher asks the respondents to participate (e.g. newsgroups), and passive strategies, where the researcher search for respondents using keywords in search engines (e.g. META tags related with the topic of interest). Miller and Sonderlund (2010) summarised the different online sampling methods in the Table I. 3.3 Facebook as a sample frame: combining random seeds and online/ofine snowball sampling Some empirical studies have applied Internet surveys for accessing to hard to reach samples (Miller and Sonderlund, 2010; Rhodes et al., 2007; Wilson and Laskey, 2003; Cambiar, 2006; Weiwu et al., 2010). For example, Duncan et al. (2003) created a web link to capture information of anonymous recreational drug users. Koo and Skinner (2005) compared a dened recruited community (registered members of a specialised web site) with an open recruitment (searching by several routes such as Usenet forums, web discussions boards, etc.) for the study of young smokers. Although many authors argue that the appearance of social networking sites have changed individuals behaviour and the way they relate in social research (Zhou, 2011),

INTR 22,1

Sampling method Newsgroups/ listservs Spamming

Advantages Targeted at group of interest Inexpensive Broad coverage More representative Inexpensive

Disadvantages May be intrusive May be geographically constrained Highly intrusive May affect future recruitment efcacy Over subscription/inappropriate responses Ethically problematic Coverage limited to web site visitors May be more expensive Most expensive Over subscription/inappropriate responses

64
Online advertising Conventional advertising Table I. Sampling methods for internet research

Targeted at group of interest Broad coverage/not only initial web recruitment More representative Targeted at group of interest

Source: Miller and Sonderlund (2010, p. 1563)

especially in the study of hard to reach populations, its development is still incipient. As Boyd and Ellison (2008) dened, the rst recognisable social networking site, named SixDegree.com, appeared in 1997. From 1997 to 2001 many communities were launched (AsianAvenue, BlackPlanet, and MiGente), allowing users to create personal and professional proles. But the massive impact of SNSs was given in 2003 with the emergence of sites such MySpace or Linkedln. However, many researchers recognise that with the creation of Facebook in 2004, the social networks phenomenon had attracted more users not only in the USA, but also all around the world. Facebook is a popular free social networking web site that allows registered users to create proles, upload photos and video, send messages and keep in touch with friends, family and colleagues. The site, which is available in 37 different languages, includes public features such as: . Marketplace. This allows members to post, read and respond to classied advertisements. . Groups. This allows members who have common interests to nd one another and interact. . Events. This allows members to publicise an event, invite guests and track who plans to attend. . Pages. This allows members to create and promote a public page built around a specic topic. . Presence technology. This allows members to see which contacts are online and chat. Each Facebook prole has a wall, where friends can post comments. Wall postings are basically a public conversation. Instead, you can send a person a private message, which will show up in his or her private inbox, similar to an e-mail message. At the beginning Facebook was created to connect Harvard students, but rapidly it spread to include high school students, professionals and everyone. Currently, more than 500

million users worldwide share their interest and personal information on Facebook. Then the individual links to others creating online communities of friends who show public information or send private messages. Furthermore they can also create new groups or join existing groups based on common interests (school, fans, religious, ethnicity, etc.). The advantages of these groups are that each member has its own Facebook prole and can be contacted individually. Since Facebook has been created, the applications and uses of this site have been changing. We consider that these barriers can be reduced with the application of SNSs to contact participants. We believe that the use of Facebook to contact individuals can minimise problems associated with spam message, impersonal contact, unclear answers and low response rates. Moreover, the possibility to have access to ofine contacts by the recommendation given by online ones can reduce problems associated with selection bias and representation. In conclusion, SNSs applied to social research can benet the study in two dimensions. First, they are a useful mean to identify hard to reach population and expand sample size. Second, they can minimise some barriers associated with online techniques to collect data. We observed these issues in the study of Argentinean entrepreneurs living in Spain. In order to access to information online surveys can be a useful complement for generating valid and rigorous data. As Brickman-Bhutta (2009, p. 4) explained:
Facebook and other social network sites allow us to carry chain-referral methods into the age of the Internet, while also exploiting the strengths of online questionnaires. A single scholar can complete projects that previously required large teams. Costs of printing, postage, and data entry virtually disappear. Feedback is instantaneous. Turnaround times shrink from weeks to days. It becomes much easier to reach remote, diffuse, and alienated subpopulations.

Social research 2.0

65

Thus, in this paper we manage to explore in which way Facebook and the online surveys can reduce these barriers and improve the representativeness of non-probabilistic samples. We formulate the following propositions: PI. Facebook can reduce the time space constrains using social networks contacts and can minimise the cost of updating the sample framework, in contrast with the traditional snowball sampling technique.

PII. Facebook can be an efcient source of information that can extent the sample size of studies that use ascending methodologies, increasing the validation and representativeness of the results. PIII. The use of random selection of virtual groups reduces the selection bias observed in convenience sampling and increases the representativeness of exploratory studies using non-probabilistic samples. PIV. The combination of Facebook recruitment with an online survey reduces time and monetary costs and minimises response bias. 4. Methodology 4.1 Recruiting survey participants When we started the study of Argentinean entrepreneurs living in Spain we faced a methodological challenge due to the characteristics of the population. This group of immigrants is a hard-to-reach population for two reasons. First, there is no sample

INTR 22,1

66

frame of foreign rms in Spain that relates them with the place of birth of the entrepreneur. Thus, as many of them have double nationality (45 per cent of Argentinean immigrants live in Spain with EU citizenship), it is not possible to reach those cases by the way of secondary information such as the ofcial statistics. Second, they are spread throughout the country and do not form the ethnic enclaves observed in other group of immigrants (Chinese, Arabs, etc.). In fact, it is not feasible to identify them by racial attributes or by using a traditional geographical sampling technique. To look for this population we decided to use a virtual networking site: Facebook. We considered that because of their condition of living abroad immigrants they should maintain contact with their country of origin and be familiarised with the Internet technologies. We explored 52 virtual groups of immigrants living in Spain (e.g. Argentineans in Madrid, Argentineans in Barcelona, etc.). Inside each group we contacted their members sending them private messages and asking if they were Argentineans and if they had started a rm up in Spain. Furthermore, we tried to extend the sample size, asking each member if they knew anyone else (online or ofine contact) who could meet the sample criteria and participate in the study. With the sample cases identied by traditional and virtual snowball sampling, a comparative analysis of both sources of information was done. Specically chi square test was applied in order to detect similarities and differences between both techniques. As regard of the validity of the study, an important issue related with the use of online sampling is the truthfulness of the information obtained by the Internet (Fricker and Schonlau, 2002). In the study, we have considered the use of different methods to check the result obtained in the online survey. First, we asked the entrepreneurs for some personal data (telephone, addresses, name of the rm, etc.), which we have validated looking for in different sources of information (blogs, statistics, telephone book, etc.). Second, we made personal interviews with some of them to cross-validate the results we have found previously. Third, we considered the contributions of previous research that have concluded that the quality of the information shared in social networks by the individuals is higher in Facebook site. Furthermore, people tend to behave in a reliable in this virtual context (Zhou, 2011; Acquisti and Gross, 2006; Dwyer et al., 2007). 5. Results As mentioned in the literature review, one of the main problems associated with the use of the Internet in social research is the selection bias and low response rates. Table II summarises the cases detected by using Facebook and traditional snowball sampling.
Response (%) 53.6 49.1 Non response n (%) 887 83 970 46.4 50.9 Total n 1,910 163 2,073 (%) 100 100

Sample Table II. Argentineans detected by virtual and traditional snowball sampling Facebook snowball sampling Traditional snowball sampling Total Source: Authors own survey

1,023 80 1,103

We managed to contact 2,073 individuals. From this sample frame we obtained 1,103 responses (53.2 per cent), of which 343 were entrepreneurs (31 per cent). An online questionnaire was sent to these entrepreneurs, of whom 214 answered (62.3 per cent). Table III summarises the distribution of valid cases (entrepreneurs/non entrepreneurs) between virtual and snowball sampling. Facebook was more effective in expanding the size than traditional snowball sampling (84.58 per cent of the total sample frame) but the last one was more efcient in detecting entrepreneurs (92.9 per cent of the contacts searched by traditional snowball sampling were entrepreneurs). This is reasonable if we consider that in the traditional snowball sampling method there is a previous knowledge of the characteristics of the unit of observation, while in Facebook there is a random process applied for contacting cases (there was a random selection of virtual groups and every member was contacted inside them without previous knowledge). Moreover if we consider the characteristics of the sample it is possible to discuss the selection bias problem referred to by the authors. In the sample of Argentinean entrepreneurs, 73.3 per cent are male and 63.6 per cent have double nationality (Argentinean nationality and another EU country). The largest age group is 26 to 41 years old (60.7 per cent) and 42.4 per cent have university studies (complete and incomplete). As can be seen from these results, there are many cases of adult people and without a high level of education. We consider that the combination of virtual and traditional snowball sampling can improve the selection process. Although the online contact can be with a young person, it is possible to access ofine individuals recommended by them (parents, friends, etc.). Other important benet related with the characteristic of the sample is the effectiveness demonstrated to capture hard-to reach population. In our sample, 60.7 per cent are communitarian citizenships, which are not registered in ofcial statistics as Argentineans. 5.1 Response rate With regard to the response rate, Table IV compares the responses obtained by virtual and traditional snowball sampling. The level of response in the total sample is considerable (62.4 per cent). Furthermore, statistical differences can be observed in terms of response rate between traditional (42 per cent) and virtual (77 per cent) search of units of observation. We believe that those differences must be due to the contact form applied in each case. For example, contacting by Facebook implies gaining the trust of the participant because the researcher has no reference to who recommends him. However, the possibility for the individual to access the researcher prole information, to see others participating or
Nonentrepreneurs n (%) 645 11 656 76.3 7.1

Social research 2.0

67

Sample

Entrepreneurs n (%) 200 143 343 23.7 92.9

Total n 845 154 999 (%) 100 100 Table III. Argentineans entrepreneurs detected by virtual and traditional snowball sampling

Facebook snowball sampling Traditional snowball sampling Total

Notes: Dif. p , 0:01 level of signicance; a99 per cent Source: Authors own survey

INTR 22,1

68

to know their contacts and activities, increases the level of condence about the purpose of the study in which they are involved. This result is consistent with the conclusions reported by other authors (Johnston and Sabin, 2010; Balden, 2008). The contacts searched by traditional snowball sampling were detected mainly by formal ways such as institutions like the embassy or organizations. In this case, the impersonal way adopted by the researcher, the incomplete and outdated information given by those informants may inuence the response rate into this group. 4.2 The quality of information using online questionnaires The online questionnaire was sent by private messages. Clicking a link, individuals open an online address and start answering the survey. In this sense we highlight some aspects that are important in terms of the quality of the information collected: . the instructions to complete the questionnaire; . the operative actions needed to answer; . the time spent doing the survey; and . technical issues associated with the control of information. Regarding the instructions to complete the questionnaire we consider that the structure of online questionnaires reduces the possibilities of making mistakes. First, it is possible to program specic instructions for each question (e.g. if it is an open question, one answer or multiple answers questions). The answers are easily visible and better displayed in form of window. It makes possible to choose an answer just with a click on the proper option. Furthermore, lter questions can be applied which show, according to respondents answers, the suitable questions that each person must complete. This reduced the possibilities of making mistakes. Moreover, the operative actions needed to successfully complete the survey are easier than other self-administered questionnaire (e.g. e-mail or postal survey). In this case, only four instructions, in their own language, are applied: open, nish later, next and send. Although the survey shows clearly the location of these instructions, we attached the instructions in the private message where we sent the link to the questionnaire. Related with the level of completeness of the information, analyzing each of the 50 questions formulated, we found that the highest level of non-response rate was 17 per cent. On average, 5 per cent of the sample chose the option no answer on each question. In addition, only 3.74 per cent did not nish the survey. We consider that the high level of completeness is associated with the time spent on answering the survey and on the attractiveness of the online survey design.
Response (%) 77 42 Non-response n (%) 46 83 129 23 58 Total n 200 143 343 (%) 100 100

Samplea Facebook snowball sampling Traditional snowball sampling Total

Table IV. Response rate in virtual and traditional snowball sampling

154 60 214

Notes: Dif. p , 0:01 level of signicance; a99 per cent Source: Authors own survey

In relation to the time required to answer, three aspects of online survey are relevant to analyze: the 24-hour available access, the nish later option and the click form answer. On average entrepreneurs needed 25 minutes to complete the survey. It is interesting to observe that the combination of the nish later option with the 24-hour access favour the response rate. In addition, most questionnaires were completed after business hours (60.7 per cent). This is a strength of Facebook as a sample frame, especially for the business research eld, where the lack of time of the entrepreneurs usually affects the response rate. If entrepreneurs enter Facebook it is because they can spend time in other activities in their spare time. Finally, this type of format allows instant control of information. If someone abandons the survey, it is immediately possible to contact them to know the reasons why they did not go on answering. Moreover, at an operational level, the researcher can obtain quick and cheap answers. In fact, several respondents can simultaneously enter the survey and the information can be transfer to statistical programs. Additionally, the system controls the duplicate cases, allowing only one access for each IP address (personal identication code). Also, we could verify the real existence of the case asking entrepreneurs for personal and business data (telephone, address, location, etc.). 5.3 The representativeness of virtual snowball sampling Traditional snowball sampling, as it was previously presented, can be seen as a biased sampling technique because it is not random. However, virtual networks incorporate random elements (the random selection of the virtual groups, the contact to every member inside them, etc.) to be considered in the analysis of representation bias. Cohen (1990, p. 64) considers that a snowball procedure with random sequences is not a random sample but it is the best way to select users from a representative form. For this, we compared the territorial distribution of the units of observation in the sample with the census data regarding the Argentinean population living in Spain, both with EU citizenship and with residence (Table V). The relationship between the territorial distribution of the total population and that in the sample gave a correlation index of 0.92, which means that, although we cannot generalise the results, it is possible to obtain a representative sample. In the same way, we observed that we could contact persons and institutions that are not expected to be found by virtual search (e.g. old people, Argentinean organizations, etc.). 5.4 Limitations of using Facebook as sample frame Although we showed some advantages of using SNSs for the study of hard to reach populations, there are many limitations specically related with the virtual networks samples. The main limitation is that Facebook is not designed for mailing and if the same message is sent many times, the administrator can block the account. The private message guarantees the privacy and the agreement to participate of the individuals but there are technical barriers to send mass e-mails. In our research we have to open many e-mail accounts to send the messages. An option is to write directly to the administrator of each group asking them to communicate with their members. In this case, the problem is that this e-mail can be seen as a spam and can reduce the level of participation.

Social research 2.0

69

INTR 22,1

Region a Andaluc n Arago Asturias Islas Baleares Islas Canarias Cantabria Galicia Granada Castilla la Mancha a Catalun Extremadura Madrid Murcia Navarra s Vasco Pa La Rioja Ceuta Melilla Valencia Non response Total

Population 15.38 1.37 1.49 8.23 6.94 0.61 5.71 1.86 1.61 23.49 0.43 14.5 1.42 0.72 1.98 0.5 0 0 13.63 290.281

Sample 14 1.86 0 8.87 0.93 0 4.20 0 0.46 23.36 0.46 26.63 1.40 0 0.93 0 0 0 14 3.73 214

70

Table V. Comparison of territorial distribution of Argentineans in the sample and in Spain

Sources: Authors own survey and National Spanish Statistics

Another limitation is that people contacted by virtual networks can disappear in the future. This fact can affect sample size and representativeness. For this we consider that the virtual contact is just the rst approach. Then, it is necessary to maintain other means of contact such as telephone, e-mail or personal interview if it is required. Additionally, as we discuss previously, there is a selection bias in the sample because only a target population use the Internet, and particularly choose Facebook. In that case we believe that for the study of hard-to reach population and for descriptive and exploratory objectives, this sample method is still effective even though its limitations. Finally, it must be considered that although virtual snowball sampling technique implies a semi- random selection procedure, it is not possible to apply statistical analysis for the generalisation of the results. Nevertheless, this paper has described a method that can extend the sample size in hard to reach populations, improve the response rate and the recruitment effectiveness. Researchers must be careful in the use of this method for predictive generalisation, but it is important to highlight that this innovation can provide alternative methods to increase theoretical validity in some research elds. 5. Conclusions This paper has discussed snowball sampling and the effects of incorporating virtual social networks (Facebook) to detect hard to reach populations. According with Brickman-Bhutta (2009), Facebook can be a remarkably good substitute for data obtained through more expensive procedures. SNSs sampling shares most of the limitations associated with other forms of web-based research, but it is an appropriate

tool to research on hard to reach populations that are difcult to study through conventional survey methods. Although we did not build a random sample, their geographical distribution preserved the statistical relation with total population distribution. In our study of immigrant entrepreneurs in Spain the main advantages of using Facebook as a sample frame are the time and cost savings, the extent of the size of the sample and the geographical scope of the study. In fact, this virtual sampling technique allowed us to improve the contributions about ethnic entrepreneurship accessing a considerable size sample of a minority group, contrary with the commonly method applied in this research eld, that is, small samples studied in depth. Thus, we managed to reduce the selection bias observed in ascending methodologies research, extending the sample size and improving the representativeness of the sample. Future research has to provide more empirical evidence to contrast the propositions presented below. In fact, many questions research can be reached by applying online techniques and tools. Furthermore, in many research elds whereas subjects are hard to reach, more contributions are necessary to test in which way the extension of the sample size allows the researcher to use alternative methodologies and triangulation processes to increase validity. It is important to consider that virtual research needs to apply the same methodological steps to guarantee the scientic rigor and validation. Virtual networks are a dynamic space (e.g. people enter and abandon the net; information is temporally online, etc). An online research needs to dene methodological actions such as set time boundaries for the collection of data, build a sample frame with personal data of the units of observation, save the documentation that supports the research results and dene control variables to validate the results. We can discuss if the Internet is or is not an appropriate means for conducting scientic research. However if research is done with the same methodological rigor we can use online research practices for many scientic objectives. That is, online research can be as serious as online transactions, political campaigns or learning courses are. Scientic social research understands reality and its changes. Nowadays it is impossible to understand human behaviour and its context without taking account of this virtual reality. This paper does not seek to offer a nal denition of virtual snowball sampling. Instead, it seeks to begin a discussion of this often used but seldom debated research technique. Social networks, as channels for recruitment, can be an advantage to study topics with barriers of access. Specially, establishing the trust of respondents is therefore essential. From both an ethical and a practical point of view, respondents need to be reassured of the protection of the information they provide.
References Acquisti, A. and Gross, R. (2006), Imagined communities: awareness, information sharing, and privacy on the Facebook, in Golle, P. and Danezis, G. (Eds), Proceedings of 6th Workshop on Privacy Enhancing Technologies, Robinson College, Cambridge, pp. 36-58. Atkinson, R. and Flint, J. (2001), Accessing hidden and hard-to-reach populations: snowball research strategies, Social Research Update, Vol. 33, pp. 1-5. Back, M.D., Stopfer, J.M., Vazire, S., Gaddis, S., Schmukle, S.C., Egloff, B. and Gosling, S.D. (2010), Facebook proles reect actual personality, not self-idealization, Psychological Science, Vol. 21, pp. 372-4.

Social research 2.0

71

INTR 22,1

72

Balden, W. (2008), Can you trust the data you collect from an online survey?, American Marketing Association Webcast, available at: http//amaevents.webex.com/eco6001/event (accessed 14 June 2010) Beneld, J.A. and Szlemko, W.J. (2006), Internet-based data collection: promises and realities, Journal of Research Practice, Vol. 2 No. 2, available at: http://jrp.icaap.org/index.php/jrp/ article/view/30/51 (accessed 20 June 2010). Bigge, R. (2006), The cost of (anti-)social networks: identity, agency and neo-Luddites, First Monday, Vol. 11 No. 12, pp. 1-10. Birch, D. and Young, M. (1997), Financial services and the Internet what does cyberspace mean for the nancial services industry?, Internet Research, Vol. 7 No. 2, pp. 120-8. Boyd, D. (2008), Facebooks privacy trainwreck: exposure, invasion, and social convergence, Convergence, Vol. 14 No. 1, pp. 13-20. Boyd, D. and Ellison, N. (2008), Social network sites: denition, history, and scholarship, Journal of Computer Mediated Communication, Vol. 13, pp. 210-30. Boyd, D. and Hargittai, E. (2010), Facebook privacy settings: who cares?, First Monday, Vol. 15 No. 8, pp. 1-24. Brickman-Bhutta, C. (2009), Not by the book: Facebook as sampling frame, available at www. thearda.com/../Not%20by%20the%20Book%20-%20Bhutta.doc (accessed 10 November 2010). Browne, K. (2005), Snowball sampling: using social networks to research non-heterosexual women, International Journal of Social Research Methodology, Vol. 8 No. 1, pp. 47-60. Bums, A.C. and Bush, R.F. (2006), Marketing Research: Online Research Applications, Prentice Hall, Englewood Cliffs, NJ. Cambiar (2006), The online research industry: an update on current practices and trends, available at: www.gmi-mr.com/press/release (accessed 18 June 2010). Cohen, P. (1990), Drugs as a Social Construct, Universiteit van Amsterdam, Amsterdam. Cook, C., Heath, F. and Thompson, R. (2000), A meta-analysis of response rates in web- or Internet-based surveys, Educational & Psychological Measurement, Vol. 60 No. 6, pp. 821-37. Coomber, R. (1997), Using the Internet for survey research, Sociological Research Online, Vol. 2 No. 2, available on: www.socresonline.org.uk/2/2/2.html (accessed 5 May 2010). Couper, M.P. (2000), Web surveys, a review of issues and approaches, Public Opinion Quarterly, Vol. 64 No. 4, pp. 464-94. Davidovich, U. and Uhr, H. (2006), Qualitative research online: self-reported pros and cons of being chat-interviewed online with web cameras, General Online Research Conference GOR 2006, available at: http://gor. de/gor06/index.php (accessed 26 July 2011). Duncan, D., White, J.B. and Nicholson, T. (2003), Using Internet-based surveys to reach hidden populations: case of nonabusive illicit drug users, American Journal of Health Behavior, Vol. 27 No. 3, pp. 208-18. Dwyer, C., Starr, H. and Passerini, K. (2007), Trust and privacy concerns within social networking sites: a comparison of Facebook and MySpace, American Conference on Information Systems Proceedings, available at: http://aisel.aisnet.org/amcis2007/339 (accessed 27 July 2011). Ellison, N., Steineld, C. and Lampe, C. (2007), The benets of Facebook friends: social capital and college students use of online social network sites, Journal of Computer-mediated Communication, Vol. 12 No. 4, available at: http://jcmc.indiana.edu/vol12/issue4/ellison. html (accessed 8 May 2010).

Evans, J. and Mathur, A. (2005), The value of online surveys, Internet Research, Vol. 15 No. 2, pp. 195-219. Fricker, R. and Schonlau, M. (2002), Advantages and disadvantages of Internet research surveys: evidence from the literature, Field Methods, Vol. 14 No. 4, pp. 347-67. Fricker, S., Galesic, M., Torangeau, R. and Yan, T. (2005), An experimental comparison of web and telephone surveys, Public Opinion Quarterly, Vol. 69 No. 3, pp. 370-973. Groves, R.M., Presser, S. and Dipko, S. (2004), The role of topic interest in survey participation decisions, Public Opinion Quarterly, Vol. 68 No. 1, pp. 2-31. Heckathorn, D. (1997), Respondent-driven sampling: a new approach to the study of hidden populations, Social Problems, Vol. 44 No. 2, pp. 174-99. Ilieva, J., Baron, S. and Healey, N.M. (2002), Online surveys in marketing research: pros and cons, International Journal of Market Research, Vol. 44 No. 3, pp. 361-76, available at: http://elibrary.ru /item.asp?id6241312 (accessed 26 July 2011). Johnston, L. and Sabin, K. (2010), Sampling hard-to-reach populations with respondent driven sampling, Methodological Innovations Online, Vol. 5 No. 2, pp. 38-48. Koo, M. and Skinner, H. (2005), Challenges of Internet recruitment: a case study with disappointing results, Journal of Medical Internet Research, Vol. 7 No. 1, available at: www.jmir.org/2005/1/e6/ (accessed 8 May 2011). Kushin, M. and Kitchener, K. (2009), Getting political on social network sites: exploring online political discourse on Facebook, First Monday, Vol. 14 No. 11, available at: http:// rstmonday.org/htbin/giwrapbin/ojs/index.php/fm/article/viewArticle/2645 (accessed 10 June 2011). Magnani, R., Sabin, K., Saidel, T. and Heckathorn, D. (2005), Sampling hard to reach and hidden populations for HIV surveillance, AIDS, Vol. 19 No. 2, pp. 67-72. Maronick, T. (2009), Role of the Internet in survey research: guidelines for researchers and experts, Journal of Global Business and Technology, available at: http://ndarticles.com/p/ articles/mi_qa5493/is_200904/ai_n32128065/ (accessed 16 October 2010). Marpsat, M. and Razandratsima, N. (2010), Survey methods for hard-to-reach populations: introduction to the special issue, Methodological Innovations Online, Vol. 5 No. 2, pp. 3-16. Miller, P.G. and Sondelund, P. (2010), Using the Internet to research hidden populations of illicit drug users: a review, Addiction, Vol. 105 No. 9, pp. 1557-67. Moskowitz, H.R. and Birgi, M. (2008), Optimising the language of e-mail survey invitations, International Journal of Marketing Research, Vol. 20 No. 4, pp. 491-510. Muhib, F.B., Lin, L.S., Stueve, A., Miller, R.L., Ford, W.L., Johnson, W.D. and Smith, P.J. (2001), A venue-based method for sampling hard-to-reach populations, Public Health Reports, Vol. 116, pp. 216-22. Rhodes, S.D., Hergenrather, K.C., Yee, L.J., Knipper, E., Wilkin, A.M. and Omli, M.R. (2007), Characteristics of a sample of men who have sex with men, recruited from gay bars and Internet chat rooms, who report methamphetamine use, AIDS Patient Care and STDs, Vol. 21 No. 8, pp. 575-83. Salganik, M.J. and Heckathorn, D.D. (2004), Sampling and estimation in hidden populations using respondent-driven sampling, in Stolzenberg, R. (Ed.), Sociological Methodology, Vol. 34, Blackwell, Oxford. Stanton, J.M. (1998), An empirical assessment of data collection using the Internet, Personnel Psychology, Vol. 51 No. 3, pp. 709-25. Van Meter, K. (1990), Methodological and design issues: techniques for assessing the representatives of snowball samples, NIDA Research Monograph, pp. 31-43.

Social research 2.0

73

INTR 22,1

74

Watters, J. and Biernacki, P. (1989), Targeted sampling: options for the study of hidden populations, Social Problems, Vol. 36 No. 4, pp. 416-30. Weiwu, Z., Johnson, T.J., Seltzer, T. and Bichard, S. (2010), The revolution will be networked: the inuence of social networking sites on political attitudes and behavior, Social Science Computer Review, Vol. 28, pp. 75-92. Wilson, A. and Laskey, N. (2003), Internet based marketing research: a serious alternative to traditional research methods?, Marketing Intelligence & Planning, Vol. 21 No. 2, pp. 79-84. Witte, J.C., Amoroso, L.M. and Howard, P.E.N. (2000), Research methodology method and representation in Internet-based survey tools, Social Science Computer Review, Vol. 18, pp. 179-95. Wong, T. (2008), Purposive and snowball sampling in the study of ethnic and mainstream community organizations, paper presented at the Annual Meeting of the Western Political Science Association, Manchester Hyatt, San Diego, CA, available at: www. allacademic.com/meta/p238387_index.html (accessed 3 March 2011). Yardi, S. (2007), Whispers in the classroom, in McPherson, T. (Ed.), The John D. and Catherine T. MacArthur Foundation Series on Digital Media and Learning, MIT Press, Cambridge, MA, pp. 143-64. Zhou, T. (2011), Understanding online community user participation: a social inuence perspective, Internet Research, Vol. 21 No. 1, pp. 67-81. Corresponding author Fabiola Baltar can be contacted at: fabaltar@mdp.edu.ar

To purchase reprints of this article please e-mail: reprints@emeraldinsight.com Or visit our web site for further details: www.emeraldinsight.com/reprints

Das könnte Ihnen auch gefallen