Subido por Arqueoplus

the-sage-handbook-of-qualitative-data-collection (1)Capitulo 6

Anuncio
Sampling and Generalization1
In: The SAGE Handbook of Qualitative Data Collection
By: Margrit Schreier
Edited by: Uwe Flick
Pub. Date: 2018
Access Date: October 12, 2021
Publishing Company: SAGE Publications Ltd
City: London
Print ISBN: 9781473952133
Online ISBN: 9781526416070
DOI: https://dx.doi.org/10.4135/9781526416070
Print pages: 84-97
© 2018 SAGE Publications Ltd All Rights Reserved.
This PDF has been generated from SAGE Research Methods. Please note that the pagination of the
online version will vary from the pagination of the print book.
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
Sampling and Generalization1
Margrit Schreier
Introduction
In their textbook about empirical research methodology in the social sciences, 6 and Bellamy write: ‘Making
warranted inferences is the whole point and the only point of doing social research’ (2012, p. 14). Empirical
research, in other words, does not limit itself to describing those instances included in a given study. It wants
to go beyond those instances and arrive at conclusions of broader relevance.
This is true of qualitative just as much as of quantitative research (on the extent and type of generalizations
in qualitative research see Onwuegbuzie and Leech, 2010).2 When Lynd and Lynd (1929), for example,
set about studying the community they called ‘Middletown’ in the early twentieth century, their goal was not
to provide an in-depth description of this one community. Instead, they wanted to draw conclusions about
contemporary life in the US in a Midwestern community on the threshold of industrialization in more general
terms. And they selected the community in question very carefully to make sure that the community was
indeed typical of Midwestern communities at that time. They looked at climate, population size, growth rate,
presence and number of industries, presence of local artistic life, any local problems, and chose ‘Middletown’
on all of those grounds (see Gobo, Chapter 5, this volume). They were thus very much aware that the
conclusions we can draw, that is, the kinds of generalizations we can make, are closely connected to the
instances we study, that is, our sample. The instances, in 6 and Bellamy's (2012) terms, act as the ‘warrants’
for our conclusions.
Qualitative research, with its holistic and in-depth approach, typically limits itself to a few instances or units
only, ranging from the single case study (as in the study of ‘Middletown') to a sample size of around 20 to 40
(although sample sizes can, in rare cases, also be considerably larger). These units or instances can be very
diverse in nature: not only people, but documents, events, interactions, behaviours, etc. can all be sampled.
The numbers in qualitative are much smaller than sample sizes in quantitative research. If qualitative research
wants to arrive at conclusions that go beyond the instances studied, but can only include comparatively
few units, one would expect qualitative researchers to reflect all the more carefully about selection and
generalization. But this is not the case. The topic of sampling has long been neglected (Higginbottom, 2004;
Onwuegbuzie and Leech, 2007; Robinson, 2014), although there has been an increased interest in the topic
in recent years (e.g. the monograph on the topic by Emmel, 2013, and the increasing attention to sampling
in textbooks). What it means to generalize in qualitative research, what kinds of conclusions can be drawn
based on the units we have studied, is a topic that is only occasionally touched upon (e.g. Gobo, 2008; Lincoln
and Guba, 1979; Maxwell and Chmiel, 2014; Stake, 1978).
Page 2 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
In the following, I will start out with some methodological considerations, focusing first on concepts of
generalizing, next on sampling and criteria and considerations underlying sampling in qualitative research.
This includes the question of sample size and the use of saturation as a criterion for deciding when to
stop sampling. In the next section, selected sampling strategies in qualitative research will be described
in more detail, followed by considerations of how generalization and selection strategies are related in
some selected qualitative research traditions. The chapter closes with describing some recent developments
in qualitative sampling methodology. Throughout the chapter, the terms ‘sampling', ‘selecting units', and
‘selecting instances’ will be used interchangeably.
Generalizing in qualitative research: Methodological considerations
Generalizing in Quantitative Social Science
In quantitative research, the concept of generalization is closely linked to that of external validity, specifically
to the concept of population generalization, namely the extent to which we can generalize from the sample
to the population. This type of generalization has also been termed empirical generalization (Lewis et al.,
2014; Maxwell and Chmiel, 2014) or numerical generalization (Flick, 2004). In quantitative research, empirical
generalization is typically realized as statistical or probabilistic generalization: It is possible to generalize
from the sample to the population to the extent that the sample is indeed representative of the population,
and statistics is used to provide the level of confidence or conversely the margin of error that underlies this
estimate of representativeness (Williams, 2002).
Statistical generalization in this sense has become the default understanding of generalization in the social
sciences. But empirical generalization does not equal statistical generalization. Statistics is a tool that in
quantitative research is used to warrant the conclusion from the sample to the population. But there may
be other ways of justifying this conclusion. It is also worth keeping in mind another characteristic of both
statistical and empirical generalization: They are essentially context-free. The conclusion from the sample to
the population applies, regardless of the specific context and the specific circumstances (Williams, 2002).
When quantitative researchers argue that qualitative research does not allow for generalization, this criticism
is typically based on an understanding of generalizability in the sense of statistical generalizability. Indeed,
samples in qualitative research are mostly not representative of a population, and using statistics as a warrant
underlying the conclusion from sample to population is then not an option (although Onwuegbuzie and Leech,
2010, report that 36 per cent of the qualitative studies they examined used statistical generalization). Some
qualitative methodologists have been just as sceptical of achieving generalizability in qualitative research.
This is expressed by the famous dictum of Lincoln and Guba (1979, p. 110): ‘The only generalization is that
there is no generalization.’ For quantitative methodologists the supposed inability of qualitative research to
arrive at empirical generalizations constitutes a criticism of qualitative research. Lincoln and Guba, on the
other hand, as well as Denzin (1983), reject the notion of statistical and empirical generalizability precisely
Page 3 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
because they do not take context into account – because they are, one might say, too general:
It is virtually impossible to imagine any kind of human behaviour that is not heavily mediated by
the context in which it occurs. One can easily conclude that generalizations that are intended to be
context-free will have little that is useful to say about human behaviour. (Guba and Lincoln, 1981, p.
62)
Reconceptualizing Generalization in Qualitative Research
Several suggestions have been made for reconceptualizing generalization so as to make it more compatible
with the principles underlying qualitative research. These suggestions fall into three groups or types:
modifying the notion of empirical generalization, transferability as an alternative conceptualization, and
theoretical generalization as an alternative conceptualization (Lewis et al., 2014; Maxwell and Chmiel, 2014;
Polit and Beck, 2010; for a more complex and extensive classification see Gobo, 2008; and Gobo, Chapter 5,
this volume).
Modifying the notion of empirical generalization entails ‘lowering the threshold’ for what can be called a
generalization. This applies, for example, to the notion of so-called moderatum generalizations proposed by
Williams (2002) for interpretive social research. They are based on the assumption of ‘cultural consistency',
of a constant structural element in the area under research, and they involve an inductive inference from
the particular instance(s) studied to this underlying structure. Moderatum generalizations constitute a ‘weaker
version’ of the type of empirical generalization used elsewhere in the social and the natural sciences.
Another reconceptualization of generalization in qualitative research focuses on the concept of transferability
(Maxwell and Chmiel, 2014; Schofield, 1990). This notion takes the highly contextualized nature of qualitative
research as its starting point, that is, the very characteristic which, from the perspective of the quantitative
social sciences, stands in the way of empirical generalization in qualitative research. With transferability, the
core concern is not to generalize to an abstract and decontextualized population, but to determine whether
the findings obtained for one instance or set of instances in one specific context also apply to other instances
in a different context. The extent to which the findings can be transferred from one case to another depends
on the similarity between the respective contexts. Assessing the similarity of a ‘source’ and a ‘target’ context
in turn requires detailed information about the context in which the study was conducted. Lincoln and Guba
(1979) speak of the degree of fittingness between the two contexts and refer to the need for thick description,
according to Geertz (1973), of the context in which the first study was carried out. It is noteworthy that the
notion of transferability entails, so to speak, a division of tasks between the authors and the readers of a
study. It is the responsibility of the authors to provide a sufficiently ‘thick description', but only the reader can
assess the degree of fittingness between the context of the study and any other context to which the findings
may or may not apply. The idea of transferability underlies several reconceptualizations of generalization that
have been proposed in the literature, such as the concept of naturalistic generalization developed by Stake
(1978) or the notion of generalization as a working hypothesis suggested by Cronbach (1975).
Page 4 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
Both the notions of moderatum generalization and transferability are based on considerations of the
relationship between a sample and a population (or the relationship between a sample and another sample),
that is, what Yin (2014, pp. 57–62) calls a sampling logic. The third alternative conceptualization of
generalization, that of theoretical generalization (Schwandt, 2001, pp. 2–3; also called analytic
generalization), moves away from the idea of population and sample and is based on what Yin terms a
replication logic (2014, pp. 57–62). With theoretical generalization, the purpose of the research is not to
generalize to a population or to other instances, but to build a theory or to identify a causal mechanism.
Instances are selected either so as to be similar to each other (literal replication) or different from each
other in one key aspect (theoretical replication), the same way studies build upon each other, leading to a
differentiation of theory. This notion of theoretical generalization is – in somewhat different versions – used in
case study research, in grounded theory methodology, and in analytic induction.
External and Internal Generalization
As mentioned in the previous section, the concept of generalizability has, in the quantitative research tradition,
been discussed as one aspect of external validity. This suggests a focus on the relationship between the
sample and the population, on how representative the instances included in the sample are of the population.
But in qualitative research, where typically few instances are examined in detail, another relationship gains
in importance, namely the relationship between our observations and the case in its entirety: how well is the
variability within a given instance represented in our observations? Hammersley and Atkinson (1995) point to
the importance of adequately representing contexts as well as points in time or a time period. They refer to
this as within-case sampling. Along similar lines, Onwuegbuzie and Leech (2005) refer to the ‘truth space’ of
an interviewee and whether the data obtained in any given interview adequately represent that truth space
(see also the concept of internal validity in Maxwell and Chmiel, 2014). The considerations underlying internal
and external generalization are similar in structural terms. Generalizing within an instance is subject to the
same restrictions and considerations as generalizing beyond that instance: in both situations, we have to
ask ourselves what kind of generalization is appropriate – moderatum empirical, transferability, or theoretical
generalization – and select our sampling strategy accordingly.
Sampling in qualitative research: Methodological considerations
Sampling Strategies: An Overview
In the methodological literature, three types of sampling strategies are distinguished: random, convenience,
and purposive sampling. In the following, key characteristics and underlying concepts of the three groups of
sampling strategies are briefly described.
Random sampling
Page 5 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
Random sampling is typically used in quantitative research, especially in survey-type research, in order to
support empirical generalization, that is, generalizing from a sample to a population (on random sampling,
see Daniel, 2012, chapter 3). This is possible to the extent that the sample is indeed representative of the
population. The importance of random sampling in quantitative research derives from its role in generating
such a representative sample. Based on a sampling frame, that is, a list of all members of the population,
the sample is chosen such that every member of the population has an equal chance (above zero) of being
included in the sample, and the members of the sample are selected using a truly random procedure (e.g. a
random number generator). If these steps are followed and if the population and the sample are sufficiently
large, the procedure of random sampling results in a sample that is (sufficiently) representative. The margin
of error in generalizing from the sample to the population can be specified using confidence intervals and
inferential statistics. Various subtypes of random sampling have been developed beside the process of simple
random sampling described above, such as systematic, cluster, or stratified random sampling.
But a random sample is not necessarily a representative sample (Gobo, 2004). In the first place, random
sampling will result in a representative sample only if the above conditions are met. Also, representativeness
of a sample with respect to a population constitutes a goal, whereas random sampling is a procedure, a
means towards that goal. And random sampling is not the only way of obtaining a representative sample.
Other strategies include, for instance, selecting typical cases or even – depending on the population –
selecting any case at all (see the section on ‘phenomenology’ below).
Purposive sampling
The term purposive sampling (also called purposeful sampling) refers to a group of sampling strategies
typically used in qualitative research. The key idea underlying purposive sampling is to select instances that
are information rich with a view to answering the research question (for an overview see Emmel, 2013; Flick,
2014, chapter 13; Mason, 2002, chapter 7; Patton, 2015, module 30; Ritchie et al., 2014). A large variety
of purposeful sampling strategies has been described in the literature, including, for example, homogeneous
sampling, heterogeneous sampling, maximum variation sampling, theoretical sampling, sampling according
to a qualitative sampling guide, snowball sampling, sampling typical, extreme, intense, critical, or outlier
cases, and many others (Patton, 2015, module 30; Teddlie and Yu, 2007).
The precise meaning of ‘information rich', however, and therefore the selection of a specific strategy, depends
on the research question and on the goal of the study (Marshall, 1996; Palinkas et al., 2015). Describing a
phenomenon in all its variations, for example, requires a type of maximum variation sampling (Higginbottom,
2004; Merkens, 2004). If the goal is to generate a theory, theoretical sampling in the tradition of grounded
theory methodology will often be the strategy of choice (see the section on ‘theoretical sampling’ below). If a
theory is to be tested, selecting an atypical or critical case would be a useful strategy (Mitchell, 1983). And
transferability requires a detailed description of specific types of cases, for example, a typical or common
case or an intense case (Yin, 2013, pp. 51–5).
Several criteria have been used to distinguish between different types of purposive sampling strategies and
Page 6 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
different ways of conducting purposive sampling. A first criterion concerns the point in time when a decision
concerning the sample composition is made (Flick, 2014, pp. 168–9; Merkens, 2004): this can either be
specified in advance, in analogy to the sampling procedure in quantitative research (examples would be
stratified purposive sampling or selecting specific types of cases). Or else the composition of the sample can
emerge over the course of the study, as in sequential sampling, snowball sampling, theoretical sampling,
or analytic induction. This latter emergent procedure is generally considered to be more appropriate to
the iterative, emergent nature of qualitative research (Palinkas et al., 2015). In actual research, advance
decisions about sample composition are often combined with modifications as they emerge during the
research process.
A second criterion relates to the relationship between the units in the sample (Boehnke et al., 2010; Palinkas
et al., 2015; Robinson, 2014), distinguishing between homogeneous samples (as in criterion sampling, or
in selecting typical cases) and heterogeneous samples (e.g. maximum variation sampling, or theoretical
sampling). A third criterion for distinguishing between purposeful sampling strategies relates to the underlying
goal (Onwuegbuzie and Leech, 2007; Patton, 2015, chapter 5), such as selecting specific types of cases
(typical, intense, extreme, etc.), selecting with a view to representativeness or to contrast.
Convenience sampling
Convenience sampling (also called ad hoc sampling, opportunistic sampling) constitutes the third type of
sampling strategy. Here cases are selected based on availability. Asking one's fellow students, for example,
to participate in a study for this semester's research project, would constitute a case of convenience sampling.
This sampling strategy has a ‘bad reputation’ with both quantitative and qualitative researchers: from the
perspective of quantitative research, it fails to produce a representative sample (Daniel, 2012, chapter 3);
from the perspective of qualitative research, it has been criticized for insufficiently taking the goal of the
study and the criterion of information richness into account. Depending upon the goal of the research and
the population under study, ‘any case’ can, however, be perfectly suitable (Gobo, 2008; see the section on
‘phenomenology’ below).
Is My Sample Large Enough?
The role of sample size in qualitative research
The question of sample size in qualitative research is discussed very controversially. Some authors argue
that, other than in quantitative research where sample size in relation to the population is crucial for statistical
generalization, sample size is irrelevant in qualitative research or at best of secondary concern. According
to this position, selecting information-rich instances that are relevant to the research question and sample
composition are considered more important than sample size (e.g. Crouch and McKenzie, 2006; Patton,
2015, chapter 5). Others argue that sample size plays a role in qualitative research as well (e.g. Onwuegbuzie
and Leech, 2005; Sandelowski, 1995).
Page 7 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
In purely practical terms, researchers are often required to specify an approximate sample size, for example,
when submitting grant applications or presenting a PhD proposal. Not surprisingly, methodologists also
differ when it comes to making recommendations for sample size in such cases. Some authors do make
recommendations (for overviews, see Guest et al., 2006; Guetterman, 2015; Mason, 2010). Others argue
that deciding on sample size before engaging in data collection contradicts the emergent nature of qualitative
research, and call for a sampling process that is constantly adjusted as the research unfolds (e.g. Mason,
2010; Palinkas et al., 2015; Robinson, 2014; Trotter, 2012). A middle way between these two extremes is the
suggestion to work with an advance specification of minimal sample size, which is then adjusted during the
research process (Francis et al., 2010; Patton, 2015, module 40). Positions concerning recommendations for
sample size in qualitative research thus range from specific numbers to ‘it depends'.
Key factors to take into consideration when deciding on a suitable sample size include the extent of variation
in the phenomenon under study (Bryman, 2016; Charmaz, 2014; Francis et al., 2010; Palinkas et al., 2015;
Robinson, 2014), the research goal (Marshall, 1996; Patton, 2015, module 40), the scope of the theory or
conclusions (Charmaz, 2014; Morse, 2000). The overall recommendation is that sample size should increase
with the heterogeneity of the phenomenon and the breadth and generality of the conclusions aimed for.
Depending on the research goal, however, a single instance may be perfectly sufficient (e.g. Patton, 2015,
module 40; Yin, 2014, pp. 51–6). Some authors also draw attention to external constraints, such as the time
and the resources available for the study or the requirements by external agencies such as review boards
(Flick, 2014; Patton, 2015, module 40). Another factor concerns the research tradition in which the study is
carried out (Guest et al., 2006; see the section on ‘sampling in different traditions’ below).
The advance specification of sample size runs the danger of oversampling, that is, including more instances
than necessary (Francis et al., 2010). In his analysis of sample size of the 51 most frequently cited
qualitative studies in five research traditions, Guetterman (2015) found a surprisingly high average sample
size of 87 participants. Mason, in examining sample size in qualitative dissertations, found sample sizes
of 20–30 participants to be most frequent, with a surprising number of sample sizes constituting multiples
of 10. Guetterman concludes from this that such round numbers are most likely the result of an advance
specification of sample size – which may well be higher than needed. Oversampling carries the
methodological danger of allowing for only an insufficient analysis of each individual case (Guetterman,
2015), and it carries the ethical danger of unnecessarily drawing upon the resources of participants (Francis
et al., 2010). Sampling in qualitative research should therefore include as many cases as are needed with a
view to the research question, but it should not model itself on quantitative standards of ‘the more, the better’
or a preference for round numbers.
The criterion of saturation
But how exactly do we know that we have included as many cases as needed? The criterion that is most
often used in qualitative research to conclude the sampling process is the criterion of saturation. Saturation
was initially developed in the context of grounded theory methodology, and it specifies to stop sampling when
including more cases does not contribute any new information about the concepts that have been developed
Page 8 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
and about their dimensions (Schwandt, 2001, p. 111). Today, however, the concept of saturation is often used
in the more general sense of thematic saturation (Bowen, 2008; Guest et al., 2006).
But it often remains unclear what the exact criteria are for determining when saturation has been reached
(Bowen, 2008; Francis et al., 2010; O'Reilly and Parker, 2012). Guest et al. (2006) examined their own
interview analysis for evidence of saturation. They concluded that saturation was reached after 12 interviews,
with key themes emerging from the analysis of only 6 interviews. Francis et al. (2010) arrive at similar
conclusions. They suggest specifying an initial sample size of n = 10 at which to examine the degree of
saturation reached. This initial sample size, they argue, should be combined with a set number of additional
cases at which the degree of saturation is to be re-examined; for their own research, they set this number
at n = 3. Both Guest et al. (2006) and Francis et al. (2010) emphasize, however, that their recommendation
targets interview studies on comparatively homogeneous phenomena.
Saturation, despite its prevalence, has been criticized on methodological grounds. Dey (1999) argues that
saturation is always a matter of degree. Also, saturation is not always the most appropriate criterion for
deciding when to stop sampling (O'Reilly and Parker, 2012).
Selected purposive sampling strategies
In this section, some selected purposive sampling strategies that are frequently used in qualitative research
are presented in more detail: theoretical sampling, stratified purposive sampling, criterion sampling, and
selecting specific cases. It should be noted that the different strategies are not mutually exclusive and that
several strategies can be combined in one study.
Theoretical Sampling
Theoretical sampling was developed in the context of grounded theory methodology and is very much a part
of the overall iterative grounded theory methodology in combination with a process of constant comparison
(Glaser, 1978; Strauss, 1987; for an overview see Draucker et al., 2007). Theoretical sampling takes place in
constant interrelation with data collection and data analysis, and it is guided by the concepts and the theory
emerging in the research process. More instances and more data are added so as to develop the emerging
categories and their dimensions, and relate them to each other:
Theoretical sampling means that the sampling of additional incidents, events, activities, populations,
and so on is directed by the evolving theoretical constructs. Comparisons between the explanatory
adequacy of the theoretical constructs and the empirical indicators go on continuously until
theoretical saturation is reached (i.e. additional analysis no longer contributes to anything new about
this concept). (Schwandt, 2001, p. 111)
In terms of sample composition, theoretical sampling yields a heterogeneous sample that allows for
Page 9 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
comparing different instantiations of a concept. The sampling process is emergent and flexible, and the goal
of the sampling strategy, as the name says, is to develop a theory that is grounded in the data. While the
strategy is well established in the methodological literature and is the default strategy in grounded theory
studies, it is nevertheless difficult to find detailed descriptions of how sampling choices are revised and
modified in response to emergent concepts (Draucker et al., 2007). The above case study by Wuest (2001)
on the experiences of women providing care to others constitutes an exception. While Wuest acknowledges
the difficulty of documenting what is essentially a process of following up on various conceptual dimensions
simultaneously, she describes the theoretical sampling process and the choices she made at various stages
of the research process in exemplary and enlightening detail.
Stratified Purposive Sampling
Like theoretical sampling, stratified purposive sampling results in a heterogeneous sample that represents
different manifestations of the phenomenon under study (Ritchie et al., 2014; termed ‘quota sampling’ in
Patton, 2015, p. 268). In contrast to theoretical sampling, however, stratified purposive sampling entails a topdown approach, that is, decisions about the composition of the sample are made before data collection. In a
first step, the researcher has to decide which factors are known or likely to cause variation in the phenomenon
of interest. In a second step, two to a maximum of four such factors are selected for constructing a sampling
guide. Step three involves combining the factors of choice in a cross-table. At this point the researcher has
to decide whether all possible combinations of all factors are to be realized or else, if not, which factor
combinations will be included. With more than two factors, it is usually not possible to conduct sampling
for all possible factor combinations. The resulting sampling guide is displayed in a table, with each factor
combination corresponding to a cell. In a final step, the researcher will decide how many units to sample for
each cell or factor combination. Depending on how many cells there are in the sampling guide (‘sample matrix’
according to Ritchie et al., 2014), one or two units will typically be included. Stratified purposive sampling is
useful for exploring the various manifestations of a phenomenon for similarities and differences.
As the term ‘sampling guide’ implies, a sampling guide is not cast in stone and can be modified as the
selection process unfolds (Morse, 2000). In the above study, for example, it proved difficult to find participants
without any training qualification, especially in the oldest age group, and participants from other age groups
or with training qualification were substituted. A sampling guide can and should also be modified if it emerges
during the study that factors other than the ones informing the sampling guide affect the phenomenon under
study. In this case a combination of concept-driven and data-driven sampling is realized (for an example, see
Johnson, 1991).
Another, more flexible variant of stratified purposive sampling is maximum variation sampling (Patton, 2015,
p. 267). As in stratified purposive sampling, the researcher starts out by identifying factors that lead to
variation in the phenomenon under study. But instead of systematically combining these factors, they serve
as a broad framework orienting the sampling process, with a view to including as much variation in the
sample as possible. In their interview study on how older persons experience living at home, for example, De
Page 10 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
Jonge et al. (2011) included participants who differed with respect to age, gender, living conditions, type of
dwelling, tenure, and location in the city or in the countryside, without specifying which combinations were to
be included. They used a matrix, however, to record the specific combination of characteristics represented
by the participants in their sample, to ensure and document sufficient variation of their cases.
Criterion Sampling
In criterion sampling, the objective is to include instances in the sample that match a predefined profile
(Coyne, 1997; Patton, 2015, p. 281). Usually this involves a combination of characteristics, which, together,
specify a phenomenon under study or a restricted population in which this phenomenon is likely to occur. The
resulting sample is homogeneous with respect to the selected criteria (but may be heterogeneous in other
respects), and decisions about these criteria are made in advance. This sampling strategy is especially useful
for exploring a phenomenon in depth.
Sampling in different research traditions
One of the factors influencing the type of generalization aimed for and the consequently optimal sample
size is the methodological tradition in which a study is carried out (Bryman in Baker and Edwards, 2012;
Higginbottom, 2004; Robinson, 2014). Empirical analysis of sample sizes in different qualitative research
traditions shows that the number of cases differs quite substantially between approaches (Guetterman,
2015; Mason, 2010). In the following, a few selected approaches will be discussed with a view to sampling
and generalization issues: interview studies, the case study, and phenomenology (for grounded theory
methodology see the sections on ‘theoretical sampling’ and ‘theoretical saturation').
Interview Studies
Many recommendations concerning sample size in qualitative research relate to the use of interview data (see
Roulston and Choi, Chapter 15, this volume) in particular (e.g. Crouch and McKenzie, 2006; Mason, 2010).
Recommendations range from 10 to 13 units (Francis et al., 2010) up to between 60 and 150 (Gerson and
Horowitz, 2002, p. 223). This large variety of recommendations is not surprising, considering that as a method
for data collection, the interview can be used within a great variety of different research traditions and with a
view towards different kinds of generalization. Instead of examining optimal sample size in interview studies,
it seems more promising to look at sample sizes in different approaches where interviews, observation, and
other methods for collecting qualitative data are used (cf. the analysis of interview-based dissertations from
different traditions in Mason, 2010).
The Case Study
In the case study, different methods of data collection are combined to allow for an in-depth analysis of one
Page 11 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
or several cases and their subunits (Yin, 2014, chapter 1). Descriptive case studies are especially suitable for
yielding ‘thick descriptions’ and therefore lend themselves well to generalization in the sense of transferability.
Explanatory case studies are more suitable for analytic generalization, being used for generating and for
building theory (Mitchell, 1983).
In terms of sampling, case studies require complex decisions on multiple levels. In a first step, the case
or cases have to be selected. In single case studies, this often involves selecting a case with a view to
a population, e.g. a typical case or an intense case (Yin, 2014, pp. 51–6). Single case studies allow for
moderatum generalizations based on structural and cultural consistency (Williams, 2002). In multiple case
studies, the key concern in sampling is the underlying logic of replication, that is, the question of how the
cases relate to each other (Yin, 2014, pp. 56–63). Based on a literal replication logic, cases are selected
so as to be similar to each other. If the study follows a theoretical replication logic, cases are selected so
as to contrast with each other on relevant dimensions. In a second step, within-case sampling is necessary
to ensure internal generalizability (Maxwell and Chmiel, 2014). In principle, any purposive sampling strategy
can be used, but maximum variation sampling seems especially useful for representing different aspects of a
case, such as persons, points in time, or contexts (Higginbottom, 2004).
In his analysis of sample size for case studies in the fields of health and education, Guetterman (2015) found
that the number of cases ranged from 1 to 8, and the number of participants or observations ranging from 1
to 700. This suggests that researchers conducting case studies tend to limit themselves to a few cases only,
thus allowing for a detailed analysis of each case. The wide range of number of participants and observations,
however, indicates that the ‘thickness’ of the resulting descriptions varies considerably.
Phenomenology
Phenomenology has the aim of identifying the ‘essence’ of the human experience of a phenomenon (overview
in Lewis and Staehler, 2010). By aiming to describe the common characteristics of that experience,
phenomenology by definition also aims for (empirical) generalization. Because the respective experience is
assumed to be universal, the experience of any human being qualified to have that experience is considered
a case in point. Consequently, no special sampling strategy is required, that is, convenience sampling would
be sufficient: Any individual who meets the conditions for having the experience under study would be
a suitable participant, and because of the relative homogeneity of the phenomenon, comparatively small
samples would be acceptable. This is, indeed, reflected in smaller sample sizes found in Guetterman's (2015)
analysis, ranging from 8 to 52 participants. Similar considerations concerning the assumed universal nature
of a phenomenon or an underlying structure are found in studies on the organization of everyday talk (e.g.
Schegloff and Sacks, 1974; see Jackson, Chapter 18, this volume) or in objective hermeneutics (overview in
Wernet, 2014).
Conclusion
Page 12 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
Recently there has been increasing attention to what qualitative researchers actually do during the sampling
process. These articles have already been cited in the previous sections: Onwuegbuzie and Leech (2010)
examined studies published in Qualitative Report for whether the authors generalized their findings and what
type of generalization they drew upon. Guetterman (2015) and Mason (2010) conducted research on sample
sizes in studies from different qualitative traditions, and Guest et al. (2006) examined their own data for
evidence of saturation. More empirical studies of this kind are needed to better understand what qualitative
researchers actually do in terms of sampling and generalization.
Another development is related to the increasing use of mixed methods designs in the social sciences (see
Hesse-Biber, Chapter 35, this volume; Morse et al., Chapter 36, this volume). Discussions of sampling in
mixed methods designs include descriptions of purposive sampling strategies, thereby contributing to making
them more visible in the social science methods literature (e.g. Teddlie and Yu, 2007). Some strategies that
are considered purposive in a qualitative research context, such as stratified purposive sampling or random
purposive sampling, have in fact been classified as ‘mixed’ strategies within this mixed methods context.
The mixed methods research tradition is relevant to developments in purposive sampling in yet another
respect. Many purposive sampling strategies, such as stratified purposive sampling or selecting certain types
of cases, require prior knowledge about the phenomenon in question and its distribution. This is where mixing
methods can be highly useful in informing the purposive sampling process. Within a sequential design, for
example, the findings of a first quantitative phase can be used in order to purposefully select instances for a
second qualitative phase of the research (for sequential designs see Creswell and Plano Clark, 2011, chapter
3).
While the topic of purposive sampling has been receiving increasing attention, this is not the case for the topic
of generalizing in qualitative research, and even less so for the relationship between types of generalization
and sampling strategies. This will be an important focus of future qualitative research methodology in this
area.
Case Study: ‘Precarious Ordering'
In her study of the experiences of women providing care, Wuest (2001) develops a middlerange theory with a focus on what she calls precarious ordering, based on a total of 65
interviews with women facing a variety of demands for care (caring for children with otitis
media with effusion, Alzheimer's disease, and leaving abusive relationships). Precarious
ordering involves a two-stage iterative process of negotiating demands for care and own
resources, moving from daily struggles to re-patterning care.
She starts out her process of theoretical sampling by talking to childrearing middle-
Page 13 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
class women, both employed and unemployed. Her initial data analysis points her to
the importance of increasing demands and types of demands and of the dissonance
between demands and resources. To further explore the role of varying demands, she
continues by interviewing women who face higher than average demands (mothers of
physically and mentally disabled children) and women who face fewer demands (women
with adult children and children away from home). Her interviews with mothers of disabled
children lead her to the question of strengths and strategies developed by the women
during the coping process and the role of the relationship with the partner. To explore
these two factors, she talks to other women with heavy, but different demands for care
(sick relatives) and women with diverse partners (e.g. a lesbian partner, a partner from a
different culture). The role of resources and the distinction between helpful and unhelpful
resources are further highlighted through interviews with women in economically difficult
situations. Wuest thus moves through cycles of comparing and contrasting different types
of caring demands, different kinds of settings, support systems, and resources, especially
in terms of the relationship with a partner.
Case Study: ‘Stakeholder Opinions on Priority Setting in Health Care'
In the following study we used stratified purposive sampling to explore the range of
opinions from different stakeholder groups and their reasons surrounding the setting of
priorities in the German health care system (Schreier et al., 2008; Winkelhage et al.,
2013). In a first step, we selected six stakeholder groups representing a variety of different
positions and roles. We assumed that, because of these different positions, they would
be likely to differ in their interests and opinions concerning priority setting in health
care: healthy members of the general population, patients, physicians, nursing personnel,
representatives of the public health insurance system, and politicians. In a second step,
a literature search was carried out for each stakeholder group separately in order to
identify factors likely to affect attitudes towards health care. Taking patients as an example,
relevant factors included age (18–30, 31–62, above 62), the severity of a patient's disease
(light versus severe; as judged by a physician), level of education (no training qualification,
training qualification, university degree), and area of origin (former Federal Republic of
Germany versus former German Democratic Republic). With four relevant factors, not
all factor combinations could be realized, and some cells remain empty. The factors
were combined into the following sampling guide, resulting in a sample of 12 participants
(see Table 6.1). The study showed, for example, the different interests and attitudes of
Page 14 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
physicians compared to patients. Patients were more likely to place hope even in less
effective treatments, and they emphasized the importance of obtaining the consent of the
family when deciding about life-prolonging measures (Winkelhage et al., 2013).
Table 6.1 Sampling guide for stratified purposive sampling of patients
Case Study: ‘Meanings of House, Home, and Family among Vietnamese
Refugees in Canada'
Huyen Dam and John Eyles (2012) used criterion sampling in their study exploring the
meanings of house, home, and family among former Vietnamese refugees in the Canadian
city of Hamilton. To be included in the study, participants had to be former Boat people who
had lived in Hamilton for at least 15 years. Bounding the phenomenon in terms of origin,
refugee history, and in terms of place ensured that the experiences of the participants were
sufficiently comparable. Requiring the participants to have lived in Hamilton for a period
of 15 years allowed the researchers to capture the experience of settling and how this
changed over time. The study shows, for example, the importance of culture and family for
the participants, and how these core values allow them to re-establish a sense of home
after having been uprooted and relocated.
Notes
1. I thank Uwe Flick, Giampietro Gobo, and an anonymous reviewer for their helpful comments.
Page 15 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
2. The term ‘qualitative research’ encompasses so many different research traditions and approaches that it
constitutes a gross oversimplification to lump them together under this one label. It would be more appropriate
to examine each tradition separately. As this is beyond the scope of this chapter, I will continue to use the term
throughout, but ask the reader to keep in mind the diversity of approaches. I will also look at a few approaches
in more detail in section 5 below.
Further Reading
Gobo, Giampietro (2004) ‘Sampling, representativeness, and generalizability', in Clive Seale, Giampietro
Gobo, Jaber F. Gubrium, and David Silverman (eds), Qualitative Research Practice. London: Sage, pp.
435–56.
Maxwell, Joseph A., and Chmiel, Margaret (2014) ‘Generalization in and from qualitative analysis', in Uwe
Flick (ed.), SAGE Handbook of Qualitative Data Analysis. London: Sage, pp. 540–53.
Patton, Michael Q. (2015) ‘Designing qualitative studies', in Michael Q. Patton (ed.), Qualitative Evaluation
and Research Methods (4th edn). Newbury Park: Sage, pp. 243–326.
References
6, Perri, and Bellamy, Christine (2012) Principles of Methodology. Research Design in Social Science.
London: Sage.
Baker, Sarah E., and Edwards, Rosalind (2012) ‘How many qualitative interviews is enough? Expert voices
and early career reflections on sampling and cases in qualitative research', National Centre for Research
Methods Review Paper, available at http://eprints.ncrm.ac.uk/2273/4/how_many_interviews.pdf.
Boehnke, Klaus, Lietz, Petra, Schreier, Margrit, and Wilhelm, Adalbert (2010) ‘Sampling: The selection
of cases for culturally comparative psychological research', in Fons van de Vijver and David Matsumoto
(eds), Methods of Cross-cultural Research. Cambridge: Cambridge University Press, pp. 101–29.
Bowen, G. A. (2008) ‘Naturalistic enquiry and the saturation concept: A research note', Qualitative Research,
8(1): 137–52.
Bryman, Alan (2016) Social Research Methods (5th edn). Oxford: Oxford University Press.
Charmaz, Kathy (2014) Constructing Grounded Theory: A Practical Guide Through Qualitative Analysis (2nd
edn). London: Sage.
Coyne, I. T. (1997) ‘Sampling in qualitative research. Purposeful and theoretical sampling: merging or clear
boundaries?' Journal of Advanced Nursing, 26(3): 623–30.
Page 16 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
Creswell, John W., and Plano Clark, Vicky L. (2011) Designing and Conducting Mixed Methods Research
(2nd edn). Thousand Oaks, CA: Sage.
Cronbach, L. J. (1975) ‘Beyond the two disciplines of scientific psychology', American Psychologist, 30(2):
116–27.
Crouch, M., and McKenzie, H. (2006) ‘The logic of small samples in interview-based qualitative research',
Social Science Information, 45(4): 483–99.
Dam, H., and Eyles, J. (2012) ‘“Home tonight? What? Where?” An exploratory study of the meanings
of house, home and family among the former Vietnamese refugees in a Canadian city [49 paragraphs]',
Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, 13(2): Art. 19, available at http://nbnresolving.de/urn:nbn:de:0114-fqs1202193.
Daniel, Johnny (2012) Sampling Essentials. London: Sage.
De Jonge, D., Jones, A., Philipps, R., and Chung, M. (2011) ‘Understanding the essence of home: Older
people's experience of home in Australia', Journal of Occupational Therapy, 18(1): 39–47.
Denzin, Norman K. (1983) ‘Interpretive interactionism', in Gareth Morgan (ed.), Beyond Method. Strategies
for Social Research. Beverly Hills, CA: Sage, pp. 129–48.
Dey, Ian (1999) Grounding Grounded Theory: Guidelines for Qualitative Inquiry. Bingley: Emerald Group.
Draucker, C. B., Martsolf, D. S., Ross, R., and Rusk, T. B. (2007) ‘Theoretical sampling and category
development in grounded theory', Qualitative Health Research, 17(8): 1137–48.
Emmel, Nick (2013) Sampling and Choosing Cases in Qualitative Research. A Realist Approach. London:
Sage.
Flick, Uwe (2004) ‘Design and process in qualitative research', in Uwe Flick, Ernst von Kardorff and Ines
Steinke (eds), A Companion to Qualitative Research. London: Sage, pp. 146–53.
Flick, Uwe (2014) Introduction to Qualitative Research (5th edn). London: Sage.
Francis, J. J., Johnston, M., Robertson, C., Glidewell, L., Entwistle, V., Eccles, M. P., and Grimshaw,
J. M. (2010) ‘What is an adequate sample size? Operationalising data saturation for theory-based interview
studies', Psychology and Health, 25(10): 1229–45.
Geertz, Clifford (1973) ‘Thick description: Toward an interpretive theory of culture', in Clifford Geertz (ed.),
The Interpretation of Cultures. New York: Basic Books, pp. 3–30.
Gerson, Kathleen, and Horowitz, Ruth (2002) ‘Observation and interviewing: Options and choices', in Tim
May (ed.), Qualitative Research in Action. London: Sage, pp. 199–224.
Page 17 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
Glaser, Barney (1978) Theoretical Sensitivity. Mill Valley, CA: Sociology Press.
Gobo, Giampietro (2004) ‘Sampling, representativeness, and generalizability', in Clive Seale, Giampietro
Gobo, Jaber F. Gubrium and David Silverman (eds), Qualitative Research Practice. London: Sage, pp.
435–56.
Gobo, Giampietro (2008) ‘Re-conceptualizing generalization. Old issues in a new frame', in Pertti
Alasuutari, Leonard Bickman and Julia Brannen (eds), The SAGE Handbook of Social Research Methods.
London: Sage, pp. 193–213.
Guba, Egon G., and Lincoln, Y.S. (1981) Effective Evaluation. San Francisco, CA: Jossey-Bass.
Guest, G., Bunce, A., and Johnson, L. (2006) ‘How many interviews are enough? An experiment with data
saturation and variability', Field Methods, 18(1): 59–82.
Guetterman, Timothy C. (2015) ‘Descriptions of sampling practices within five approaches to qualitative
research in education and the health sciences [48 paragraphs]', Forum Qualitative Sozialforschung/Forum:
Qualitative Social Research, 16(2): Art. 25, available at http://nbn-resolving.de/urn:nbn:de:0114-fqs1502256.
Hammersley, Martyn, and Atkinson, Paul (1995) Ethnography: Principles in Practice (2nd edn). Milton
Park: Routledge.
Higginbottom, G. (2004) ‘Sampling issues in qualitative research', Nurse Researcher, 12(1): 7–19.
Johnson, Jeffrey C. (1991) Selecting Ethnographic Informants. London: Sage.
Lewis, Jane, Ritchie, Jane, Ormston, Rachel, and Morrell, Gareth (2014) ‘Generalising from qualitative
research', in Jane Ritchie, Jane Lewis, Carol M. Nicholls and Rachel Ormston (eds), Qualitative Research
Practice. A Guide for Social Science Students and Researchers. London: Sage, pp. 347–63.
Lewis, Michael, and Staehler, Tanja (2010) Phenomenology: An Introduction. New York: Continuum.
Lincoln, Yvonne S., and Guba, Egon G. (1979) Naturalistic Inquiry. Newbury Park, CA: Sage.
Lynd, Robert S., and Lynd, Helen M. (1929) Middletown. A Study in Modern American Culture. New York:
Harcourt Brace Jovanovich.
Marshall, M. N. (1996) ‘Sampling for qualitative research', Family Practice, 13(6): 522–25.
Mason, Jennifer (2002) Qualitative Researching (2nd edn). London: Sage.
Mason, M. (2010) ‘Sample size and saturation in PhD studies using qualitative interviews [63 paragraphs]',
Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, 11(3): Art. 8, available at http://nbnresolving.de/urn:nbn:de:0114-fqs100387.
Page 18 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
Maxwell, Joseph A., and Chmiel, Margaret (2014) ‘Generalization in and from qualitative analysis', in Uwe
Flick (ed.), SAGE Handbook of Qualitative Data Analysis. London: Sage, pp. 540–53.
Merkens, Hans (2004) ‘Selection procedures, sampling, case construction', in Uwe Flick, Ernst von
Kardorff and Ines Steinke (eds), A Companion to Qualitative Research. London: Sage, pp. 165–71.
Mitchell, C. (1983) ‘Case and situation analysis', The Sociological Review, 31(2): 187–211.
Morse, J. (2000) ‘Editorial: Determining sample size', Qualitative Health Research, 10(1): 3–5.
Onwuegbuzie, A. J., and Leech, N. (2005) ‘Taking the “q” out of research: Teaching research methodology
courses without the divide between quantitative and qualitative paradigms', Quality & Quantity, 39(3): 267–96.
Onwuegbuzie, Anthony J., and Leech, Nancy (2007) ‘A call for qualitative power analyses', Quality &
Quantity, 41(1): 105–21.
Onwuegbuzie, A. J., and Leech, N. (2010) ‘Generalization practices in qualitative research: A mixed
methods case study', Quality & Quantity, 44(5): 881–92.
O'Reilly, M., and Parker, N. (2012) ‘“Unsatisfactory saturation”: A critical exploration of the notion of
saturated sample sizes in qualitative research', Qualitative Research, 13(2): 190–7.
Palinkas, L. A., Horwitz, S. M., Green, C. A., Wisdom, J. P., Duan, N., and Hoagwood, K. (2015)
‘Purposeful sampling for qualitative data collection and analysis in mixed method implementation research',
Administration and Policy in Mental Health and Mental Health Services Research, 42(5): 533–44.
Patton, Michael Q. (2015) Qualitative Evaluation and Research Methods (4th edn). Newbury Park: Sage.
Polit, D. F., and Beck, C. (2010) ‘Generalization in quantitative and qualitative research: Myths and
strategies', International Journal of Nursing Studies, 47(11): 1451–8.
Ritchie, Jane, Lewis, Jane, Elam, Gilliam, Tennant, Rosalind, and Rahim, Nilufer (2014) ‘Designing and
selecting samples', in Jane Ritchie, Jane Lewis, Carol M. Nicholls and Rachel Ormston (eds), Qualitative
Research Practice. A Guide for Social Science Students and Researchers. London: Sage, pp. 111–45.
Robinson, O. C. (2014) ‘Sampling in interview-based qualitative research: A theoretical and practical guide',
Qualitative Research in Psychology, 11(1): 25–41.
Sandelowski, M. (1995) ‘Sample size in qualitative research', Research in Nursing & Health, 18(2): 179–83.
Schegloff, E. A., and Sacks, H. (1974) ‘A simplest systematics for the organization of turn-taking for
conversation', Language, 50(4): 696–735.
Schofield, Janet W. (1990) ‘Increasing the generalizability of qualitative research', in Elliot W. Eisner and
Page 19 of 20
Sampling and Generalization1
SAGE
SAGE Research Methods
2018 SAGE Publications, Ltd. All Rights Reserved.
Alan Peshkin (eds), Qualitative Inquiry in Education: The Continuing Debate. New York: Teachers College
Press, pp. 201–42.
Schreier, Margrit, Schmitz-Justen, Felix, Diederich, Adele, Lietz, Petra, Winkelhage, Jeanette, and
Heil, Simone (2008) Sampling in qualitativen Untersuchungen: Entwicklung eines Stichprobenplanes zur
Erfassung von Präferenzen unterschiedlicher Stakeholdergruppen zu Fragen der Priorisierung medizinischer
Leistungen,
FOR655,
12,
available
at
http://www.priorisierung-in-der-medizin.de/documents/
FOR655_Nr12_Schreier_et_al.pdf.
Schwandt, Thomas A. (2001) Dictionary of Qualitative Inquiry (2nd edn). Thousand Oaks, CA: Sage.
Stake, R. E. (1978) ‘The case study method in social inquiry', Educational Researcher, 7(2): 5–8.
Strauss, Anselm (1987) Qualitative Analysis for Social Scientists. Cambridge: Cambridge University Press.
Teddlie, C., and Yu, F. (2007) ‘Mixed methods sampling: A typology with examples', Journal of Mixed
Methods Research, 1(1): 77–100.
Trotter, R. L. (2012) ‘Qualitative research sample design and sample size: Resolving and unresolved issues
and inferential imperatives', Preventive Medicine, 55(5): 398–400.
Wernet, Andreas (2014) ‘Hermeneutics and objective hermeneutics', in Uwe Flick (ed.), SAGE Handbook of
Qualitative Data Analysis. London: Sage, pp. 125–43.
Williams, Malcolm (2002) ‘Generalization in interpretive research', in Tim May (ed.), Qualitative Research in
Action. London: Sage, pp. 125–43.
Winkelhage, J., Schreier, M., and Diederich, A. (2013) ‘Priority setting in health care. Attitudes of physicians
and patients', Health 2013, 5(4): 712–19.
Wuest, J. (2001) ‘Precarious ordering: Toward a formal theory of women's caring', Health Care for Women
International, 22(1–2): 167–93.
Yin, Robert K. (2014) Case Study Research. Design and Methods (5th edn). Thousand Oaks, CA: Sage.
http://dx.doi.org/10.4135/9781526416070.n6
Page 20 of 20
Sampling and Generalization1
Descargar