Non-random sampling: quota sampling

At last, our series of posts on sampling, has reached the all-star of non-random sampling: quota sampling. This method is most often used in online research conducted through panels. We could look at quota sampling as the nonrandom version of stratified sampling. It consists of three phases:

1. Segmentation

To start with, we divide the population we hope to study into mutually exclusive groups, in such a way that every individual belongs to one and only one group (no one is left out), just as when we define the strata for stratified sampling. This segmentation is normally achieved by using a sociodemographic variable such as sex, age, region or social class.

2. Setting the size of the quotas

Next, we set the target number of individuals to be surveyed for each of these groups. We normally define these targets in proportion to the group’s size within the population. For example, if we defined several segments by gender in a population that is 60% women and 40% men, and we want a 1,000-person sample, we would define

a target of 600 women and 400 men. These targets are known as quotas. In this example, we would have a gender quota of 600 women and 400 men. Sometimes we set quotas that are not proportional to the population; such quotas might be used to conduct in-depth analysis of a specific group.

3. Participant selection and quota monitoring

Finally, we seek participants to cover each of the quotas that we have defined. It is here that we diverge from random sampling: in quota sampling, we accept that the selection of individuals is not random; in fact, we could even use availability sampling. For example, in a study in which we have defined a quota of 100 individuals younger than twenty-five and 100 individuals who are twenty-five or older, we could stop people on the streets, ask them their age and, if we haven’t yet hit our target, survey them.

Given the above description, the difference between stratified sampling and quota sampling lies in the way we choose participants. With stratified sampling, I have access to a list of possible interviewees, all with a certain (known) probability of being selected. With quota sampling, it's a different story: we choose candidates to be part of the study in a nonrandom way, and as part of the selection process, before interviewing them, we determine whether they are valid for our study (that is, whether they can count towards our quotas or if we have already hit our target). When we have to dismiss a possible participant because we have met our quota (the 101st woman for my 100-woman quota), we say that the individual was dismissed due to a full quota

CHOICE OF VARIABLESes

What variables should we choose for quota sampling? How should we segment the population? This is a key question for this method.

We must consider the goal of using quotas: making the sample as representative as possible of the population being studied. When we define quotas by sex and age in a sample, what we’re guaranteeing is that, even though the method used to select individuals for the sample isn’t strictly random, at least the sample will retain proportions that are identical to those of the population in terms of sex and age.

From this point of view, we should choose to define quotas with variables that meet two conditions: (1) they can be altered, with respect to the population, by the nonrandom selection process that we use, and (2) they can have the greatest effect on the data that we want to measure.

Let’s take a look at those two criteria at play in a real-life example: a sample drawn from an online panel. Suppose that we want to measure the percent of a population that smokes by sampling from an online panel. What variables should we choose to define our quotas?

Well, first of all, those variables that we’re thinking of might be distorted by the fact that we’re sampling for the population from an online panel in the first place: age (youths participate in online panels more than their older counterparts) and social class (it is difficult for these panels to recruit participants form lower classes, especially in Latin America) might play a role.

We can do away with regional quotas. Online panels don’t usually draw from a specific region. Rather, they recruit participants through online media, which can be accessed from any region. Unless we’re in a country with significant socioeconomic differences between regions, regional quotas aren’t necessary. Furthermore, if we don’t expect there to be differences in smoking habits between regions, there’s no advantage to forcing this sort of quota.

As for the second criterion (quotas that could affect the result measured), we could opt to add a gender quota: smoking habits are different for men and women. Unless we’re working with a panel with guaranteed gender composition, it is advisable to control for this quota, too.

QUOTA SAMPLING AND REPRESENTATION

Using quotas for nonrandom sampling doesn’t mean that we can transform it into a random sampling process. We are still unable to calculate the margin of error and the confidence level of the results. In other words, using quotas does not enable us to measure the precision of our results.

Does that mean that there's no difference between using and not using quotas? That availability sampling is just as good as quota sampling? Definitely not! Using quotas gives us a certain degree of control over biases that may arise from the selection method used; it ensures that in a key series of variables, the composition of the population will be faithfully mirrored in our sample. The problem is that we cannot state just how representative our sample is—even though this is common practice for many researchers. Quotas improve representation, but we don’t know just how much.

Despite all that, quota sampling is among the most popular sampling methods, and is practically the only viable method when conducting online research (unless we have a random panel). Quotas are an effective and affordable system for obtaining samples that provide relevant information.

The biggest advantage of quota sampling is that it offers usable results at an affordable cost and, if the variables for segmentation have been chosen properly, those results tend to be reliable.

There are two main disadvantages: (1) It is impossible to delimit any error we may be committing when using this sampling method and (2) there is the risk that some relevant quota could be left out of the study. For example, if we don’t set a regional quote for an electoral study and it turns out that voting trends vary widely by region, the global results will be significantly distorted.

COMMOM MISTAKES WHEN USING QUOTAS ONLINE

Quota sampling is very popular. Most telephone and in-person studies, for lack of a precise sampling frame (such as a population census) use quotas to ensure an acceptable level of representation. This is also the dominant method used in online study, through the use of panels. However, conducting studies online comes with its own idiosyncrasies, which researchers often fail to account for; these researchers end up using the exact same methods they would use for offline studies. Consequently, they attain lower-quality results, and in some cases they incur greater costs.

Here are a few examples:

Geographic quotas

• Offline... The region of the respondent is a key variable that must be controlled for when we conduct live surveys, for obvious reasons. If we survey residents of one city, all of the respondents will be from that city. That is why region is a key quota. It is also common practice to limit the sample to a country’s most important cities in order to minimize costs.
• Online... Region is not so important; finding individuals from different cities is easy. There is no difference in cost between surveying people from one city or from ten. Therefore, if the geographic factor isn’t key to the study, there is no need to control for geographic quotas. If the geographic factor is important, we can set quotas so that we obtain responses from every region, not just from a handful of cities. This way, we’ll get better data at a lower cost, since we can use the whole panel to obtain results.

Quotas by social class

• Offline... Social class isn’t often considered a key quota in Europe and North America, at least not in all studies. Differences between social classes exist, but they are not so pronounced as in other regions, such as Latin America, where individuals from higher social classes are more difficult to access in order to gather data through personal interviews, while data from lower social classes is easier to obtain.
• Online... Social class is more relevant than online, especially in countries where Internet use is intermediate or low. Curiously, in these countries, the situation is flipped: it is easier to connect with higher classes in Latin America, while lower classes are more difficult to access online.

Gender and age quotas

• Offline... Gender and age are variables that are typically controlled for with quotas. There is no real difference when conducting live interviews, but men and young people are more difficult to access over the phone. The emergence of the cellphone has aggravated the problem: young people hardly ever use landlines.
• Online... Both variables must be controlled for, as offline. It is easier to access young people through online panels, especially those between 20 and 35 years old. These panels also typically recruit more women than men, because they are more highly sought for market studies, so gender must be controlled for.

We hope that you’ll join us for the next post in this series, which is dedicated to a method known as snowball sampling.