Whenever I look at the stats for this modest blog, I always notice the same pattern. The number of visits aligns perfectly with the Pareto principle: 20% of our posts generate 80% of our page views. Of that 20%, the majority discuss how to calculate the size of a representative sample in order to conduct an opinion poll.
Given the apparent interest in this topic, today we are launching a series of posts on sampling: what it is, different sampling methods, when it’s useful to use one method or another, and so on. We hope that this information will be useful to students, to stats enthusiasts, and to professionals whose statistical expertise is a little rusty.
WHAT IS SAMPLING?
Sampling is the process of selecting a group of individuals from a population in order to study them and characterize the population as a whole.
It’s a pretty simple idea. Let’s say we want to know something about a population—the percentage of people in Mexico who smoke, for example. One way to go about this would be to call up everyone in Mexico (122 million people) and ask them if they smoke. The other way would be to get a subgroup of individuals together (1,000 people, for example) and ask them if they smoke, and then use this information as an approximation of the information we really want. This group of 1,000 people who make it possible for us to understand the behavior of Mexicans in general is called a sample, and the way we select them is called sampling.