What are representative samples?
For most market research surveys it is impractical, in terms of time, budget and other factors, to interview everyone in your target population. To get a view of how the population of the UK feels we don't need to interview everyone in the country, instead we ask a sample for their opinions.
As far as possible the sample of people interviewed should be representative of the target population. To be representative the characteristics (demographic, attitudinal and behavioural) of the people interviewed should, as far as is possible, match those of the entire target population. As a group the people interviewed should look no different to those we haven't spoken to.
Why are representative samples important?
Representative samples are important as they ensure that all relevant types of people are included in your sample and that the right mix of people are interviewed. If your sample isn’t representative it will be subject to bias. Certain groups may be over-represented and their opinions magnified while others may be under-represented.
One of the most famous examples of sample bias was the 1936 poll carried out by The Literary Digest in the US ahead of the presidential elections. Around 10 million questionnaires were mailed to subscribers and 2 million completed surveys sent back. Based on the results of the survey The Literary Digest predicted that Alfred Landon would beat the then president Franklin D Roosevelt. However, when it came to the actual election Roosevelt won by a landslide.
The reason for the inaccuracy of the poll was an unbalanced, unrepresentative sample. The Literary Digest subscribers tended to be wealthier than average and therefore more likely to oppose Roosevelt. Also, it was felt that those unhappy with the current president were more likely to be motivated to respond to the survey and say they would vote for Landon.
This survey also showed that large sample sizes don’t guarantee accurate survey results. George Gallup, using a smaller, more controlled survey correctly predicted the result.
Nowadays many surveys are sent out through social media with seemingly no control on the type or mix of respondents completing the survey. As many interviews as possible are sought with no concern about sample structure. If we don’t know who has responded we don’t know how well or badly the resulting data represents the complete target audience. Without controls it is likely that the sample will be significantly biased making the resulting data impossible to interpret and unusable.
Sample bias will always exist to a certain extent. For example, we can’t force people to complete surveys. Those that don’t accept our invitations to participate may well be different in some way to those that take part . For example, the busier people are generally the less likely they are to take part in surveys. Busy people are therefore likely to be under-represented in research. We can't, therefore, eliminate sample bias, but we should do all that we reasonably can to minimise it and to understand it.
How do we achieve representative samples?
There are 2 ways of deriving representative samples for research surveys: probability sampling and non-probability sampling:
Probability or random sampling involves choosing respondents from your target population at random minimising potential sample bias. However, to be able to sample randomly you need to know up-to-date details of everyone in your target population. This is unlikely for many audiences. You also need to be able to actually survey a large proportion of those chosen at random which can be time consuming and expensive. In layman's terms samples drawn in this way are "purer" than those constructed using non-probability methods but, due to the resources needed, this sort of sampling tends to be confined to high quality, well-funded social and government research.
Non-probability or purposive sampling is much more widely used. Controls are placed on the types of respondents chosen for the survey in terms of quotas and we specifically look for different types of people to interview to make sure the sample is correctly balanced.
In order to set up these quotas we firstly need to find out the profile of our target population in terms of its key characteristics. For example if we wanted to interview a sample of teenagers in the UK we can use census data available on the Office of National Statistics website to find information on the demographic characteristics of this group. We might want to interview the right mix of teenagers by age. If there are fewer 14 year olds than 18 year olds in the UK our sample should reflect this. Census data can tell us the number of teenagers of each age as well as gender, region and other useful splits.
Profiles of some groups can be harder to obtain. Industry wide research can provide profiles of groups such as readers of a particular newspaper or shoppers at a certain supermarket chain. Past research can also be used if relevant. However, sometimes it might be necessary to use an omnibus survey. An omnibus survey regularly interviews a representative sample of the population of a country and includes a range of questions from different clients about different subjects. Therefore, if we wanted to know something like the profile of people who wear sunglasses regularly we could add a question to the omnibus survey. We would then get back profile data for this group, typically on key demographics, which can be used to set up quotas for a further, more detailed survey, on the use of sunglasses.
Once we know the profile of our target population we can then set quotas on key respondent characteristics. For example, if we know that 70% of the population we want to interview are male we would set up a quota specifying that 70% of the interviews will be with men and 30% with women. If we wanted to interview a total of 1000 respondents we would therefore set a maximum number of 700 interviews with men and make sure that interviewing with this group stopped once 700 interviews had been reached.
Quotas are typically set on 3 or 4 different variables, usually demographics such as gender, age and region. Using more variables than this can make the structure of the sample complicated and, in turn, make it a difficult, lengthy and costly process obtaining interviews with exactly the right people. Additional quotas are also unlikely to reduce sample bias by more than a marginal amount.
The actual quotas set are very dependent on your target population. Different types of audiences have different key characteristics. For example, for business to business research quotas are often set on variables such as business sector and number of employees.
Before you launch your survey think about how you can minimise potential sample bias and achieve as representative a sample as possible. No matter how well designed your questionnaire the resulting data can be worthless if your sample is unbalanced.