Bias in Surveys
Surveys are a cost-effective means of obtaining information from a wide variety of persons. For that reason they are often used in research and evaluation as a central data collection activity. Unfortunately, many persons who have no training in survey development and may be unaware of the issues of bias and error develop surveys for use as data collection tools.
One of the critical concerns when developing surveys is the issue of error. Error in surveys is the difference between one’s true answer and one’s actual answer. In an ideal situation error would be zero and there would be no difference between the true and actual answer. In algebraic terms, where T= True answer, A = Actual answer, and e is error, T = A + e.
Although we cannot calculate a person’s true answer, we can reduce error by addressing the bias which is known to inflate error. In terms of the actually constructing the survey, including developing survey items/questions and response options for closed-ended questions, reducing measurement error must be a critical concern to preserve the quality of data resulting from the survey.
Measurement error refers to the bias introduced as part of the measurement process, and can be introduced based on question wording, question placement, the context in which questions are asked, and the response options provided for closed end questions. Unfortunately, very little research exists as to ways in which response options introduce bias in survey responses. Without empirical research in this area it is hard to estimate how big of a problem measurement error may be for a particular survey because of the response options provided. Understanding ways in which response options may introduce bias is critical to changing survey practices regarding the development of response options and understanding the quality of the data obtained by the survey.
Typical survey response options for a frequency scale
One type of closed end question often asked on a survey is the frequency with which an event occurred. For example, a survey on student engagement may ask the respondent, the student in this case, how frequently, on average, throughout a typical week, he or she raises his or her hand to answer a question in class. As respondents are very unlikely to know the exact number of times, on average in a week, that he or she does this, they are often asked to provide a best guess. To help them estimate this frequency, a surveyor may provide the following responses to choose from, rather than just having the respondent provide a number: Never, A couple of times, A few times, Several times. Unfortunately, when these responses are provided with no additional information included, such as what number or range of numbers is associated with them, bias is introduced into the responses, directly affecting the reliability of the resulting data.
Bias in the use of frequency descriptors with no numerical anchoring
To assess the degree to which bias is introduced by the above commonly used frequency descriptors, as part of a workshop on survey quality 27 persons responded to the following directions:
Given the question “On average, how often do you withdraw cash from your bank account each month?” and the following response options “Never, Once, A couple of times, A few times, and Several times”, please assign a number or range of numbers to these response options.
A couple of times =
A few times =
Several times =
The question, “On average, how often do you withdraw cash from your bank account each month?” was provided to respondents as it placed a time span or boundary to the potential values for the descriptors and ensured that all respondents were responding within the same context.
The table below shows the minimum and maximum values associated with each descriptor, as well as the mean of the ranges and the standard deviation of these means. As can be seen, the only descriptor where the standard deviation was zero was “Never”. While that finding is not surprising, as by definition, never means at no time, what is surprising is that the descriptor “Once” has a mean greater than 1.00 and a nonzero standard deviation. Although the mean is very close to (1.10) and the standard deviation is quite small (0.28), it is somewhat surprising that a range of numbers was associated with “Once” instead of just the number one. “A couple of times” was also associated with numbers outside of two as its minimum was two but the maximum value associated with it was eight, as in eight times. While the mean value associated with that descriptor is close to 2.00 (2.60), its standard deviation is quite large at 1.04. This suggests that the value or values respondents associate with that descriptor ranges greatly. Not surprisingly, the descriptors “A few times” and “Several times” appear to be fairly ambiguous in their meaning to respondents as well, as is evident by the large standard deviations associated with them (2.19 and 4.24, respectively).
Table 1: Numeric values associated with frequency descriptors
Descriptor n, Min., Max., Mean, sd
Never 27, 0, 0 , 0.00, 0.00
Once 27, 1, 2, 1.10, 0.28
A couple of times 27, 2, 8, 2.60, 1.04
A few times 27, 2, 14, 4.73, 2.19
Several times 27, 4, 24, 9.13, 4.24
The findings presented above suggest that frequently used response option descriptors related to the frequency of an event or behavior occurring may introduce bias and thus error into survey responses. Although this finding may not be surprising to some who believe that frequency response options should be defined numerically, the degree of error associated with “Once” and “A couple of times” is surprising, given that by convention they are assumed to mean one time and two times, respectively. Whereas the finding that the range of numbers associated with the descriptors “A couple of times” and “Several times” is quite large, the actual standard deviation of the means of these ranges is also quite large, further suggesting that there is room for much measurement error when these descriptors are used without accompanying numbers.
The implication for survey developers is both clear and significant as these findings suggest that there is not an agreed upon definition for these typically used frequency descriptors. While the remedy may seem clear (rather than use a description alone associate a number or range of numbers with these descriptors or only use numbers alone to represent frequency of events or behaviors), the fact that oftentimes numbers and number ranges are not included indicates that surveyors are unaware of the degree to which these descriptive introduce measurement error into survey findings.
To improve survey developers’ understanding of how response options may introduce bias and thus error into survey response, thus undermining the quality of data resulting from the survey, additional research needs to be undertaken to further assess bias in other response options and to replicate the findings in this study.
To What End?: Evaluation and Nietzsche
13 hours ago