I was recently given the results to a survey in which participants chose answers to questions they would be likely to randomly answer, and in which the survey population is known to have a preference for the first answer. Survey takers participated because of a free prize, and now I need to figure out how to account for the bias introduced by the preference for the first answer. How would I go about doing this mathematically?
Here is an equivalent example:
Favorite Representatives Survey
Pick your favorite Utah representative:
- Jason Chaffetz (2080 selected)
- Rob Bishop (1380 selected)
- Chris Stewart (580 selected)
- Mia Love (746 selected)
Pick your favorite Kansas representative:
- Lynn Jenkins (1890 selected)
- Tim Huelskamp (2910 selected)
- Mike Pompeo (540 selected)
Pick your favorite New York representative:
- Lee Zeldin (2584 selected)
- Peter King (1123 selected)
- Steve Israel (1790 selected)
- Kathleen Rice (1521 selected)
- Gregory Meeks (1207 selected)
Pick your favorite Alaska representative:
- Donald Young (4585 selected)
Pick your favorite Hawaii representative:
- Mark Takai (1900 selected)
- Tulsi Gabbard (2700 selected)
As you can see from the data there is a pretty clear trend toward first-choices, but at the same time there are some cases where higher-numbered choices won. What general methods can I use to quantify the effect of the first-choice advantage, and try to find what the results would have been if not for that bias?