3.1 a. Gender, measured as male or female. (It measures a non-numerical variable, and truly measures that feature.)
b. Time, measured on a clock that is always 5 minutes fast. (Everyone would get the same reading so it is reliable, but it would be consistently off by 5 minutes, so it is biased.)
c. Weight on a postal scale that sometimes measures too high and sometimes measures too low, with equal likelihood. (It is unbiased since there is no systematic bias in one direction, but unreliable because different people measuring the same package would get different answers.)
3.2 a. "Don't you agree that our whole tax system, which is far too complicated for anyone to understand, should be overhauled?" (The key here is to include a leading question like "Don't you agree that ..." or to include emotional wording like "Because abortion involves killing innocent babies..."
b. Questions that are unintentionally biased usually involve ambiguous wording. For example, asking "Do you exercise at least three times a week?" may mean something different to college students than to senior citizens. For instance the former group might not consider riding a bicycle for transportation as exercise, while the latter group does. Terms need to be clearly defined.
c. These are often questions that ask more than one question but allow only one answer, like "Do you support doing away with subsidized school lunches and giving food vouchers to the poor?"
d. Questions that require knowledge people aren't likely to have, or questions that have socially desirable or even illegal answers fit this. For example, "How many times in the past year have you used illegal drugs?"
3.3 a. Questions for which you want to allow for unique answers, like "Give an example of a situation for which an experiment would be unethical."
b. Questions for which you want to summarize relative responses from a fixed set of choices. For instance, the student government at a university might ask: "Which of the following areas do you think should receive the highest priority for funding with student fees?" then list choices of projects they are thinking about funding.
3.4 a. Pitfalls 1 (deliberate bias) and 5 (unnecessary complexity) both may be present. The question suggests that there is a good reason to support banning prayers, thus encouraging people to agree they should be banned. It also asks a more complicated question than simply whether the respondent supports banning prayers.
b. Pitfall 1, deliberate bias is present. The form of the question indicates that the person asking it does think marijuana should be legal.
c. Pitfall 3, desire to please, is probably most relevant, although pitfalls 1 (deliberate bias) and 6 (order of questions, in this case, order of information presented) may also apply. The statement preceding the question indicates that consuming one drink may actually be good for you, so people would be more likely to admit that they drink.
3.5 a. Do you support banning prayers in schools, or not?
b. Do you think that the use of marijuana should be legal, or not legal?
c. How many alcoholic drinks do you consume daily? It might make sense to present categories like "none, 0 to 1 per day, 2 to 4 per day, 5 or more per day."
3.6 a. Measurement.
b. Categorical, but may be ambiguous because it could be quantified to within a certain range of years.
3.7 a. Years of formal education is ratio. It makes sense to talk about someone having twice as many years as someone else.
b. Highest level of education is ordinal. There is a natural order to the categories.
c. Brand of car is nominal. There is no natural ordering.
d. Price paid for a car is ratio. It makes sense to talk about one car costing twice as much as another.
e. Type of car owned is ordinal, although there is a problem with a few of the categories, namely sports car and pickup because they don’t fit anywhere in the natural ordering.
3.8 a. Discrete.
3.9 a. The number of floors in a building is ratio. It makes sense to talk about one building having twice as many floors as another.
b. The height of a building is ratio. It makes sense to talk about one building being twice as tall as another.
c. The number of words in a book is ratio. It makes sense to talk about one book having twice as many words as another.
d. The weight of a book is ratio. It makes sense to talk about one book weighing twice as much as another.
e. IQ is interval. There is no meaningful zero, and it doesn’t make sense to talk about someone having twice as high an IQ as someone else.
3.10 a. Yes, nominal variables are one type of categorical variable.
b. No, categorical variables are either nominal or ordinal.
c. No, interval variables are measurement variables, not categorical variables.
d. Yes, an example is year of someone’s birth.
3.11 A reliable measurement. The reported price may consistently deduct factors like the cost of necessary repairs, but we could make the year-to-year comparison as long as the measurements were taken the same way each year (i.e., were reliable).
3.12 It would be easier to detect a difference if there were only a little variability, since then the difference due to the heartbeat might completely separate the groups. For instance, if without the heartbeat all babies gained 10 grams, then even a 1-gram difference in the heartbeat group would be detected. If weight gains ranged from 0 to 20 grams, the 1-gram difference might be masked by the large natural variability.
3.13 They are not likely to be a valid measure, because many crimes (like rape or domestic violence) go unreported. They are a reliable measure, because we would all agree on how to get the answer, namely, just look at what is in the records of reported crimes.
3.14 As discussed in Case Study 2.2, the wording was biased, perhaps unintentionally so. Also, desire to please would certainly be a factor in a face-to-face interview like this one, especially in a question like "How long have you known about Brooks Running Shoes?" with the clear implication that you should know about them.
3.15 a. Closed-form question, 4 choices of response were given.
b. There is only one choice of aspirin, namely Brand B. Further, those who prefer Tylenol would have to choose between plain Tylenol or Extra-strength Tylenol, thus splitting the vote.
c. The claim is very misleading and biased in favor of Brand B aspirin. Anyone who prefers aspirin at all would have to choose Brand B aspirin. Further, those who prefer Tylenol have had their votes split into two groups. Perhaps the number of people who prefer Tylenol in general is higher than the number who prefer aspirin in general, but the Tylenol vote is split while the aspirin vote all goes to Brand B.
3.16 Although only one-fifth favored forbidding public speeches (version A), almost one-half did not want to allow them (version B). Americans appear to be reluctant to forbid this type of free speech (4/5 didn’t want to do so), but they aren’t as reluctant to withhold approval, with almost half willing to do so.
3.17 The idea is that one question might suggest a topic that is then considered important or informative for the second question, whereas if the wording had been reversed the topic may not have come to mind. As an example, in the study by Brooks Shoes (Case Study 2.2) participants were told "I am going to hand you a shoe. Please tell me what brand you think it is." Later, they were asked "How long have you known about Brooks Running Shoes?" If they had been asked the second question first, they would surely have thought of Brooks immediately when asked about the brand of the displayed shoe.
3.18 The order of the questions most likely influenced the answer to the second one. Since the majority of respondents first acknowledged thinking most presidents have had extramarital affairs, they were then not in a position to describe Clinton's faults as worse than most presidents, when the alleged fault of his was an extramarital affair. (Note that the allegations of perjury and other offenses arose after this poll was taken.)
3.19 Anonymous. If it was confidential, the names would be known but not released to anyone.
3.20 a. Some airline companies are thinking of banning cigarette smoking on all flights. Do you agree or disagree with this policy?
b. Don't you agree that because of the known harmful effects of passive smoke, airlines should ban cigarette smoking on all flights?
c. Don't you agree that on long airline flights, smokers should have the right to smoke cigarettes, as long as a separate, well-ventilated section of the plane is provided for them?
3.21 A discrete variable measures something numerical with a logical ordering, whereas a categorical variable does not measure something numerical and generally has no logical ordering to the categories. An example of a discrete variable is shoe size. An example of a categorical variable is type of shoe (sandal, athletic shoe, and so on).
3.22 a. No. No, there is so much overlap in the times that it would be difficult to make a definite conclusion about which one is really faster based on only five times for each route.
b. If one route always took 14 minutes and the other always took 16 minutes, that would be convincing.
c. When measurements are extremely variable, as they were for part a, it is hard to determine whether or not a difference between two groups or conditions really exists. When measurements have almost no variability, like in part b, a difference is easy to detect.
3.23 Anything that gets placed into discrete packets, like candy bar consumption, could be measured in terms of ounces (continuous) or number of bars (discrete) consumed. The size of a book can be measured in number of pages (discrete) or by weight. The height of a building can be measured in inches or by number of floors.
3.24 It is not completely valid, because people may interpret "on time" to mean "by the scheduled time" and not 15 minutes later. It is reliable, because everyone using that definition would agree on whether or not the flight made it on time.
3.25 a. Systolic blood pressure is likely to differ due to all three causes. People are different, each person’s blood pressure changes over time, and there is measurement error.
b. Blood type is likely to differ due to natural variability across individuals. It doesn’t change across time, and it can be measured accurately.
c. Natural variability across individuals. Because they are all measured at the same time, and the measurement should be accurate, the other two sources are not involved.
d. Natural variability across individuals. Measurements are taken in a way that they can change across time, and the measurements should be accurate.
3.26 a. Natural variability across time and measurement error are both likely to cause variability in blood pressure measurements for the same person across days.
c. Natural variability across time, as the student’s watch may be off by different amounts on different days.
d. Natural variability across time.
3.27 Here is the quote from the article that explains it: “The prevalence of positive emotions and positive personality traits were assessed using the Positive Emotions facet of the Extraversion domain for the NEO personality inventory. Patients reported how often they experienced positive emotions such as the urge to jump for joy, intense joy, optimism, light-heartedness and the ability to laugh easily.”
3.28 Every 200 seconds they were asked to rate their level of sleepiness on a scale from 1 to 9, called the Karolinksa Sleepiness Scale, where 1 = extremely alert and 9 = very sleepy, great effort to stay awake, fighting sleep.
b. It was an ordinal variable.
3.29 Here is how the story explains it: “To measure levels of depression, the researchers examined adolescents' answers to 11 questions about the previous week, such as how often they felt they couldn't shake off the blues, felt lonely or sad or got bothered by things that normally wouldn't faze them.” (From News Story 19 in the Appendix)
3.30 a. Question 1 asked the respondents age in years, and would be easy to categorize. Question 11 asked about activities, hobbies and sports, and would be harder to categorize because there are so many possible responses.
b. Sports was specifically listed as a possible activity.
c. The wording did not seem to make much difference. About the same proportions answered yes and no with each wording.
3.31 a. Nominal
3.32 They listed 13 possible hangover symptoms and asked people to rate the percent of times they had experienced each one of them the morning after drinking, on a 5-point scale ranging from never (0%) to every time (100%).
3.33 a. It is likely to be valid because it measures severity of 13 of the most common hangover symptoms.
b. It is likely to be reliable because people would probably respond about the same way if you were to ask them to do it over again.
3.34 They measured gender, which is a categorical variable, and they measured the hangover symptoms scale, which is a measurement variable.
3.35 The women may have been less willing to admit that they had hangover symptoms than the men, because it isn’t as socially acceptable for women to get very drunk as it is for men, at least in some circles. Or, the men may have been less willing to admit that they had hangover symptoms, because men are supposed to be able to drink a lot and not be affected, at least in some circles. If this latter phenomenon occurred, it could have contributed to the finding that women are harder hit by hangovers.
NOTES ABOUT THE MINI-PROJECTS FOR CHAPTER 3
This project has two purposes. One is to illustrate that it is not easy to precisely measure even something as well defined as height. For instance, it should not be acceptable to ask people their heights, since they may not know or may not want to give you an accurate answer. The measurements, if taken properly, should be valid and would probably be fairly reliable. The second purpose of the project is to recognize that establishing a difference between men's and women's heights, which we all know exists, requires more measurements than might be suspected. With only five of each, the overlap is often sufficient that it would be less than convincing to an alien being that men are really taller. The project should address that concern in the context of the heights actually obtained. Sometimes it turns out that for the specific individuals chosen, the variability is low within each sex and the difference is apparent.
The purpose of this project is to demonstrate that a minor change in wording can result in a major change in responses. Make sure that the "unbiased" versions of the questions are truly neutral. The report should include an explanation of why the biased questions elicited the responses they did. For example, were they biased to produce socially acceptable answers, or to express a strong opinion in hopes that respondents would not disagree, or in some other way?
The main purpose of this project is to encourage students to think about how emotions are measured in studies. Most studies use rating scales based on questions about the emotion and its consequences. The validity of the scale depends on whether the questions asked really do measure that type of emotion. The reliability of the scale is determined by whether people would respond the same way in repeated taking of the test. A scale may not be reliable if it’s highly dependent on how someone is feeling at the moment.