Order in Product Customization Decisions: Evidence from Field Experiments

Download 114.53 Kb.
Date conversion25.04.2017
Size114.53 Kb.
  1   2   3

Order in Product Customization Decisions: Evidence from Field Experiments

Jonathan Levav

Mark Heitmann

Andreas Herrmann

Sheena S. Iyengar

Acknowledgements: We are grateful to Ray Fisman, Daniel Kahneman, Emir Kamenica, Dean Karlan, and Catherine Thomas for their comments on previous drafts and to Colin Camerer, Michael Riordan and Richard Thaler for helpful discussions. We are particularly grateful to Eric Johnson for facilitating our collaboration and for his helpful comments. We would also like to acknowledge the Center for Excellence in E-Business at Columbia Business School for its financial support.


A distinctive feature of the modern consumer world is the possibility of customizing a product to a consumer’s exact specifications. In economic models of demand for differentiated products, a product’s utility is derived from the summation of utilities for its individual attributes or characteristics. Embedded in this view is the assumption that the order in which attributes are considered in the customization process should not affect revealed preferences. In this paper we argue that order of attribute presentation can exert an important influence on what bundle of attributes a consumer purchases because considering alternative attribute levels is mentally depleting. Perhaps more importantly, we show that mental depletion has a particular pattern that firms can exploit to increase their revenues. We test the effect of order in one framed field experiment and two natural field experiments in the domains of custom-made suits and automobile purchases. We find that order affects the design of a suit that people configure and the design and price of a car that people purchase by influencing the likelihood that people will accept the default option suggested by the firm.

A distinctive feature of the modern consumer world is the possibility of customizing a product to a consumer’s exact specifications. In lieu of choosing between preset bundles of attributes or characteristics, consumers create their preferred bundle via a sequence of attribute decisions; each attribute might have different numbers of options (or “attribute levels”) to choose from. So, for instance, a homeowner who renovates her apartment engages in a complicated sequence of decision steps, with each step including dozens of options that she can select from (e.g., a typical US paint manufacturer offers 2000 different colors); a jogger who wants to customize her Nike shoes must decide between a dozen possible colors for her shoe lining, and then do the same for the swoosh, mid-sole and laces; Starbucks coffee customers can configure no fewer than 19,000 different drink combinations. Customization contributes to consumer welfare because it enables each consumer to select an attribute bundle that comes as close as possible to matching her preferences; greater variety within each attribute increases the likelihood that she will obtain exactly the option that maximizes her utility. One decision variable for firms that provide customizable products is how to order the product attributes in the configuration process. Does this order matter, even in cases where any attribute decision is reversible at any point in the configuration?

In economic models of demand for differentiated products, a product’s utility is derived from the summation of utilities for its individual attributes or characteristics (Kelvin Lancaster 1966; Daniel McFadden 1974; Sherwin Rosen 1974; Steven Berry, James Levinsohn, and Ariel Pakes 1995 or “BLP”). Differentiated product models have been a popular tool to estimate demand for a wide variety of goods, including housing, wines, automobiles and computers. Despite their many attractive properties, these models have been criticized for being improbable in certain markets or for their (sometimes unavoidable) econometric shortcomings. For instance, Rosen’s (1974) hedonic model assumes perfect competition and cannot account for unobserved product characteristics. In the BLP model the additive error term forces an implausible view of how consumers respond to increases in product offerings. A number of papers have presented variants of differentiated product models that are designed to overcome these constraints and increase the models’ usefulness for demand estimation (see, e.g., Dennis Epple 1987; Patrick Bajari and C. Lanier Benkard 2005; Berry and Pakes forthcoming).

In this paper we test the generalizability of another basic property that is embedded in differentiated product models. Namely, we test an implication of the models’ underlying principle that a product’s utility is constructed from the utility derived from each of the product’s attributes. Configuration decisions are particularly well-suited to test this principle because they require consumers literally to construct their preferred product piecemeal, attribute by attribute. If consumers are utility maximizing agents whose preferences are transitive and complete, then the order in which they consider the product attributes should be irrelevant to their preference for the product itself—any order should yield an equivalent “final” bundle. Any difference due to order (particularly where any attribute decision is reversible) highlights an important exception to the view that the utility from products is derived from the summed utilities of the product’s attributes.

We argue and demonstrate empirically in framed and natural field experiments (Glenn W. Harrison and John A. List 2004) involving financially consequential decisions that order of attribute presentation can exert an important influence on what bundle of attributes a consumer purchases because considering alternative attribute levels is mentally depleting. Perhaps more importantly, we characterize the pattern of mental depletion and show that it creates an opportunity that firms can exploit.

The basic experimental treatment we discuss below involves a product with multiple attributes that is configured by a consumer. Each attribute includes multiple options for the consumer to choose from; different attributes have different numbers of options. The configuration process is ordered either such that the attributes with a greater number of options come first in the sequence and are followed by the attributes with a smaller number of options or vice versa; this is our only experimental treatment.

Our premise is that in many cases the prospective utility from an option is assessed at the time of the decision (John W. Payne, James R. Bettman and Eric J. Johnson 1993; Paul Slovic 1995); options that elicit utility or are arousing beyond some minimum threshold level are more likely to be chosen. For instance, if a shopper is deciding which ceiling-fan to purchase for her home, she must assess the utility she will derive from each fan’s elegance, functionality, and features (Daniel Kahneman 1994); following this assessment she selects the fan that elicits sufficient arousal (i.e., the option that “satisfices”; Herbert Simon 1955).

We contend that the ability to detect arousal is a mental resource. And, just as people are boundedly rational, their capacity to evaluate options is limited, and the process of utility or arousal detection depletes what we call arousal detection capacity (or simply, arousal capacity). The idea that people might have a (limited) stock of arousal capacity is drawn from models of self-control in experimental psychology and economics. Experimental psychologists Mark Muraven and Roy Baumeister (2000) conceptualize self-control as a muscle: as self-control is exerted, one’s capacity for self-control in subsequent situations is diminished unless there is adequate rest. In a typical experiment a group of participants is asked to avoid temptation (e.g., a brownie) and then, in an ostensibly unrelated task, is presented with another self-control dilemma. Compared with a group that did not complete the first task, participants in the two-task group are less likely to exercise self-control in the second task (see, e.g., Baumeister, Ellen Bratslavsky, Muraven and Dianne Tice 1998). Emre Ozdenoren, Stephen Salant and Dan Silverman (2006) capture this phenomenon in an economic model that explains intertemporal choice anomalies by invoking the concept of willpower capacity; the stock of willpower is reduced when self-control is exercised (Ozdenoren et al. 2006). An important implication from their model is that when the stock of willpower is low, it decreases at a (weakly) faster rate. In other words, after exercising self-control, a subsequent encounter with a tempting option will require greater willpower resources than an identical earlier encounter. Similarly, we suggest that early decisions in a sequence affect subsequent decisions in a sequence because the early decisions deplete the stock of arousal capacity.

More importantly, arousal capacity depletion is not only a function of the decision sequence, but also of the number of options that decision-makers must evaluate at each stage of the sequence. People’s limited arousal capacity may increase the difficulty—and sometimes also reduce the likelihood—of evaluating increasingly large numbers of options. Such decision difficulty or “choice overload” has been documented in a number of experiments that examine the probability of choice given different numbers of options. A notable study conducted by Sheena Iyengar and Mark Lepper (2000) at an upscale grocery store in Menlo Park, California, presented shoppers with one of two displays of gourmet jams. Every hour the display alternated between 24 different jams or six different jams (representing a subset of the 24). Each shopper who approached the display was given a discount coupon. Coupon redemption (and purchase) rates were ten times greater for shoppers who had encountered the smaller subset of jams rather than the complete display. Similarly, a field study by Marianne Bertrand et al. (2006) offers evidence that loan take-up is significantly greater when a single loan is offered rather than multiple loans. Iyengar and Lepper (2000) argue that these experimental findings are explained by the fact that large choice sets sap decision-makers of their motivation to make a choice, even though the likelihood of matching their exact preference is greater when the choice set is larger. Instead of making what might be a more beneficial choice, decision makers forgo making the choice altogether.

We suggest that choice overload occurs because evaluating a larger variety of options depletes arousal capacity to the point that the evaluation requirements exceed people’s remaining arousal capacity. As a result, people tend to simplify subsequent choices because their ability to detect the arousal elicited by an option is hampered. The simplifying strategy that we focus on in our experiments is people’s likelihood of accepting the default alternative for a given decision in the sequence. In many situations defaults appear to exert a strong effect on people’s revealed preferences. For instance, Brigitte Madrian and Dennis Shea (2001) demonstrate that employees are much more likely to contribute to their retirement savings plan (401[k]) if enrollment in the plan is automatic. Similar results have been found for preferences for privacy and participation in e-mail lists online (Johnson, Steven Bellman and Gerald Lohse 2002) and, more dramatically, for participation in national organ donation programs (Johnson and Daniel Goldstein 2003; see also William Samuelson and Richard Zeckhauser 1988).

The sequential nature of product customization decisions potentially magnifies the depleting effects of a decision sequence and of variety. Our investigation focuses on the combined effect of these factors, and how they influence revealed preferences. Our thesis is that early decisions in a sequence affect subsequent decisions in the sequence because the early decisions deplete arousal capacity, but that this depletion effect depends upon whether or not the early decisions involve attributes that are high in variety or low in variety. Even though consumers should know to allocate their arousal capacity efficiently across the configuration process so that variety at different stages should not matter, we suggest that, as in Xavier Gabaix, David Laibson, Guillermo Moloche, and Stephen Weinberg’s (2006) directed cognition model, consumers are myopic in the sense that they behave as if the current decision is their last (despite the fact that in our experiments it is obvious that subsequent decisions will follow).

As mentioned earlier, our experimental treatment manipulates the configuration process such that the attributes with a greater number of options come first in the sequence and are followed by the attributes with a smaller number of options or vice versa. Normatively the sequence should not affect choices or willingness to pay; the same preference should be revealed irrespective of the sequence. If, however, choices are sensitive to the stock of arousal detection capacity, then each sequence should yield different revealed preferences because decision makers will be depleted at different parts of the sequence depending on the experimental condition. In particular, we predict that people who encounter high variety, depleting choices early in the sequence will evince a tendency to accept the default alternative in subsequent decisions even if these decisions involve relatively few options that would ordinarily require less capacity to evaluate. In contrast, those who begin the sequence with less complex decisions offering fewer options to choose from will evince little effect of depletion later in a sequence, even if these subsequent decisions are of the complex, high variety sort. This differential depletion pattern provides firms with an opportunity to extract higher revenues from their customers by manipulating the order in which firms present product attributes and the option that they select as the default alternative. We conduct our empirical tests of depletion and its consequences to revealed preferences in one framed and two natural field experiments involving real choices in financially consequential domains: custom-made men’s suits and Audi automobiles.

I. Suit Study


We recruited 73 MBA students at St. Gallen University in Switzerland under the aegis of a study about clothing taste in Switzerland and the United States. Participants were told that we would be raffling two business suits, custom made according to their specifications and taste by a well-known local tailor shop that the students were familiar with. In order to create a cover story to explain why we were offering this opportunity, participants were informed that a similar study was being conducted at a US business school. MBA students are an ideal participant pool for a study involving suits because at some point or another all of them purchase at least one suit for job interviews, summer internships, etc. Participants were told that they would be asked to design a suit, including a shirt and tie, and that they would have to contribute 75 Swiss Francs toward its cost (which was approximately 2000 Swiss Francs) in the event that they won the raffle. The fee, a substantial charge for the typical student participant, was included in order to ensure that participants would understand that their selections had a financial consequence.

Under the tailor’s close guidance we created a makeshift tailor shop in a laboratory space at the University. The tailor provided the shop’s seven booklets of swatches of suit fabric (100 options), suit lining fabric (5), shirt fabric (50), and tie fabric (42), suit buttons (20), dress belts (8) and dress socks (20). Upon arrival in the lab participants were asked to complete a short survey in order to be provided “a standard set of recommendations” by the tailor. The survey asked participants to indicate their prospective use for the suit (multipurpose, business, private), whether and how often they intended to travel in their suit, and a subjective rating of their preference for a classic versus a modern look (on a 7 point scale with “rather modern” and “rather classic” as anchors). The survey was handed to an assistant who proceeded to compile its results.

Next participants were randomly assigned to one of two treatment conditions, Hi-to-Lo (n = 34) or Lo-to-Hi (n = 39). In the Hi-to-Lo condition participants were presented with the booklets beginning with the attribute that had the most options (suit fabric, 100 options) and ending with the attribute that had the fewest options (suit lining, 5 options) in descending order, while in the Lo-to-Hi condition the order was exactly the opposite, i.e., in ascending order. The final choice for all participants in both conditions was the sock category (20 options). Participants were presented with each booklet in succession. For each booklet one of the options was randomly chosen to be the tailor’s recommended option given the participant’s survey responses.1 This option was indicated by a small piece of poster board that was labeled “standard recommendation” and was attached to the item; the recommended option was considered the default option in our analysis. Participants’ choices were recorded by the experiment’s administrator. The dependent variable was whether or not the participant accepted the standard recommendation for each suit attribute.

After participants completed the suit configuration process, they were asked to complete a self-reported satisfaction survey that asked them to rate (on a 1-7 scale) their satisfaction with the outfit that they had selected, how certain they were that their selections matched their preferences, their likelihood of making a similar selection in the future, and their satisfaction with the decision making process. Finally, we asked them to rate their knowledge about suits relative to an average peer.


The first six choices were estimated using the following logistic model form (socks were excluded and analyzed separately)2:

Pr(Default) = Stage Variety + Order + Knowledge

The Stage variable was an index that took on a value from 1 to 6 and denoted the attribute’s position in the decision sequence. It was included in order to control for any fatigue effects that were simply due to making choices. The Variety variable took on the number of options for that decision stage (e.g., for tie fabric there were 42 options, so Variety was 42) and was added because heterogeneity in tastes leads the likelihood of choosing the default option to decrease as variety increases. Our key variable of interest was the dummy variable Order (0 = Lo-to-Hi; 1 = Hi-to-Lo). A significant effect of Order would suggest a greater propensity to accept the default in one kind of sequence, and would serve as evidence that people’s choices are sensitive to attribute order. Finally, we included Knowledge because we assumed that more knowledgeable participants—even if self-proclaimed—would be less affected by our experimental treatment.


Table 1 presents the results of the logistic regression analysis. Not surprisingly, both Stage and Variety exert significant and opposite effects on default acceptance: as participants proceeded through the decision sequence they were more likely to accept the recommended option and as variety increased they were less likely to accept the recommended option. Self-rated expertise was also negatively correlated with default acceptance. Most critically, the Order variable was significant; Figure 1 presents the pattern of default acceptance. It is apparent from the figure that, whereas the propensity to accept the default was roughly equal throughout the decision making process in the Lo-to-Hi condition, this propensity increased steeply as participants advanced from stage to stage in the Hi-to-Lo condition. In other words, participants’ revealed preference for their suit was determined by the order of attribute presentation. The results are consistent with the notion that differential arousal capacity depletion exerts a significant effect on people’s choices.

The results for the sock decision also hint at a depletion effect. Hi-to-Lo condition participants were much more likely to accept the default sock than their counterparts in the Lo-to-Hi condition (53% versus 36%, respectively). Unfortunately, due to sample size limitations this difference was not statistically significant at conventional alpha levels (2 = 2.14, p = .14).

The responses to the satisfaction questions were highly correlated with each other (Cronbach’s = 0.84) so we created an overall satisfaction index. Participants were overwhelmingly more satisfied in the Lo-to-Hi condition than the Hi-to-Lo condition (5.0 versus 4.2, respectively, t(71) = 2.97, p = .002). This difference is important because customer satisfaction is related to stock price and financial performance measures such as ROI (Sunil Gupta and Valerie Zeithaml 2006), and firms spend considerable financial resources on surveying satisfaction and use it as an input to business policies.

II. Car Study I

Next we elected to test our order effect on an actual purchase of a high-priced durable good: an automobile. This test also enables us to show how arousal capacity depletion can be exploited by firms to increase their revenue. The next two experiments that we conducted were natural field studies that included 900 car buyers (600 in the first study and 300 in the second) at Audi car dealerships in Frankfurt, Munich and Stuttgart, Germany. There are some small but important design differences between the two studies so we present them separately.

In Germany car customers configure their vehicle to their own specification and purchase it in advance of delivery; cars are typically not available for immediate purchase at the dealership. Customers either configure their car using Audi’s configuration software on the World Wide Web or use a catalog that is presented to them by the salesperson at the dealership. We restricted our sample to customers who were interested in purchasing an A4 model sedan and who had not configured their car previously online. The studies were conducted at a computer terminal using the same configuration software available to Audi customers who configure their car on the World Wide Web. Participants were not informed of the purpose of study. Instead, they were told by the salesperson that Audi was testing the use of its configurator at its dealerships. In exchange for using the configurator, participants were given a free miniature toy car (approximate value $7). (Note that the salespeople were not aware of the purpose of this experiment.)

The configuration process includes a sequence of 67 decisions about attributes of the car, made one at a time, and takes approximately thirty minutes to complete. Each decision appears on a different screen, with a side-screen indicating the total price of the car to that point and all the features it includes. With each configuration decision the price is updated on the screen and at any point customers are free to revise their previous choices or scroll (click) forward. This is an important aspect because it means that all customers can have access to information about any attribute at any point in time. Each attribute consists of a different variety of options and different options have different prices; for instance, there are 56 interior colors and 13 types of wheel rims to choose from. At every screen there is a default option that is already checked-off by the manufacturer (e.g., the default engine is 1.6 liter five-speed manual transmission). For all attributes except exterior color the default is the cheapest option and appears at the top of the list (e.g., engine).3 We selected eight “target attributes” for the purposes of our experiment (number of options in parentheses): interior color (56), exterior color (26), engine and gearbox (25), wheel rims/tires (13), steering wheel (10), rearview mirror (6), interior décor style (4), and gearshift knob style (4).4 The target attributes were placed at the beginning of the configuration sequence.


Each customer-participant began by completing a short questionnaire where he (or she) was asked to state his willingness to pay for a new A4 sedan. This question was designed to make the customer’s budget constraint salient. Next the participant was asked a series of self-rated knowledge questions including whether he had ever owned an Audi, had ever driven an Audi, and felt knowledgeable about Audi cars (all on 1 to 7 scales). Finally, in the last phase of the pre-configuration survey participants were asked to rate the importance of each of the eight target attributes using a constant sum scale in which they allocated 100 points across the eight attributes according to subjective importance. The software forced participants to allocate all 100 points but allowed for ties and for zeros. The purpose of this survey item was to test whether self-reported importance exerted any effect on customers’ choices in our study.

Next participants were randomly assigned by the configurator software to one of three groups. As in the Suit Study, we varied the order in which participants made their decisions regarding the (eight) target attributes. In the Hi-to-Lo group (n = 150) participants were presented with a sequence that was sorted by descending variety such that the attribute that had the most options (interior color, 56) appeared first and the attribute with the fewest options (gearshift style, 4) appeared eighth. The Lo-to-Hi group (n = 150) was presented with the exact opposite, ascending sequence (i.e., gearshift style was first and interior color eighth). Control condition participants (n = 300) were presented with a randomly determined sequence of the eight attributes. The remainder of the configuration was identical for all participants. Our dependent variable was whether or not the customer-participant accepted the default option at each of the eight stages of the decision sequence. Thus, each customer provided eight observations. In addition to recording their selection, the software also recorded the price of the chosen option as well as the time taken to make the choice. It also recorded the total price paid for the car.

At the end of the configuration process participants were asked to indicate their satisfaction with the configuration software and the car, their likelihood of configuring the same car again, and the extent to which the car they configured matched their preferences (all measured on 1 to 7 scales). Having completed the configuration process, participants proceeded to complete the paperwork necessary to complete the purchase of their configured car.5


The eight manipulated choices were analyzed using a logistic regression of the following form:

Pr(Default) = Stage Variety + Order + Importance + Importance*StageKnowledge

Stage and Variety, as in the Suit Study, were expected to have significant and opposite effects on default taking. The Order dummy (0 = Lo-to-Hi; 1 = Hi-to-Lo) was the critical variable; a significant parameter would indicate that our sequence manipulation affected participants’ choices as predicted, even after controlling for Stage and Variety. We added the Importance parameter in order to ascertain whether participants resist the tendency to accept the default when they consider the attribute to be important and the interaction term in order to test whether the effect of importance was uniform throughout the sequence. Finally, we control for self-rated expertise with the Knowledge variable.


Table 2 presents the key summary statistics and table 3 presents the results of our logistic regression analysis for this study. We find a significant effect of Stage: customers were more likely to accept the default offering as they advanced through the decision sequence. As one would expect from simple taste heterogeneity, we find the intuitive result that as variety increases, the likelihood of accepting the default decreases significantly. Third, (self-rated) greater attribute importance is associated with a decrease in default choice for that attribute. However, this relationship is qualified by an Importance x Stage interaction, such that the effect of importance on the probability of accepting the default diminishes as participants advance in the choice sequence. It appears that the choice process overwhelms participants’ ability or desire to best match their preferences even on the attributes that they reportedly consider to be the most important.

Most relevant to this investigation, we find a significant effect of Order: where participants began the choice sequence with the highest variety attributes, i.e., interior color, they were more likely to later accept the default for the lower variety attributes compared with the condition where they began with the lowest variety attributes, i.e., gearshift style (see Figure 2). Note that a very subtle change in attribute order gives rise to a significant change in revealed preferences and subsequent real purchases, even where attribute information is equally available to all participants at all times. Such a result in a high stakes decision is surprising given the assumption of complete preferences and suggests that people often evaluate options at the time of decision using a limited mental resource.

We also obtained data on how long it took each participant to make his or her choice of options at each stage, as well as the duration of the overall configuration process. The total configuration completion times were within 3 seconds of each other in all conditions, and stood at just under 30 minutes (see Table 2). The timing data for the eight manipulated attributes are plotted in Figure 3. Note that the pattern of the times tracks the choice pattern. If time is taken as a proxy for decision effort, then it is apparent that participants who took the default actually invested more effort than their counterparts who rejected the default in favor of a different option. That is, default acceptance was unlikely to be due to simple laziness.

An analysis of the total price paid for the automobile demonstrates the financial consequences of our experimental manipulation in this study. Table 2 shows the prices paid for each of the eight target attributes and for the car overall in each condition. Since the order manipulation affected the features of the vehicle that participants chose and where participants accepted the default, it also affected its price: participants in the Hi-to-Lo condition paid 1482.37 Euros more than in the Lo-to-Hi condition, a statistically significant difference (t(298) = 3.18, p-value < 0.01). This difference indicates that the effects of our subtle order manipulation—recall that order was altered for only eight of the automobile’s 67 configurable attributes—were of significant financial consequence both to our respondents and to the manufacturer. The results suggest that the firm can increase its revenues with a relatively costless manipulation of its configuration process that consists of strategically altering the order of the configuration as well as the default option for certain attributes (more on this in the next experiment).

Finally, due to the high correlation between the satisfaction measures, we combined them to form a satisfaction index (Cronbach  = 0.92). Replicating the Suit Study, participants reported greater satisfaction in the Lo-to-Hi condition than the Hi-to-Lo condition (t(298) = 5.12, p < .0001). It is noteworthy that there was no statistically significant correlation between self-reported satisfaction and purchase price (r = 0.01, n.s.).

  1   2   3

The database is protected by copyright ©hestories.info 2017
send message

    Main page