Most of the economists in this group use experiments as part of their empirical research. In this post I will highlight some recently published work that is in the tradition of Vernon Smith, who influenced all of us so much.
Abstract: We study minimal conditions for competitive behavior with few agents. We adapt a price-quantity strategic market game to the indivisible commodity environment commonly used in double auction experiments, and show that all Nash equilibrium outcomes with active trading are competitive if and only if there are at least two buyers and two sellers willing to trade at every competitive price. Unlike previous formulations, this condition can be verified directly by checking the set of competitive equilibria. In laboratory experiments, the condition we provide turns out to be enough to induce competitive results, and the Nash equilibrium appears to be a good approximation for market outcomes. Subjects, although possessing limited information, are able to act as if complete information were available in the market.
This small excerpt from their results shows a market converging toward equilibrium over time, under different treatment conditions. With some opportunities for practice and feedback, agents create surplus value by trading.
Figure 4 plots the average efficiency in each round in the four treatments. Efficiency is defined as the percentage of the maximum social surplus realized. … learning takes longer under the clearing house institution; hence, average efficiency under the clearing house institution presents a stronger upward trend over time. Under the clearing house institution, the average efficiencies start at levels lower than under the double auction institution, and remain statistically lower in the second half of the experiment. Nevertheless, we can observe from Fig. 4 that the upward trend of the efficiencies in clearing house treatments persist over time, and at the end of the experiment, the efficiency levels from the two institutions are close.
Grosch, Kerstin, Simone Haeckl, and Martin G. Kocher. “Closing the gender STEM gap-A large-scale randomized-controlled trial in elementary schools.” (2022).
These authors were thinking about the same problem at the same time, unbeknownst to me. In their introduction they write, “We currently know surprisingly little about why women still remain underrepresented in STEM fields and which interventions might work to close the gender STEM gap.”
My conclusion from my paper is that, by college age, subjective attitudes toward tech are very important. This leads to the questions of whether those subjective attitudes are shaped at younger ages. Grosch et al. have run an experiment to target 3rd-graders with a STEM-themed game. I’ll quote their description:
The treatment web application (treatment app) intends to increase interest in STEM directly by increasing knowledge and awareness about STEM professions and indirectly by addressing the underlying behavioral mechanisms that could interfere with the development of interest in STEM. The treatment app presents both fictitious and real STEM professionals, such as engineers and programmers, on fantasy planets. Accompanied by the professionals, the children playfully learn more about various societal challenges, such as threats from climate change and to public health, and how STEM skills can contribute to combating them. The storyline of the app comprises exercises, videos, and texts. The app also informs children about STEM-related content in general. To address the behavioral mechanisms, the app uses tutorials, exercises, and (non-monetary) rewards that teach children a growth mindset and improve their self-confidence and competitive aptitude. Moreover, the app introduces female STEM role models to overcome stereotypical beliefs. To test the app’s effect, we recruited 39 elementary schools in Vienna (an urban area) and Upper Austria (a predominantly rural area).
This is a preview of their results, although I recommend reading their paper to understand how these measurements were made:
Girls’ STEM confidence increases significantly in the treatment group (difference: 0.047 points or 0.28 standard deviations, p = 0.002, Wald test), and the effect for girls is significantly larger than the effect for boys.
Result 2: Children’s competitiveness is positively associated with children’s interest in STEM. We do not find evidence that stereotypical thinking and a growth mindset is associated with STEM interest.
Lastly, my kids play STEM-themed tablet games. PBS Kids has a great suite of games that are free and educational. Unfortunately, I have not tried to treat one kid while giving the other kid a placebo app, so my ability to do causal inference is limited.
This is the second of two blog posts on my paper “Willingness to be Paid: Who Trains for Tech Jobs”. Follow this link to download the paper from Labour Economics (free until November 27, 2022).
Women did not reject a short-term computer programming job at a higher rate than men.
For the incentivized portions of the experiment, women had the same reservation wage to program. Women also seemed equally confident in their ability after a belief elicitation.
The main gender-related outcomes were, surprisingly, null results. I ran the experiment three times with slightly different subject pools.
However, I did find that women might be less likely to pursue programming outside of the experiment based on their self-reported survey answers. Women are more likely to say they are “not confident” and more likely to say that they expect harassment in a tech career.
In all three experiments, the attribute that best predicted whether someone would program is if they say they enjoy programming. This subjective attitude appears more important even than having taken classes previously.
Along with “enjoy programming” or “like math”, subjects who have a high opportunity cost of time were less willing to return to the experiment to do programming at a given wage level.
I wrote this paper partly written to understand why more people are not attracted to the tech sector where wages are high. This recent tweet indicates that, although perhaps more young people are training for tech than ever before, the market price for labor is still quite high.
The tech wages are too damn high.
— Antonio García Martínez (agm.eth) (@antoniogm) October 7, 2022
The neat thing about controlled experiments is that you can randomly assign treatment conditions to subjects. This post is about what happened after adding either extra information or providing encouragement to some subjects.
Informed by reading the policy literature, I assumed that a lack of confidence was a barrier to pursuing tech. A large study done by Google in 2013 suggested that women who major in computer science were influenced by encouragement.
I provided an encouraging message to two treatment groups. The long version of this encouraging message was:
If you have never done computer programming before, don’t worry. Other students with no experience have been able to complete the training and pass the quiz.
Not only did this not have a significant positive effect on willingness to program, but there is some indication that it made subjects less confident and less willing to program. For example, in the “High Stakes” experiment, the reservation wage for subjects who had seen the encouraging message was $13 more than for the control subjects.
My experiment does not prove that encouragement never matters, of course. Most people think that a certain type of encouragement nudges behavior. My results could serve as a cautionary tale for policy makers who would like to scale up encouragement. John List’s latest book The Voltage Effect discusses the difficulty of delivering effective interventions at scale.
The other randomly assigned intervention was extra information, called INFO. Subjects in the INFO treatment saw a sample programming quiz question. Instead of just knowing that they would be doing “computer programming,” they saw some chunks of R code with an explanation. In theory, someone who is not familiar with computer programming could be reassured by this excerpt. My results show that INFO did not affect behavior. Today, most people know what programming is already. About half of subjects said that they had already taken a class that taught programming. Perhaps, if there are opportunities for educating young adults, it would be in career paths rather than just the technical basics.
Since the differences between treatments turned out to be negligible, I pooled all of my data (686 subjects total) for certain types of analysis. In the graph below, I group every subject as either someone who accepted the programming follow-up job or as someone who refused to return to program at any wage. Recall that the highest wage level I offered was considerably higher on a per-hour basis than what I expect their outside earning option to be.
Fig. 5. Characteristics of subjects who do not ask for a follow-up invitation, pooling all treatments and sample
I’ll discuss the three features in this graph in what appear to be the order of importance for predicting whether someone wants to program. There was an enormous difference in the percent of people who were willing to return for an easy tedious task that I call Counting. By inviting all of these subjects to return to count at the same hourly rate as the programming job, I got a rough measure of their opportunity cost of time. Someone with a high opportunity cost of time is less likely to take me up on the programming job. This might seem very predictable, but this is a large part of the reason why more Americans are not going into tech.
Considering the first batch of 310 subjects, I have a very clean comparison between the programming reservation wage and the reservation wage for counting. People who do not enjoy programming require a higher payment to program than they do to return for the counting job. Self-reported enjoyment is a very significant factor. The orange bar in the graph shows that the majority of people who accepted the programming job say that they enjoy programming.
Lastly, the blue bar shows the percent of female subjects in each group. The gender split is nearly the same. As I show several ways in the paper, there is a surprising lack of a gender gap for incentivized decisions.
I hope that my experiment will inspire more work in this area. Experiments are neat because this is something that someone could try to replicate with a different group of subjects or with a change to the design. Interesting gaps could open up between subject types under new circumstances.
The topic of skill problems in the US represents something reasonably new for labor market and public policy discussions. It is difficult to think of a labor market issue where academic research or even research using standard academic techniques has played such a small role, where parties with a material interest in the outcomes have so dominated the discussion, where the quality of evidence and discussion has been so poor, and where the stakes are potentially so large.
Cappelli, PH, 2015. Skill gaps, skill shortages, and skill mismatches: evidence and arguments for the United States. ILR Rev. 68 (2), 251–290.
I am pleased to announce that my paper “Willingness to be Paid: Who Trains for Tech Jobs?” has been accepted at Labour Economics.
Having a larger high-skill workforce increases productivity, so it is useful to understand how workers self-select into high-paying technology (tech) jobs. This study examines how workers decide whether or not to pursue tech, through an experiment in which subjects are offered a short programming job. I will highlight some results on gender and preferences in this post.
Most of the subjects in the experiment are college students. They started by filling out a survey that took less than 15 minutes. They could indicate whether or not they would like an invitation for returning again to do computer programming.
Subjects indicate whether they would like an invitation to return to do a one-hour computer programming job for $15, $25, $35, …, or $85.[1]This is presented as 9 discrete options, such as:
“I would like an invitation to do the programming task if I will be paid $15, $25, $35, $45, $55, $65, $75 or $85.”,
or,
“I would like an invitation to do the programming task if I will be paid $85. If I draw a $15, $25, $35, $45, $55, $65 or $75 then I will not receive an invitation.”,
and the last choice is
“I would not like to receive an invitation for the programming task.”
Ex-ante, would you expect a gender gap in the results? In 2021, there was only 1 female employee working in a tech role at Google for every 3 male tech employees. Many technical or IT roles exhibit a gender gap.
To find a gender gap in this experiment would mean female subjects reject the programming follow-up job or at least they would have a different reservation wage. In economics, the reservation wage is the lowest wage an employee would accept to continue doing their job. I might have observed that women were willing to program but would reject the low wage levels. If that had occurred, then the implication would be that there are more men available to do the programming job for any given wage level.
However, the male and female participants behaved in very similar ways. There was no significant difference in reservation wages or in the choice to reject the follow-up invitation to program. The average reservation wage for the initial experiment was very close to $25 for both males and females. A small number of male subjects said they did not want to be invited back at even the highest wage level. In the initial experiment, 5% of males and 6% of females refused the programming job.
The experiment was run in 3 different ways, partly to test the robustness of this (lack of) gender effect. About 100 more subjects were recruited online through Prolific to observe a non-traditional subject pool. Details are in the paper.
Ex-ante, given the obvious gender gap in tech companies, there were several reasons to expect a gender gap in the experiment, even on a college campus. Ex-post, readers might decide that I left something out of the design that would have generated a gender gap. This experiment involves a short-term individual task. Maybe the team culture or the length of the commitment is what deters women from tech jobs. I hope that my experiment is a template that researchers can build on. Maybe even a small change in the format would cause us to observe a gender gap. If that can be established, then that would be a major contribution to an important puzzle.
For the decisions that involved financial incentives, I observed no significant gender gaps in the study. However, subjects answered other questions and there are gender gaps for some of the self-reported answers. It was much more likely that women would answer “Yes” to the question
If you were to take a job in a tech field, do you expect that you would face discrimination or harassment?
I observed that women said they were less confident if you just asked them if they are “confident”. However, when I did an incentivized belief elicitation about performance on a programming quiz, women appear quite similar to men.
Since wages are high for tech jobs, why aren’t more people pursing them? The answer to that question is complex. It does not all boil down to subjective preferences for technical tasks, however in my results enjoyment is one of the few variables that was significant.
People who say they enjoy programming are significantly more likely to do it at any given wage level, in this experiment.
Fig. 3 Histogram of reservation wage for programming job, by reported enjoyment of computer programming (CP) and gender, pooling all treatments and samples
Figure 3 from the paper shows the reservation wage of participates from all three waves. Subjects who say that they enjoy programming usually pick a reservation wage at or near the lowest possible level. This pattern is quite similar whether you are considering males or females.
Interestingly, enjoyment mattered more than some of the other factors that I though would predict willingness to participate. About half of subjects said they had taken a class that taught them some coding, but that factor did not predict their behavior in the experiment. Enjoyment or subjective preferences seemed to matter more than training. To my knowledge, policy makers talk a lot about training and very little about these subjective factors. I hope my experiment helps us understand what is happening when people self-select into tech. Later, I will write another blog about the treatment manipulation and results, and perhaps I will have the official link to the article by then.
Buchanan, Joy. “Willingness to be Paid: Who Trains for Tech Jobs.” Labour Economics.
[1] We use a quasi-BDM to obtain a view of the labor supply curve at many different wages. The data is not as granulated as that which a traditional Becker-DeGroot-Marschak (BDM) mechanism obtains, but it is easy for subjects to understand. The BDM, while being theoretically appropriate for this purpose, has come under suspicion for being difficult for inexperienced subjects to understand (Cason and Plott, 2014). We follow Bartling et al. (2015) and use a discrete version.
You can download my full paper “If Wages Fell During a Recession” with Dan Houser from the Journal of Economic Behavior and Organization (only free until September 24, 2022).
There is a simulated recession in our experiment. We ask what happens if employers cut wages in response. Although nominal wage cuts are rare in the outside world, some of our lab subjects cut the wages of their “employee”. Employees retaliated against nominal wage cuts by shirking, such that the employers probably would have been better off keeping wages rigid.
We also tried the same thing with an inflation shock that allowed the employer to institute a real wage cut without a nominal wage cut. The reaction to that real wage cut was muted compared to the retaliation against the obvious nominal wage cut.
Inflation was implemented after 3 rounds of the same wage to create a reference point.
The Great Recession happened when I was an undergraduate. As I started my career in research, the issue of employment and recessions seemed like THE problem to work on. The economy of 2022 is so different from the years that inspired this experiment! Below I’ll highlight current events and work from others on this topic.
Inflation used to be something Americans could almost ignore, and now it’s at the highest level I have seen in my lifetime. Suddenly, people are so mad about inflation that politicians named their bill the Inflation Reduction Act just to make it popular.
The EWED crew has made lots of good posts on inflation. Although job openings and (nominal) wage increases are noticeable right now, Jeremy explored whether inflation has wiped out apparent wage growth.
More recently, the WSJ reports that real wages are down because inflation is so high. “Wage gains haven’t kept pace with inflation. Private-sector wages and salaries declined 3.1% in the second quarter from a year earlier, when accounting for inflation.”
Firms in 2022 did not just sit back and let real wages get eroded exactly proportional to inflation. But it is also not the case that Americans got a raise of 9% to exactly offset inflation. According to our experiment, there would be outrage if workers were experiencing a nominal wage cut in proportion to the real wage cut they are getting right now.
The high inflation combined with a hot job market makes this current economy hard to compare to anything in our recent history. Brian at Price Theory explained that inflation pressure is coming from both supply and demand factors.
In one sense, it seems like advice does not work. Advice is often ignored and sometimes even resented. People are going to just do what they want.
And yet, many people were in fact influenced by advice at some point in some situation. Many people can tell you about a mentor they spoke with or a book they read. Somehow, we do indeed need to learn about our environments and make choices about career and health and relationships. So, advice does work, sometimes.
A trivial example is why I stopped putting sugar in my coffee. A random anonymous message board post said that you should stop putting sugar in your coffee and your taste will adjust. “You won’t even miss it,” the anonymous poster told me. From that day forward, I stopped putting sugar in my coffee. I’m healthier and I don’t miss it. I was “nudged”. I was also predisposed to make this healthy decision, and I had sought out advice.
We might overestimate the effectiveness of advice because when people bother to talk about it, they mention the one time it affected them. First, they fail to mention the thousands of messages that had no effect (personally I still eat all kinds of junk food that contain sugar despite getting warnings to stop). And secondly, some decisions (perhaps including my coffee-sugar example) would have been made eventually without the advice event. Even recognizing those limitations, I still believe that messaging works sometimes.
It is tempting to think that, at almost zero cost, you could nudge people into making different decisions, just by sending them messages. There is a growing literature on this topic. Economists like myself are collecting data on whether it works.
One of these papers was just published:
Halim, Daniel, Elizabeth T. Powers, and Rebecca Thornton. 2022. “Gender Differences in Economics Course-Taking and Majoring: Findings from an RCT.” AEA Papers and Proceedings, 112: 597-602.
We implemented an RCT among undergraduate students enrolled in large introductory economics courses at the University of Illinois at Urbana Champaign. Two treatment arms provided encouragement to major in economics. A “prosocial” treatment provided information emphasizing the wide variety of career options and personal benefits associated with the major, while an “earnings” treatment provided information on financial returns. We evaluate the effects of the two treatments on subsequent choices to take another economics course and declaration of the economics major by the end of the student’s junior year using student-level matched administrative data. … Our primary aim is to evaluate whether women can be “nudged” into a major with low-cost, theoretically grounded, encouragement/information interventions.
Our primary sample consists of 1,976 students who were freshmen or sophomores during the focal course.
We find that the average male student receiving either treatment is more likely to take at least one more economics course after the focal course, but there is little evidence of increased majoring. The average woman appears unresponsive to either treatment.
Treated women with better than-expected focal-course performance are nudged to take an additional economics course. The likelihood that a woman takes another course in response to treatment increases by 5.6-5.9%-points with a favorable one-third- grade “surprise”. The hypothesis of treatment effects on women’s majoring, mediated or not, is rejected. Men’s susceptibility to treatment is invariant with respect to focal course performance.
Women did not demonstrate a bias towards a pro-social framing, and men did not demonstrate a bias towards a pro-earnings framing.
The pile of null results for messaging, when it is randomly assigned, is growing. It’s good to see null results get published though.
One of my current projects is related, but with a focus on computer programming instead of majoring in economics.
“How Dictators Use Information about Recipients” is my new project with Laura Razzolini. A working paper is up at SSRN. We use the Dictator Game to measure if people are generous toward others who made a similar choice.
In the first stage of the experiment, every player gets to make their own choice about whether or not to invest in a risky option (called Option B). Players can pick Option A if they do not want to invest.
In the second stage, participants get to decide if they will send any money to another anonymous player. If a “dictator” (the person who determines the final allocation of money) decided to take the risk on Option B in stage 1, would they be more generous toward a counterpart if they know that person also picked Option B?
We explain in our paper why the literature indicates such a form of favoritism could be expected.
Social identity theory is the psychological basis for intergroup discrimination. Economic experiments have created feelings of group identity in various ways, leading to significant effects on behavior. Chen & Li (2009) demonstrate that group identity formation can affect social preferences.
Chen and Li (2009) started by having subjects review paintings by two different modern artists. The subjects were divided into two groups, based on their reported painting preferences. Subjects were informed about their group membership by the experimenter.
The Chen and Li paper has been cited almost 2000 times. Group identity is a topic of interest. Several experimental papers demonstrate that strangers can have team feelings induced quickly with the right procedures. Those team loyalties affect behavior in incentivized tasks.
Group feelings artificially induced in the lab by Eckel & Grossman (2005) influence levels of cooperation and contributions to public goods. Pan & Houser (2013) induce group identities by asking subjects to complete tasks in groups. Pan & Houser (2019) found that investors trust in-group members more. The in-group has been induced in several different ways in lab experiments. In this paper, we investigate whether in-group effects arise from making a common financial decision in the first stage of the experiment.
Do you think our manipulation in the beginning affected giving?
Nope. There was no effect. Dictators who chose Option B did not give more to recipients who also chose Option B.
Not every result in the paper is a null result. One piece of information caused a large increase in giving. If we inform the dictator that their counterpart started with less money in the first stage (due to bad luck) then the dictator would give more. Sympathy was inspired, as we predicted, by knowing if a recipient was “poor” in the experiment. Conversely, if dictators are informed that their counterpart is “rich” then they excused themselves from having to give up money to help.
Information about financial choices, at least in our sterile simple environment, neither polarized nor united the participants. The giving with only choice information was higher than giving to “rich” but lower than giving to “poor”. Lastly, we provided all of the information at once. With full information, dictators were still heavily influenced by the starting endowments and choices information had no effect.
Understanding polarization is important. Humans exhibit tribal instincts to not help those who are perceived as different. In our experiment we seem to have found one difference that that people are willing to tolerate or overlook.
Chen, Yan, and Sherry Xin Li. “Group Identity and Social Preferences.” American Economic Review 99, no. 1 (March 2009): 431–57.
Eckel, Catherine C., and Philip J. Grossman. “Managing Diversity by Creating Team Identity.” Journal of Economic Behavior & Organization 58, no. 3 (2005): 371–92.
Pan, Xiaofei, and Daniel Houser. “Why Trust Out-Groups? The Role of Punishment under Uncertainty.” Journal of Economic Behavior & Organization 158 (2019): 236–54.
Pan, Xiaofei Sophia, and Daniel Houser. “Cooperation during Cultural Group Formation Promotes Trust towards Members of Out-Groups.” Proceedings of the Royal Society B: Biological Sciences 280, no. 1762 (July 7, 2013): 20130606.
Josh Hendrickson and Brian Albrecht have a Substack called Economic Forces that is a source of economics news and examples. We have linked to EF before at EWED.
Albrecht just published an op-ed titled “Behavioral Economics Is Fine. Just Keep It Away from Our Kids”. I’ll to respond to this, just as I responded to that other blog. I think the group of people who are pitting themselves against “behavioral economics” is small. They might even think of themselves as a minority embattled against the mainstream. So, why bother responding? That’s what blogs are good for.
I agree with Albrecht’s main point. The first thing an undergraduate should learn in economics classes is the classic theory of supply and demand. Even in its simplest form, the idea that demand curves slope down and supply curves slope up is powerful and important.*
Albrecht points out that there are some results that have been published in the behavioral economics literature that turned out not to replicate or, in the recent case of Dan Ariely, might be fraudulent. Then he makes a jump from there by calling the behavioral field of inquiry a “fad”. That’s not accurate. (See Scott Alexander on Ariely and related complaints.)
In his op-ed, Albrecht names the asset bubble as a faddish behavioral idea. Vernon Smith (with Suchanek and Williams) published “Bubbles, Crashes and Endogenous Expectations in Experimental Spot Asset Markets” in Econometrica in 1988. Bubbles have been replicated all around the world many times. There is no doubt in anyone’s mind that the “dot com” bubble had an element of speculation that became irrational at a certain point. This is not a niche topic or a very rare occurrence. Bubbles are observed in the lab and out in the naturally occurring economy.
Should we start undergrads on bubbles before explaining the normal function of capital markets? No. Lots of people think that stock markets generally work well, communicate reliable information, and should be allowed to function with minimal regulation. Behavioral Finance is usually right where it should be in the college curriculum, which is to be offered as an upper-division elective class for finance and economics majors. I am not going to do research on this, but I looked up courses at Cornell, and there it is: Behavioral Economics is one of many advanced elective classes offered for economics students. I don’t know how they teach ECON101 at Cornell, but it would seem like they are binning most of the behavioral content into later optional courses.
In a social media exchange, Albrecht pointed me to one of the posts by Hendrickson on how they handle the situations where it seems like economic forces are not explaining everything. Currently, for example, it seems like the labor market is not clearing right now because firms want to hire but wages are not rising. The quantity supplied seems lower than the quantity demanded at the market wage. Hendrickson claims that this market condition is temporary. He says that firms are cleverly paying bonuses to attract workers so that they won’t have to lower wages in the future when conditions return to normal post-Covid. This would be a perfect time to discuss downward nominal wage rigidity, a pervasive behavioral phenomenon.** It has been studied extensively in lab settings. Nominal wage rigidity has implications for monetary policy. Wage rigidity might be a “temporary” thing, but it helps to explain unemployment. Some of the research done by behavioral economists in this area follow the Akerlof 1982 paper on the gift exchange model. It was published 40 years ago by a Nobel prize winner and cited extensively.*** The seminal lab study of that theory is Fehr et al. 1993. There have been hundreds of replications of the main result that people will trade out of equilibrium due to positive reciprocity.
A blog post titled “The Death of Behavioral Economics” went viral this summer. The clickbait headline was widely shared. After Scott Alexander debunked it point-by-point on Astral Codex Ten, no one corrected their previous tweets. I recommend Scott’s blog for the technical stuff. For example, there is an important distinction between saying that loss aversion does not exist versus saying that its underlying cause is the Endowment Effect.
The author of the original death post, Hreha, is angry. Here’s how he describes his experience with behavioral economics.
I’ve run studies looking at its impact in the real world—especially in marketing campaigns.
If you read anything about this body of research, you’ll get the idea that losses are such powerful motivators that they’ll turn otherwise uninterested customers into enthusiastic purchasers.
The truth of the matter is that losses and benefits are equally effective in driving conversion. In fact, in many circumstances, losses are actually *worse* at driving results.
Why?
Because loss-focused messaging often comes across as gimmicky and spammy. It makes you, the advertiser, look desperate. It makes you seem untrustworthy, and trust is the foundation of sales, conversion, and retention.
He’s trying to sell things. I wade through ads every day and, to mix metaphors, beat them off like mosquitoes. Knowing how I feel about sales pitches, I don’t envy Hreha’s position.
I don’t know Hreha. From reading his blog post, I get the impression that he believes he was promised certain big returns by economists. He tried some interventions in a business setting and did not get his desired results or did not make as much money as he was expecting.
According to him, he seeks to turn people into “enthusiastic purchasers” by exploiting loss aversion. What would consumers be losing, if you are trying to sell them something new? I’m not in marketing research so I should probably just not try to comment on those specifics. Now, Hreha claims that all behavioral studies are misleading or useless.
The failure to replicate some results is a big deal, for economics and for psychology. I have seen changes within the experimental community and standards have gotten tougher as a result. If scientists knowingly lied about their results or exaggerated their effect sizes, then they have seriously hurt people like Hreha and me. I am angry at a particular pair of researchers who I will not name. I read their paper and designed an extension of it as a graduate student. I put months of my life into this project and risked a good amount of my meager research budget. It didn’t work for me. I thought I knew what was going to happen in the lab, but I was wrong. Those authors should have written a disclaimer into their paper, as follows:
Disclaimer: Remember, most things don’t work.
I didn’t conclude that all of behavioral research is misleading and that all future studies are pointless. I refined my design by getting rid of what those folks had used and eventually I did get a meaningful paper written and published. This process of iteration is a big part of the practice of science.
The fact that you can’t predict what will happen in a controlled setting seems like a bad reason to abandon behavioral economics. It all got started because theories were put to the test and they failed. We can’t just retreat and say that theories shouldn’t get tested anymore.
I remember meeting a professor at a conference who told me that he doesn’t believe in experimental economics. He had tried an experiment once and it hadn’t turned out the way he wanted. He tried once. His failure to predict what happened should have piqued his curiosity!
There is a difference between behavioral economics and experimental economics. I recommend Vernon Smith’s whole book on that topic, which I quoted from yesterday, for those interested.
The reason we run experiments is that you don’t know what will happen until you try. The good justification for shutting down behavioral studies is if we get so good at predicting what interventions will work that the new data ceases to be informative.
Or, what if you think nudges are not working because people are highly sensible and rational? That would also imply that we can predict what they are going to do, at least in simple situations. So, again, the fact that we are not good at predicting what people are going to do is not a reason to stop the studies.
I posted last week about how economists use the word “behavioral” in conversation. Yesterday, I shared a stinging critique of the behavioral scientist community written by the world’s leading experimental researcher long before the clickbait blog.
Today, I will share a behavioral economics success story. There are lots of papers I could point to. I’m going to use one of my own, so that readers could truly ask me anything it. My paper is called “My reference point, not yours”.
I started with a prediction based on previous behavioral literature. My design depended on the fact that in the first stage of the experiment, people would not maximize expected value. You never know until you run the experiment, but I was pretty confident that the behavioral economics literature was a reliable guide.
Some subjects started the experiment with an endowment of $6. Then they could invest to have an equal chance of either doubling their money (earn $12) or getting $1. To maximize expected value, they should take that gamble. Most people would rather hold on to their endowment of $6 than risk experiencing a loss. It’s just $5. Why should the prospect of losing $5 blind them to the expected value calculation? Because most humans exhibit loss aversion.
I was relying on this pattern of behavior in stage 1 of the experiment for the test to be possible in stage 2. The main topic of the paper is whether people can predict what others will do. High endowment people fail to invest in stage 1, so then they predict that most other participants failed to invest. The high endowment people failed to incorporate easily available information about the other participants, which is that starting endowments {1,2,3,4,5,6} were randomly assigned and uniformly distributed. The effect size was large, even when I added in a quiz to test their knowledge that starting endowments are uniformly distributed.
Here’s a chart of my main results.
Investing always maximizes expected value, for everyone. The $1 endowment people think that only a quarter of the other participants fail to invest. The $6 endowment people predict that more than half of other participants fail to invest.
Does this help Mr. Hreha get Americans to buy more stuff at Walmart, for whom he consults? I’m not sure. Sorry.
My results do not directly imply that we need more government interventions or nudge units. One could argue instead that what we need is market competition to help people navigate a complex world. The information contained in prices helps us figure out what strangers want, so we don’t have to try to predict their behavior at all.
Here’s the end of my Conclusion
One way to interpret the results of this experiment is that putting yourself in someone else’s shoes is costly. We often speak of it as a moral obligation, especially to consider the plight of those who are worse off than ourselves. Not only do people usually decline to do this for moral reasons, they fail to do it for money. Additionally, this experiment shows that, if people are prompted to think about a specific past experience that someone else had, then mutual understanding is easier to establish.
I’m attempting to establish general purpose laws of behavior. I’ll end with a quote from Scott Alexander’s reply post.
A thoughtful doctor who tailors treatment to a particular patient sounds better (and is better) than one who says “Depression? Take this one all-purpose depression treatment which is the first thing I saw when I typed ‘depression’ into UpToDate”. But you still need medical journals. Having some idea of general-purpose laws is what gives the people making creative solutions something to build upon.
Like last week, this post is adjacent to the internet chattering over whether behavioral economics is “dead”.
Vernon Smith wrote a book Rationality in Economics that came out in 2008. I’m going to pull some quotes from that book that I think are relevant. This is not an attempt to summarize the main point of the book.
I began developing and applying experimental economics methods to the study of behavior and market performance in the 1950s and 1960s…
Preface, pg xiii
Repetitive or real-time action in incomplete information environments is an operating skill different from modeling based on the “given” information postulated to drive the economic environment that one seeks to understand in the sense of equilibrium, optimality, and welfare. This decision skill is based on a deep human capacity to acquire tacit knowledge that defies all but fragmentary articulation in natural or written language.
Preface, pg xv
I think that improved understanding of various forms of ecological rationality will be born of a far better appreciation that most of human knowledge of “how,” as opposed to knowledge of “that,” depends heavily on autonomic functions of the brain. Human sociality leads to much unconscious learning in which the rules and norms of our socioeconomic skills are learned with little specific instructions… Humans are not “thinking machines” in the sense that we always rely on self-aware cognitive processes…
Introduction, pg 5, emphasis his
Research in economic psychology[footnote 6] has prominently reported examples where “fairness” and other considerations are said to contradict the rationality assumptions… Footnote 6: I will use the term “economic psychology” generally to refer to cognitive psychology as it has been applied to economics questions, and to a third subfield of experimental methods in economics and recently product-differentiated as “behavioral economics”… Behavioral economists have made a cottage industry of showing that SSSM assumptions seem to apply almost nowhere… their research program has been a candidly deliberate search “Identifying the ways in which behavior differs from the standard model…”
Introduction, pg 22, italics mine
Vernon Smith doesn’t always like the direction of the behavioral economics literature as a whole, however he agrees in the book that humans don’t always behave rationally. Chapter 6 has the very un-fuzzy title FCC Spectrum Auctions and Combinatorial Designs. Here’s an example of the way Vernon uses the word behavioral, which I offer like I did last week as an example of how “behavioral” is never going away.
I will provide a brief review of the theoretical issues and some… experimental findings that bear most directly on the conceptual and behavioral foundation of the FCC design problem.
Chapter 6, pg 116
Unfortunately, the popular press… has often interpreted the contributions of Kahneman as proving that people are “irrational,” in the popular sense of stupid… In the Nobel interview, Kahneman seems clearly to be uncomfortable with this popular interpretation and is trying to correct it.
Chapter 7, pg 150
Chapter 7 is about loss aversion and fairness and any other “behavioral” phenomenon of interest. I recommend anyone who is following the current conversation to read all of Chapter 7 for yourself. Vernon sees the best in all whenever possible, despite being annoyed that certain academics have used a tool he developed to make points that he believes are wrong. He forges a way forward for everyone in this book.
Experiments help us understand how human beings who are prone to error can arrive at good outcomes when they are working within good/effective institutions.