Introducing Students to Text Mining II

In the Fall of 2020, I blogged about how I introduce students to text mining, as part of a data analytics class.

Could Turing ever have imagined that a human seeking customer service from a bank could chat with a bot? Maybe text mining is a big advance over chess, but it only took about one decade longer for a computer (developed by IBM) to beat a human in Jeopardy. Winning Jeopardy requires the computer to get meaning from a sentence of words. Computers have already moved way beyond playing a game show to natural language processing.

https://economistwritingeveryday.com/2020/11/07/introducing-students-to-text-mining/

I told the students that “chat bots” are getting better and NLP is advancing. By July 2020, OpenAI had released a beta API playground to external developers to play with GPT-3, but I did not sign up to use it myself.

In April of 2022, I added some slides inspired by Alex’s post about the Turing Test that included output from Google’s Pathway Languages Model. According to Alex, “It seems obvious that the computer is reasoning.”

This week in class, I did something that few people could have imagined 5 years ago. I signed into the free new GPTChat function in class and typed in questions from my students.

We started with questions that we assumed would be easy to answer:

Then we were surprised that it answered a question we had thought would be difficult:

And then we asked two questions that prompted the program to hedge, although for different reasons.

It seems like the model is smarter than it lets on. For now, the creators are trying hard not to offend anyone or get in the way of Google’s advertising business. Overall, the quality of the answers are high.

Because of when I was born, I believe that something I have published will make it into the training data for these models. Will that turn out to be more significant than any human readers we can attract?

Of course, GPT can still make mistakes. I’m horrified by this mischaracterization of my tweets:

Thankful List in 2022

  1. I was able to get a free Covid-19 booster shot reformulated to fight new Omicron strains. This was easy to schedule at Walgreens, and I got a flu shot at the same time to save time. Vaccines for all types of diseases are advancing.  
  2. I didn’t lose money on crypto.
  3. When Russia invaded Ukraine in February of this year, I ordered Potassium Iodide tablets. I have not needed them.
  4. The shrinking ozone hole shows that the world can actually solve an environmental crisis
  5. NASA can save us from an incoming asteroid
  6. Bringing one over from Dynomight (HT: Tyler) “That there’s been a 93% decline in stomach cancer deaths over the past 100 years—from by far the biggest killer among cancers to one of the smaller ones—and mostly this was an accident, it happened because better food refrigeration reduced infections of H. pylori, a bacterium that wasn’t even identified until 1982 after most of the decline had already happened.” 

Blogging about Tweet Threads

Elon Musk bought Twitter almost a month ago.

The community I follow was speculating, half-seriously, that Musk has abruptly fired so many employees that the entire site would just crash.

My prediction is that many economists will stay on Twitter because it is such a useful venue for sharing work and ideas. There was also a sizable migration to Mastodon this week where a critical mass of economists will likely remain for a while. I made an account there @JoyBuchanan@econtwitter.net   I don’t think my habits are going to change much right now, meaning I will not add being active on Mastodon to my current slate of activities. Personally, I find Facebook to be a place for meaningful connection in my professional community, as well, but that might be much harder for younger researchers to break into.

Here’s my “What if it all goes down?” moment. I’ll document a great tweet thread by Dennie van Dolder, who is an inspiration for the art form of announcing a working paper on Twitter. (Mastodon does not currently have the thread format, which might point people back to blogging on the margin?)

Dennie van Dolder of the University of Essex provides a tweet thread

I’ll screenshot that thread (just in case).

He also provides a blog post with high production quality and the more traditional SSRN working paper. It’s 2022 and that’s the trifecta for putting work out online, or at least it was.  I have a “News” and a “History” category for classifying these blog posts. Where does this post belong? I settled on “Technology”.

New Double Auction Paper

This weekend I am at the Economic Science Association meeting.

Most of the economists in this group use experiments as part of their empirical research. In this post I will highlight some recently published work that is in the tradition of Vernon Smith, who influenced all of us so much.

Martinelli, C., Wang, J. & Zheng, W. Competition with indivisibilities and few traders. Experimental Economics (2022). https://doi.org/10.1007/s10683-022-09772-9

Abstract: We study minimal conditions for competitive behavior with few agents. We adapt a price-quantity strategic market game to the indivisible commodity environment commonly used in double auction experiments, and show that all Nash equilibrium outcomes with active trading are competitive if and only if there are at least two buyers and two sellers willing to trade at every competitive price. Unlike previous formulations, this condition can be verified directly by checking the set of competitive equilibria. In laboratory experiments, the condition we provide turns out to be enough to induce competitive results, and the Nash equilibrium appears to be a good approximation for market outcomes. Subjects, although possessing limited information, are able to act as if complete information were available in the market.

This small excerpt from their results shows a market converging toward equilibrium over time, under different treatment conditions. With some opportunities for practice and feedback, agents create surplus value by trading.

Figure 4 plots the average efficiency in each round in the four treatments. Efficiency is defined as the percentage of the maximum social surplus realized. … learning takes longer under the clearing house institution; hence, average efficiency under the clearing house institution presents a stronger upward trend over time. Under the clearing house institution, the average efficiencies start at levels lower than under the double auction institution, and remain statistically lower in the second half of the experiment. Nevertheless, we can observe from Fig. 4 that the upward trend of the efficiencies in clearing house treatments persist over time, and at the end of the experiment, the efficiency levels from the two institutions are close.

EWED Recommends Gifts 2022

Every year I request posts about stuff the writers actually use. My logic is that a great wave of stuff-buying is coming, so let’s try to highlight the good items and reduce holiday waste.

For Children

James recommends buying a whole bounce house. It might seem like something you could only afford to rent once a year, but the price of buying one you can use at home is now less than $300. In a big room, you can even do this indoors. Be the Christmas hero. Check on the space requirements.

I recommend two games that help kids learn to read. These are a great complement to Kindergarten or 1st-grade reading assignments. With enough confidence, you can convince kids that these games are toys and not “a book?”.

Sight Word Swat

Zingo sight words

SPOT IT is a card game that takes up almost zero space in the house or car. No reading or numeracy required and yet fun for adults!

Phantom Toll Booth A book for school-aged kids.

Little Tikes Easy Store Picnic Table with Umbrella, Scott says it’s worth the price if you have young kids around the house. Let them do messy food or activities there.

Food

Sounds like a good gift for adults who like to cook. Scott found a relatively affordable Black Rice.

Office to Garden

Compressed gas for computer maintenance. See Scott’s explanation on PC care.

Velcro Cut to Length – Zachary suggests: “Do you have a phone charger beside your bed that keeps falling on the ground? Just Velcro it to the nightstand lamp and it will stay exactly where you want it.”

Minute Soil is better than the dirt you have. This makes growing plants more fun and easier. Sounds like a great gift to wrap up for someone who likes gardening.

Set for Life

I agree with Zachary that cordless men’s hair clippers are a great investment.  

Barge All Purpose TF Cement Rubber – Praise from Scott: “Unlike most “superglues”, it will work on rough or porous surfaces, including situations like leather where flexibility is needed.”

Qwix Mix windshield – Windshield wiper fluid concentrate that is easy to store at home for when you need it.

Stoner Car Care 91154 10-Ounce Tarminator Tar, Sap, and Asphalt Remover Safe on Automotive Paint and Chrome on Cars, Trucks, RVs, Motorcycles, and Boats

Lastly, Mike has some correct life advice. Give yourself what your future self would want. For example, if you enjoy video games but don’t exercise enough, then try setting up an exercise bike right in front of your video games. That way you’ll get your cardio in and not have regrets the next week.

An intervention for children to change perceptions of STEM

Here is a a new paper related to the topic of women getting into technical fields (see previous post on my paper about programming).

Grosch, Kerstin, Simone Haeckl, and Martin G. Kocher. “Closing the gender STEM gap-A large-scale randomized-controlled trial in elementary schools.” (2022).

These authors were thinking about the same problem at the same time, unbeknownst to me. In their introduction they write, “We currently know surprisingly little about why women still remain underrepresented in STEM fields and which interventions might work to close the gender STEM gap.”

My conclusion from my paper is that, by college age, subjective attitudes toward tech are very important. This leads to the questions of whether those subjective attitudes are shaped at younger ages. Grosch et al. have run an experiment to target 3rd-graders with a STEM-themed game. I’ll quote their description:

The treatment web application (treatment app) intends to increase interest in STEM directly by increasing knowledge and awareness about STEM professions and indirectly by addressing the underlying behavioral mechanisms that could interfere with the development of interest in STEM. The treatment app presents both fictitious and real STEM professionals, such as engineers and programmers, on fantasy planets. Accompanied by the professionals, the children playfully learn more about various societal challenges, such as threats from climate change and to public health, and how STEM skills can contribute to combating them. The storyline of the app comprises exercises, videos, and texts. The app also informs children about STEM-related content in general. To address the behavioral mechanisms, the app uses tutorials, exercises, and (non-monetary) rewards that teach children a growth mindset and improve their self-confidence and competitive aptitude. Moreover, the app introduces female STEM role models to overcome stereotypical beliefs. To test the app’s effect, we recruited 39 elementary schools in Vienna (an urban area) and Upper Austria (a predominantly rural area).

This is a preview of their results, although I recommend reading their paper to understand how these measurements were made:

Girls’ STEM confidence increases significantly in the treatment group (difference: 0.047 points or 0.28 standard deviations, p = 0.002, Wald test), and the effect for girls is significantly larger than the effect for boys.

Result 2: Children’s competitiveness is positively associated with children’s interest in STEM. We do not find evidence that stereotypical thinking and a growth mindset is associated with STEM interest.

Lastly, my kids play STEM-themed tablet games. PBS Kids has a great suite of games that are free and educational. Unfortunately, I have not tried to treat one kid while giving the other kid a placebo app, so my ability to do causal inference is limited.

Joy Recommends Stuff for Kids 2022

I recommend two games for teaching kids to read their “sight words”. In early school grades, learning sight words can mean doing boring homework or rote memorization of flash cards. Instead use

Zingo Sight Words

and

 Sight Word Swat 

These are both fun interactive games that will get kids reading and talking about sight words. Zingo Sight Words is easier, so I recommend starting there. It’s a lot like bingo with a fun plastic dispenser. Kids can do the matching task to win the game even if they are not yet confident with reading.

Sight Word Swat is a little more advanced but good for expanding vocabulary past the first 50 words. It’s fast paced and fun. Someone yells out a word and then two players compete to “swat” with a plastic mallet the correct “fly” that has the word. Also, if the kid isn’t competitive, they could swat the correct word without time pressure.

Next, I’ll recommend a game that will not remotely feel like an educational exercise. “Spot It” is a genius card game. The tin is small, so you can store it easily and travel with it. The game is easy to teach to new friends because it’s just matching visual patterns. Spot It requires zero reading – not even reading numbers. So, a kid as young as 4 could potentially jump in and start trying to get matches. One of the great things about Spot It is that you play a series of mini games. It’s not the nightmare of a Monopoly game that could take multiple days to finish. So, if you are a parent with limited time to spend on card games, you can parachute in and out quickly.

All of these items are under $20 and potentially all of them could make fun holiday gifts, although your mileage may vary for gifting books and getting smiles. Personally, I bought the sight word games when we needed them for learning instead of trying to make them Christmas gifts.

I had been looking forward to reading the Phantom Tollbooth with my kids for a long time. This is the kind of book that you should read as soon as they are ready to understand most of the action, but not before. If too much is going over their heads, then it isn’t fun. In my case, this book prompted a lot of questions and great conversations with the 7-year-old. The book will teach kids a lot, but if you keep your tone light it feels like just another adventure story.

Willingness to be Paid Treatments

This is the second of two blog posts on my paper “Willingness to be Paid: Who Trains for Tech Jobs”. Follow this link to download the paper from Labour Economics (free until November 27, 2022).

Last week I focused on the main results from the paper:

  • Women did not reject a short-term computer programming job at a higher rate than men.
  • For the incentivized portions of the experiment, women had the same reservation wage to program. Women also seemed equally confident in their ability after a belief elicitation.
  • The main gender-related outcomes were, surprisingly, null results. I ran the experiment three times with slightly different subject pools.
  • However, I did find that women might be less likely to pursue programming outside of the experiment based on their self-reported survey answers. Women are more likely to say they are “not confident” and more likely to say that they expect harassment in a tech career.
  • In all three experiments, the attribute that best predicted whether someone would program is if they say they enjoy programming. This subjective attitude appears more important even than having taken classes previously.
  • Along with “enjoy programming” or “like math”, subjects who have a high opportunity cost of time were less willing to return to the experiment to do programming at a given wage level.

I wrote this paper partly written to understand why more people are not attracted to the tech sector where wages are high. This recent tweet indicates that, although perhaps more young people are training for tech than ever before, the market price for labor is still quite high.

The neat thing about controlled experiments is that you can randomly assign treatment conditions to subjects. This post is about what happened after adding either extra information or providing encouragement to some subjects.

Informed by reading the policy literature, I assumed that a lack of confidence was a barrier to pursuing tech. A large study done by Google in 2013 suggested that women who major in computer science were influenced by encouragement.

I provided an encouraging message to two treatment groups. The long version of this encouraging message was:

If you have never done computer programming before, don’t worry. Other students with no experience have been able to complete the training and pass the quiz.

Not only did this not have a significant positive effect on willingness to program, but there is some indication that it made subjects less confident and less willing to program. For example, in the “High Stakes” experiment, the reservation wage for subjects who had seen the encouraging message was $13 more than for the control subjects.

My experiment does not prove that encouragement never matters, of course. Most people think that a certain type of encouragement nudges behavior. My results could serve as a cautionary tale for policy makers who would like to scale up encouragement. John List’s latest book The Voltage Effect discusses the difficulty of delivering effective interventions at scale.

The other randomly assigned intervention was extra information, called INFO. Subjects in the INFO treatment saw a sample programming quiz question. Instead of just knowing that they would be doing “computer programming,” they saw some chunks of R code with an explanation. In theory, someone who is not familiar with computer programming could be reassured by this excerpt. My results show that INFO did not affect behavior. Today, most people know what programming is already. About half of subjects said that they had already taken a class that taught programming. Perhaps, if there are opportunities for educating young adults, it would be in career paths rather than just the technical basics.

Since the differences between treatments turned out to be negligible, I pooled all of my data (686 subjects total) for certain types of analysis. In the graph below, I group every subject as either someone who accepted the programming follow-up job or as someone who refused to return to program at any wage. Recall that the highest wage level I offered was considerably higher on a per-hour basis than what I expect their outside earning option to be.

Fig. 5. Characteristics of subjects who do not ask for a follow-up invitation, pooling all treatments and sample

I’ll discuss the three features in this graph in what appear to be the order of importance for predicting whether someone wants to program. There was an enormous difference in the percent of people who were willing to return for an easy tedious task that I call Counting. By inviting all of these subjects to return to count at the same hourly rate as the programming job, I got a rough measure of their opportunity cost of time. Someone with a high opportunity cost of time is less likely to take me up on the programming job. This might seem very predictable, but this is a large part of the reason why more Americans are not going into tech.

Considering the first batch of 310 subjects, I have a very clean comparison between the programming reservation wage and the reservation wage for counting. People who do not enjoy programming require a higher payment to program than they do to return for the counting job. Self-reported enjoyment is a very significant factor. The orange bar in the graph shows that the majority of people who accepted the programming job say that they enjoy programming.

Lastly, the blue bar shows the percent of female subjects in each group. The gender split is nearly the same. As I show several ways in the paper, there is a surprising lack of a gender gap for incentivized decisions.

I hope that my experiment will inspire more work in this area. Experiments are neat because this is something that someone could try to replicate with a different group of subjects or with a change to the design. Interesting gaps could open up between subject types under new circumstances.

The topic of skill problems in the US represents something reasonably new for labor market and public policy discussions. It is difficult to think of a labor market issue where academic research or even research using standard academic techniques has played such a small role, where parties with a material interest in the outcomes have so dominated the discussion, where the quality of evidence and discussion has been so poor, and where the stakes are potentially so large.

Cappelli, PH, 2015. Skill gaps, skill shortages, and skill mismatches: evidence and arguments for the United States. ILR Rev. 68 (2), 251–290.

Postmodernism to Poastmodernism

Authors of the kinds of books I read present themselves as a voice of reason against our declining society that no longer can evaluate arguments or define moral principles. (I’m fun at parties.) “Postmodernism” has been attacked all my life.

For a while, I have been looking for a successor of postmodernism. To simply define our age as the one that came after modernism seems unsatisfactory. How many more decades can we coast along on this antithesis idea?

One reason I don’t like the term postmodernism is that it gives a sense of progress where we might be losing ground. If you aren’t modern, then you are pre-modern. If you aren’t a verbal culture, then you have regressed to pictographs. If you aren’t engaging arguments, then you have degenerated to tribalism. So, postmodern might be dressing up a decline with a word that is too respectable sounding.

Calling people who use smartphones premodern does not seem right. But, what information are they consuming on those screens? Is it mostly low-quality videos and quick poasts? That doesn’t seem like what someone in 1900 would expect of a modern person.

Here’s an idea for the new century. We are in an age of poastmodernism, beginning with the founding of Twitter. This is different from the kind of skepticism or moral relativism that defined postmodernism. The poasters and their followers can be earnest. They retweet like evangelists. (A “poast” is a message posted in an internet forum.)

Poasts are short. This does not allow for nuance or traditional rational forms of argumentation. A poast could be referencing a rich history or body of literature, but if this generation has not evaluated those original sources then they are really just getting the meme. The poast does not provide its own context. Tyler Cowen says that people who think “modern art” is absurd have no context. Context for modern art would be the classical art and realistic landscape paintings that came before. Most Americans including myself are pretty ignorant about classical art. Similarly, how much value would teenagers get from Lord of the Rings internet memes if they have never seen the movies or read the books?

I’m on Twitter. The pace of discourse is more fun than reading a 50-page econ journal article. I get the appeal of poasting. It’s easy. Our first pediatrician told us not to let our baby use touchscreen games. She told us that it is good for a child to struggle to touch a ball that is two feet away across the floor. Better that they cry over the ball then get the dopamine too easily on a tablet game. Tapping on a screen trains kids for instant rewards. Something that concerns me about a generation that was not raised on books is that they will actually enjoy poasting less than I do, because they will be used to the rapid pace of reward. Twitter as a company benefits from the current generation of people who did not grow up with Twitter.

Poasting affects politics. This week two US Senate candidates had a debate. What would someone who gets most of their news from social media learn about the debate? Some top poasts about the debate have almost zero positive policy substance. Campaigners use the internet medium to dunk on their opponents instead of offer solutions to problems. What attracts engagement is the fire emoji.

This is not meant as a comment on either men as candidates. I share these jabs because lots of Americans are consuming their “news” in this form (see Pew Research chart). In postmodernism a successful political candidate has to appeal to feelings as much as reason. In poastmodernism, they only have 280 characters to work with. (Donald Trump was a skilled poaster.)

Getting elected today might require great poasting, but that has little to do with being good at governing. Most people think the details of government are dull. Ten minutes into a city council meeting, I’m bored and ready to check the notifications on my phone. And yet, we cannot just poast about poasting. It’s the physical political world and the classic books that make the best subjects of conversation. So, I’m not sure if the era of poastmodernism will last for a long time, or simply to the end of my lifetime. Millennials are not going to give up the dog fire meme.

You’ll have to pry it from our hands after our large generation has passed on. But will it inspire people in the future? I have already been informed that teenagers are calling our gifs “cringe”. They seem to prefer 90 second videos of their peers dancing to pop music. Don’t ask me what comes next after that.

I’ll end on a positive note by saying that sometimes shorter is better. Get to the point quickly, if you can. Some of the novels produced in the modern era were too long. Adam Smith’s books would be more widely read if they were shorter. Long-winded speeches are not necessarily good and I’m glad I am not forced to listen to them. (I get the tl;dr the next day.)

A lot of bad ideas were dressed up in pages of smart-sounding language and then passed off for wisdom in the modern era. It might be harder to pull that off today. Authoritarian regimes in the past relied on being able to lie about conditions on the ground. Today, we know what is happening because of Twitter. American elites believed lies about what was going on inside the Soviet Union for years. That would be impossible today.

Willingness to be Paid Paper Accepted

I am pleased to announce that my paper “Willingness to be Paid: Who Trains for Tech Jobs?” has been accepted at Labour Economics.

Having a larger high-skill workforce increases productivity, so it is useful to understand how workers self-select into high-paying technology (tech) jobs. This study examines how workers decide whether or not to pursue tech, through an experiment in which subjects are offered a short programming job. I will highlight some results on gender and preferences in this post.

Most of the subjects in the experiment are college students. They started by filling out a survey that took less than 15 minutes. They could indicate whether or not they would like an invitation for returning again to do computer programming.

Subjects indicate whether they would like an invitation to return to do a one-hour computer programming job for $15, $25, $35, …, or $85.[1]This is presented as 9 discrete options, such as:

“I would like an invitation to do the programming task if I will be paid $15, $25, $35, $45, $55, $65, $75 or $85.”,

or,

“I would like an invitation to do the programming task if I will be paid $85. If I draw a $15, $25, $35, $45, $55, $65 or $75 then I will not receive an invitation.”,

and the last choice is

“I would not like to receive an invitation for the programming task.”

Ex-ante, would you expect a gender gap in the results? In 2021, there was only 1 female employee working in a tech role at Google for every 3 male tech employees. Many technical or IT roles exhibit a gender gap.

To find a gender gap in this experiment would mean female subjects reject the programming follow-up job or at least they would have a different reservation wage. In economics, the reservation wage is the lowest wage an employee would accept to continue doing their job. I might have observed that women were willing to program but would reject the low wage levels. If that had occurred, then the implication would be that there are more men available to do the programming job for any given wage level.

However, the male and female participants behaved in very similar ways. There was no significant difference in reservation wages or in the choice to reject the follow-up invitation to program. The average reservation wage for the initial experiment was very close to $25 for both males and females. A small number of male subjects said they did not want to be invited back at even the highest wage level. In the initial experiment, 5% of males and 6% of females refused the programming job.

The experiment was run in 3 different ways, partly to test the robustness of this (lack of) gender effect. About 100 more subjects were recruited online through Prolific to observe a non-traditional subject pool. Details are in the paper.

Ex-ante, given the obvious gender gap in tech companies, there were several reasons to expect a gender gap in the experiment, even on a college campus. Ex-post, readers might decide that I left something out of the design that would have generated a gender gap. This experiment involves a short-term individual task. Maybe the team culture or the length of the commitment is what deters women from tech jobs. I hope that my experiment is a template that researchers can build on. Maybe even a small change in the format would cause us to observe a gender gap. If that can be established, then that would be a major contribution to an important puzzle.

For the decisions that involved financial incentives, I observed no significant gender gaps in the study. However, subjects answered other questions and there are gender gaps for some of the self-reported answers. It was much more likely that women would answer “Yes” to the question

If you were to take a job in a tech field, do you expect that you would face discrimination or harassment?

I observed that women said they were less confident if you just asked them if they are “confident”. However, when I did an incentivized belief elicitation about performance on a programming quiz, women appear quite similar to men.

Since wages are high for tech jobs, why aren’t more people pursing them? The answer to that question is complex. It does not all boil down to subjective preferences for technical tasks, however in my results enjoyment is one of the few variables that was significant.

People who say they enjoy programming are significantly more likely to do it at any given wage level, in this experiment.

Fig. 3 Histogram of reservation wage for programming job, by reported enjoyment of computer programming (CP) and gender, pooling all treatments and samples

Figure 3 from the paper shows the reservation wage of participates from all three waves. Subjects who say that they enjoy programming usually pick a reservation wage at or near the lowest possible level. This pattern is quite similar whether you are considering males or females.

Interestingly, enjoyment mattered more than some of the other factors that I though would predict willingness to participate. About half of subjects said they had taken a class that taught them some coding, but that factor did not predict their behavior in the experiment. Enjoyment or subjective preferences seemed to matter more than training. To my knowledge, policy makers talk a lot about training and very little about these subjective factors. I hope my experiment helps us understand what is happening when people self-select into tech. Later, I will write another blog about the treatment manipulation and results, and perhaps I will have the official link to the article by then.

Buchanan, Joy. “Willingness to be Paid: Who Trains for Tech Jobs.” Labour Economics.


[1] We use a quasi-BDM to obtain a view of the labor supply curve at many different wages. The data is not as granulated as that which a traditional Becker-DeGroot-Marschak (BDM) mechanism obtains, but it is easy for subjects to understand. The BDM, while being theoretically appropriate for this purpose, has come under suspicion for being difficult for inexperienced subjects to understand (Cason and Plott, 2014). We follow Bartling et al. (2015) and use a discrete version.