The extinction tournament

https://open.substack.com/pub/astralcodexten/p/the-extinction-tournament?r=1r8dq&utm_medium=ios&utm_campaign=post

This month’s big news in forecasting: the Forecasting Research Institute has released the results of the Existential Risk Persuasion Tournament (XPT). XPT was supposed to use cutting-edge forecasting techniques to develop consensus estimates of the danger from various global risks like climate change, nuclear war, etc.

The plan was: get domain experts (eg climatologists, nuclear policy experts) and superforecasters (people with a proven track record of making very good predictions) in the same room. Have them talk to each other. Use team-based competition with monetary prizes to incentivize accurate answers. Between the domain experts’ knowledge and the superforecasters’ prediction-making ability, they should be able to converge on good predictions.

They didn’t. In most risk categories, the domain experts predicted higher chances of doom than the superforecasters. No amount of discussion could change minds on either side.

The tournament asked about two categories of global disaster. “Catastrophe” meant an event that killed >10% of the the population within five years. It’s unclear whether anything in recorded history would qualify; Genghis Khan’s hordes and the Black Plague each killed about 20% of the global population, but both events were spread out over a few decades.

“Extinction” meant reducing total human population below 5,000 (it didn’t require literal extinction). This is very hard! Nuclear war is very unlikely to do this; people in bunkers or remote islands would survive at least the original blasts, and probably any ensuing winter. Even the worst pandemic might not get every remote island or uncontacted Amazonian tribe. Participants assigned the highest literal-extinction risk to AI, maybe because it can deliberately hunt down survivors.

You might notice that all of these numbers are pretty low! I’ve previously said I thought there was a 33% chance of AI extinction alone (and lots of people are higher than me). Existential risk expert Toby Ord estimated a 16% total chance of extinction by 2100, which is 16x higher than these superforecasters and 2.5x higher than these domain experts.

In some sense, this is great news. These kinds of expert + superforecasting tournaments seem trustworthy. Should we update our risk of human extinction downward?

Cancelling The Apocalypse?

It’s weird that there’s so much difference between experts and superforecasters, and awkward for me that both groups are so far away from my own estimates and those of people I trust (like Toby). Is there any reason to doubt the results?

Were the incentives bad?

The subreddit speculates about this - after all, you can’t get paid, or congratulated, or given a trophy, if the world goes extinct. Does that bias superforecasters - who are used to participating in prediction markets and tournaments - downward? What about domain experts, who might be subconsciously optimizing for prestige and reputation?

This tournament tried to control for that in a few ways.

First, most of the monetary incentives were for things other than predicting extinction. There were incentives for making good arguments that persuaded other participants, for correctly predicting intermediate steps to extinction (for example, a small pandemic, or a limited nuclear exchange), or for correctly guessing what other people would guess (this technique, called “reciprocal scoring”, has been validated in past experiments).

Second, this wasn’t really an incentive-based prediction market. Although they kept a few incentives as described above, it was mostly about asking people who had previously demonstrated good predictive accuracy to give their honest impressions. At some point you just have to trust that, absent incentives either way, reasonable people with good track records can be smart and honest.

Third, a lot of the probabilities here were pretty low. For example, the superforecasters got an 0.4% probability of AI-based extinction, compared to the domain experts’ 3%. At these levels it’s probably not worth optimizing your answers super-carefully to get a tiny amount of extra money or credibility. If it’s the year 2100, and we didn’t die from AI, who was right - the people who said there was a 3% chance, or the people who said there was an 0.4% chance? Everyone in this tournament was smart enough to realize that survival in one timeline wouldn’t provide much evidence either way.