ETE

Elliott Thornley (EJT)

Research Fellow @ Global Priorities Institute
1047 karmaJoined
www.elliott-thornley.com

Bio

I work on AI alignment. Right now, I'm using ideas from decision theory to design and train safer artificial agents.

I also do work in ethics, focusing on the moral importance of future generations.

You can email me at thornley@mit.edu.

Comments
81

Makes sense! Unfortunately any x-risk cost-effectiveness calculation has to be a little vibes-based because one of the factors is 'By how much would this intervention reduce x-risk?', and there's little evidence to guide these estimates.

Whether longtermism is a crux will depend on what we mean by 'long,' but I think concern for future people is a crux for x-risk reduction. If future people don't matter, then working on global health or animal welfare is the more effective way to improve the world. The more optimistic of the calculations that Carl and I do suggest that, by funding x-risk reduction, we can save a present person's life for about $9,000 in expectation. But we could save about 2 present people if we spent that money on malaria prevention, or we could mitigate the suffering of about 12.6 million shrimp if we donated to SWP.

Oops yes, fundamentals between my and Bruce's cases are very similar. Should have read Bruce's comment!

The claim we're discussing - about the possibility of small steps of various kinds - sounds kinda like a claim that gets called 'Finite Fine-Grainedness'/'Small Steps' in the population axiology literature. It seems hard to convincingly argue for, so in this paper I present a problem for lexical views that doesn't depend on it. I sort of gestured at it above with the point about risk without making it super precise. The one-line summary is that expected welfare levels are finitely fine-grained.

Oh yep nice point, though note that - e.g. - there are uncountably many reals between 1,000,000 and 1,000,001 and yet it still seems correct (at least talking loosely) to say that 1,000,001 is only a tiny bit bigger than 1,000,000.

But in any case, we can modify the argument to say that S* feels only a tiny bit worse than S. Or instead we can modify it so that S is degrees celsius of a fire that causes suffering that just about can be outweighed, and S* is degrees celsius of a fire that causes suffering that just about can't be outweighed.

Nice post! Here's an argument that extreme suffering can always be outweighed.

Suppose you have a choice between:

(S+G): The most intense suffering S that can be outweighed, plus a population that's good enough to outweigh it G, so that S+G is good overall: better than an empty population.

(S*+nG): The least intense suffering S* that can't be outweighed, plus a population that's n times better than the good population G.

If extreme suffering can't be outweighed, we're required to choose S+G over S*+nG, no matter how big n is. But that seems implausible. S* is only a tiny bit worse than S, and n could be enormous. To make the implication seem more implausible, we can imagine that the improvement nG comes about by extending the lives of an enormous number of people who died early in G, or by removing (non-extreme) suffering from the lives of an enormous number of people who suffer intensely (but non-extremely) in G.

We can also make things more difficult by introducing risk into the case (in this sort of way). Suppose now that the choice is between:

(S+G): The most intense suffering S that can be outweighed, plus a population that's good enough to outweigh it G, so that S+G is good overall: better than an empty population.

(Risky S*+nG): With probability , the most intense suffering S that can be outweighed. With probability , the least intense suffering S* that can't be outweighed. Plus (with certainty) a population that's n times better than the good population G.

We've amended the case so that the move from S+G to Risky S*+nG now involves just a  increase in the probability of a tiny increase in suffering (from S to S*). As before, the move also improves the lives of those in the good population G by as much as you like. Plausibly, each  increase (for very small ) in the probability of getting S* instead of S (together with an n increase in the quality of G, for very large n) is an improvement. Then with Transitivity, we get the result that S*+nG is better than S+G, and therefore that extreme suffering can always be outweighed.

I think the view that extreme suffering can't always be outweighed has some counterintuitive prudential implications too. It implies that basically we should never think about how happy our choices would make us. Almost always, we should think only about how to minimize our expected quantities of extreme suffering. Even when we're - e.g. - choosing between chocolate and vanilla at the ice cream shop, we should first determine which choice minimizes our expected quantity of extreme suffering. Only if we conclude that these quantities are exactly the same should we even consider which of chocolate and vanilla tastes nicer. That seems counterintuitive to me.

Note also that you can accept outweighability and still believe that extreme suffering is really bad. You could - e.g. - think that 1 second of a cluster headache can only be outweighed by trillions upon trillions of years of bliss. That would give you all the same practical implications without the theoretical trouble.

Nice point, but I think it comes at a serious cost.

To see how, consider a different case. In X, ten billion people live awful lives. In Y, those same ten billion people live wonderful lives. Clearly, Y is much better than X. 

Now consider instead Y* which is exactly the same as Y except that we also add one extra person, also with a wonderful life. As before, Y* is much better than X for the original ten billion people. If we say that the value of adding the extra person is undefined and that this undefined value renders the value of the whole change from X to Y* undefined, we get the implausible result that Y* is not better than X. Given plausible principles linking betterness and moral requirements, we get the result that we're permitted to choose X over Y*. That seems very implausible, and so it counts against the claim that adding people results in undefined comparisons.

Wait, all the LLMs get 90+ on ARC? I thought LLMs were supposed to do badly on ARC.

You should read the post! Section 4.1.1 makes the move that you suggest (rescuing PAVs by de-emphasising axiology). Section 5 then presents arguments against PAVs that don't appeal to axiology. 

Load more