[Subtitle.] The utilitarians were right!
This is a crosspost for Infinite Dust Specks Are Worse Than One Torture by Bentham's Bulldog, which was originally published on Bentham's Newsletter on 20 March 2026. Bentham's Bulldog published a post 2 days later responding to comments on the 1st post.
1 Introduction
You hear the torture vs dust specks example come up a lot when discussing the alleged vices of utilitarianism. Emile Torres once wrote:
On another occasion, Yudkowsky argued that in a forced-choice situation you should prefer that a single person is tortured relentlessly for 50 years than for some unfathomable number of people to suffer the almost imperceptible discomfort of having a single speck of dust in their eyes. Just do the moral arithmetic — or, as he puts it, “Shut up and multiply!” Suffice it to say that most philosophers would vehemently object to this conclusion.
This is extremely misleading. If one bothers to read Eliezer’s piece on the subject (rather than opportunistically scanning it for things that sound bad out of context and then gleefully spreading them across the internet), they will see he has arguments for biting the bullet. He doesn’t just instruct you to shut up and multiply. There is a well-known philosophical paradox in the torture vs. dust specks case, and to resolve it, you’ll have to accept something weird. If provably every view will have to say something absurd-sounding, it’s dishonest to provide a contextless potshot that someone said one of the absurd things.
A number of very competent philosophers without any utilitarian sympathies—Michael Huemer, Dustin Crummett, Philip Swenson—agree with Eliezer here. Pointing to a counterintuitive result that is supported by strong arguments without noting that the arguments exist is dishonest.
But enough metacommentary. Here, I’ll explain why the judgment that a bunch of dust specks are worse than a torture is simply correct. I think of all the counterintuitive utilitarian judgments, this is one of the best supported. At the very least, this should be thought of as a puzzle for everyone, not a uniquely damning utilitarian result.
2 The opposite of love on the spectrum
The most famous argument for a bunch of dust specks being worse than a torture is called the spectrum argument. Here’s how it goes. Imagine one person being tortured. That seems pretty bad! But now imagine making the torture one iota less intense—perhaps even imperceptibly less intense—but inflicted on 100,000 times as many people. Surely things have gotten worse. Now take each of those tortures and replace them with one an iota less intense torture inflicted on 100,000 times as many people. Seems like things are worse once more.
But this process can simply continue until the pains in question are bid down to the level of a dust speck. At each step along the way, the difference in suffering is undetectable, and the number of victims is much greater. But if things continually get worse, then at the end, they are worse than at any previous state. A giant pile of dust specks is worse than a single torture.
This is a relatively simple argument but people sometimes get confused about it. Let me be clear on the minimal commitments. They are simply:
- Replacement: If you take some unit of suffering and make it a tiny bit less intense while affecting 100,000 times as many people, things have gotten worse.
- Transitivity: If A is worse than B and B is worse than C, then A is worse than C.
Both of these are extremely intuitive premises that we’d accept in any other domain.
Here’s a concrete way to set up the scenario. Let’s say that being boiled in 200-degree water is torture. Well surely 100 people being boiled in 199.99 degree water is worse. And 10,000 people being boiled in 199.98 degree water is worse than that. The process can continue until we arrive at the conclusion that a bunch of people being in water at some temperature where it’s just mildly uncomfortable is worse than one person being painfully boiled.
You could deny Replacement. But this seems like quite a tough pill to swallow. In addition, we can turn the screws on the denier of Replacement by varying a number of the features in question. The denier of replacement must think that there’s a pain at some amount of intensity so that any number of pains at lower intensity is less bad than that single pain at the higher level of intensity.
But then let’s imagine taking that pain and varying its duration rather than its intensity. Assume the pain in question lasts 10 minutes. Surely replacing each of those pains with 100,000 pains of equal intensity that last 9 minutes and 59 seconds is worse. And replacing those with pains that last 9 minutes and 58 seconds is worse. At the end of this road, we’re left with the conclusion that a very large number of second-long pains at that level of intensity is worse than the ten minute pain at that level. But then replace each of those pains which last only a second with a pain that lasts 10 minutes at a slightly lower level of intensity. Surely that is worse. But then by transitivity, some number of pains at the lower intensity level must be worse than the pains at the higher intensity level.
Thus, the denier of Replacement must think something stronger. They must think that either:
- There exists some level of pain intensity, so that if you make the pain imperceptibly less intense, and last 100,000 times as long, things haven’t gotten worse.
or
- There exist pains at some levels of intensity so that shortening their duration by an imperceptible amount and inflicting them on 100,000 times more people doesn’t make things worse.
I know intuitions differ about this to some degree, but these strike me as about as obviously false as anything could be. Thus, I think we should hold on to Replacement.
What about transitivity, the idea that if A is better than B which is better than C then A is better than C? There’s a pretty extensive philosophical literature on this, culminating in most people accepting transitivity. I can’t go into it in a ton of detail, but let me summarize the main reasons people think transitivity is right. They strike me as very decisive.
- Transitivity is just very intuitive. It’s on its face hard to make sense of A being better than B, B better than C, but C better than A. That just seems impossible. Note: this isn’t some idiosyncratic intuition that I have. It seems this way to almost everyone, and people only give up the judgment when they feel there are strong counterarguments.
Giving up transitivity requires giving up dominance. Dominance is the idea that if A is better than B, C is better than D, and there are no significant relationships between any of them, then A + C is better than B + D. For instance, if the Earth is better than Mars and Jupiter is better than Venus, assuming no relationships between the goodness of any parts of them, then the Earth and Jupiter would be better than Mars and Venus.
If you give up transitivity then you have to give up dominance too. Suppose there are three things: A, B, and C. Suppose A>B, B>C, and C>A. Now ask: is A B better than B C? Dominance implies both are better than the other. After all, A is better than B, and B is better than C, so it would seem A B is better than B C (composed of two parts each of which is better than one of the parts of B C). But B is as good as B and C is better than A—so it would seem B C is better!
Thus, if you give up transitivity you’ll give up dominance—thus becoming an effeminate soy-latte-drinking avocado-toast-eating BETA MALE!
Intransitive preferences are vulnerable to money pumps. Suppose you prefer A to B, B to C, and C to A. Well, because you like A better than B, you’d pay a penny to trade B for A. Similarly, you’d pay a penny to trade A for C, and C for B. But now you’re back where you started and down three cents. This can continue forever. Now, there are subtle ways of modifying decision theory to get around this, and even more subtle money pumps, but overall, I don’t think there’s much hope for the person with intransitive preferences to avoid money pumps without violating plausible norms of rationality.
There’s also a deeper problem for any way of avoiding money pumps: they all require saying either that whether you should make the trades that make up the money pump will depend on which bets you’ve made in the past or whether you’ll have good options in the future (for if you ignore both of those, then you’d always make the trade and get money pumped). But it’s very hard to believe that whether you should trade A for C will depend on how you got A. It’s similarly hard to believe that the fact that trading A for C will allow you to take future bets that you like will count against it. The fact that an action gives you more options that you’ll rationally want to take shouldn’t count against it.
- Here’s a plausible principle: suppose something becomes better, and then becomes better again, it will at the end be better. This seems very obvious. But suppose A>B>C>A. If you take C and replace it with B and then A, it will get worse. Thus, giving up transitivity requires rejecting this principle.
(See here for an argument from Theron Pummer for why rejecting transitivity isn’t enough to diffuse the spectrum argument. The core idea is that there will still have to be some weird kind of hypersensitivity, wherein some pains at some level of intensity can be worse than very intense pain if sufficiently numerous, but no pains at a slightly lower level of intensity can outweigh the very intense pain).
Now, suppose that you think transitivity and replacement both seem intuitive, but it also seems like no number of dust specks is worse than a torture. Which should you give up? In this case, we have a conflict between a specific case—torture vs dust specks—and two principles. But I think there are a number of reasons to think that intuitions about broad principles are more trustworthy than intuitions about cases. I go over the case for this in more detail here, so let me just briefly describe the argument I found most persuasive.
Principles apply to many different cases. Our intuitions are fallible. For this reason, we should expect true principles to appear to go wrong in some cases. If we assume that our intuitions are right 90% of the time, then if a principle applies to 100 cases, it will appear counterintuitive in ten. Thus, very broad principles suffer less from facing a counterexample than cases do. We’d expect true principles to conflict with case judgments, but we wouldn’t expect true case judgments to conflict with principles.
Even the most obvious-seeming principles often have cases that seem like counterexamples. Many cases appear to violate the law of non-contradiction. My model explains this—it’s just not that surprising that a true principle that applies to every single proposition would wrongly appear to go wrong occasionally. But this is a reason why when cases conflict with principles, you should revise the case judgment not the principle. This isn’t an infallible rule but it’s a decent heuristic.
The other thing is that the utilitarian view is by far the more natural one. It says simply: if you’re comparing between the badness of sets of agony, and there aren’t other relevant considerations in play, the worse one is the one where more pain is present. Certainly that should be our default assumption. Just as many dust specks can collectively be bigger than a mountain, the simplest view of how these things compare implies that dust specks can be worse than a torture. We should give up this view if there’s a strong counterexample, but if none of the principles seem clearly worse than the other, it’s not clear there’s much reason to give them up.
3 Risk
(This is a rehash of Michael Huemer’s argument).
Suppose that you are given the following two options:
- Prevent everyone on Earth from painfully stubbing their toe. Stipulate that in none of these cases would it cause death or serious injury—it would never rise beyond the level of a toe stub.
- Have a one in Rayo’s number chance of preventing a person from being tortured.
Rayo’s number, for those unaware, is a stupidly large number. It is bigger than the biggest number you can think of. You could take 100 trillion and then fill the known universe with factorial signs after it, then put that many factorial signs after 100 trillion, then put that many factorial signs after one trillion, and take the number you have and repeat the process that many times, and you’d still have a number that was effectively zero compared to Rayo’s number.
So all this is to say it’s a lot more than 11.
Intuitively it seems the first option is much better. You could keep doing the second option every millisecond throughout the whole history of the universes—and a billion billion billion billion universes besides—and the odds you’d have prevented a torture would still be basically zero.
In addition, if you think the first option is worse, then you must think that tiny risks of torture totally dominate all small bads. The primary harm, when you stub your toe, is that it might lead to torturous pain. On such a view, pains below the threshold are wholly ignorable in practice on non-instrumental grounds—they are simply outclassed by any risk of intense pain.
But here’s a principle that seems plausible: if you’re comparing the desirability of two lotteries, other causally-isolated lotteries happening galaxies away doesn’t affect their desirability. For example, suppose that you were trying to decide whether a 50% chance of feeding two hungry people was better than a guarantee of giving a bit less food to one hungry person. This principle says: if you’re trying to make this decision, you don’t have to take into account stuff that is happening a million galaxies away and that the lotteries have no effects on.
Seems straightforward enough! But together these principles imply that a bunch of dust specks are worse than a torture.
Why? Well, imagine that across Rayo’s number galaxies, there is either lottery 1 or lottery 2 (lottery 1 remember is a guarantee of preventing everyone on the planet from stubbing their toe, while lottery 2 is a 1/Rayo’s number chance of preventing a single torture). Lottery 1 is better, or so I’ve argued. But if lottery 1 is better in each individual location, then it’s better to have lottery 1 in every location.
So then from these principles, we get the result that lottery 1 in every location is better. But lottery 1 in every location involves a bunch of people being spared from a dust speck rather than around one person being spared from torture. Together, then, these principles imply that some number of dust specks are worse than a torture.
Or, to put the core insight plainly: repeatedly risking something makes it likely it will occur. So if dust specks are worse than a minuscule risk of torture, then some number of dust specks must be worse than a guaranteed torture.
4 The extremely simple argument
Here is an argument that I find persuasive.
- A mild pain is bad.
- Infinite mild pains are infinity times worse than a single mild pain.
- Therefore infinite mild pains are infinitely bad.
- Very intense pain is not infinitely bad.
- Things that are infinitely bad are worse than things that aren’t.
- Therefore infinite mild pains are worse than one very intense pain.
Where should one get off the boat? You might be tempted to reject 2. After all, suppose the badness of the first dust speck is 1, the next is .5, the next is .25, etc. Their total badness would be 2 even though they each have some badness. But that can only work if the badness of the dust specks is dependent on the number of other dust specks. This is hard to believe—why would the badness of my eye being mildly irritated by a speck of dust getting into it be affected by whether other people in distant galaxies get dust specks in their eyes too?
My guess is that they should reject 4. They should say that very intense pains are infinitely bad. But that doesn’t seem that plausible to me. Really, infinitely bad? Mild pains have a finite amount of badness—could it really be that there’s some small jump in pain-level that spikes things from a finite level to an infinite level.
Imagine trying to make a graph of the badness of the pain as a function of its intensity. The graph would have to look like this.
(A brief note: often there are thought to be multiple scales of badness—so that it’s not so much that torture just scores infinite on the badness scale dust specks are on, but that it’s lexically worse. That basically there’s a linear scale where things as bad as torture are measured, and on that scale, dust specks count for zero or infinitesimal. For the present graph, I’m using the dust speck scale).
Now, maybe you think that it’s vague the level of intensity at which things spike to infinity. But I don’t think this is right. It can’t be vague whether something is infinitely bad. I don’t even think badness can be vague, nor can other fundamental facts, but definitely it seems super sus if it’s vague whether something is infinitely bad.
5 Scope neglect
So far we’ve seen three arguments for thinking that a bunch of dust specks is worse than a torture. On the other side we have something like a single brute intuition. Here I’m going to argue that that one intuition isn’t even trustworthy!
The basic problem is that humans have very bad intuitions when it comes to large numbers. People will pay about the same amount to save 2,000 birds as 200,000 birds. Intuitively, a billion years of torture registers to us almost exactly the same as a million years. Even though a billion years is 1,000 times worse. This bias leads to a number of clear errors.
In light of this, we don’t have trustworthy intuitions about the badness of infinite dust specks. If we didn’t suffer from bias, so 1 million dust specks intuitively registered to us .1% as intensely as a billion dust specks, it’s not at all clear we’d have this intuition.
I find there’s a frame of mind I can get into where I can see why so many dust specks is worse than a torture. Remember, the total amount of time people will spend in mild discomfort from the dust specks is much more than the number of years conscious beings will ever experience. If everyone on Earth had a speck of dust in their eye for their whole life, it would still be infinitesimal compared to infinite dust specks. When one vividly grasps how bad infinite dust specks are, then it begins to seem not so crazy that they’re worse than one torture.
6 Conclusion
In this article, I’ve given three arguments for dust specks being worse than a torture. They all strike me as strong, especially the spectrum argument in section 2. I’ve also argued that the alternative intuition is positively unreliable, because humans don’t think accurately about large numbers. This case was supposed to be an embarrassment for the utilitarian—something that compelled us to abandon our view—and yet in the end, it seems like we have the better side of things!

Even if there were a 'Super-Observer' in the universe who experienced the sum of every independent event, an infinite sum of mild annoyances might still fail to add up to a single instance of torture.
In fact, such a claim is highly plausible. Sometimes, even if you have a trillion small things, their addition is not enough to create a higher level of intensity. We see this phenomenon everywhere in nature. In physics, for example, you can gather a trillion low-frequency radio waves, but they will never have the power to displace an electron like a single gamma ray can. In thermodynamics, a trillion raindrops at 20°C will never "add up" to the scorching heat of a single 10,000°C plasma bolt. We might similarly suggest that a trillion small bad feelings can never equal the horror of one true moment of agony. Simply increasing the quantity of something does not necessarily change its fundamental quality.
In my opinion, the core flaw of the "Replacement Argument" lies in there, in its assumption that suffering is a perfectly linear and infinitely additive variable. Under this purely quantitative view, if we say ϵ represent an infinitesimal unit of discomfort, the theory dictates that an infinite accumulation of these trivial annoyances must eventually outweigh a singular state of profound agony, expressed mathematically as:
limN→∞(N×ϵ)>Storture
However, this continuous model might be fundamentally misrepresenting the physiological realities of sentience. Our brains are not simple 19th-century sliders; they do not process information on a linear scale. Instead, they are hyperoptimized data processing machines designed by evolution to sort signals into tiered categories of "minor significance" versus "catastrophic priority."
It would be quite grounded in neurobiological facts to view the difference between a trillion small discomforts and a single moment of true agony as a massive "state transition" or a "quantum leap" in importance. Mechanistically speaking, a dust speck triggers low-threshold Aβ fibers that signal the thalamus. As the brain’s gatekeeper, the thalamus identifies these as low-priority "background noise" and filters most of them out. The signals that do survive are processed as minor sensory inputs that lack the biological weight required to engage the brain's survival systems. Torture, conversely, triggers a completely different set of high-threshold nociceptors (Aδ and C fibers). This recruitment ignites the "agony circuits" (the Anterior Cingulate Cortex and the Insular Cortex), triggering a systemic breakdown of the psychological and physiological self.
This is not merely "intense touch"; it is a fundamentally different state of being. Firing a dust signal a trillion times is never equivalent to firing an agony signal once; you cannot stack low intensity inputs to force a high intensity neurological state. Because evolution has built a sharp "cliff" between these levels of importance, you can never simply add up low priority signals to create a high priority emergency.
Ultimately, the idea that agony possesses a unique intensity that no amount of lower-level pain can ever reach might not only be plausible but analytically necessary if we adopt the view that 'suffering' is not a uniform currency, but a series of discrete state transitions. And as I explained, this model would be far more congruent with evolutionary biology as our neural architecture is hardwired for survival-critical prioritization, rather than the mere arithmetic summation of inputs.
I think by decoupling moral philosophy from the actual mechanics of the nervous system, we risk creating a "theoretically consistent" but biologically impossible ethics. Think of it like this: I can create a fictional physics where gravity works in reverse. My math for calculating orbital mechanics in that universe will be perfectly "internally consistent," but I’ll still never launch a rocket in THIS one.
Ethics should be treated like a branch of physics (specifically the physics of affective experience), not just a branch of math. In other words, our "moral arithmetic" must be built on the actual hardware of the brain, not on abstract lines that stretch to infinity and we should view affective neuroscience as our "Law Book’’ in the process.
Additional Thought:
While scope neglect is real, I think it is not the reason why we reject the utilitarian calculus. We reject it because we recognize qualitative lexicality. On an experience level we know that certain states are not merely quantitative intensifications of the same feeling but belong to an entirely different ontological order.
Aggregative utilitarians talk about 'Total Badness' as if there’s a giant, cosmic Excel sheet in the sky:) But one might simply reject these frameworks in favor of a person-affecting view, which I find far more intuitive.
Suffering is subject-dependent; it exists only within a conscious vessel. A trillion dust specks in a trillion different eyes are a trillion isolated events. They never 'meet' to form a collective mountain of pain.
1-) In Case A (Torture), one consciousness experiences 100% of the agony.
2-)In Case B (Dust Specks), no single consciousness experiences more than a 0.000001% discomfort.
If no single observer in the universe experiences a 'catastrophe,' can we truly say a catastrophe has occurred? In my opinion, by aggregating across separate minds, we create a 'phantom suffering' that no one actually feels. There is no 'Super-Observer' in the universe who feels the sum of those trillion specks:)
Additional Thought:
We can also apply John Rawls’s 'Veil of Ignorance' to test whether a trillion dust specks are truly worse than a single case of torture. Imagine you are behind a curtain, about to be born into the world, but you have no idea which 'conscious vessel' you will inhabit. You are given two choices:
If the 'Total Badness' of a trillion specks were truly greater than torture, a rational person behind the Veil would have to choose World A to avoid the 'larger' catastrophe. I don't know about you but I would never take that gamble. And if you also choose the specks, you admit that the 'Phantom Suffering' of the aggregate is a mathematical fiction:)
Hi Elif. Thanks for the comment.
Aggregation across existing individuals still matters in person-affecting views.
It depends on how catastrophe is defined. I think most people would consider a catastrophe all life on Earth dying painlessly, even though no one would experience anything in the process.
Would you take the gamble if it involved 1 min of excruciating pain? If not, do you think the probability of you experiencing more than 1 min of excruciating pain in your real future is lower than 1 in 1 trillion, 10^-12? The probability of dying in a road injury in high income countries in 2023 was 8.61*10^-5. So a probablity of 0.1 % of experiencing more than 1 min of excruciating pain in a road injury death would result in an annual probability of a random person in high income countries experiencing more than 1 min of excruciating pain of at least 8.61*10^-8 (= 8.61*10^-5*0.001), 86.1 k (= 8.61*10^-8/10^-12) times as large as 1 in 1 trillion. "At least" because there are other events besides road injury deaths which could lead to more than 1 min of excruciating pain. If you believe the probability of someone experiencing more than 1 min of excruciating pain in their real future is higher than 1 in 1 trillion, would you prefer their life to end painlessly to eliminate the risk of them experiencing excruciating pain?
I think "speck of dust in the eye" was a bad choice for the central example of this debate, because in some situations a speck in your eye can be literally zero painful, and in others it can be actually quite painful and distressing. I think this leads to miscommunications and poor intuitions.
My preferred alternative would be something like "lightly scratching your palm with your fingernail". And while this is technically pain, I find a single light scratch to be so minor that it has literally zero effect on my levels of happiness: in fact I will sometimes do this to myself on purpose when I get sufficiently bored.
I therefore think that that premise 1: "mild pain is bad", is wrong for sufficiently small definitions of "mild pain". I think you need a threshold of badness for the argument to work. Furthermore, I think most people who would side with the "dust specks" also have some threshold where they would pick the torture: for example if it was "punching a billion people in the face vs torture one person".
Hi titotal. A "speck of dust in the eye" is supposed to represent something which decreases welfare very little (considering all effects, including decreasing boredom), thus being very slightly bad, and worth avoiding (all else equal). So one can interpret premise 1 ("mild pain is bad") as "mildly decreasing welfare is bad". I believe the arguments in the post work for an arbitrarily small decrease in welfare. Do you agree?
Executive summary: The author argues that, despite strong contrary intuitions, a sufficiently large number of very mild harms (like dust specks) is worse than a single extreme harm (like torture), and that rejecting this leads to more implausible commitments.
Key points:
This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.