Thanks Vasco. I’d like to clarify that Disabling Pain is also a severe/intensive level—think of it as the kind of crippling back pain or intense headache that prevents any enjoyment or productivity. And our project study found that moving a hen from a furnished cage to a cage-free aviary prevents, on average, hundreds of hours of Disabling Pain during her laying life. Specifically, transitioning to cage-free systems avoids approximately 275 hours of Disabling pain ( https://welfarefootprint.org/laying-hens).
Additionally, as argued in the book, the estimates for Excruciating Pain were extremely conservative (i.e. Cumulative Pain in both cage systems is likely higher than estimated). We'll have full estimates soon, once 'The Welfare Footprint of the Egg' is released.
Thanks, Vasco. I think we’ve clarified where our frameworks diverge—you prioritize maximizing expected welfare, assuming that equivalences across intensities are possible once the time component is introduced (an assumption I don’t share), whereas I tend to emphasize minimizing the most intense forms of suffering. Both approaches have their merits, but they naturally lead to different prioritizations. Perhaps we can just agree to disagree on this point.
Vasco, thank you for inviting me to look at your post. Here are some considerations .
Point 1: Uncertainty in Hedonic Capacity for Primitive Organisms
 Our recent EA Forum  post  explores the question of hedonic capacity for primitive organisms like ants, termites, and nematodes. I personally believe there are weak biological grounds—whether in neurological capacity or in behavioral prioritization needs—to support the view that these species can reach high‑intensity suffering levels, which are our primary ethical concern. Therefore, in my view, the suffering a an ant or a nematode can experience is not comparable to what a pig, cow, or chicken can, making the former not a moral priority.
Point 2: The Risks of Aggregating Intensities and Durations
 You noted in one message you sent me that, as a classical utilitarian, you are indifferent between averting 1 billion animal‑years of low‑intensity suffering and 1 animal‑year of high‑intensity suffering, as long as expected welfare increases. That’s a revealing point and highlights precisely why we at the Welfare Footprint Institute believe it is not advisable to create equivalences between intensities based on duration. Aggregation of this sort, besides lacking a sound empirical basis (what is the equivalence of one hour of Excruciating Pain in Annoying Pain? one month? one million years?), can actually mask and divert attention from what some of us consider really important and of primary moral concern: minimizing high levels of Pain (namely Excruciating and Disabling, in the Welfare Footprint classification). For this reason, our methodology explicitly measures time in distinct intensity categories rather than collapsing them into a single score, using tools like the Cumulative Pain metric (see also our EA Forum piece “Short Agony or Long Ache?” ).
Point 3: Prioritizing Farmed Animals for Tractability and Capacity
Just to reinforce a point about our focus on farmed animals: this stems not only from their clear high hedonic capacity but also from tractability—we can much more reliably intervene and measure impacts in these systems.
This is a very thought-provoking idea—thank you Aaron for sharing it. That said, I wonder about the analogy with carbon credits, which are based on the fungibility of carbon: one ton emitted can, in principle, be balanced by one ton absorbed elsewhere. When it comes to sentient experience, things are less straightforward.
For example, if a laying hen endures 200 hours of Disabling Pain, what would it mean to “offset” that suffering? Supporting a happier life for another animal may be valuable in itself, but it doesn’t reverse or neutralize the original experience. Each animal is a distinct individual, and pain—unlike carbon—cannot be canceled by pleasure elsewhere.
From a practical standpoint too, the goal should be to ensure funding is tied as tightly as possible to direct improvements at the source of suffering. The risk of a credit market is that it can introduce a layer of abstraction, where the focus shifts from making a specific farm better to simply trading units of 'welfare' to balance a ledger.
Speaking from the perspective of the Welfare Footprint approach (apologies for the self-reference), I see real potential in identifying reforms that can prevent large amounts of intense suffering in a measurable way. For instance, if evidence shows that implementing electrical stunning in a shrimp slaughter facility could avert, say, one billion hours of Disabling Pain and one hundred thousand hours of Excruciating Pain annually—and if that reform costs $200,000—then this creates a clear and actionable opportunity to “pay to reduce time in intense pain” directly. That might align well with what you're suggesting, while avoiding some of the conceptual challenges that arise from offsetting.
Hi Toby, thank you for your kind words. I might take some time to answer, but I’m happy to continue this back-and-forth (and please feel free to challenge or push on any point you disagree with).
I believe the problem we face is practical in nature: we currently lack direct access to the affective states of animals, and our indirect methods become increasingly unreliable as we move further away from humans on the evolutionary tree. For instance, inferring the affective capacity of a reptile is challenging, let alone that of an arthropod or annelid. But when you mention the caveat “even in principle,” I feel much more optimistic. I do believe that, in principle, how affect varies can be projected onto a universal scale—so universal that it could even compare affective experiences across sentient beings on other planets or in digital minds that have developed hedonic capacity.
Despite the variety of qualitative aspects (e.g., whether Pain stems from psychological or physical origins, or signals an unfulfilled need, a threat, damaged tissue, or a desire), the goodness or badness of a feeling—its ‘utility’—should be expressible along a single dimension of real numbers, with positive values for Pleasure, negative values for Pain, and zero as a neutral point. Researchers like Michael Mendl and Elizabeth Paul have explored similar ideas using dimensional models of affect, suggesting that valence and arousal might offer a way to compare experiences across species, which supports the idea of a universal scale—though they also note the empirical gaps we still face.
So, I see this challenge as a technical and scientific issue, not an epistemological one. In other words, I’m optimistic that one day we’ll be able to say that a Pain value of, let’s say, -2.456, represents the same amount of suffering for a human, a fish, or a fly—provided they have the neurological capacity to experience this range of intensities. I recognize this is a bold claim, and given the current lack of empirical data, it’s highly speculative—perhaps even philosophical. But this is my provisional opinion, open to change, of course! :)
Hi Bob, thank you for this valuable comment
You’re correct that there’s an apparent tension in how we frame Pain intensity categories. We addressed this briefly in our footnote [1]. Let me confirm: these categories are human-indexed and absolute, anchored to the intensity levels humans can experience. For example, Excruciating Pain represents the maximum intensity a human might feel under extreme conditions, such as severe torture. We use this human-centric scale because it’s the only reference point we can directly access and define with precision.
This absolute scale seems to conflict with the use of indicators in the operational definitions of intensity categories (such as Pain 'taking priority over most bids for behavioral execution' for Disabling Pain). These indicators are practical proxies to estimate where a species’ experience falls on this scale. However, they are not universal and require species-specific expertise to interpret (we discuss this in the context of differences between indicators and welfare metrics, here). The tension you pointed out—between the absolute nature of the scale and the provisional nature of the indicators, especially when diverse species are considered—is real and stems from the fundamental challenge of not being able to directly measure affective states in other beings.
This brings me to your second point: yes, you’re correct that assessing Pain levels via behavior alone is not sufficient. That is why, in the Cumulative Pain and Cumulative Pleasure methods, intensity attributions use as many diverse indicators as possible —chiefly the degree of attention demanded by the experience (hence behavioral changes), neurological evidence, pharmacology (dose/type of pain-relieving drugs), evolutionary reasoning, among others—to collectively estimate placement on this scale.
Regarding the interspecific aspect the article discusses: addressing this challenge is enormous. We believe that, alongside other approaches, examining some biological constraints on how Range (maximum intensity) and Resolution (discrimination ability) manifest in the simpler neurological structures of primitive sentient organisms can provide insights into their capacity for extreme Pain, at least for this group.
Toby, I really appreciate your detailed and thoughtful feedback
As you point out, we don’t yet have a way to assign a mathematical equivalence among intensity categories, such as saying one Pain intensity is 10x or 1000x as painful as another. But I believe somehow (most probably heuristically and roughly) minds navigate these comparisons, as they decide whether an expected (or actual) level of Pain outweighs the expected (or actual) level of another source of Pain, guiding their behavior accordingly (as I gather, this reflects the von Neumann-Morgenstern utility theorem’s principle of deriving preferences from risk trade-offs—thanks for bringing it up). That is indeed the whole biological point of having intensities of affective states: to better steer behaviors in the Benthamian direction of minimizing Pain and maximizing Pleasure (which should, overall, ultimately maximize the organism’s net reproduction). 
Moving to your example, I believe one day those equivalences among Pain intensities (and their trade-offs with Pleasure intensities) will be described (we discuss some possibilities in this other  post on potential equivalence methods), but for now, the '13x' example you gave isn’t something we think we can estimate yet. It is possible, though, for practical reasons to compare the time spent in each level of affect—which is the assumption of the Cumulative Pain and Cumulative Pleasure metrics—which has proven useful and insightful for comparing welfare across conditions.
You also raise a valid concern about our assumption that higher intensity Pain corresponds to greater 'signal strength' and requires 'additional processing units.' Note that this is a working hypothesis only. But the idea finds some support from neurobiological studies in vertebrates, where increasing Pain intensity has been found to correlate with greater activation of nociceptive pathways and broader neural engagement (in fact also demanding more energy, adding a possible physical component over the possible biological one). We extrapolate this to suggest that, in general, more intense Pain might require more neural resources—hence the mention of 'processing units.' Let me share, for what it’s worth, my personal belief in this sense: I don’t think there are significant differences between the level of Pain that a primate and a mouse can feel (despite their differences in brain sizes, and therefore in 'processing units') because the differences lie in cognitive brain systems, not the affective systems that process Pain (the affective-cognitive brain divide- see, Panksepp et al 2017 for an engaging debate on this topic). But I do believe that primitive sentient organisms (such as annelids), despite being able to experience affective states, are not able to experience Pain (and Pleasure) in the levels we (or a mouse or a bird) can. Although the great biological wonder of the sentience threshold has been crossed in these organisms, the processing power of the affective part (or parts) in these primitive sentient organisms is still rudimentary. In fact, my personal bet is that primitive sentient organisms are LiLr (as per our classification in this piece), and higher ranges and Resolution evolved as processing power kept increasing during the first millions of years after the onset of sentience (let me also suggest sharing my review of a book that explores the onset of sentience).
And yet, you’re absolutely right that this is not proven and it might be the case that an organism with a simpler nervous system might represent intense Pain with less processing energy, perhaps through a different mechanism, like a binary 'on/off' response rather than graded signals (this would be the HiLr scenario where primitive sentient organisms might experience intense Pain without distinguishing many states).
Regarding your concerns about the ethical relevance of defining Pain units in terms of signal strength or processing units, note we’re not proposing that neural metrics (e.g., energy use) should define Pain intensity (even because biological mechanisms—let alone neurological ones—rarely work in linear ways). Rather, we’re just exploring whether such physiological correlates can shed light on how affective states, including Pain, evolved in primitive sentient organisms.
Thank you very much for this thorough analysis and for the constructive comments.
Cynthia will address the points related to the results of the study, while I’ll focus here on the methodological aspects.
One of the most important points you raise touches on the core of the Welfare Footprint Framework itself: we recognize that inferring the affective states of other beings is enormously challenging—both in scope and depth. This task can never be complete; it will always require revisions and corrections as new evidence becomes available. The Welfare Footprint Framework is, in essence, an attempt to structure this challenge into as many workable, auditable pieces as possible, so that the process of inference can be progressively improved and openly scrutinized.
You are absolutely right that several painful conditions in chickens were not included in this initial analysis. This was a conscious decision—not because those harms are unimportant, but because we had to start with a subset that we judged to be among the most influential and best documented. The framework is designed precisely so that others can build upon it by incorporating additional conditions, refining prevalence estimates, or reassessing intensities. In that sense, this work should be seen as a living model, not a closed dataset.
Regarding the concern about the lack of use of high-quality statistical techniques, our approach is pragmatic. Where robust statistical analyses are feasible—such as in estimating prevalence or duration—they are of course welcome and encouraged. But in areas where measurement is currently impossible—most notably the intensity of affective states—we deliberately avoid mathematical sophistication for its own sake. No amount of elegant equations can compensate for the fact that subjective experience is, for now, beyond direct measurement. What we can do is gather convergent evidence from different sources - e.g. behavior, physiology, neurology, evolutionary reasoning - and generalize that evidence into transparent, revisable estimates, and make every assumption explicit so that others can challenge and adjust them.
As for the legitimacy of this approach, we believe that, while imperfect and always improvable, quantifying affective experiences is vastly more informative than relying solely on indirect indicators such as mortality. Animals can live long, physically healthy lives that are nevertheless filled with frustration, chronic pain, fear, or monotony—forms of suffering invisible to metrics that focus only on death or disease. By directing efforts toward gathering as much evidence as possible to infer the intensity and duration of each stage spent in negative and positive affective states, we can begin to capture what actually matters to the animal.
The framework has also evolved since this analysis was first produced. At that time, we focused primarily on negative affective states, but we have now extended the methodology to include Cumulative Pleasure alongside Cumulative Pain. Positive affective states are now being systematically quantified using the same operational principles, creating a fuller picture of animal welfare.
Finally, we are developing an open, collaborative platform where Pain-Tracks and Pleasure-Tracks can be published, discussed, and iteratively improved by the broader scientific community. Each component of a track—for example, the probability assigned to a certain intensity within a phase of an affective experience—could be challenged and refined, potentially even through expert voting or consensus mechanisms. The aim is to make welfare quantification transparent, dynamic, and collective rather than proprietary.
Thanks again for putting our work under the microscope—this is exactly what it needs. The Framework is meant to evolve, and feedback like yours helps it grow in the right direction.