C

ClaireZabel

3505 karmaJoined

Comments
142

Right but this requires believing the future will be better if humans survive. I take ops point as saying she doesn't agree or is at least skeptical

I think the post isn't clear between the stances "it would make the far future better to end factory farming now" and "the only path by which the far future is net positive requires ending factory farming", or generally how much of the claim that we should try to end factory farming now is motivated by the "if we can't do that, we shouldn't attempt to do longtermist interventions because they will probably fail" vs. "if we can't do that, we shouldn't attempt to do longtermist interventions because they are less valuable becuase the EV of the future is worse" 

Anyway, working to cause humans to survive requires (or at least, is probably motivated by) thinking the future will be better that way. Not all longtermism is about that (see e.g. s-risk mitigation), and those parts are also relevant to the hinge of history question. 

I think again, the point of OP is trying to make is we have very little proof of concept of getting people to go against their best interests. And so if doing what's right isn't in the ai companies best interest op wouldn't believe we can get them to do what we think they should.  

I am saying aligning AI is in the best interests of AI companies, unlike the situation with ending factory farming and animal ag companies, which is a relevant difference. Any AI company that could align their AIs once and for all for $10M would do it in a heartbeat. I don't think they will do nearly enough to align their AIs (so in that sense, their incentives are not humanity's incentives), given the stakes, but they do want to at least a little

I think this argument is pretty wrong for a few reasons:

  • It generalizes way too far... for example, you could say "Before trying to shape the far future, why don't we solve [insert other big problem]? Isn't the fact that we haven't solved [other big problem] bad news about our ability to shape the far future positively?" Of course, our prospects would look more impressive if we had solved many other big problems. But I think it's an unfair and unhelpful test to pick a specific big problem, notice that we haven't solved it, and infer that we need to solve it first.
  • Many, if not most, longtermists believe we're living near a hinge of history and might have very little time remaining to try to influence it. Waiting until we first ended factory farming would inherently forgo a huge fraction of the time remaining on those views to make a difference. 
  • You say "It is a stirring vision, but it rests on a fragile assumption: that humanity is capable of aligning on a mission, coordinating across cultures and centuries, and acting with compassion at scale." but that's not true/exactly; I don't think longtermism rests on the assuption that the best thing to do is try to directly cause that right now (see the hinge of history link above). For example, I'm not sure how we would end factory farming, but it might require, as you allude to, massive global coordination. In contrast, creating techniques to align AIs might require only a relatively small group of researchers, and a small group of AI companies adopting research that is in their best interests to use. To be clear, there are longtermist-relevant interventions that might also require global and widespread coordination, but they don't all require it (and the ones I'm most optimistic about don't require it, because global coordination is very difficult).
  • Related to the above, the problems are just different, and require different skills and resources (and shaping the far future isn't necessarily harder than ending factory farming; for example, I wouldn't be surprised if cutting bio x-risk in half ends up being much easier than ending factory farming). Succeeding at one is unlikely to be the best practice for succeeding at the other. 

(I think factory farming is a moral abomination of gigantic proportions, I feel deep gratitude for people who are trying to end it, and dearly hope they succeed.)

I think there are examples supporting many different approaches and it depends immensely on what you're trying to do, the levers available to you and the surrounding context. E.g. in the more bold and audacious, less cooperative direction, Chiune Sugihara or Osckar Schindler come to mind. Petrov doesn't seem like a clear example in the "non-reckless" direction, and I'd put Arkhipov in a similar boat (they both acted rapidly under uncertainty in a way the people around them disagreed with, and took responsibility for a whole big situation when it probably would have been very easy to say to themselves that it wasn't their job to do things other than obey orders and go with the group). 

Thanks so much, Will! (Speaking just for myself) I really liked and agree with much of your post, and am glad you wrote it!

I agree with the core argument that there's a huge and very important role for EA-style thinking on the questions related to making the post-AGI transition go well; I hope EA thought and values play a huge role in research on these questions, both because I think EAs are among the people most likely to address these questions rigorously (and they are hugely neglected) and because I think EA-ish values are likely to come to particularly compassionate and open-minded proposals for action on these questions. 

Specifically, you cite my post

“EA and Longtermism: not a crux for saving the world”, and my quote

I think that recruiting and talent pipeline work done by EAs who currently prioritize x-risk reduction (“we” or “us” in this post, though I know it won’t apply to all readers) should put more emphasis on ideas related to existential risk, the advent of transformative technology, and the ‘most important century’ hypothesis, and less emphasis on effective altruism and longtermism, in the course of their outreach.

And say 

This may have been a good recommendation at the time; but in the last three years the pendulum has heavily swung the other way, sped along by the one-two punch of the FTX collapse and the explosion of interest and progress in AI, and in my view has swung too far.

I agree with you that in the intervening time, the pendulum has swung too far in the other direction, and am glad to see your pushback.

One thing I want to clarify (that I expect you to agree with): 

There’s little in the way of public EA debate; the sense one gets is that most of the intellectual core have “abandoned” EA

I think it's true that much of the intellectual core has stopped focusing on EA as the path to achieving EA goals. I think that most of the intellectual core continues to hold EA values and pursue the goals they pursue for EA reasons (trying to make the world better as effectively as possible, e.g. by trying to reduce AI risk), they've just updated against that path involving a lot of focus on EA itself. This makes me feel a lot better about both that core and EA than if much of the old core had decided to leave their EA values and goals behind, and I wanted to share it because I don't think it's always very externally transparent how many people who have been quieter in EA spaces lately are still working hard and with dedication towards making the world better, as they did in the past. 

In addition to what Michael said, there are a number of other barriers:

  • Compared to many global health interventions, AI is a more rapidly-changing field and many believe we have less time to have an impact, leading to a lot more updates-per-time about cost effectiveness, and making each estimate less useful. E.g. interventions like research on mechanistic interpretability can come into and out of fashion in a small number of years. Organizations focused on working with one political party might drop vastly in expected effectiveness after an election, etc. In contrast, GiveWell relies on studies that took longer to conduct than most of the AI safety field has existed (e.g. my understanding is Cisse et al 2016 took 8 years from start to publication; 8 years ago, about 2.5x longer than ChatGPT has existed in any form)
  • There is probably a much smaller base of small-to-mid-sized donors responsive to these estimates, making them less valuable
  • There are a large number of quite serious philosophical and empirical complexities associated with comparing GiveWell and longtermist-relevant charities, like your views about population ethics, total utilitarianism vs preference utilitarianism (vs others), the expected number of moral patients in the far future, acausal trade, etc.

    [I work at Open Phil on AI safety and used to work at GiveWell, but my views are my own]
     

Drift isn't the issue I was pointing at it my comment

I really appreciate this post! I have a few spots of disagreement, but many more of agreement, and appreciate the huge amount of effort that went into summarizing a very complicated situation with lots of stakeholders over an extended period of time in a way that feels sincere and has many points of resonance with my own experience. 

Seconding Ben, I did a similar exercise and got similarly mixed (with stark examples in both directions) results (including in some instances you allude to in the post)

Thanks for sharing this, Tom! I think this is an important topic, and I agree with some of the downsides you mention, and think they’re worth weighing highly; many of them are the kinds of things I was thinking in this post of mine of when I listed these anti-claims:

Anti-claims

(I.e. claims I am not trying to make and actively disagree with) 

  • No one should be doing EA-qua-EA talent pipeline work
    • I think we should try to keep this onramp strong. Even if all the above is pretty correct, I think the EA-first onramp will continue to appeal to lots of great people. However, my guess is that a medium-sized reallocation away from it would be good to try for a few years. 
  • The terms EA and longtermism aren’t useful and we should stop using them
    • I think they are useful for the specific things they refer to and we should keep using them in situations where they are relevant and ~ the best terms to use (many such situations exist). I just think we are over-extending them to a moderate degree 
  • It’s implausible that existential risk reduction will come apart from EA/LT goals 
    • E.g. it might come to seem (I don’t know if it will, but it at least is imaginable) that attending to the wellbeing of digital minds is more important from an EA perspective than reducing misalignment risk, and that those things are indeed in tension with one another. 
    • This seems like a reason people who aren’t EA and just prioritize existential risk reduction are less helpful from an EA perspective than if they also shared EA values all else equal, and like something to watch out for, but I don’t think it outweighs the arguments in favor of more existential risk-centric outreach work.

This isn’t mostly a PR thing for me. Like I mentioned in the post, I actually drafted and shared an earlier version of that post in summer 2022 (though I didn’t decide to publish it for quite a while), which I think is evidence against it being mostly a PR thing. I think the post pretty accurately captures my reasoning at the time, that I think often people doing this outreach work on the ground were actually focused on GCRs or AI risk and trying to get others to engage on that and it felt like they were ending up using terms that pointed less well at what they were interested in for path-dependent reasons. Further updates towards shorter AI timelines moved me substantially in terms of the amount I favor the term “GCR” over “longtermism”, since I think it increases the degree to which a lot of people mostly want to engage people about GCRs or AI risk in particular. 

Seriously. Someone should make a movie!

Load more