AI Safety is an area that emerged from the EA community. EAs made the first significant efforts to build this field, and I’d say they have done a pretty good job overall. The number of people working on AI Safety has been increasing exponentially. However, many believe that the field-building efforts were not perfect. One example is that the talent pipeline has become over-optimized for researchers.

We now seem to be entering a similar stage with AI Welfare. Following Longview’s recent grants, the number of people working on issues related to digital sentience is starting to grow. Field-building efforts are likely to begin during this period, and we may soon see a new area that is closely connected to EA developing and becoming an independent field, much like AI Safety.

AI Welfare is a complex area. Empirical research is difficult and requires deep philosophical reasoning. Communication can be seen as fringe and is vulnerable to many failure modes. The public is already forming its own views, and there are many potential pitfalls ahead.

As someone who is planning to found a field-building organization focused on AI Welfare, I found this post quite helpful and wanted to gather more input.

So what are some mistakes you think were made during AI Safety field-building? How can we avoid repeating them while developing the AI Welfare field?

If you would prefer, we can also have a short chat and I can write down your comments here. 

38

0
0
2

Reactions

0
0
2
Comments1
Sorted by Click to highlight new comments since:

What mistakes have been made in AI safety field-building? Studies of AI safety, and now AI welfare, have adopted an unconditional anthropomorphism that approaches dogma. Possibilities have been mistaken for inevitabilities, placing problems in a dangerously narrow frame.

By “unconditional” anthropomorphism, I mean an unexamined assumption that all sufficiently capable problem-solving systems will necessarily resemble animal-like entities. This shouldn’t be considered obvious. AI systems, unlike animals (e.g., us), have no evolutionary heritage of selective pressure acting on single-viewpoint, single-action-sequence, world-engaged actors in which mandatory, physical, germ-line continuity has forced a struggle for survival to enable reproduction. AI systems aren’t shaped by these pressures. AI systems could be, yet aren’t constrained to be, like us. (Note that rational agent models are anthropomorphic: They were invented as idealized models of human actors.)

The field of AI safety has largely neglected fundamentally different ways in which intelligence could be organized, a vast universe of systems that could be safer, more functional, and, if designed with insight and care, perhaps incapable of suffering. If readers aren’t familiar with what I am referring to, this illustrates the problem. Search [ai agency drexler] for some starting points.

Curated and popular this week
Relevant opportunities