Is contribution to open-source capabilities research socially beneficial? - my reasoning

damc4

I'm considering participating in ARC (Abstraction And Reasoning challenge) competition ( https://arcprize.org/ ), and I'm trying to figure out if it's socially beneficial to participate in it. Please feel free to double check my reasoning on that or say something that I don't know.

I should have thought about this because starting to work on the solution and I did, but I learned new things since when I started.

ARC is a competition in which the solutions contribute to open-source artificial intelligence progress. I want to know if it's a good thing or not. Here's my reasoning.

Firstly, let's consider what strategy to use to figure out what is the right action.

I can see two strategies:

Write down the possible paths how artificial intelligence can play out, assign probabilities to those paths, assign probabilities conditional on the actions I can choose. Choose the action that increases the probability of AI going well the most.
Same as before, but ask a reliable prediction market (if one like that exists) to figure out the probabilities of each path. Ask the people who vote in the prediction market to familiarize themselves with my analysis of possible paths so that they make at least as informed action that I would do alone.

Assumptions

I am making the following assumptions.

Potential close-to-immortality

People can potentially become close to immortal, thanks to technological advancement. They might not become close to immortal, but what will happen in that case is more important than anything else, because the consequences will be experienced for much longer. Therefore, I will only care about what will happen assuming that people will become close to immortal.

It's better to get it right than quickly (but I'm not sure)

Artificial intelligence will have diminishing returns, like every other resource. That is because it can be applied to achieve a finite number of goals. If you have goals that have some cost (in terms of how much resource you need to use) and some value, then choosing the goals to follow is a knapsack problem (google it, if you don't know it). In Knapsack problem, the maximum attainable value will grow logarithmically with the amount of the resource.

For that reason, it's more important that humans get AI right, instead of getting it quickly. Because getting it right vs getting it wrong has consequences for close-to-infinity. And getting it later will cost a little at the beginning, but far in the future it won't matter that much when superintelligence started because of diminishing returns.

Counterargument: if we slow down AI so that we get it right, then some people will die in that time and they might lose it for close-to-finite. And each year about 1% of population dies, so that is quite a lot.

Possible paths

Pause -> solving alignment between humans -> good outcome.
Pause or no pause -> Closed AI -> concentration of power -> bad outcome.
Pause or no pause -> Closed AI -> no concentration of power -> good outcome.
Pause or no pause -> Open AI -> alignment -> good outcome.
Pause or no pause -> Open AI -> no alignment -> bad outcome.
Pause or no pause -> Open AI -> biological weapons -> bad outcome.

Explanation of paths:

Pause - means that artificial intelligence is paused, for example by imposing some limit on how much computational power groups of people (e.g. countries or companies) can use. If/when enforceable, also by imposing limit on algorithmic research.
Solving alignment between humans - means that inequality is stopped from growing and there will be no concentration of power, and that people can solve collective action problems (problems where one person benefits from doing something that is harmful for the society overall). Those problems can be solved through wealth insurance, decentralized government and strong decentralized surveillance.
Closed AI - means that the closed AI (owned by a small group of people, e.g. Anthropic) is much better than open AI (public, locally-run AI, e.g. Llama 3).
Open AI - means that the open AI (public, locally-run AI, e.g. Llama 3) is very close to closed AI (owned by a small group of people, e.g. Anthropic), and it's possible to verify that the open-source AI doesn't contain any malicious backdoors.
Concentration of power - means that power is concentrated in the hands of a small group of people.
Alignment - means that superintelligence acts in the interest of humans.
Biological weapons - means that >25% of people get killed by biological weapons where AI was significantly used for development.

There are also other paths, but I don't consider them very likely.

Some paths that I don't consider very likely:

Pause AI -> solving alignment between humans -> bad outcome - I don't think that bad outcome is likely after alignment between humans is solved on time (and with pause we can get enough time) because all negative outcomes can be avoided with the right rules, as long as they can be enforced. And solving alignment between humans involves making it easier to enforce rules (using strong decentralized surveillance). More precisely, maybe they can't be avoided, but then going a different path, it would be even less probable to avoid them.
Open AI -> concentration of power - if open-source is strong, then concentration of power is unlikely because open-source levels up the playing field.
Closed AI -> no alignment or biological weapons - if closed AI is much stronger that open-source then no alignment or biological weapons are unlikely because if there are only few most capable actors, they can coordinate themselves and they can implement safety precautions in their products.

There are also other risks (e.g. AI welfare, cybersecurity, lack of purpose) that I'm not talking about here because I have a reason to believe that they will either solve on its own or that they won't have impact for close-to-infinity., so they are not likely to sway my decision.

What would solution to alignment between humans look like?

It could be:

Decentralized governance.
Continuous wealth insurance to stop inequality from growing - with wealth insurance, people give off their wealth that is redistributed among all people, in exchange they receive financial security in the future - if they get unexpectedly poor, then they will be able to receive wealth from this redistribution system.
Strong decentralized surveillance - thanks to strong surveillance people would be able to enforce any rules. If the surveillance is centralized, then we get Orwell's 1984 which is very bad. But if it's decentralized, then we get utopia.

I will maybe share more about how I see that later.

Increased surveillance can also lead to loss of uncertainty. Uncertainty helps with getting equality because if there is uncertainty about future, then wealthy people have interest in trading their wealth for financial security in the future. Therefore, perhaps, increased surveillance should be rather accomplished after some form of wealth insurance agreement is established.

Additional thoughts

Speed can help with alignment because if there is less computational power, then there is lower risk that AI will become so powerful that we'll lose control.
Open-sourcing stuff can improve the equality situation. It might also harm it because people will get to AI quicker. But it will rather help more (if we never open-sourced then closed AI would have certain win).
The solution that I'd share will also help with equality because it will make decentralized training of AI more probable.
I am very optimistic about inner alignment because I have ideas how to solve that have a strong theoretical justification (I might share them later). They work especially with the ideas that I have. I'm not saying that I'm 100% certain that alignment will go well, but I have a very strong conviction relative to other problems.
Participating in ARC could popularize my ideas regarding inner alignment and outer alignment (I could link to them in a paper, I want to participate in "paper award").
Making biological weapons requires materials that are controlled which lowers probability of a bad outcome.

Conclusion

Based on the above information it's a trade-off between equality and the fact that less people will die if strong AI is developed sooner vs biological weapons.

Contributing to ARC will increase equality because it will level up the playing field.

It helps to develop AI faster, so less people will die in the meantime.

Additionally, I could use the paper to potentially popularize my ideas about inner alignment and outer alignment.

It also increases the probability of biological weapons killing a lot of people because it means more capable open AI (where people can remove safety guardrails).

It's difficult to weight the benefits vs costs.

It would be good for humans to solve alignment between humans before approaching AI, but I believe that until this is solved, it's best for people to strengthen open-source/decentralization so that it doesn't stay behind closed AI companies. So, at the moment, I sway towards participating in the competition. I have a little bit of reasoning behind that decision, but I need more time to verbalize that, so I plan to describe that later.

I also don't have any good idea right now how I could use prediction market to make a more informed decision.

I'd be happy to hear your thoughts.

Effective Altruism Forum
EA Forum

[ Question ]