I do independent research on EA topics. I write about whatever seems important, tractable, and interesting (to me).
I have a website: https://mdickens.me/ Much of the content on my website gets cross-posted to the EA Forum, but I also write about some non-EA stuff over there.
My favorite things that I've written: https://mdickens.me/favorite-posts/
I used to work as a software developer at Affirm.
The next-gen LLM might pose an existential threat
I'm pretty sure that the next generation of LLMs will be safe. But the risk is still high enough to make me uncomfortable.
How sure are we that scaling laws are correct? Researchers have drawn curves predicting how AI capabilities scale based on how much goes into training them. If you extrapolate those curves, it looks like the next level of LLMs won't be wildly more powerful than the current level. But maybe there's a weird bump in the curve that happens in between GPT-5 and GPT-6 (or between Claude 4.5 and Claude 5), and LLMs suddenly become much more capable in a way that scaling laws didn't predict. I don't think we can be more than 99.9% confident that there's not.
How sure are we that current-gen LLMs aren't sandbagging (that is, deliberately hiding their true skill level)? I think they're still dumb enough that their sandbagging can be caught, and indeed they have been caught sandbagging on some tests. I don't think LLMs are hiding their true capabilities in general, and our understanding of AI capabilities is probably pretty accurate. But I don't think we can be more than 99.9% confident about that.
How sure are we that the extrapolated capability level of the next-gen LLM isn't enough to take over the world? It probably isn't, but we don't really know what level of capability is required for something like that. I don't think we can be more than 99.9% confident.
Perhaps we can be >99.99% that the extrapolated capability of the next-gen LLM is still not as smart as the smartest human. But an LLM has certain advantages over humans—it can work faster (at least on many sorts of tasks), it can copy itself, it can operate computers in a way that humans can't.
Alternatively, GPT-6/Claude 5 might not be able to take over the world, but it might be smart enough to recursively self-improve, and that might happen too quickly for us to do anything about.
How sure are we that we aren't wrong about something else? I thought of three ways we could be disastrously wrong:
But we could be wrong about some entirely different thing that I didn't even think of. I'm not more than 99.9% confident that my list is comprehensive.
On the whole, I don't think we can say there's less than a 0.4% chance that the next-gen LLM forces us down a path that inevitably ends in everyone dying.
You have a list of "learn to learn" methods, and then you said "Can we haz nice thingss? Futureburger n real organk lief maybs?" I'm not sure I'm interpreting you correctly, but it sounds like you're saying something like
If we biological humans get sufficiently good at learning to learn, using methods such as the Doman method, mnemonics, etc., then perhaps we can keep up with the rate at which ASI learns things, and thus avoid bad outcomes where humans get completely dominated by ASI.
If that's what you mean then I disagree, I don't think our current understanding of the science of learning is remotely near where it would need to be to keep up with ASI, and in fact I would guess that even a perfect-learner human brain would still never be able to keep up with ASI regardless of how good a job it does. Human brains still have physical limits. An ASI need not have physical limits because it can (e.g.) add more transistors to its brain.
Harangue old-hand EA types to (i) talk about and engage with EA (at least a bit) if they are doing podcasts, etc; (ii) post on Forum (esp if posting to LW anyway), twitter, etc, engaging in EA ideas; (iii) more generally own their EA affiliation.
I think the carrot is better than the stick. Rather than (or in addition to) haranguing people who don't engage, what if we reward people who do engage? (Although I'm not sure what "reward" means exactly)
You could say I'm an old-hand EA type (I've been involved since 2012) and I still actively engage in the EA Forum. I wouldn't mind a carrot.
Will, I think you deserve a carrot, too. You've written 11 EAF posts in the past year! Most of them were long, too! I've probably cited your "moral error" post about a dozen times since you wrote it. I don't know how exactly I can reward you for your contributions but at a minimum I can give you a well-deserved compliment.
I see many other long-time EAs in this comment thread, most of whom I see regularly commenting/posting on EAF. They're doing a good job, too!
(I feel like this post sounds goofy but I'm trying to make it come across as genuine, I've been up since 4am so I'm not doing my best work right now)
I don't think it's possible to get the PDF because the publisher owns distribution rights. But if you haven't seen it already, you may be interested in this: https://intelligence.org/the-problem/
It's an article explaining MIRI's views on AI risk. It's not as detailed as the book, but the basic concepts are the same.
What changed between yesterday and today? How did you manage to overcome the 5th obstacle? What I get from section 5 is that you overcame social pressure essentially by deciding to. But why did you decide to, and why now rather than (say) a month ago? Do you think are any lessons others could take from your experience about how to overcome social pressure?
A list of ideas:
I've seen a number of people I respect recommend Horizon, but I've never seen any of them articulate a compelling reason why they like it. For example in that comment you linked in the footnote, I found the response pretty unpersuasive (which is what I said in my follow-up comment, which got no reply). Absence of evidence is evidence of absence, but I have to weigh that against the fact that so many people seem to like Horizon.
A couple weeks ago I tried reaching out to Horizon to see if they could clear things up, but they haven't responded. Although even if they did respond, I made it apparent that the answer I'm looking for is "yes Horizon is x-risk-pilled", and I'm sure they could give that answer even if it's not true.