DM

David Mathers🔸

5759 karmaJoined

Bio

Superforecaster, former philosophy PhD, Giving What We Can member since 2012. Currently trying to get into AI governance. 

Posts
11

Sorted by New

Comments
684

I think my basic reaction here is that longtermism is importantly correct about the central goal of EA if there are longtermist interventions that are actionable, promising and genuinely longtermist in the weak sense of "better than any other causes because of long-term effects", even if there are zero examples of LT interventions that meet the "novelty" criteria, or lack some significant near-term benefits. 

Firstly, I'd distinguish here between longtermism as a research program, and longtermism as a position about what causes should be prioritized right now by people doing direct work. At most criticisms about novelty seem relevant to evaluating the research program, and deciding whether to fund more research into longtermism itself. I feel like they should be mostly irrelevant to people actually doing  cause prioritization over direct interventions. 

Why? I don't see why longtermism wouldn't count as an important insight for cause prioritization if it was the case that thinking longtermistly didn't turn up any new intervention that we're not already known to be good, but it did change the rankings of interventions so that I changed my mind about which interventions were best. That seems to be roughly what longtermists themselves think is the situation with regard to longtermism. It's not that there is zero reason to do X-risk reduction type interventions even if LT is false, since they do benefit current people. But the case for those interventions being many times better than other things you can do for current people and animals rests on, or at least is massively strengthened by Parfit-style arguments about how there could be many happy future people. So the practical point of longtermism isn't to produce novel interventions, necessarily, but also to help us prioritize better among the interventions we already knew about. Of course, the idea of Parfit-style arguments being correct in theory is older than using it to prioritize between interventions, but so what? Why does that effect whether or not it is a good idea to use it to prioritize between interventions now? The most relevant question for what EA should fund isn't "is longtermist philosophy post -2017 simultaneously impressively original and of practical import" but "should we prioritize X-risk because of Parfit-style arguments about the number of happy people there could be in the future." If the answer to the latter question is "yes", we've agreed EAs should do what longtermists want in terms of direct work on causes, which is at least as important than how impressed we should or shouldn't be with the longtermists as researchers.* At most the latter is relevant to "should we fund more research into longtermism itself", which is important, but not as central as what first-order interventions we should fund. 
To put the point slightly differently, suppose I think the following: 

1) Based on Bostrom and Parfit-style arguments-and don't forget John Broome's case for making happy people being good-I think it's at least as influential on Will and Toby-the highest value thing to do is some form of X-risk reduction, say biorisk reduction for concreteness.

2) If it weren't for the fact that there could exist vast numbers of happy people in the far future, the benefits on the margin to current and near future people of global development work would be higher than biorisk reduction, and should be funded by EA instead, although biorisk reduction would still have significant near-term benefits, and society as a whole should have more than zero people working on it. 

Well, then I am a longtermist, pretty clearly, and it has made a difference to what I prioritize. If I am correct about 1), then it has made a good difference to what I prioritize, and if I am wrong about it, it might not have done. But it's just completely irrelevant to whether I am right to change cause prioritization based on 1) and 2) how novel 1) was if said in 2018, or what other insights LT produced as a research program.  

None of this is to say 1), or its equivalent about some other purpoted X-risk, is true. But I don't think you've said anything here that should bother someone who thinks it is. 

Year of AGI

25 years seems about right to me, but with huge uncertainty. 

I think on the racism fron Yarrow is referring to the perception that the reason Moskowtiz won't fund rationalist stuff is because either he thinks that a lot of rationalist believe Black people have lower average IQs than whites for genetic reasons, or he thinks that other people believe that and doesn't want the hassle. I think that belief genuinely is quite common among rationalists, no? Although, there are clearly rationalists who don't believe it, and most rationalists are not right-wing extremists as  far as I can tell. 

What have EA funders done that's upset you? 

Not everything being funded here even IS alignment techniques, but also, insofar as you just want general better understanding of AI as a domain through science, why wouldn't you learn useful stuff from applying techniques to current models. If the claim is that current models are too different from any possible AGI for this info to be useful, why do you think "do science" would help prepare for AGI at all? Assuming you do think that, which still seems unclear to me. 

I asked about genuine research creativity not AGI, but I don't think this conversation is going anywhere at this point. It seems obvious to me that "does stuff mathematicians say makes up the building blocks of real research" is meaningful evidence that the chance that models will do research level maths in the near future is not ultra-low, given that capabilities do increase with time. I don't think this analogous to IQ tests or the bar exam, and for other benchmarks, I would really need to see what your claiming is the equivalent of the transfer from frontier math 4 to real math that was intuitive but failed. 

The forum is kind of a bit dead generally, for one thing. 

I don't really get on what grounds your are saying that the Coefficient Grants are not to people to do science, apart from the governance ones. I also think you are switching back and forth between: "No one knows when AGI will arrive, best way to prepare just in case is more normal AI science" and "we know that AGI is far, so there's no point doing normal science to prepare against AGI now, although there might be other reasons to do normal science." 

I guess I still just want to ask: If models hit 80% on frontier math by like June 2027, how much does that change your opinion on whether models will be capable of "genuine creativity" in at least one domain by 2033. I'm not asking for an exact figure, just a ballpark guess. If the answer is "hardly at all", is there anything short of an 100% clear example of a novel publishable research insight in some domain, that would change your opinion on when "real creativity" will arrive? 

I think what you are saying here is mostly reasonable, even if I am not sure how much I agree: it seems to turn on very complicated issue in the philosophy of probability/decision theory, and what you should do when accurate prediction is hard, and exactly how bad predictions have to be to be valueless. Having said that, I don't think your going to succeed in steering conversation away from forecasts if you keep writing about how unlikely it is that AGI will arrive near term. Which you have done a lot, right? 

I'm genuinely not sure how much EA funding for AI-related stuff even is wasted on your view. To a first approximation, EA is what Moskowitz and Tuna fund. When I look at Coefficient's-i.e. what previously was Open Phil's-7 most recent AI safety and governance grants here's what I find: 

1) A joint project of METR and RAND to develop new ways of assessing AI systems for risky capabilities.

2) "AI safety workshop field building" by BlueDot Impact

3) An AI governance workshop at ICML 

4) "General support" for the Center for Governance of AI. 

5) A "study on encoded reasoning in LLMs at the University of Maryland"

6) "Research on misalignment" here: https://www.meridiancambridge.org/labs 

7) "Secure Enclaves for LLM Evaluation" here https://openmined.org/

So is this stuff bad or good on the worldview you've just described? I have no idea, basically. None of it is forecasting, plausibly it all broadly falls under either empirical research on current and very near future models, training new researchers, or governance stuff, though that depends on what "research on misalignment" means. But of course, you'd only endorse if it is good research. If you are worried about lack of academic credibility specifically, as far as I can tell 7 out of the 20 most recent grants are to academic research in universities. It does seem pretty obvious to me that significant ML research goes on at places other than universities, though, not least the frontier labs themselves. 
 

I guess I feel like if being able to solve mathematical problems designed by research mathematicians to be similar to the kind of problems they solve in their actual work is not decent evidence that AIs are on track to be able to do original research in mathematics in less than say 8 years then what would you EVER accept as empirical evidence that we are on track for that, but not there yet?  

Note that I am not saying this should push your overall confidence to over 50% or anything, just that it ought to move you up by a non-trivial amount relative to whatever your credence was before. I am certainly NOT saying that skill on Frontier Math 4 will inevitably transfer to real research mathematics, just that you should think there is a substantial risk that it will. 

I am not persuaded by the analogy to IQ test scores for the following reason. It is far from clear that the tasks that LLMs can't do despite scoring 100 on IQ tests are anything like as similar as the Frontier Math 4 tasks are at least allegedly designed to resemble real research questions in mathematics*, because the latter are being deliberately designed for similarity, whereas IQ tests are just designed so that skill on them correlates with skill on intellectual tasks in general among humans. (I also think the inference towards "they will be able to DO research math", from progress on Frontier Math 4, is rather less shaky than "they will DO proper research math in the same way as humans". It's not clear to me what tasks actually require "real creativity" if that means a particular reasoning style, rather than just the production of novel insights as an end product. I don't think you or anyone else knows this either.) Real math is also uniquely suited to questions-and-answer benchmarks I think, because things really are often posed as extremely well-defined problems with determinate answers, i.e. prove X. Proving things is not literally the only skill mathematicians have, but being able to prove the right stuff is enough to be making a real contribution. In my view that makes claims for construct validity here much more plausible than say, inferring Chat-GTP can be a lawyer if it passes the bar exam. 

In general, your argument here seems like it could be deployed against literally any empirical evidence that AIs were approaching being able to do a task, short of them actually performing that task. You can always say "just because in humans, ability to do X is correlated with ability to do Y, doesn't mean the techniques the models are using to do X can do Y with a bit of improvement." And yes, that is always true, that it doesn't *automatically* mean that. But if you allow this to mean that no success on any task ever significantly moves you at all about future real world progress on intuitively similar but harder tasks, you are basically saying it is impossible to get empirical evidence that progress is coming before it has arrived, which is just pretty suspicious a priori. What you should do in my view, is think carefully about the construct validity of the particular benchmark in question, and then-roughly-updated your view based on how likely you think it is to be basically valid, and what it would mean if it was. You should take into account the risk that success on Frontier Math 4 is giving real signal, not just the risk that it is meaningless. 

My personal guess is that it is somewhat meaningful, and we will see the first real AI contributions to maths in 6-7 years, that is 60% chance by then of AI proofs important enough for credible mid-ranking journals. EDIT: I forgot my own forecast here, I expect saturation in about 5 years so "several" years is an exaggeration. Nonetheless I expect some gap between Frontier Math 4 being saturated and the first real contribuitions to research mathematics: I guess 6-9 years until real contributions is more like my forecast than 6-7 To be clear, I say "somewhat" because this is several years after I expect the benchmark itself to saturate.  But I am not shocked if someone thinks "no, it is more likely to be meaningless". But I do think if your going to make a strong version of the "it's meaningless" case where you don't see the results as signal to any non-negligible degree, you need more than to just say "some other benchmarks in far less formal demains, apparently far less similar to the real world tasks being measured, have low construct validity." 

 In your view, is it possible to design a benchmark that a) does not literally amount to "produce a novel important proof", but b) nonetheless improvements on the benchmark give decent evidence that we are moving towards models being able to do this? If it is possible, how would it differ from Frontier Math 4? 

*I am prepared to change my mind on this if a bunch of mathematicians say "no, actually the questions don't look like they were optimized for this." 

 

Load more