The aforementioned study reported that generative AI adoption in the U.S. has been faster than personal computer (PC) adoption, with 40% of U.S. adults adopting generative AI within two years of the first mass-market product release compared to 20 % within three years for PCs.But this comparison does not account for differences in the intensity of adoption (the number of hours of use) or the high cost of buying a PC compared to accessing generative AI. 14. Alexander Bick, Adam Blandin, and David J. Deming. 2024. The Rapid Adoption of Generative AI. National Bureau of Economic Research.Depending on how we measure adoption, it is quite possible that the adoption of generative AI has been much slower than PC adoption.
re1: I can't think of a single metric for "PC" or "computer" analogue where you start with <<1% usage (as is the case with LLM-mediated chatbots) and get to >20% in 3 years, so I don't think the PC analogy is correct. It's obviously extremely disanalogous/suspicious here where they set up a foil and only criticize the minor problems that makes the analogue look better for LLM adoption speeds when the much more obvious disanalogy makes LLM adoption speeds look worse.
"Point 3 is not even an argument, just a restatement of what they believe" drawing a highly unusual and unmotivated reference class without defending against the most obvious counterarguments and objections is a bad move! Stating reasons for X is not the same as arguing for X against the strongest version of not-X. They do the first; the objection is that they don't do the second, and the unargued reference class is doing all the work. This is also what I mean by "vibes" doing much more of the argument than you seem to believe.
"Point 5 is not an argument either: they are not to blame for how you interpret their "vibes". It's the title of their post! The equivocation is load-bearing for the paper's reception. If they had titled it "AI as Slow Transformative Technology" or "AI Will Reshape the Economy Over Decades, Not Months," it would have gotten a fraction of the citations, etc, etc. "The title and framing do the rhetorical work of 'AI is not a big deal'; the technical content predicts electricity-scale transformation; when talking to journalists or among useful idiots clarification is not needed; when criticized, the authors retreat to the technical content while keeping the rhetorical benefit of the title.
re 6 "Do you think that AI systems are merely cheating on every single benchmark" no i think models are systematically good at easily measurable short time-horizon tasks relative to humans.
First, benchmarks have construct-validity problems even when honestly measured. A benchmark is a sample of tasks chosen to be tractable, verifiable, and gradeable, often with short time horizons (and not requiring long-term planning) The set of tasks with those properties is systematically biased toward what models are good at (at least relative to humans): tasks with crisp answers, short context, well-specified inputs, non-novel circumstances, and clean evaluation criteria[1].
Second, even setting construct validity aside, optimization pressure on any specific metric degrades that metric's correlation with the underlying capability, because labs (entirely ~legitimately!) train on data that resembles the benchmark, design architectures that excel at benchmark-shaped problems, and iterate on whatever moves the benchmark number. This is Goodhart's Law operating normally. Most ppl in AI would not consider this fraud or cheating.
Note that (as I alluded to earlier) my worldview makes different predictions with frozen AI capabilities than N&K make. N&K believes current (and early 2025-era) AI capabilities will cause dramatic shifts in expert labor, just with decades to diffuse. Whereas my perspective (construct-validity issues means models are dramatically good at a few things now but mostly the benchmarks overpredict true ability) says frozen capability would not lead to >~5x changes than we currently observe because the binding constraint is in the parts benchmarks don't test.
I probably won't engage further on this thread.
(I have a lot of sympathy towards models having this shape as someone who's maybe 0.5 sd above average at taking tests relative to my estimation of my actual capabilities, myself).
Many people hold up 'AI As Normal Technology' as a reasonable "normal-people" case against the doomer position. I actually think it's wrong on a number of ways and falls flat on its own terms. I think I believe this for reasons mostly orthogonal to being a doomer (except inasomuch as being a doomer makes me more interested in thinking about AI). If anybody here is interested in fighting the good fight, it might be valuable to do a Andy Masley-style annilihation of the AI As Normal Technology position, trying to stick to minimally controversial arguments and just destroying their arguments with reference to obvious empirical and logical arguments. I suspect it won't be very hard. Eg here's a few obvious reasons they fail:
Overall I think it’s a deeply unserious form of futurism, only held up by Serious Policy People who want to believe in a pre-determined comfortable conclusion.
Should be fun to take down for any of my friends who are bored undergraduates or graduate students interested in destroying bad arguments. Could be a easy way to get a bunch of views on a moderately important topic.
"I feel like if you're going to put something out there in the public sphere as a leader in AI, a bit of timeline conservatism might be prudent."
I see and respect that position, but you can imagine someone saying the opposite: "I feel like if you're going to put something out there in the public sphere as a leader in AI, it's probably prudent to warn people of significant risks that happens much sooner than people expect, even if you think it's less than 50% likely to happen then."
Plausibly you can get away with reporting 3-5 numbers.
For 3 numbers, 25th percentile, median, 75th percentile. This is the approach ("interquartile range") used for reporting SAT acceptance ranges in the US. So we have at least a prior example of a widely reported figure that people don't think "normal people"/high-schoolers and their parents would have too much trouble understanding.
For 5 numbers, something like 5th percentile, 25th percentile, median, 75th percentile, 95th percentile.
3-5 numbers obviously harder to communicate than 1 number, and less precise than the full distribution. But hopefully it's clear and useful enough to be good here.
I think they were optimizing for a combination of concreteness (so there's an exact story to point to, where the 2027 story is "things go roughly as they expected" whereas 2028 and 2031 were pricing in different types of individually unexpected delays[1]), and for memetic value.
Compare: My best estimate is that this project will take me 6 months. However, if you ask me to write it out step-by-step, it'd take me 4 months.The 6 months include buffers for various delays, some expected and some unexpected.
I think for project time estimations as part of a larger plan, the 6 month reply is more useful. But for someone following along on my thinking process, or a manager/collegue/direct report trying to help me optimize, the 4 month step-by-step report might be easier to follow along and/or more useful to critique or improve.
Oh that's a really good point, thanks. I also get annoyed when people in comments harp on a bad title without providing a better one, instead of engage with the substance of my arguments.
(I thought it's fine to complain in this case because they clearly benefited a bunch from the equivocation in their title and clear better alternatives were available, whereas when I have bad titles they tend to be clear own goals in the sense that I both got more flak and also less readership than if I had a better title).