At first I read the article like the author was trying to display how ridiculous these people are by just repeating what they say. I guess this is like some people reading Ayn Rand works under the impression that they're satire.
This was not such an effective venture.
my current favorite trick for reducing "cognitive debt" (h/t @simonw ) is to ask the LLM to write two versions of the plan:
- The version for it (highly technical and detailed)
- The version for me (an entertaining essay designed to build my intuition)
I don't know about them, but I would be offended if I was planning something with a collaborator, and they decide to give me a dumbed down, entertaining, children's storybook version of their plan while keeping all the technical details to themselves.
Also, this is absolutely not what "cognitive debt" means. I've heard technical debt refers to bad design decisions in software where one does something cheap and easy now but has to constantly deal with the maintenance headaches afterwards. But the very concept of working through technical details? That's what we call "thinking". These people want to avoid the burden of thinking.
This is why CCC being able to compile real C code at all is noteworthy. But it also explains why the output quality is far from what GCC produces. Building a compiler that parses C correctly is one thing. Building one that produces fast and efficient machine code is a completely different challenge.
Every single one of these failures is waved away because supposedly it's impressive that the AI can do this at all. Do they not realize the obvious problem with this argument? The AI has been trained on all the source code that Anthropic could get their grubby hands on! This includes GCC and clang and everything remotely resembling a C compiler! If I took every C compiler in existence, shoved them in a blender, and spent $20k on electricity blending them until the resulting slurry passed my test cases, should I be surprised or impressed that I got a shitty C compiler? If an actual person wrote this code, they would be justifiably mocked (or they're a student trying to learn by doing, and LLMs do not learn by doing). But AI gets a free pass because it's impressive that the slop can come in larger quantities now, I guess. These Models Will Improve. These Issues Will Get Fixed.
Congratulations to the maker of a tool that charges you $20 to remind you to buy milk the next morning.
I thought I was sticking my neck out when I said that OpenAI was faking their claims in math, such as with the whole International Math Olympiad gold medal incident. Even many of my peers in my field are starting to become receptive to all of these rumors about how AI is supposedly getting good at math. Sometimes I wonder if I'm going crazy and sticking my head in the sand.
All I can really do is to remember that AI developers are bad faith (and scientists are actually bad at dealing with bad faith tactics like flooding the zone with bullshit). If the boy has cried wolf 10 times already, pardon me if I just ignore him entirely when he does it for the 11th time.
I would not underestimate how much OpenAI and friends would go out of their way to cheat on math benchmarks. In the techbro sphere, math is placed on a pedestal to the point where Math = Intelligence.
“California is, I believe, the only state to give health insurance to people who come into the country illegally,” Kauffman said nervously. “I think we probably should not be providing that.”
“So you’d rather everyone just be sick, and get everyone else sick?” another reporter asked.
“That’s not what I’m saying,” said Kauffman.
“Isn’t that effectively what happens?” the reporter countered. “They don’t have access to health care and they just have to get sick, right?”
Kauffman contemplated that one for a moment. “Then they have to just get sick,” he said. “I mean, it’s unfortunate, but I think that it’s sort of impossible to have both liberal immigration laws and generous government benefits.”
Do I need to comment on this one?
It is how professors talk to each other in ... debate halls? What the fuck? Yud really doesn't have any clue how universities work.
I am a PhD student right now so I have a far better idea of how professors talk to each other. The way most professors (in math/CS at least) communicate in a spoken setting is through giving talks at conferences. The cool professors use chalkboards, but most people these days use slides. As it turns out, debates are really fucking stupid for scientific research for so many reasons.
- Science assumes good faith out of everyone, and debates are needlessly adversarial. This is why everyone just presents and listens to talks.
- Debates are actually really bad for the kind of deep analysis and thought needed to understand new research. If you want to seriously consider novel ideas, it's not so easy when you're expected to come up with a response in the next few minutes.
- Debates generally favor people who use good rhetoric and can package their ideas more neatly, not the people who really have more interesting ideas.
- If you want to justify a scientific claim, you do it with experiments and evidence (or a mathematical proof when applicable). What purpose does a debate serve?
I think Yud's fixation on debates and "winning" reflects what he thinks of intellectualism. For him, it is merely a means to an end. The real goal is to be superior and beat up other people.
In my experience most people just suck at learning new things, and vastly overestimate the depth of expertise. It doesn't take that long to learn how to do a thing. I have never written a song (without AI assistance) in my life, but I am sure I could learn within a week. I don't know how to draw, but I know I could become adequate for any specific task I am trying to achieve within a week. I have never made a 3D prototype in CAD and then used a 3D printer to print it, but I am sure I could learn within a few days.
This reminds me of another tech bro many years ago who also thought that expertise is overrated, and things really aren't that hard, you know? That belief eventually led him to make a public challenge that he could beat Magnus Carlsen in chess after a month of practice. The WSJ picked up on this, and decided to sponsor an actual match with him and Carlsen. They wrote a fawning article about it, but it did little to stop his enormous public humiliation in the chess community. Here's a reddit thread discussing that incident: https://www.reddit.com/r/HobbyDrama/comments/nb5b1k/chess_one_month_to_beat_magnus_how_an_obsessive/
As a sidenote, I found it really funny that he thought his best strategy was literally to train a neural network and ... memorize all the weights and run inference with mental calculations during the game. Of course, on the day of the match, the strategy was not successful because his algorithm "ran out of time calculating". How are so many techbros not even good at tech? Come on, that's the one thing you're supposed to know!
Just had a conversation about AI where I sent a link to Eddy Burback's ChatGPT Made Me Delusional video. They clarified that no, it's only smart people who are more productive with AI since they can filter out all the bad outputs, and only dumb people would suffer all the negative effects. I don't know what to fucking say.
More AI bullshit hype in math. I only saw this just now so this is my hot take. So far, I'm trusting this r/math thread the most as there are some opinions from actual mathematicians: https://www.reddit.com/r/math/comments/1o8xz7t/terence_tao_literature_review_is_the_most/
Context: Paul Erdős was a prolific mathematician who had more of a problem-solving style of math (as opposed to a theory-building style). As you would expect, he proposed over a thousand problems for the math community that he couldn't solve himself, and several hundred of them remain unsolved. With the rise of the internet, someone had the idea to compile and maintain the status of all known Erdős problems in a single website (https://www.erdosproblems.com/). This site is still maintained by this one person, which will be an important fact later.
Terence Tao is a present-day prolific mathematician, and in the past few years, he has really tried to take AI with as much good faith as possible. Recently, some people used AI to search up papers with solutions to some problems listed as unsolved on the Erdős problems website, and Tao points this out as one possible use of AI. (I personally think there should be better algorithms for searching literature. I also think conflating this with general LLM claims and the marketing term of AI is bad-faith argumentation.)
You can see what the reasonable explanation is. Math is such a large field now that no one can keep tabs on all the progress happening at once. The single person maintaining the website missed a few problems that got solved (he didn't see the solutions, and/or the authors never bothered to inform him). But of course, the AI hype machine got going real quick. GPT5 managed to solve 10 unsolved problems in mathematics! (https://xcancel.com/Yuchenj_UW/status/1979422127905476778#m, original is now deleted due to public embarrassment) Turns out GPT5 just searched the web/training data for solutions that have already been found by humans. The math community gets a discussion about how to make literature more accessible, and the rest of the world gets a scary story about how AI is going to be smarter than all of us.
There are a few promising signs that this is getting shut down quickly (even Demis Hassabis, CEO of DeepMind, thought that this hype was blatantly obvious). I hope this is a bigger sign for the AI bubble in general.
EDIT: Turns out it was not some rando spreading the hype, but an employee of OpenAI. He has taken his original claim back, but not without trying to defend what he can by saying AI is still great at literature review. At this point, I am skeptical that this even proves AI is great at that. After all, the issue was that a website maintained by a single person had not updated the status of 10 problems inside a list of over 1000 problems. Do we have any control experiments showing that a conventional literature review would have been much worse?
For all the talk about these people being "highly agentic", it is deeply ironic how all the shit they do has no meaning and purpose. I hear all this sound and fury about making millions off of ChatGPT wrappers, meeting senators in high school bathrooms, and sperm races (?), and I wonder what the point is. Silicon Valley hagiographies used to at least have a veneer that all of this was meaningful. Are we supposed to emulate anyone just because they happen to temporarily have a few million dollars?
Even though the material conditions of working in science are not good, I'd still rather do science than whatever the hell they're doing. I would be sick at the prospect of being a "highly agentic" person in a "new and possibly permanent overclass", where my only sense of direction is a vague voice in my head telling me that I should be optimizing my life in various random ways, and my only motivation is the belief that I have to win harder and score more points on the leaderboard. (In any case, I believe this "overclass" is a lot more fragile than the author seems to think.)