Yeah for some reason they never covered that in the stats lectures
If the growth is superexponential, we make it so that each successive doubling takes 10% less time.
(From AI 2027, as quoted by titotal.)
This is an incredibly silly sentence and is certainly enough to determine the output of the entire model on its own. It necessarily implies that the predicted value becomes infinite in a finite amount of time, disregarding almost all other features of how it is calculated.
To elaborate, suppose we take as our "base model" any function f which has the property that lim_{t → ∞} f(t) = ∞. Now I define the concept of "super-f" function by saying that each subsequent block of "virtual time" as seen by f, takes 10% less "real time" than the last. This will give us a function like g(t) = f(-log(1 - t)), obtained by inverting the exponential rate of convergence of a geometric series. Then g has a vertical asymptote to infinity regardless of what the function f is, simply because we have compressed an infinite amount of "virtual time" into a finite amount of "real time".
The actual pathfinding algorithm (which is surely just A* search or similar) works just fine; the problem is the LLM which uses it.
Apparently MIT is teaching a vibe coding class:
How will this year’s class differ from last year’s? There will be some major changes this year:
- Units down from 18 to 15, to reflect reduced load
- Grading that emphasizes mastery over volume
- More emphasis on design creativity (and less on ethics)
- Not just permission but encouragement to use LLMs
- A framework for exploiting LLMs in code generation
ok i watched Starship Troopers for the first time this year and i gotta say a whole lot of that movie is in fact hot people shooting bugs
I read one of the papers. About the specific question you have: given a string of bits s, they're making the choice to associate the empirical distribution to s, as if s was generated by an iid Bernoulli process. So if s has 10 zero bits and 30 one bits, its associated empirical distribution is Ber(3/4). This is the distribution which they're calculating the entropy of. I have no idea on what basis they are making this choice.
The rest of the paper didn't make sense to me - they are somehow assigning a number N of "information states" which can change over time as the memory cells fail. I honestly have no idea what it's supposed to mean and kinda suspect the whole thing is rubbish.
Edit: after reading the author's quotes from the associated hype article I'm 100% sure it's rubbish. It's also really funny that they didn't manage to catch the COVID-19 research hype train so they've pivoted to the simulation hypothesis.
~~For some reason the previous week's thread doesn't show up on the feed for me (and didn't all week)...~~ nvm, i somehow managed to block froztbyte by accident, no idea how
I honestly think anyone who writes "quantum" in an article should be required to take a linear algebra exam to avoid being instantly sacked
Yudkowskian Probability Theory
what a throwback
Isn't it like not real money? Or have they changed that
decision theory is when there's a box with money but if you take the box it doesn't have money