Wolfram describes what ChatGPT is doing
Written in 2023, this is still a good grounding on just what's happening in ChatGPT under the covers.
I read this essay by Wolfram when he first published it, and it’s worth revisiting today. ChatGPT was still novel/mysterious when he published the essay, and since they we’ve gone through a proliferation of models, and public concern about AI-doom. With the benefit of hindsight, here are some points that stick out for me:
“In the future, …there will be fundamentally better ways to train neural nets”. Our current architecture is not the terminal architecture, and our current hardware substrate is not the terminal hardware substrate. People have different views on Yan LeCun and Beff Jezos (link may be wrong) but on this they both agree and can point to interesting alternative paths forward.
Much of what’s important is computationally irreducible. You just cannot get there from here, and more pointedly, you cannot train your ways to certain kinds of capabilities. We’ve been wrong about what some of those capabilities are (writing essays for example) but this fact places a hard limit around what’s possible. I think a lot of real-world processes, that have dynamic, attentional responses, will fall into this category. Reality is path-dependent.
As shown by ChatGPT struggling with axiomatic problems (knowing when to close a sentence containing parenthesis) at its’ core, it’s an associative engine. I don’t know if the most successful application will lean in on this fuzziness, or find ways to address the weakness via math-specific subroutines or incorporating deterministic approaches like knowledge graphs. You can make a good case for both. I actually find Wolfram’s discussion of this a little shallow (or maybe I do not grasp his larger point.)
If anyone else has read the essay, I’d love your thoughts on it 18 months later.