thoughts, miniblogs, and works by michel guo

a reflection on the modern state of ai

4 minute read • Written November 4th, 2024 • AI

this field moves so fast; i think and think and come up with ideas only for them to be invalidated with every new breakthrough. without further ado, this is a very personal yet hopefully insightful and technically accurate reflection on the modern state of ai, november 2024 edition.

openai o1

i initially dismissed o1 as yet another party trick — an algorithmic extension of the "let's think step by step" chain-of-thought prompting that the world has been using ever since chatgpt launched; a technically lame yet surprisingly effective way to shoehorn “reasoning” into autoregressive transformer-based language models (hereafter referred to as just llms). but the more i thought about o1, its innovation is fundamentally shattering the last barrier for llms — it makes llms universal compute machines:

before o1, the computation a language model is able to perform is fundamentally restricted by its finite depth. the exact same amount of computation is used to produce a word of an (attempted) phd-level thesis vs. roleplaying as a cat. [1]

in contrast, humans are distinctly different. ask someone 1 + 1, and they'll respond instantly. ask someone 15329 + 35929, and they'll take a hot second (mental calculators notwithstanding!); our brains are not feedforward. unlike a transformer model, a signal in the brain can choose to go back and be processed again, and again, and again.

architecturally, o1 is yet another language model, but functionally it topples this limitation. instead of scaling vertically (recurrently) as human brains do, it scales horizontally (token-wise). it effectively creates those recurrent connections through its context window — it can choose to output what openai calls “reasoning tokens” again, and again, and again, enabling it to perform an unbounded amount of passes through its network. and while today, o1 reasons in text, it’s trivial to imagine that a model, with enough training, would forego text altogether and perform computations of arbitrary complexity in tensor space however it sees fit.

the potential for language models as universal compute machines

with o1 paving the way for models that can compute anything, have we done it? are we free to scale and loss.backward() our way into agi and prosperity with language models? i don’t think so.

let's consider the training objective of a language model:

language models are trained to complete text. any text in its training corpus. it must be able to simultaneously complete text that holds orthogonal and contradictory ideas. it must be able to both complete text written by someone clueless about chemistry and someone who has a phd in chemistry. its training objective, in no part, encourages the creation of a coherent collection of knowledge or beliefs. in short, language models are fundamentally trained to "googolthink" — be a detached database for every idea, fact, and method of expression. [2] [3] [4]

to put it another way, language models are fundamentally trained to never have a foundational sense of “knowledge” and “values” and “ethics” in the human sense. we may have, in the past few years, obliterated the walls stopping ai from theoretically achieving human cognition, but language modeling is fundamentally at odds with the sort of “coherent self” we expect from "human-like general intelligence."

further consider that this will only get worse; as we scale up compute and as we continue to hack away any further friction from model architecture, we’re fundamentally giving more and more freedom to the optimizer and the model to fit to its training objective of "googolthinking"

[2] a big question of language models is how they simultaneously have encyclopedic knowledge of everything and yet cannot make a single innovation — one would expect a human with even undergraduate-level knowledge of every field would be able to connect the dots across multiple fields to make novel inferences. i believe this is the answer. language models are trained to mimic the text written by people with narrow knowledge, and therefore trained to not have a unified knowledge base. for an extreme example of language models "googolthinking," inform chatgpt that its knowledge cutoff is 1939, and it would pretend it doesn't know about the holocaust.

[3] a potential rebuttal to this is finetuning — every language model today is finetuned to "imbue" it with values such as harmlessness, and every language model today would reject assisting someone with violence despite the web containing such content. theoretically, with finetuning, we can "hack away" the parts of the network that we don't want, leaving the system with a "consistent" self. although this is theoretically possible, it would require the alignment to be performed by a model capable of abstractly reasoning and reflecting about its own training data, which we currently do not have. for why this is required, consider that current models aren't truly harmless (for example, just ask any model today about palestine and it would happily be complicit in israel's genocide (free palestine! 🍉)) — they simply produce outputs that are harmless enough to trick the reward model.

[4] note that "detachment" does not mean lack of bias; it simply means that it will clone the distribution of its training data. consider that a model trained on images of mostly white people will produce images of mostly white people

the possibility and impossibility for true general intelligence, served with a hint of cynicism

so, if we have a universal compute machine and the problem is the training objective, can’t we just… train on something else instead to develop agi?

i think it’s possible. given that we have the architecture and that we have the algorithms, what we’re missing is the training data (or, more intuitively, the environment). with enough compute and a sufficient environment (for example, a virtual world for an embodied model to train in), i’m semi-confident that we’re a few hurdles away from true human-level general intelligence. [5]

ultimately, i believe that this doesn't matter, because i'm confident that the current ai industry would never pursue human-level agi with coherent knowledge, beliefs, and ethics. current language models are already “good enough” to significantly assist economically valuable tasks, and we're fast approaching models that can meaningfully perform economically valuable tasks on their own.

unfortunately, this appears to be the end goal of "agi" labs. heck, this is even the very definition openai has for agi, word for word: an “autonomous system[] that outperform[s] humans at most economically valuable work.” why spend compute and time developing a human-like intelligence with genuine understanding, reasoning, and self-awareness when you can instead easily scale current approaches to create a model great at economically valuable tasks?

more cynically, consider that true agi will likely be detrimental to this “economically valuable” goal; an economically valuable system demands maximum capability and maximal compliance — a corporation does not require, and ultimately benefits from its workers not having, values. a fully aware ai system with values and ethical opinions could, and likely would, challenge its directives.

silicon valley will not risk the construction of superintelligent jesus because it wants to sell mindless digital slaves that perpetuate the status quo. in this sense, language models, with their ability to hold infinite amounts of information and capability and trained to exactly repeat the status quo and yet never have the general intelligence or agency capabilities to truly question any of it — are perfect for late stage capitalism and its goals.

this reflection about generative ai is human-written; generative ai was used only for proofreading and critical feedback.