The AI Seesaw
AI
Is it different this time or are we just on the seesaw? Yes.
With generative AI, it feels like weâre on a perpetual seesaw that oscillates between âAI is a scamâ and âAI super-intelligence is inevitable.â Currently, it feels like AI has passed a tipping point on the path toward knowledge worker replacement. Some super cool dudes, like Jack Dorsey, are even laying off 40% of their company in anticipation of this playing out (or so he claims thatâs whyâŠ). Did something fundamental change, or are we just on the seesaw? Is it even possible to be clear-eyed while riding it?
I thought the models just predict the next word.
The LLMs powering todayâs agents and chatbots remain the same auto-regressive next-word predictors that AI luminary Yann LeCun declared doomed 3 years ago (though they increasingly rely on advanced post-training alignment and tool-use training.) Doomed or not, it is becoming abundantly clear that advances in training techniques and compute are making next-word prediction models quite powerful. But how?
The magic lies in the animated dots (in the overly simplified visual) below. It truly is like magic, too, because while researchers understand some circuits and features, the full computation inside large models remains mostly a mystery. Input -> ??? -> ouput.
Even though the modelâs output is simple â a single word â how it arrives at that word can be incredibly complex. Hidden layers contain billions of parameters used to determine that next word. If we feed the word predictor thousands of training examples like:
Illustrative Training Example
âdocumentâ: âA tenant stopped paying rent after discovering severe mold in the apartment. The landlord sued for unpaid rent. The tenant argued the apartment was uninhabitable and that the landlord failed to fix the issue after repeated notices.â, âthinkâ: âIdentify the legal issue (habitability). Determine the relevant rule (landlords must maintain livable conditions). Apply the rule to the facts.â, âfinalâ: âThe court ruled in favor of the tenant, finding that the landlord breached the warranty of habitability by failing to repair the mold problem.â
The model may begin to internalize patterns of legal reasoning within the hidden layers of its neural network, enabling it to activate those patterns when users ask it to analyze documents or explain legal disputes. The emergent capabilities of LLMs continue to surprise, but that is not the main story behind why AI suddenly appears to pose an âintelligence crisisâ.
Then whatâs going on?
This is largely a story about advancements in agent development and âharness engineeringâ â designing the environments, data, scaffolding, and feedback loops of agents. In the software engineering world, people are starting to crack the code (đ) on the harness engineering needed to orchestrate agents to develop and maintain production-ready software.
Now agents and tools â such as Codex, Claude Co-Work, Databricks Assistant, Perplexity Computer, Google Workspace CLI, and custom-built systems â are moving beyond writing code and beginning to operate computers directly, performing many of the same digital tasks that knowledge workers perform every day. The assumption is that it is only a matter of time before the code is cracked on orchestrating agents to handle all âknowledge workâ tasks.
The AI companiesâ dream is to reimagine the corporate office as a kind of RollerCoaster Tycoon, except instead of coasters, the tracks are for agents.
RollerCoaster Tycoon is a classic theme park sim computer game from 1999
Would this dream free us to do âhigher-level thinking,â or render us virtual rail workers doing the tedious work of building tracks for the robots who will replace us once we finish constructing their theme parks? Stay tuned!
Will the seesaw tip back again?
It is important to note that as the seesaw oscillates, the arrow of progress continues upward. Our goalposts are constantly shifting alongside the advancement of generative AI. That said, I fully expect within 3-12 months, we will once again feel like AI is not living up to the hype.