Transcript
ZUaHXwvTSF4 • Grok 5 vs GPT-6: Is AGI Actually Coming by 2027? (The Truth About the Timeline)
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/BitBiasedAI/.shards/text-0001.zst#text/0325_ZUaHXwvTSF4.txt
Kind: captions
Language: en
You've probably been watching the AI
space explode and wondering, "Okay,
Grock 5, GPT6, AGI by 2030. Is any of
this actually real? Or is it just tech
Twitter hype?"
Well, I spent weeks digging through
research papers, model cards, benchmark
data, and executive interviews so you
don't have to. And honestly, what I
found was more surprising than I
expected.
The gap between today's AI and something
that could genuinely think like a human
is closing faster than most people
realize. Welcome back to bitbiased.ai
where we do the research so you don't
have to join our community of AI
enthusiasts with our free weekly
newsletter. Click the link in the
description below to subscribe. You will
get the key AI news tools and learning
resources to stay ahead. So in this
video, we're going to break down exactly
what Grock 5 and GPT6 are bringing to
the table, the architectures, the
training approaches, the reasoning
capabilities,
and then ask the big question, do these
models actually move the needle on AGI
timelines? By the end, you'll have a
clear picture of where AI is headed and
why the next 3 to 5 years might be the
most important in tech history.
Let's start with the two models
themselves and what makes them
fundamentally different.
What are Grock 5 and GPT6?
Before we get into the weeds, let's set
the stage. Gro is Elon Musk's AI. It
lives on X, formerly Twitter, and XAI is
currently training Gro 5 on something
almost incomprehensibly large, the
Colossus supercomput with over 1 million
H100 GPUs.
Let that sink in for a second. That is
an insane amount of compute. GPT6, on
the other hand, is OpenAI's next big
move after GPT5.
The details are still under wraps, but
Sam Alman has been dropping hints,
particularly around memory and
personalization,
suggesting GPT6 won't just be smarter,
it'll actually remember you across
conversations.
We're working from confirmed specs where
available and reasonable projections
where things are still speculative.
And here's where it gets interesting.
These two models are taking very
different philosophical approaches to
the same problem.
Architecture,
two very different philosophies.
This is where things get technically
fascinating. Grock uses something called
a mixture of experts architecture or
MOE.
Think of it like a company with hundreds
of specialists where only the relevant
ones are called in for each specific
job. Grock 1 was already a 314 billion
parameter model with only about 25% of
its weights activating per token. The
result incredible efficiency without
sacrificing raw power.
GPT historically has gone the dense
transformer route, more like one massive
generalist brain doing everything at
once.
GPT5 actually introduced a clever twist.
Two models, one fast, one thinking deep
with a realtime router deciding which
one handles your query.
Open AAI's plan is to eventually merge
those into a single model, which is
likely what GPT6 becomes.
So in simple terms, Grock is sparse and
surgical. GPT is dense and unified. Both
approaches have real advantages and the
benchmark battle between them is
genuinely close. Training data scale
that's hard to comprehend.
Both of these models are being trained
on data sets that would take a human
thousands of lifetimes to read through.
Gro 5 likely pulls from web text, code,
multilingual sources, and multimodal
data, images, video, speech across
dozens of languages. XAI already powers
voice agents in multiple languages. So,
we know the data pipeline is broad. GPT6
will go even bigger. Multi- trillion
token corpora of internet data, books,
code, academic papers, and licensed
medical texts.
GPT5 already scored impressively on a
hard medical benchmark, which tells you
Open AI is deliberately expanding into
specialized domains.
But here's the thing that often gets
overlooked. It's not just about how much
data, it's about what kind. And both
companies are making very deliberate
choices about quality over pure quantity
as they scale.
Multimodality way beyond text. Here's a
shift that I think people underestimate.
These are no longer just text models.
Grock already runs Grock Vision, Grock
imagine for image and video generation,
and Grock voice for spoken conversation
across languages.
Grock 5 is expected to fully unify all
of that. You'll be able to show it an
image, speak to it, and get a response
that blends all of those inputs
seamlessly.
GPT6 will match this, building on GPT5's
vision capabilities, which already hit
84% on a challenging multimodal
benchmark.
But the feature I'm most curious about
is GPT6's rumored memory system.
Altman has said people want AI that
actually remembers their preferences and
context. Imagine a chat GPT that 6 weeks
later still knows your project, your
tone, your goals without you
reexplaining everything. That's not a
small feature. That fundamentally
changes how useful these tools are in
daily work. Reasoning and planning, this
is where AGI lives. Wait until you see
this part because this is where the
conversation about AGI actually becomes
grounded. Gro 4 fast, the current
version, not even Grock 5, already
scored 92% on IMEI, one of the hardest
math competitions in the world. GPT5
scored 94.6%.
Those numbers are separated by rounding
errors. And Grock did it using 40% fewer
thinking tokens, meaning it got there
more efficiently. GPT5 also topped every
major benchmark. coding at nearly 90%,
science at record levels, multimodal at
84%.
GPT6 is expected to push all of these
further.
But raw benchmark scores aren't the full
story.
The real test is multi-step reasoning.
Can these models break down a complex
problem, hold context across a long
chain of logic, and arrive at a
nonobvious answer?
Both Grock and GPT are getting much much
better at this. And that's precisely
what starts to look like the early seeds
of general intelligence.
Tool use and agency.
AI that actually does things. Gro 4.
FAST was trained end to end with tool
use built into its core. Not bolted on
as a feature, but baked into how it
thinks. It decides when to run code,
when to search the web, when to pull
data, and it currently sits at the top
of the LMA Marina search leaderboard,
beating every other model at real world
search tasks.
GPT6 will take a similar but distinct
path. Chat GPT already has plugins, code
execution, and API access. GPT6 looks
set to internalize all of this, plus add
long-term memory as a planning tool.
Think about what that means practically.
An AI that remembers your project from
last week, writes the code, runs it,
finds the bug, fixes it, and sends you
the result without you babysitting every
step. That's the agency threshold, and
we're approaching it faster than most
people expected 2 years ago. Benchmarks
and emergent abilities. Let's look at
the scoreboard for a moment. GPT5 set
state-of-the-art on Amy Math at 94.6%,
6% hit 88% on coding benchmarks and 84%
on multimodal tasks. Gro 4 fast matches
it nearly point for point across the
board. Both Gro 5 and GPT6 should
shatter these numbers.
But here's what matters more than
benchmark scores, emergent abilities.
These are capabilities that nobody
explicitly trained the model for, but
which appear naturally as scale
increases.
GPT5 surprised researchers with its
instruction following precision. Grock
surprised people with its reasoning
efficiency.
Deep Minds Deise Hasabis once said, "The
real AGI benchmark isn't whether an AI
can solve a known problem. It's whether
it can invent something from scratch."
Like how Einstein derived relativity.
Gro 5 and GPT6 won't do that tomorrow.
But with each scale jump, these models
get a little closer to that kind of
genuinely creative self-directed
reasoning.
AGI timelines. What do these models
actually change? This is the question
that matters. And the honest answer is
these models make the optimistic
scenarios meaningfully more plausible.
Here's how experts currently break it
down.
The fast scenario AGI by around 2030
would require Gro 5 and GPT6 to close
the remaining gaps in reasoning and
planning leading to an intelligence
acceleration between 2028 and 2030.
Most researchers put this at roughly a
20% probability.
It's not the default, but it's not
fantasy either.
The medium scenario AGI emerging
somewhere between 2032 and 2035 is where
the majority of expert consensus
currently sits. Deepminds Habis's
estimated 3 to 5 years from 2025.
Daniel Kokajelo who runs the AI 2027
forecasting project recently revised his
timeline toward the early 2030s. This
scenario accounts for about 50%
probability and assumes steady
compounding progress without major
surprises. The slow scenario AGI
arriving in the mid to late 2030s
reflects the possibility of hitting real
architectural or data bottlenecks.
Stanford's AI experts have explicitly
said they don't expect AGI by 2026.
This accounts for roughly 30%
probability.
The key insight is this. Before GPT4,
these numbers skewed much later. Each
new model generation has consistently
shifted expert medians earlier. Grock 5
and GPT6 will almost certainly do the
same. Safety, risk, and what keeps
researchers up at night. It wouldn't be
an honest video if we didn't talk about
this. Both XAI and OpenAI publicly
commit to safety testing, model cards,
red teaming, and content filtering.
Gro's published model card explicitly
documents what it filters and how OpenAI
describes a life cycle of testing before
any model touches the public. But here's
the tension. The more capable and
agentic these models become, the harder
alignment gets.
A model that can plan, browse, write
code, and execute actions across
multiple steps. If that model has even a
small misalignment with what you
actually want, the consequences compound
quickly. That's not science fiction.
That's the core of what AI safety
researchers have been warning about for
years. There are also broader societal
questions. Wider access to Gro 5 on free
tiers is great for democratizing AI, but
the same capability that helps a
researcher accelerate drug discovery
could help someone else do something
harmful. The policy frameworks,
particularly in the EU with upcoming
frontier AI regulations, are trying to
catch up, but they're running behind the
technology. What should you actually
take away from this? Gro 5 and GPT6 are
not just incremental upgrades. They
represent a genuine step change in what
AI can do in reasoning, in agency, in
multimodal understanding, and in
efficiency.
The AGI debate is no longer theoretical.
It's a timeline question and that
timeline keeps getting shorter.
Whether AGI arrives by 2030, 2033 or
2037, the direction is clear.
These models are closing the gap and the
people building them, Altman, Musk,
Habis, all agree we're on a defined
path. What's not yet defined is what
happens when we arrive.
If this breakdown was useful, drop a
comment below. I'm genuinely curious
whether you think the fast, medium, or
slow scenario is most likely.
And if you want more deep dives like
this as Gro 5 and GPT6 actually roll
out, make sure you're subscribed
because this story is only getting more
interesting from Here.