Mythodical AI

We simplified the simulation to focus on testing a few core ideas.

The agent's goal is for the third pixel to be white.
pixel 1: N/A
pixel 2: N/A
pixel 3: white
Initial Goal
At each timestep, it sees three pixels and their colors...
t0
pixel 1: red
pixel 2: black
pixel 3: grey
What it sees
Then it chooses button A or button B.
t0
A
Button A is pressed
B
Button B is not pressed
What it does
Then, at the next timestep, the agent receives updated sensory input...
t1
pixel 1: red
pixel 2: black
pixel 3: grey
What it sees
And again it chooses button A or button B.
t1
A
Button A is not pressed
B
Button B is pressed
What it does
It continues to interact with the environment, trying to figure out how it works and how to reach its goal.

pixel 1: red

pixel 2: black

pixel 3: grey

What it sees

pixel 1: red

pixel 2: black

pixel 3: grey

What it sees

pixel 1: green

pixel 2: black

pixel 3: grey

What it sees

Button A is pressed

Button B is not pressed

What it does

Button A is not pressed

Button B is pressed

What it does

Button A is pressed

Button B is not pressed

What it does

pixel 1: N/A

pixel 2: N/A

pixel 3: white

Initial Goal

Demo of our prototype

These are two recordings of our prototype agent interacting with the simulation.
One recording is the "training" episode; the other is the "testing" episode.
This is the agent “training” on the task — learning how to reach the goal for the first time.
For this simple task, it completes its learning after 8 timesteps.
This is the agent “testing” on the task — reaching the goal as quickly as possible after the training episode.
It reaches the goal in 2 timesteps — the fastest possible for this task.

Training

/ EPISODE 001

What it sees

What it does

pixel 1: red

pixel 2: black

pixel 3: grey

Button A is pressed

Button B is not pressed

pixel 1: red

pixel 2: black

pixel 3: grey

Button A is not pressed

Button B is pressed

pixel 1: green

pixel 2: black

pixel 3: grey

Button A is pressed

Button B is not pressed

pixel 1: red

pixel 2: black

pixel 3: white

Button A is pressed

Button B is not pressed

pixel 1: red

pixel 2: black

pixel 3: grey

Button A is not pressed

Button B is pressed

pixel 1: green

pixel 2: black

pixel 3: grey

Button A is pressed

Button B is not pressed

pixel 1: red

pixel 2: black

pixel 3: white

Button A is not pressed

Button B is pressed

pixel 1: green

pixel 2: black

pixel 3: grey

Button A is pressed

Button B is not pressed

pixel 1: red

pixel 2: black

pixel 3: white

Testing

/ EPISODE 002

What it sees

What it does

pixel 1: red

pixel 2: black

pixel 3: grey

Button A is not pressed

Button B is pressed

pixel 1: green

pixel 2: black

pixel 3: grey

Button A is pressed

Button B is not pressed

pixel 1: red

pixel 2: black

pixel 3: white

🎉

Toy problem. 🧸
Principled solution. 📐

The agent's core principles were developed independent of the task, and are the foundation for solving more complex tasks.

What makes this result interesting is the explanation for how it works...

Training

/ EPISODE 001

🐣

Past training or experience

Init from scratch

pixel 1: red

pixel 2: black

pixel 3: grey

What it sees

Button A is pressed

Button B is not pressed

What it does

pixel 1: red

pixel 2: black

pixel 3: grey

What it sees

Button A is not pressed

Button B is pressed

What it does

pixel 1: green

pixel 2: black

pixel 3: grey

What it sees

Button A is pressed

Button B is not pressed

What it does

pixel 1: red

pixel 2: black

pixel 3: white

What it sees

Button A is pressed

Button B is not pressed

What it does

pixel 1: red

pixel 2: black

pixel 3: grey

What it sees

Button A is not pressed

Button B is pressed

What it does

pixel 1: green

pixel 2: black

pixel 3: grey

What it sees

Button A is pressed

Button B is not pressed

What it does

pixel 1: red

pixel 2: black

pixel 3: white

What it sees

Button A is not pressed

Button B is pressed

What it does

pixel 1: green

pixel 2: black

pixel 3: grey

What it sees

Button A is pressed

Button B is not pressed

What it does

pixel 1: red

pixel 2: black

pixel 3: white

What it sees

Reconciliation

How learning works

The agent has zero knowledge or experience before training.
Agent's first ever "data".
The sensory input at t0 is the first "data" ever seen by the agent.
Before making a decision about what to do, the agent exhibits some internal behavior.
Two steps:
Reconciliation & Planning
Theory Reconciliation
First, the agent looks for any new problems in its theories and attempts to reconcile them.
No theories? No problems.
In the first timestep, the agent has no theories yet, so there are no problems to reconcile.
Next, it makes a plan to reach its goal.
We choose the goal for now, and we can modify the goal at any time.
Planning may require some conjecture.
It chooses a strategy for reaching its goal. If it doesn't have any strategies, it needs to invent one.
The agent's next action is determined by its selected plan.
At the next timestep, the agent receives new sensory input.
The agent reconciles its first problem.
This time, the agent's sensory input reveals a problem with its theories. It reconciles the problem by rejecting one or more theories.
Then it revisits its plan.
This time, it can rule out more theories because it has more sensory experience.
Skipping ahead — at t3, it reaches the goal for the first time.
But it doesn't yet understand how it reached the goal...
Reaching the goal surprises the agent, revealing a problem with its theories.
It discards the problematic theories despite reaching the goal.
It continues to iterate on its theories and plans at each timestep.
Let's skip ahead to when it reaches the goal for the third time, at t8.
At t8, the agent reaches the goal for the third time...
But this time it isn't surprised by reaching the goal.
Importantly, it didn't experience any problems on the way to the goal.
After this point, the agent doesn't learn anything new.
It knows how to reach its goal without encountering any problems.

Learning is fueled by problems.

A problem can come from an error in its theories or from not-knowing how to reach its goal.

If the agent doesn't discover any new problems, then it has no "fuel" to continue learning.

So understanding is about having good-enough theories, not perfect ones.

When the agent's theories are good-enough to reliably reach the goal without encountering problems, then learning is complete.

Testing

/ EPISODE 002

🐥

Past training or experience

Keeps existing knowledge

pixel 1: red

pixel 2: black

pixel 3: grey

What it sees

Reconciliation

Planning

Button A is not pressed

Button B is pressed

What it does

pixel 1: green

pixel 2: black

pixel 3: grey

What it sees

Reconciliation

Planning

Button A is pressed

Button B is not pressed

What it does

pixel 1: red

pixel 2: black

pixel 3: white

What it sees

🎉

How testing works

After training, we "test" the agent to see how quickly it can reach the goal.
The agent retains all of its knowledge from training.
The agent wakes up and "sees" these colors.
It doesn't have any theories about it's current situation yet...
...so there are no problems to reconcile.
In order to make a plan, it needs to guess at the current situation.
These conjectures are singular, not universal.
It needs to conjecture theories about what's there rather than how it works.
The agent's next action is determined by its selected plan.
At the next timestep, the agent receives new sensory input.
No problems so far.
The agent got lucky and guessed its situation on the first try.
The agent updates its plan based on its progress.
It reaches the goal in 2 timesteps — the lowest possible score.
And again, no problems were encountered.
This indicates a threshold level of understanding of this situation, simulation, and task.

How it's useful

We'll start by automating simple physical tasks.
Tasks that require less abstraction and reasoning, e.g. assembly, welding, factory tasks.
As we expand its capability, we'll automate complex physical or digital tasks.
Tasks that require more abstraction and reasoning, e.g. language, math, decision-making, multi-step processes.
Ultimately, we'll be able to automate engineering and science tasks.
Tasks that require creating and testing new designs or new theories.
In general, the agent will be able to learn anything where we can create a feedback loop for trial and error.

The work ahead

Building AGI requires understanding how knowledge is created and improved — not just memorizing a lot of knowledge.
We've been able to make meaningful progress in a relatively short time, and we want to reach this threshold as quickly as possible.
This is only the beginning.
There are many interesting problems ahead of us.
Future work will build upon the principles behind our prototype.
We’ll systematically work to address increasingly complex tasks and environments.
We’re currently looking for investors who want to fund the next phase of R&D.
Contact Collin Kindrom for more information, or with any other questions or ideas.
collinkindrom@gmail.com

We're building a new kind of AI that learns more like people.

LLMs don't learn like people.

Massive Datasets

Massive Compute

Catastrophic Interference

Our architecture learns in a fundamentally different way.

Dataset-free

Compute-efficient

Interference-free

What we're doing differently

Learns by guessing

Trial & error

Error correction

A new blank-slate architecture

The prototype agent lives inside a simulation.

The simulation works like virtual reality.

The agent has some goals.

The agent receives sensory data from the simulation.

The agent chooses actions to learn about the environment and try to reach the goal.

We simplified the simulation to focus on testing a few core ideas.

The agent's goal is for the third pixel to be white.

At each timestep, it sees three pixels and their colors...

Then it chooses button A or button B.

Then, at the next timestep, the agent receives updated sensory input...

And again it chooses button A or button B.

It continues to interact with the environment, trying to figure out how it works and how to reach its goal.

Demo of our prototype

These are two recordings of our prototype agent interacting with the simulation.

This is the agent “training” on the task — learning how to reach the goal for the first time.

For this simple task, it completes its learning after 8 timesteps.

This is the agent “testing” on the task — reaching the goal as quickly as possible after the training episode.

It reaches the goal in 2 timesteps — the fastest possible for this task.

Training

/ EPISODE 001

Testing

/ EPISODE 002

Toy problem. 🧸Principled solution. 📐

What makes this result interesting is the explanation for how it works...

Training

/ EPISODE 001

How learning works

The agent has zero knowledge or experience before training.

Agent's first ever "data".

Before making a decision about what to do, the agent exhibits some internal behavior.

Two steps: Reconciliation & Planning

Theory Reconciliation

No theories? No problems.

Next, it makes a plan to reach its goal.

Planning may require some conjecture.

The agent's next action is determined by its selected plan.

At the next timestep, the agent receives new sensory input.

The agent reconciles its first problem.

Then it revisits its plan.

Skipping ahead — at t3, it reaches the goal for the first time.

Reaching the goal surprises the agent, revealing a problem with its theories.

It continues to iterate on its theories and plans at each timestep.

At t8, the agent reaches the goal for the third time...

But this time it isn't surprised by reaching the goal.

After this point, the agent doesn't learn anything new.

Learning is fueled by problems.

So understanding is about having good-enough theories, not perfect ones.

Testing

/ EPISODE 002

How testing works

After training, we "test" the agent to see how quickly it can reach the goal.

The agent wakes up and "sees" these colors.

It doesn't have any theories about it's current situation yet...

...so there are no problems to reconcile.

In order to make a plan, it needs to guess at the current situation.

These conjectures are singular, not universal.

The agent's next action is determined by its selected plan.

At the next timestep, the agent receives new sensory input.

No problems so far.

The agent updates its plan based on its progress.

It reaches the goal in 2 timesteps — the lowest possible score.

And again, no problems were encountered.

How it's useful

We'll start by automating simple physical tasks.

As we expand its capability, we'll automate complex physical or digital tasks.

Ultimately, we'll be able to automate engineering and science tasks.

We're building a new kind of AI
that learns more like people.

Toy problem. 🧸
Principled solution. 📐

Two steps:
Reconciliation & Planning

Skipping ahead — at `t3`, it reaches the goal for the first time.

At `t8`, the agent reaches the goal for the third time...