███╗   ███╗███████╗███╗   ██╗ █████╗  ██████╗███████╗
████╗ ████║██╔════╝████╗  ██║██╔══██╗██╔════╝██╔════╝
██╔████╔██║█████╗  ██╔██╗ ██║███████║██║     █████╗
██║╚██╔╝██║██╔══╝  ██║╚██╗██║██╔══██║██║     ██╔══╝
██║ ╚═╝ ██║███████╗██║ ╚████║██║  ██║╚██████╗███████╗
╚═╝     ╚═╝╚══════╝╚═╝  ╚═══╝╚═╝  ╚═╝ ╚═════╝╚══════╝
#ai#algorithms#javascript
A 1960s AI That Still Teaches Us

In 1961, Donald Michie built a tic-tac-toe playing "machine" out of 304 matchboxes and colored beads. No computers. No code. Just physical objects and a simple learning rule.

It learned to play tic-tac-toe through trial and error – and it worked.

I recently implemented MENACE (Machine Educable Noughts And Crosses Engine) in TypeScript, and the experience taught me more about reinforcement learning than any neural network tutorial ever did.

→ Try the live demo

How The Original MENACE Worked

The physical MENACE was beautifully simple:

The Setup:
1. Matchboxes: Each matchbox represented a unique board state that MENACE might encounter (304 total).
2. Colored Beads: Inside each matchbox were colored beads. Each color represented a possible move from that board state.
3. The Game: To make a move, you'd find the matchbox for the current board state, shake it, and pull out one bead at random. The color told you which square to play.
4. Learning: After the game:
If MENACE won: add 3 beads of each color used
If MENACE drew: add 1 bead of each color used
If MENACE lost: remove 1 bead of each color used (minimum 1)

That's it. No gradients. No backpropagation. No neural networks. Just beads and matchboxes.

And it learned. After about 200 games, MENACE became nearly unbeatable.

Translating to TypeScript

The core idea translates beautifully to code. Instead of matchboxes and beads, we use objects and numbers.

// Board representation type Board = Array<'' | 'X' | 'O'>; // 9 squares // Brain: maps board states to move weights interface BrainState { [boardKey: string]: BeadCount; } interface BeadCount { [position: number]: number; // position -> number of "beads" } // Example brain state: { "___-___-___": { // empty board 0: 10, // center has 10 beads (highly weighted) 1: 5, // corners have 5 beads each 4: 8, // etc. }, "X__-___-___": { // after opponent plays corner 4: 12, // center is strongly preferred 2: 3, 6: 3, } }

The key insight: the number of beads = the probability weight for that move.

If center square has 10 beads and corner has 5, the AI is twice as likely to play center.

Implementation Highlights

1. Choosing a move
function chooseMove(board: Board, brain: BrainState): number { const boardKey = serializeBoard(board); const weights = brain[boardKey] || initializeWeights(board); // Weighted random selection const totalBeads = Object.values(weights).reduce((a, b) => a + b, 0); let random = Math.random() * totalBeads; for (const [position, beads] of Object.entries(weights)) { random -= beads; if (random <= 0) { return Number(position); } } }
2. Learning from results
function updateBrain( brain: BrainState, movesHistory: Array<{ board: Board; move: number }>, result: 'win' | 'draw' | 'loss' ): void { const beadChange = result === 'win' ? 3 : result === 'draw' ? 1 : -1; for (const { board, move } of movesHistory) { const boardKey = serializeBoard(board); const currentBeads = brain[boardKey][move]; // Update bead count (minimum 1) brain[boardKey][move] = Math.max(1, currentBeads + beadChange); } }
3. Persistence with IndexedDB

One of the coolest parts: the brain persists across sessions. When you close the page and come back, MENACE remembers everything it learned.

This creates a powerful loop: the more you play, the smarter it gets. Leave it running for a few hundred games, and it becomes nearly impossible to beat.

What Makes This Algorithm Beautiful

1. It's explainable
You can literally see why MENACE made a move. "This square has 12 beads, that one has 3, so it's 4x more likely to pick this one." Try explaining that with a neural network.
2. It's minimal
The entire learning algorithm is ~50 lines of code. No libraries. No frameworks. Just weighted random selection and simple arithmetic.
3. It's intuitive
The learning rule makes sense to anyone: "if this move led to winning, do it more often. If it led to losing, do it less." That's how humans learn too.
4. It works
Despite being from 1961, the algorithm is still effective. After ~200 games, MENACE rarely makes mistakes. It discovers optimal strategies purely through experience.

Lessons For Modern AI Development

What MENACE taught me:
Reinforcement learning isn't new – we've understood the core idea for 60+ years
Simple algorithms can solve complex problems if you give them enough iterations
Explainability matters – being able to inspect the "brain" is incredibly valuable
Sometimes the best way to learn an algorithm is to implement it from scratch
Physical analogies (matchboxes and beads) can make abstract concepts click

Modern AI has gotten incredibly complex. But at its core, it's still doing what MENACE did: trying things, seeing what works, and adjusting probabilities.

We've added layers of sophistication – neural networks, backpropagation, gradient descent – but the fundamental idea remains: learn from experience.

Try It Yourself

I've built a live implementation you can play with. Watch MENACE learn in real-time:

The Demo Features:
Play against MENACE and watch it improve
See the brain state visualized (what it's "thinking")
Track win/draw/loss statistics
Export/import brain states to share trained AIs
Reset and train from scratch

Pro tip: Let it lose the first 20-30 games. Watch how it adapts. By game 100, you'll struggle to beat it.

That's reinforcement learning in action – no code changes, just experience.

Final Thoughts

Building MENACE was one of those rare projects where the implementation was even more educational than I expected.

It's a reminder that AI doesn't have to be black-box neural networks trained on millions of examples. Sometimes the simplest algorithms – the ones you can explain with matchboxes and beads – are the most elegant.

And in 2025, when we're drowning in transformer models and billion-parameter networks, there's something refreshing about an AI you can fully understand in an afternoon.

Donald Michie was onto something in 1961. We'd do well to remember it.

References & Further Reading

Donald Michie's original paper: "Trial and Error" (1961)
Matthew Scroggs' excellent writeup on MENACE
Reinforcement Learning: An Introduction (Sutton & Barto)
My implementation: Live Demo
Enjoyed this? Want to discuss AI, algorithms, or build something together?