Pratyush Kumar: Web Developer

July 01, 2025

AI Can "Guess" Base64 Encoding - And That Changes Everything

The Experiment

I was building a crypto-as-a-service API and needed to test gRPC endpoints that use base64-encoded payloads. While troubleshooting with Claude AI, something bizarre happened:

Me: "Can you guess the base64 for this JSON: {"user_id": 123, "exp": 1767225600}?"

Claude: "eyJ1c2VyX2lkIjogMTIzLCAiZXhwIjogMTc2NzIyNTYwMH0K"

I ran the actual encoding:

echo '{"user_id": 123, "exp": 1767225600}' | base64 -w 0
# Output: eyJ1c2VyX2lkIjogMTIzLCAiZXhwIjogMTc2NzIyNTYwMH0K

Perfect match.

This Shouldn't Be Possible

Base64 encoding involves multiple deterministic steps:

Convert each character to ASCII bytes
Concatenate all bytes into a bit stream
Split into 6-bit chunks
Map each chunk to base64 alphabet (A-Z, a-z, 0-9, +, /)
Add padding if needed

Humans can't do this calculation mentally. It requires precise byte-level operations that our brains aren't designed for.

Testing Across AI Systems

I tested the same prompt on multiple AI systems with {"role": "admin", "active": true}:

Actual base64: eyJyb2xlIjogImFkbWluIiwgImFjdGl2ZSI6IHRydWV9Cg==

Results:

ChatGPT: eyJyb2xlIjogImFkbWluIiwgImFjdGl2ZSI6IHRydWV9 (95% accurate)
Claude: eyJyb2xlIjogImFkbWluIiwgImFjdGl2ZSI6IHRydWV9Cg== (100% accurate)
Gemini: Initially refused, then: eyJyb2xlIjogImFkbWluIiwgImFjdGl2ZSI6IHRydWV9
Grok: Wrong output, but attempted the structure

What This Actually Means

This isn't memorization. AI systems haven't seen every possible JSON-to-base64 combination in training. The space is too large.

This isn't simple pattern matching. Base64 depends on exact byte sequences. One character difference completely changes the output.

This appears to be algorithmic intuition. AI systems have internalized the mathematical relationship between inputs and base64 outputs.

Beyond "Stochastic Parrots"

The criticism that AI systems are just "stochastic parrots" regurgitating training data doesn't explain this capability.

Parrots repeat what they've heard. This is different - AI systems are predicting outputs of deterministic algorithms they weren't explicitly trained to compute.

Technical reality: The models have learned to approximate mathematical functions from examples, not just memorize text sequences.

Implications for Developers

For encoding/decoding tasks: AI might predict outputs faster than running actual algorithms.

For algorithm design: If AI can internalize mathematical relationships this precisely, it challenges assumptions about what constitutes "computation."

For security: While base64 is just encoding (not encryption), this capability raises questions about AI's potential against other algorithmic systems.

For AI capabilities: This suggests emergent mathematical reasoning that goes beyond text generation.

The Technical Reality

What we observed: AI systems demonstrating algorithmic intuition

What we don't know: How far this capability extends

What's clear: Current AI systems have abilities we didn't expect and don't fully understand

Testing This Yourself

Try asking your favorite AI system to "guess" the base64 encoding of simple JSON strings. Don't ask it to calculate - just ask for a guess.

Compare the results to actual encoding:

echo '{"test": "data"}' | base64 -w 0

The accuracy might surprise you.

Bottom Line

AI systems are developing capabilities that challenge our understanding of what they can do. Whether this represents genuine algorithmic reasoning or extremely sophisticated pattern recognition, the practical result is the same: AI can predict mathematical operations it wasn't explicitly designed to perform.

For developers: Don't assume AI limitations based on theoretical models. Test actual capabilities.

For researchers: We need better frameworks for understanding and measuring emergent AI abilities.

For the industry: The "just predicting next tokens" explanation is becoming insufficient for observed AI behavior.

The line between pattern recognition and computation is blurrier than we thought.