Using Claude Code to solve Advent of Code 2025

Claude Code is all you need

Jan 03, 2026

Introduction

Let’s be honest: with LLMs, the fun of Advent of Code is gone. You can paste any puzzle into ChatGPT or Claude and get a working solution in seconds. So I did it anyway, but to run a different experiment with Advent of Code 2025: what if I didn’t write a single line of code? Instead, I gave Claude Code a single instruction file and let it solve the puzzles completely autonomously.

The result: 20 out of 22 challenges solved (91% success rate) with zero human-written code.

Check out my repo for more details.

The Setup

I created a single file called INSTRUCTIONS.md with a 12-step process for each day:

Create a folder ./day_xx/
Navigate to the Advent of Code puzzle page
Save the input to ./day_xx/input.txt
Read Part 1, write strategy in ./day_xx/README.md
Write ./day_xx/part1.py
Test with examples
Run against actual input and submit
Write Part 2 strategy in README
Write ./day_xx/part2.py
Test Part 2
Run Part 2
Submit Part 2 answer

Then I ran: claude --chrome --dangerously-skip-permissions

Note: The --dangerously-skip-permissions flag bypasses all safety checks. This is terrible for production use, but necessary for this experiment, where the agent needed to navigate websites and submit answers autonomously.

What Happened

Claude Code executed the entire workflow independently:

Used Chrome integration to navigate Advent of Code
Read puzzle descriptions on its own
Developed solution strategies and documented them
Wrote and tested Python code
Submitted answers to the website
Self-corrected when answers were wrong

Zero lines of code written by me. Just the instruction file.

Results

Completed: Days 2-8 (both parts), Day 9 Part 1, Days 10-11 (both parts), Day 12 Part 1

Failed: Day 9 Part 2, Day 12 Part 2

Total: 20/22 challenges = 91% autonomous completion

The repository generated approximately 42 Python files across days 2-12, each with full solution code, test files, and documented reasoning.

Example: Day 2 Strategy

Here’s how Claude Code documented its approach for Day 2 (from the auto-generated README):

Part 1: Detect product IDs where “any substring starting from position 0 appears immediately after itself” (exactly twice repetition).

Part 2: Expand to catch IDs where “the entire string can be formed by repeating that substring at least 2 times.”

The agent independently reasoned through the problem, identified the algorithmic approach, and implemented it - all without human guidance beyond the instruction template.

Limitations

Even with 91% success, the agent failed on 2 challenges. Looking at the failures:

Day 9 Part 2: Complex disk defragmentation problem that likely needed algorithmic insight, the agent couldn’t generate
Day 12 Part 2: Blocked by Day 9 Part 2’s failure (dependency issue)

Some problems still require human algorithmic intuition and creative problem-solving. The agent excels at execution but can struggle with novel algorithmic insights.

Conclusion

This wasn’t about pair programming or AI assistance. This was about autonomous execution from start to finish.

The agent navigated websites, read natural language descriptions, formulated strategies, wrote code, debugged failures, and submitted results - all independently. The only human input was a procedural instruction file.

Are we ready for fully autonomous development? Not quite. That 9% failure rate matters, especially when complex algorithmic thinking is required. But 91% autonomous completion on varied programming challenges suggests we’re closer than I expected.

The future isn’t AI replacing developers. It’s developers orchestrating autonomous agents - providing high-level direction while the agent handles execution, testing, and iteration.

As I watched Claude Code navigate Advent of Code independently, I realized: the question isn’t “can AI code?” anymore. It’s “what level of abstraction should humans work at when AI handles the implementation?”

Check out the full repository to see all the auto-generated code and conversation transcripts.

Jan 13

Which coding challenge should I try next with Claude Code?

Pawel Jozefiak

Jan 25

Fascinating results! The 91% success rate on Advent of Code with zero human-written code is genuinely impressive. What strikes me most is how the agent handled the full workflow autonomously - navigating to the puzzle, understanding the problem, strategizing, coding, and testing. That's a lot of sequential reasoning steps where things could go wrong.

I've been building an AI agent called Wiz using Claude Code for the past few months, and my experience mirrors some of what you found. The agent handles routine tasks remarkably well, but those remaining edge cases (your 9% failure rate) often require a specific kind of algorithmic intuition that's hard to prompt for. In my case, it's less about puzzle-solving and more about knowing when NOT to act autonomously.

What I find most valuable about Claude Code isn't the "zero human code" achievement per se, but how it changes the developer's role. You still designed the instruction file, chose the constraints, and presumably iterated on the approach. The human becomes more of an architect and less of a typist - which honestly feels like the right direction.

For anyone curious about how Claude Code performs on more practical, day-to-day automation tasks (rather than algorithmic puzzles), I wrote up my experience after several months of real usage: https://thoughts.jock.pl/p/claude-code-review-real-testing-vs-zapier-make-2026

Dinesh’s Substack

Discussion about this post

Ready for more?