← All posts Posted April 21, 2026

Why we built ClickProbe

Your coding agent believes its own code works. You need something that doesn't.

Ingo Karstein

The problem with AI that writes and tests its own code

AI coding agents are remarkably productive. They write code, refactor modules, and ship features at a pace no individual developer can match. I use them every day, and they have changed how I build software.

But they have a blind spot that no amount of prompting fixes: they never actually click through the app they just built.

When a coding agent finishes a feature, it confirms its own work by reading the code it just wrote. It reasons: “I implemented the save handler, the handler writes to the database, therefore saving works.” This is not a test. This is a coding agent validating its own assumptions from inside its own context window. It is, in the most literal sense, the AI telling you what you want to hear.

I ran into this constantly while building with Claude Code. I would ask for a new feature. The agent would write it, describe it in detail, and declare it complete. I would open the browser, click the button, and nothing would happen. Or the form would submit without persisting. Or the navigation would link to a route that 404s. The agent was confident. The app was broken.

The agent isn’t lying. It genuinely cannot tell. It has no eyes.

What ClickProbe actually does

ClickProbe gives your app a pair of eyes that are not the same eyes that wrote the code.

It runs a separate AI instance — one with zero knowledge of how your app was built — and puts it in front of a real browser. That instance sees your app the way a user sees it: buttons, menus, forms, navigation. It guesses what things do, tries them, and compares what it expected to what actually happened.

Three things it finds reliably:

Broken buttons. The button renders. It is clickable. Nothing happens. The coding agent that wrote it would never catch this because the button exists in the code. ClickProbe catches it because the button doesn’t work in the browser.

Phantom saves. Fill a form, click Save, see a success toast, reload the page — the data is gone. The coding agent validated the save handler in isolation. ClickProbe ran the full user flow.

Console errors. JavaScript exceptions that don’t surface visually but indicate broken logic, failed network requests, or unhandled state. ClickProbe captures these on every action.

Every finding comes with a screenshot of before and after, the action that triggered it, and the full reasoning chain. Not a raw stack trace — a human-readable account of what the AI tried and what went wrong.

What it looks like in practice

Install the ClickProbe proxy on your dev machine. Point it at your local or staging URL. Open your browser, navigate to your app, and start exploring. ClickProbe’s Claude instance takes over from there.

Within five minutes it has mapped your navigation, found your forms, identified your interactive elements, and started logging findings. Within ten minutes you typically have the first bug report in your Slack channel or IDE.

You don’t write tests. You don’t maintain selectors. You don’t configure assertions. You describe your app in two sentences, and ClickProbe does the rest.

Read more about the architecture on the how it works page, or see all the integrations on the integrations page.

Where it is going

ClickProbe is currently in closed beta with a small group of DACH development teams. We are using this phase to calibrate exploration depth, tuning how much context we give the AI and how many actions constitute a meaningful session before the cost outweighs the value.

The public beta opens when we can run 100 concurrent sessions without paging ourselves. We expect to hit that in Q3 2026.

If you want an invite before that, write to hello@clickprobe.ai. If you want to start the 10-minute trial immediately, head to app.clickprobe.ai.