✓ Solves CAPTCHAs Mobile-first AI-ready HTML One POST request

The web scraping API that gets past the CAPTCHA — and stays this simple

Send a URL, get back clean minified HTML built for AI. The crawler renders pages on a mobile viewport and solves CAPTCHAs automatically — no proxies, no fingerprints, no headless browser fleet to run.

13Countries battle-tested
1 POSTURL in, HTML out
MobileViewport by default
€0.20Per request, pay-as-you-go

Why we built this

We didn't set out to build a crawler.
The web made us.

We run a grocery shopping & recipe API. Which means reading public webpages all day — recipe blogs, supermarket listings, social posts, across 13 countries — and turning the mess into clean data our models can actually use.

Then the web fought back. Bloated JavaScript. The good stuff hiding on mobile. And CAPTCHAs everywhere.

Every scraper made us pick a side. The easy ones quit the second a CAPTCHA showed up. The tough ones wanted proxies, fingerprints, and a whole browser fleet before they'd read a single page.

So we built our own. Mobile viewport. CAPTCHAs solved on the fly. Pages stripped to lean HTML our AI reads in one pass. It's been quietly doing exactly this in production behind our /parse endpoint ever since.

Nothing out there was this simple. So we're handing it to you.


Every scraping API forces a trade-off. This one doesn't.

The market splits cleanly in two. One side is easy but stops at the wall. The other gets through the wall but is a project to set up. Almost no one does both — and no one defaults to mobile.

Camp 1 — Easy

Simple, but stops at the wall

Reader-style APIs and prefix tricks. Dead simple to call. Then a CAPTCHA or bot check shows up and you get an error page instead of content.

  • Trivial to call
  • Folds on CAPTCHAs
  • Desktop-first
Camp 2 — Powerful

Gets through, but it's a project

Proxy networks and unblockers. They can defeat bot protection — once you've configured proxies, fingerprints, sessions, and retries, and accepted the bill.

  • Defeats bot protection
  • Heavy setup & config
  • Desktop-first
Pepesto Crawl API

Both — plus mobile

One authenticated POST with a URL. CAPTCHAs solved inside the call. Output is mobile-rendered, minified, and ready for an LLM. Nothing to configure.

  • Trivial to call
  • Solves CAPTCHAs
  • Mobile-first & AI-ready

How it works

URL in, AI-ready text out. Three steps, one request.

Send a URL

POST any public HTTP or HTTPS URL to /api/crawl with your Bearer token. No options to learn.

We render & unblock

The page is rendered on a mobile viewport in a real browser. If a CAPTCHA appears, it's solved automatically — proxies and fingerprints handled for you.

Get clean HTML

You receive a JSON object with an html field: compact, minified HTML built for parsing and AI workflows.

Want the request and response schema? Read the API docs →

Output an LLM can actually read

"Minified HTML" means a compact version of the page body: visible text is kept, hidden elements and <script>/<style> are removed, and only id and class survive. You feed it straight to a model — no boilerplate to strip, no token budget wasted on markup.

Raw rendered page
<body>
  <script src="analytics.js">…</script>
  <style>.hdr{display:flex;…}</style>
  <div class="product" data-ga="x9" style="…">
    <div hidden>tracking pixel</div>
    <h1 data-id="42">Organic Oats</h1>
    <span class="price">£2.40</span>
  </div>
  <!-- 40kb of nav, footer, modals -->
</body>
Pepesto minified HTML
<body>
  <div class="product">
    <h1>Organic Oats</h1>
    <span class="price">£2.40</span>
  </div>
</body>
// every token earns its place

Same content. A fraction of the tokens. Structure preserved via id and class so your parser still knows where things are.


What you can build with it

It started as recipe parsing. It works for any public page you need turned into clean data.

AI agents

Give an agent eyes on the live web

Let a LangChain, AutoGPT-style, or custom agent read any public page — even CAPTCHA-protected ones — and get back HTML it can reason over directly.

RAG & LLM pipelines

Clean ingestion for retrieval

Feed minified HTML into your chunker and embeddings without writing a boilerplate stripper for every site. Less markup, more signal per token.

Price & catalog monitoring

Track pages that fight back

Monitor product, listing, or pricing pages that throw bot checks. The crawler gets through and returns the rendered content every time.

Content & research tools

Turn any URL into structured input

Summarizers, readers, and research assistants that need the real rendered text — not a half-loaded SPA or a "verify you're human" page.

Recipe & food apps

Parse recipes from anywhere

The original use case. Pull a recipe page or social post and pair it with /parse for structured ingredients, steps, and nutrition.

Mobile-only content

Reach what desktop scrapers miss

Some sites serve their best content only to mobile. Because the crawler requests a mobile viewport by default, you get the lean version made for phones.


How it compares

A fair look at where the Pepesto Crawl API sits against the tools developers reach for first.

 Pepesto CrawlReader-style APIsProxy / unblocker platforms
Solves CAPTCHAs automaticallyYesNoYes
Setup to first callOne POSTOne callProxies, fingerprints, config
Mobile viewport by defaultYesNoOptional / manual
Output tuned for AIMinified HTMLMarkdown / textRaw HTML
Pricing model€0.20 / request, pay-as-you-goToken / credit tiersSubscription + usage
No subscription requiredYesVariesUsually no

A general comparison of common approaches, not specific products. Capabilities vary by provider and plan.


Simple pricing

Pay only for what you crawl. No subscription, no per-site fee.

€0.20 / request

One price per crawled page — CAPTCHA solving, mobile rendering, and minification all included. Pay-as-you-go via Stripe; your API key is returned instantly.

Running real volume? Volume discounts are available, and we're happy to tailor a rate to what you're building. Tell us about your project — we'd love to chat.

See full pricing →

Frequently asked questions

What is the Pepesto Crawl API?

It's a REST endpoint that fetches a public webpage and returns a compact, mobile-optimized minified HTML representation of the rendered page. You send a single URL and receive clean HTML built for downstream parsing and AI workflows. It solves CAPTCHAs when they're encountered, so it returns content for CAPTCHA-protected public pages that most scrapers fail on.

Does it solve CAPTCHAs automatically?

Yes. When the crawler hits a CAPTCHA on a public page, it solves it and returns the content. You don't configure proxies, browser fingerprints, or third-party CAPTCHA services — it's all inside the single /api/crawl call. That's the core difference: easy scraping APIs stop at the CAPTCHA wall, and APIs that get past it usually require heavy setup.

What format does it return?

A JSON object with an html field — the minified HTML from the rendered page. Visible text is preserved, hidden elements and <script>/<style> tags are removed, and only id and class attributes are retained. It's an AI-ready representation of the page body for parsing and LLM workflows, not a byte-for-byte copy of the original source.

Why is it mobile-first?

The crawler requests pages with a mobile viewport. Mobile pages are leaner, render faster, and carry the core content without desktop-only clutter — so the resulting minified HTML is smaller and cleaner for AI analysis. Most competing scraping APIs default to a desktop browser; mobile-first rendering is a niche almost no one serves.

Can it scrape pages behind a login or paywall?

No. The endpoint works on public pages only. It doesn't work for anything that requires authentication — pages behind a login, a private account area, a paywall login, or any flow that depends on a user-specific session.

How is the Pepesto Crawl API different from other scraping APIs?

Most scraping tools force a trade-off. The simple, easy-to-call ones tend to fold the moment a page is CAPTCHA-protected. The ones that get past bot protection usually require proxy configuration, fingerprinting, and real engineering setup. The Pepesto Crawl API does both in a single POST — it solves CAPTCHAs and stays trivial to call — and renders mobile-first by default, which almost no one else does.

How much does it cost and how do I start?

€0.20 per request, pay-as-you-go via Stripe credits — no monthly subscription and no per-site fee. Volume discounts bring it down further, and we're glad to tailor a rate, so get in touch if you're running real volume — we'd love to chat. Buy credits and your API key is returned instantly, with no approval process. See the pricing page for current rates across all endpoints.


Crawl your first page in the next five minutes

Built for ourselves, battle-tested in production, now open to everyone. One POST, a URL in, clean AI-ready HTML out — CAPTCHAs and all.

Questions? Email support@pepesto.com or book a quick call.