Robots.txt Parser API — fetch and parse robots.txt

Need to inspect robots.txt from your app, script, or CI job without writing your own parser? Use the TinyUtils Robots Parse API to fetch rules, sitemap references, and crawler groups as clean JSON.

The problem

robots.txt is easy to inspect manually, but automating it is more work than it looks. You need to fetch the right URL (always at the origin, not the path), parse multiple user-agent blocks, handle missing files gracefully, and extract sitemap references that are easy to miss in raw text.

Validating crawl rules in scripts or CI tooling is awkward when all you have is a raw text response. The TinyUtils Robots Parse API does all of this in one GET request and returns structured JSON.

Quick solution

curl

curl "https://tinyutils.dev/api/robots-parse?url=https://example.com"

Example response

{
  "ok": true,
  "input_url": "https://example.com",
  "robots_url": "https://example.com/robots.txt",
  "found": true,
  "status": 200,
  "content_type": "text/plain",
  "sitemaps": [
    "https://example.com/sitemap.xml"
  ],
  "groups": [
    {
      "user_agents": ["*"],
      "allow": ["/public/"],
      "disallow": ["/admin/"],
      "crawl_delay": null,
      "host": null
    }
  ],
  "meta": {
    "responseTimeMs": 74,
    "cached": false,
    "rateLimitedScope": "global"
  },
  "error": null
}

Use cases

JavaScript example

JavaScript (fetch)

const res = await fetch(
  "https://tinyutils.dev/api/robots-parse?url=https://example.com"
);
const data = await res.json();

if (data.found) {
  console.log("Sitemaps:", data.sitemaps);
  for (const group of data.groups) {
    console.log("User-agents:", group.user_agents);
    console.log("Disallow:", group.disallow);
  }
} else {
  console.log("robots.txt not found (status:", data.status, ")");
}

See also

Try the Robots Parse tool

Enter any URL and inspect its robots.txt instantly.

Open Robots Parse →