Robots.txt Parser API — fetch and parse robots.txt

Need to inspect robots.txt from your app, script, or CI job without writing your own parser? Use the TinyUtils Robots Parse API to fetch rules, sitemap references, and crawler groups as clean JSON.

The problem

robots.txt is easy to inspect manually, but automating it is more work than it looks. You need to fetch the right URL (always at the origin, not the path), parse multiple user-agent blocks, handle missing files gracefully, and extract sitemap references that are easy to miss in raw text.

Validating crawl rules in scripts or CI tooling is awkward when all you have is a raw text response. The TinyUtils Robots Parse API does all of this in one GET request and returns structured JSON.

Quick solution

curl

curl "https://tinyutils.dev/api/robots-parse?url=https://example.com"

Example response

{
  "ok": true,
  "input_url": "https://example.com",
  "robots_url": "https://example.com/robots.txt",
  "found": true,
  "status": 200,
  "content_type": "text/plain",
  "sitemaps": [
    "https://example.com/sitemap.xml"
  ],
  "groups": [
    {
      "user_agents": ["*"],
      "allow": ["/public/"],
      "disallow": ["/admin/"],
      "crawl_delay": null,
      "host": null
    }
  ],
  "meta": {
    "responseTimeMs": 74,
    "cached": false,
    "rateLimitedScope": "global"
  },
  "error": null
}

Use cases

•Check whether a site blocks specific crawlers
•Find sitemap URLs quickly without fetching raw text
•Debug crawlability issues from a script or CI pipeline
•Validate robots.txt after deployments
•Build SEO and monitoring workflows on top of structured data

JavaScript example

JavaScript (fetch)

const res = await fetch(
  "https://tinyutils.dev/api/robots-parse?url=https://example.com"
);
const data = await res.json();

if (data.found) {
  console.log("Sitemaps:", data.sitemaps);
  for (const group of data.groups) {
    console.log("User-agents:", group.user_agents);
    console.log("Disallow:", group.disallow);
  }
} else {
  console.log("robots.txt not found (status:", data.status, ")");
}

Robots.txt Parser API — fetch and parse robots.txt

The problem

Quick solution

Use cases

JavaScript example

See also