Robots Parse

Fetch and parse robots.txt for any site. Returns crawler rules, sitemap references, and user-agent groups as clean JSON.

GET /api/robots-parse

Try it

{
  "ok": true,
  "input_url": "https://example.com",
  "robots_url": "https://example.com/robots.txt",
  "found": true,
  "status": 200,
  "content_type": "text/plain",
  "sitemaps": [
    "https://example.com/sitemap.xml"
  ],
  "groups": [
    {
      "user_agents": ["*"],
      "allow": ["/public/"],
      "disallow": ["/admin/"],
      "crawl_delay": null,
      "host": null
    }
  ],
  "meta": {
    "responseTimeMs": 74,
    "cached": false,
    "rateLimitedScope": "global"
  },
  "error": null
}

What it returns

Use cases

Quick API examples

curl

curl "https://tinyutils.dev/api/robots-parse?url=https://example.com"

JavaScript (fetch)

const res = await fetch(
  "https://tinyutils.dev/api/robots-parse?url=https://example.com"
);
const data = await res.json();
console.log(data.sitemaps);
console.log(data.groups);