Robots.txt Analyzer
Test directives against URLs and user-agents at scale.
robots.txt input
Paste your robots.txt or fetch /robots.txt directly.
Path access tester
Will a given UA be allowed to crawl a given path?
Parsed groups
3 group(s), 2 sitemap(s)
Start here · What is robots.txt?
Robots.txt is a small text file at the root of a website, usually example.com/robots.txt. It tells crawlers which paths they are allowed or not allowed to crawl.
Crawl control is not the same as indexing control. A blocked URL can still appear in Google if other pages link to it, but Google may not crawl the blocked page content.
Beginners should treat robots.txt carefully. One broad Disallow: / can hide an entire site from search engine crawlers.
When to use this tool
- Pre-launch check
Use it before a site goes live to make sure staging blocks are removed and important directories are crawlable.
- Debugging missing pages
Use it when a URL is not being crawled or Search Console mentions a robots.txt block.
- Testing rule changes
Paste proposed rules and sample URLs before asking developers to deploy them.
Examples
Walk through these with the form above — they are practice scenarios, not live data.
Accidental full-site block
Try this
Paste User-agent: * followed by Disallow: /, then test your home page URL.
What to look for
The analyzer should flag that all crawlers are blocked from crawling the site.
Admin area only
Try this
Paste a rule blocking /admin/, then test /admin/login and /blog/post-1.
What to look for
The admin URL should be blocked, while the blog URL should remain allowed.
Short tutorial
Follow in order the first time you use the tool; later you can skip to the step you need.
- Step 1 - Paste robots.txt
Use the live file or a proposed draft. Keep line breaks intact so user-agent groups stay readable.
- Step 2 - Add test URLs
Include important templates: home page, category pages, product pages, blog posts, and blocked private areas.
- Step 3 - Check user agents
Test
Googlebotand*at minimum. Different user-agent groups can produce different crawl permissions.
More detail
New here? Skim Start here first, then run one Examples scenario in the form above.
Robots.txt Analyzer does one job: test directives against URLs and user-agents at scale. It lives under Technical SEO on SEOToolkits, where the beginner idea is simple: Technical SEO keeps pages crawlable, indexable, fast enough, and understandable to search engines.
FAQ
- Does robots.txt remove a page from Google?
- Not reliably. Use
noindexor remove the page when you need deindexing. Robots.txt mainly controls crawling. - Where should robots.txt live?
- It should be available at the root of the host, such as
https://example.com/robots.txt. - Can one bad rule block my whole site?
- Yes. A broad
Disallow: /underUser-agent: *tells all compliant crawlers not to crawl anything.
Related tools
Same workflow cluster on SEOToolkits — open another module without leaving context.