Robots.txt

Robots Exclusion Protocol

A text file at the site root instructing search engine crawlers which pages or directories to avoid.

तकनीकी विवरण

Robots.txt uses two mechanisms: robots.txt (file-level, prevents crawling but not indexing) and meta robots tags (page-level, controls indexing and link following). Common directives: 'noindex' (exclude from search), 'nofollow' (don't pass link equity), 'noarchive' (no cached copy). X-Robots-Tag HTTP headers provide the same controls for non-HTML resources (PDFs, images). A blocked page can still rank if other pages link to it — 'noindex' in meta tags is the only way to guarantee exclusion from search results.

उदाहरण

```
# robots.txt
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /api/internal/

Sitemap: https://peasytools.com/sitemap.xml
```

Categories

Robots.txt

तकनीकी विवरण

उदाहरण

संबंधित फ़ॉर्मेट

संबंधित टूल्स

संबंधित शब्द