Back to home

Aigentably-Bot

Aigentably-Bot is the crawler we run when an authenticated Aigentably user requests AI tool generation for a site they own. This page documents what it does, what it doesn't do, and how to control access.

Identification

User-agent stringAigentably-Bot/1.0 (+https://aigentably.com/bot)
Operated byAigentably (aigentably.com)
PurposeOn-demand site analysis for WebMCP tool generation, triggered by authenticated site owners.
FrequencyAt most one fetch per page per 24h (crawl results are cached). Pro users: up to 20 fresh crawls per 30 days.
Pages per crawlUp to 5 pages selected from sitemap or homepage links.

What it does

  • Fetches /robots.txt and honors Disallow rules and Sitemap: directives.
  • Reads /sitemap.xml (or /sitemap_index.xml) when no Sitemap: entry is declared.
  • Fetches up to 5 representative HTML pages with standard GET requests.
  • Honors <meta name="robots" content="noindex"> (and googlebot / aigentably-bot variants) — those pages are not passed to the model.
  • Sends If-None-Match / If-Modified-Since on refresh — unchanged pages return 304 with no body transfer.
  • Extracts structured signals (forms, buttons, framework, JSON-LD) from each page and discards the raw HTML.
  • Caches the crawl result for 24 hours to avoid re-fetching unchanged sites.

What it does not do

  • Does not execute JavaScript. Static HTML only.
  • Does not attempt to log in or bypass authentication. Pages returning 401/403 are reported as inaccessible and skipped.
  • Does not submit forms, click buttons, or trigger any side effects.
  • Does not crawl sites without an authenticated Aigentably user request. It is not a continuous web crawler.
  • Does not store full page content beyond the structured signal extraction.

Blocking Aigentably-Bot

Add the following to your robots.txt to block the crawler entirely:

User-agent: Aigentably-Bot
Disallow: /

To block specific paths only:

User-agent: Aigentably-Bot
Disallow: /admin
Disallow: /internal/

Rules under User-agent: * are also honored.

Contact

Questions, abuse reports, or unexpected behavior: contact us.