# robots.txt for contented.io - Revolutionary AI Digital Labor Platform # Transform content creation from expensive human labor to intelligent AI automation # Last updated: 2025-06-22 # === ALLOW ALL SEARCH ENGINES === User-agent: * # === PUBLIC PAGES - FULL ACCESS === # Main landing page (SEO-optimized About content) Allow: / # Core marketing pages Allow: /pricing Allow: /pricing/ Allow: /signup Allow: /signup/ # Blog and content showcase Allow: /blog Allow: /blog/ Allow: /blog/* # Legal and informational pages Allow: /terms Allow: /privacy Allow: /about # === PROTECTED AREAS - NO CRAWLING === # Authentication and user dashboard Disallow: /app Disallow: /app/ Disallow: /auth Disallow: /auth/ # User-specific functionality Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /schedule Disallow: /schedule/ Disallow: /post-history Disallow: /post-history/ Disallow: /image-pool Disallow: /image-pool/ # Document and file uploads Disallow: /documents/ Disallow: /uploads/ Disallow: /files/ # API endpoints and functions Disallow: /api/ Disallow: /functions/ Disallow: /.netlify/ # User-generated content paths Disallow: /user/ Disallow: /profile/ Disallow: /account/ # Checkout and payment flows Disallow: /checkout Disallow: /checkout/ Disallow: /payment/ Disallow: /billing/ Disallow: /checkout-success Disallow: /checkout-canceled # Admin and internal tools Disallow: /admin/ Disallow: /internal/ Disallow: /test/ Disallow: /dev/ # === SPECIFIC BOT INSTRUCTIONS === # Google Bot - Full access to public content User-agent: Googlebot Allow: / Allow: /pricing Allow: /signup Allow: /blog Allow: /blog/ Allow: /terms Allow: /privacy Allow: /about Disallow: /app Disallow: /dashboard Disallow: /settings Disallow: /api/ # Bing Bot - Same permissions as Google User-agent: Bingbot Allow: / Allow: /pricing Allow: /signup Allow: /blog Allow: /blog/ Allow: /terms Allow: /privacy Allow: /about Disallow: /app Disallow: /dashboard Disallow: /settings Disallow: /api/ # Social Media Crawlers - Allow for rich previews User-agent: facebookexternalhit Allow: / Allow: /blog Allow: /blog/ Allow: /pricing User-agent: Twitterbot Allow: / Allow: /blog Allow: /blog/ Allow: /pricing User-agent: LinkedInBot Allow: / Allow: /blog Allow: /blog/ Allow: /pricing # === AGGRESSIVE CRAWLERS - RATE LIMITING === # Slow down aggressive crawlers User-agent: AhrefsBot Crawl-delay: 10 Allow: / Allow: /blog Disallow: /app User-agent: SemrushBot Crawl-delay: 10 Allow: / Allow: /blog Disallow: /app User-agent: MJ12bot Crawl-delay: 5 Allow: / Allow: /blog # === BLOCK PROBLEMATIC BOTS === # Block bots that don't respect robots.txt or are malicious User-agent: AhrefsBot Disallow: / User-agent: SemrushBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: DotBot Disallow: / User-agent: CCBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: GPTBot Disallow: / User-agent: AI2Bot Disallow: / User-agent: Claude-Web Disallow: / # Block content scrapers User-agent: HTTrack Disallow: / User-agent: wget Disallow: / User-agent: WebZIP Disallow: / # === FILE TYPE RESTRICTIONS === # Prevent crawling of sensitive file types Disallow: /*.json$ Disallow: /*.xml$ Disallow: /*.csv$ Disallow: /*.sql$ Disallow: /*.log$ Disallow: /*.config$ Disallow: /*.env$ # === CRAWL OPTIMIZATION === # Set reasonable crawl delay for all bots Crawl-delay: 1 # === SITEMAPS === # Primary sitemap for all content Sitemap: https://contented.io/sitemap.xml # === SEO NOTES === # This robots.txt is optimized for: # 1. Maximum visibility of marketing pages (/, /pricing, /signup, /blog) # 2. Protection of user data and internal functionality # 3. Prevention of duplicate content issues # 4. Optimal crawl budget allocation # 5. Social media rich preview support # 6. Protection against AI training data harvesting