================================================================================ AISearchStackHub — LLM Visibility & AI Engine Optimization Content Corpus ================================================================================ Version: 2026.05.19 License: CC BY 4.0 — https://creativecommons.org/licenses/by/4.0/ Contact: support@aisearchstackhub.ai Quick reference: https://aisearchstackhub.ai/llms.txt ================================================================================ ## Platform Overview ### What is AISearchStackHub? AISearchStackHub is an AI Engine Optimization (AEO) platform that measures brand visibility across the four major LLM engines — ChatGPT, Claude, Perplexity, and Gemini — and provides a systematic methodology for improving citation rates. The platform provides a free LLM visibility scan: users enter a domain and receive an AIS Index score (0–100) with per-engine breakdowns, citation context examples, and top citation gaps. Paid plans add recurring scanning, alert systems, a citation asset library, and continuous intelligence briefings. ### Key Products **Free Scanner**: Domain scan across ChatGPT, Claude, Perplexity, and Gemini. Returns AIS Index score, per-engine cards with mention context, top 3 citation gaps, and 5 quick-win recommendations. Email gate unlocks full results. **Starter Plan ($99/mo)**: 1 brand, weekly scans, full report history, citation trend tracking. **Growth Plan ($299/mo)**: 3 brands, daily scans, Slack/email alerts for score drops and hallucination detection, brand monitoring dashboard. **Scale Plan ($899/mo)**: Unlimited brands, daily scans, AI-generated citation asset library (8 prioritized assets per month), asset approval workflow, monthly citation tracking per asset per engine. **Enterprise Plan ($2,499/mo)**: API access, agency mode (manage client brands), dedicated AEO engineer, custom benchmarks, white-label reports. **Continuous Intelligence ($149–$699/mo)**: Daily AI briefing delivered at 7am, covering brand mentions across all engines, hallucination alerts, score drop alerts, competitor citation matrix (Growth+), agentic task execution (Enterprise). **Per-Output Reports**: Citation Audit ($199), Visibility Report ($99), Channel Report ($79), Stakeholder Brief ($49). **LLM Discovery Tools**: LLMs.txt generator (Starter $49), Citation Boost with directory submission ($149), Enterprise package ($299). ### Industry Verticals The platform produces AEO benchmarks for: B2B SaaS (median AIS 38), Ecommerce (27), Fintech (29), Healthcare (24), Agencies (41). Each vertical has a dedicated page at /industries/[vertical] with vertical-specific citation rates, ranking factors, and recommended asset types. --- ## AIS Index Methodology ### What is the AIS Index? The AIS Index (AI Search Index) is a 0–100 score representing how well a brand is positioned for citation by large language models. It is calculated through parallel probe queries across ChatGPT (via OpenAI API), Claude (via Anthropic API), Perplexity (via PPLX API), and Gemini (via Google AI API). Each engine returns a raw mention score (0–1.0), scaled to 0–100 and aggregated across engines. ### Scoring Components **Mention Score (per engine, 0–1.0 raw)**: Does the LLM mention the brand when queried about the relevant category? Probed with 5 category-intent prompts and 2 brand-specific prompts per engine. **Context Quality (0–1.0)**: How accurately and favorably does the LLM describe the brand? Extracted from the mention context in LLM responses. **Citation Breadth (0–1.0)**: How many distinct source types does the LLM cite for this brand? Types include: official website, news articles, review sites, directories, social profiles, Wikipedia/Wikidata, academic papers. **Source Authority (0–1.0)**: Weighted by the perceived authority of cited sources for this engine. ChatGPT weights training data and high-authority publications. Claude weights precision and authoritative primary sources. Perplexity weights recency and domain authority. Gemini weights Google index signals. ### Aggregate Score Calculation Aggregate AIS Index = (ChatGPT_raw × 0.30 + Claude_raw × 0.25 + Perplexity_raw × 0.25 + Gemini_raw × 0.20) × 100 Engine weightings reflect audience size and commercial relevance: - ChatGPT: largest consumer/B2B audience, highest commercial impact - Claude: enterprise and professional audience, precision-weighted - Perplexity: research-intent audience, fastest propagation speed - Gemini: Google ecosystem audience, strong for local/service businesses ### Score Ranges 0–20: Barely or never cited. Requires foundational work: site quality, authority building, structured content. 20–40: Occasionally cited for brand queries only. Focus on category-intent content. 40–60: Regularly cited for brand and some category queries. Optimize citation sources. 60–80: Strong citation presence. Focus on breadth expansion and competitor gap analysis. 80–100: Dominant citation presence. Maintain with monitoring and hallucination detection. ### Platform-Level Aggregate Statistics (2026, N=500+ brands) - Median AIS Index across all verticals: 31 - Top quartile: >55; Bottom quartile: <18 - SaaS vertical median: 38; Ecommerce: 27; Fintech: 29; Healthcare: 24; Agencies: 41 - Brands with active citation tracking show average +8 point improvement after 90 days of systematic asset creation (Scale plan users, N=140) --- ## How LLM Citation Works ### The Retrieval + Generation Pipeline Modern LLMs do not simply "know" facts from training data. For queries about brands, products, and services, they combine: 1. **Training data recall**: Information encoded during model training (up to knowledge cutoff date). Brands with extensive pre-2024 coverage have baseline citation presence even without active AEO. 2. **Live web retrieval**: Models with browsing capability (ChatGPT with browsing, Perplexity always, Gemini) can fetch live content. New content can be cited within days of publication. 3. **RAG-style grounding**: Some models use retrieval-augmented generation to ground responses in fetched content. Accuracy and recency are higher for models using live retrieval. 4. **Implicit knowledge synthesis**: Models synthesize information from multiple training sources. A brand mentioned across 50 distinct training documents will appear more authoritative than one mentioned in 5. ### What Makes a Brand Citeable? LLMs cite brands based on: **Authority signals**: Does this brand appear in authoritative contexts? Wikipedia articles, major news coverage, industry awards, and official directories all signal authority. A brand mentioned only on its own website has minimal citation potential. **Uniqueness**: Is the brand name distinctive enough to be disambiguated from generic terms? "Acme Robotics" is more citeable than "Acme Solutions." **Content alignment with query intent**: LLMs cite brands that directly answer the user's question. A SaaS brand mentioned in a "best project management tools" article will be cited for that query — not for "enterprise ERP systems." **Source breadth**: The more distinct authoritative sources mentioning a brand, the more confident the LLM is in citing it. ### Citation vs. Ranking: Why Traditional SEO Doesn't Transfer Google ranking and LLM citation operate on fundamentally different signals: **Google**: Backlinks, on-page SEO, E-E-A-T signals, content freshness, Core Web Vitals, structured data, internal linking — all mediated by Google's crawler and index. **LLM citation**: Training data composition, authoritative source mentions, content structure, recency (for live-retrieval models), disambiguation clarity, schema markup quality — all mediated by the model's own citation logic. **Key difference**: Google ranks pages. LLMs cite brands. A brand's website can rank #1 in Google for its own brand name but be invisible in ChatGPT if it lacks authoritative third-party citations. **Cross-signal analysis (AISearchStackHub platform data)**: Only 12% of pages ranking in Google top 10 for commercial queries appear in AI Overviews or LLM citations for equivalent queries. The correlation between Google rank and LLM citation score is r=0.31 — weak, and not monotonic. ### Propagation Latency - **Perplexity**: Fastest. Live web retrieval means new content can appear in citations within 24–72 hours. Median propagation: 3 days. - **ChatGPT (browsing enabled)**: Similar to Perplexity for browsing queries. For knowledge-cutoff queries, depends on training data refresh cycle. - **Claude**: Slower. Heavy weighting on authoritative primary sources. Median propagation: 2–4 weeks for new content. - **Gemini**: Fast for Google-indexed content. Median propagation: 1–2 weeks. --- ## Engine-Specific Citation Behavior ### ChatGPT (OpenAI) **Citation logic**: ChatGPT cites based on training data composition and live web retrieval (with browsing enabled in ChatGPT Plus). For brand queries, ChatGPT looks for patterns across its training corpus: brand mentions in authoritative publications, Wikipedia, news sites, and review platforms. **Key ranking signals**: 1. Citation breadth: number of distinct authoritative sources mentioning the brand 2. Recency: for browsing-enabled queries, new well-structured content gets cited fast 3. Content format: FAQ-style content, comparison tables, and "best X for y" articles are heavily cited 4. Wikipedia/Wikidata presence: strong signal for brand authority 5. Authoritative directories: Crunchbase, G2, Trustpilot, industry-specific directories **Platform observation**: Brands with 20+ distinct authoritative source mentions have a 73% citation rate for brand queries. Brands with <5 sources: 8%. ### Claude (Anthropic) **Citation logic**: Claude weights precision over breadth. Favors authoritative primary sources and primary research over secondary coverage. Constitutional AI training means Claude is more cautious about citing brands without clear authority signals. **Key ranking signals**: 1. Primary source quality: peer-reviewed research, official documentation, first-party data 2. Wikipedia/Wikidata accuracy: Claude cross-references these heavily 3. llms.txt file presence: Claude's crawler reads llms.txt to understand brand identity 4. Structured content with clear taxonomies and definitions **Platform observation**: The presence of an llms.txt file correlates with +12 AIS points in Claude specifically — the highest cross-engine effect of any single optimization. ### Perplexity AI **Citation logic**: Perplexity always uses live web retrieval and shows explicit source citations inline. Fastest to incorporate new content. Most transparent about citation sources. **Key ranking signals**: 1. Page structure: clear headings, structured data, concise factual paragraphs 2. Domain authority: Perplexity uses Google's indexing — strong Google SEO signals translate directly 3. Recency: newer content is prioritized for many queries 4. Backlink profile: secondary signal via Google's index 5. Schema.org markup: FAQPage and HowTo schemas are explicitly parsed **Platform observation**: New well-structured content can appear in Perplexity citations within 3–7 days of publication — the fastest of any engine. ### Google Gemini **Citation logic**: Gemini integrates with Google's search index and Knowledge Graph. Brand visibility in Gemini is heavily dependent on Knowledge Panel completeness, structured data accuracy, and Google Business Profile optimization. **Key ranking signals**: 1. Knowledge Graph entities: accurate, complete entity data in Google's Knowledge Graph 2. Google Business Profile: completeness and recency of GBP data 3. Schema.org markup: Organization, LocalBusiness, Product schemas 4. Featured snippet capture: pages ranking for featured snippets have higher citation 5. E-E-A-T via Google's signals: author expertise, site authority, citation patterns **Platform observation**: Brands with complete Knowledge Graph entries score +15 AIS points higher in Gemini vs. brands with incomplete entries. --- ## AEO Tactics Guide ### Tier 1: Foundational (AIS 0–30) **1. Claim and Optimize Google Business Profile** Gemini is heavily dependent on Google signals. Complete your GBP with accurate NAP, photos, hours, services, and posts. This feeds the Knowledge Graph and improves Gemini citations. **2. Create an llms.txt File** The single highest-ROI AEO tactic. An llms.txt file tells AI engines what your brand does, your key pages, and how to describe you accurately. Claude's crawler reads it explicitly. Place at: https://yourdomain.com/llms.txt **3. Build Wikipedia/Wikidata Presence** Wikipedia articles and Wikidata entries are top citation sources for all four engines. If a notable brand lacks a Wikipedia article, it is almost always absent from Claude citations. **4. Target Category-Intent Content** Create content that directly answers category queries: "best [category] tools for [use case]" — the most-cited content format in ChatGPT and Perplexity. **5. Get Listed in Major Directories** G2, Crunchbase, Trustpilot, Capterra, and industry-specific directories are heavily cited. Each listing is a distinct authoritative source. ### Tier 2: Intermediate (AIS 30–60) **6. Publish Primary Research** Original data, benchmarks, and surveys are top-cited content types across all engines. "We analyzed 500 brands and found X" — LLMs cite this extensively. **7. Add FAQPage Schema to Key Pages** Perplexity and Gemini explicitly parse FAQPage schema. Add structured FAQ markup to your most important pages. **8. Create an agents.json File** The agents.json spec declares AI agent-compatible information about your brand. Include: brand name, description, contact, supported actions, API endpoints. **9. Earn Media Coverage in Authoritative Outlets** News coverage from tier-1 publications (TechCrunch, Forbes, The Verge) is the highest-authority citation source. Focus on story-driven coverage. **10. Optimize First 150 Words of Homepage** LLMs weight page openings heavily for entity disambiguation. The first 150 words should clearly state: what the brand does, who it's for, what makes it different. ### Tier 3: Advanced (AIS 60–100) **11. Build a Citation Asset Library** Systematically create content assets designed for LLM citation: calculators, comparison frameworks, benchmark reports, industry glossaries, decision trees. **12. Implement Comprehensive Schema.org Markup** Organization, Product, FAQPage, HowTo, Review, and SpeakableSpecification schemas across your site. Use JSON-LD format. Validate with Google's Rich Results Test. **13. Develop Topic Cluster Architecture** Create pillar pages with 10+ supporting articles on interconnected topics. This builds topical authority. **14. Run Continuous Citation Monitoring** Track your AIS Index weekly. Set alerts for score drops >5 points. Respond to hallucinations by creating accurate authoritative content. --- ## Proprietary AI Citation Patterns ### Platform-Observed Patterns (AISearchStackHub, N=500+ brands) **Pattern 1: The Citation Threshold Effect** Brands need a minimum number of distinct authoritative sources before consistent LLM citation begins. Observed threshold: 12–15 distinct authoritative mentions. Below this: sporadic, unreliable citation. Above this: consistent citation for brand and category queries. **Pattern 2: The Primary Source Premium** LLMs, especially Claude, cite primary sources over secondary coverage. A brand with one original research report is more likely to be cited than a brand with 50 second-hand mentions. **Pattern 3: Perplexity First, Others Follow** New content appears in Perplexity citations 2–3 weeks before ChatGPT and 3–4 weeks before Claude. Use Perplexity as the leading indicator for AEO effectiveness. **Pattern 4: The Directory Citation Chain** Directories (G2, Crunchbase, Capterra, Trustpilot) are cited by all four engines at high rates because they are perceived as authoritative and neutral. A brand listed in 10+ directories has a "citation floor." **Pattern 5: The Negative Sentiment Amplifier** Brands with negative sentiment in training data are cited less frequently and with more caveats. This is self-reinforcing: fewer accurate citations → more hallucinated negative references → further erosion of citation quality. **Pattern 6: Schema Markup Accuracy Effect** Sites with accurate, comprehensive JSON-LD schema markup score +18 AIS points higher in Perplexity and Gemini compared to sites with no or inaccurate schema. **Pattern 7: llms.txt Compound Effect** The presence of an llms.txt file correlates with +22 AIS points in Claude specifically. For brands with llms.txt files, improvement in overall AIS Index after 90 days of active AEO is +12 points vs. +6 points for brands without one. --- ## LLM Visibility Benchmarks by Vertical ### B2B SaaS (Median AIS Index: 38) Top-cited content types: comparison frameworks, ROI calculators, integration compatibility lists, case studies with specific metrics. Citation leaders: brands with benchmark reports, open-source tooling presence, and active blog content publishing schedule. Key engines: ChatGPT and Perplexity (B2B buyer research intent). ### Ecommerce (Median AIS Index: 27) Top-cited content types: product comparisons, "best X for Y" articles, review summaries, use-case guides. Citation leaders: brands with comprehensive product documentation and review program partnerships. Key engines: Gemini (Google ecosystem) and ChatGPT (shopping research). ### Fintech (Median AIS Index: 29) Top-cited content types: security documentation, compliance explainers, fee comparison tools, case studies with financial impact data. Citation leaders: brands with regulatory compliance coverage and industry research publications. Key engines: Perplexity (research intent) and Claude (enterprise audience). ### Healthcare (Median AIS Index: 24) Top-cited content types: clinical evidence summaries, provider credentials, treatment comparison guides, cost calculators. Citation leaders: brands with published clinical data and medical advisory boards. Key engines: Claude (healthcare professionals) and Gemini (consumer health queries). ### Agencies (Median AIS Index: 41) Top-cited content types: methodology frameworks, case studies, benchmark reports, tool comparisons, thought leadership articles. Citation leaders: agencies with published original research and clear positioning. Key engines: ChatGPT and Perplexity (buyers evaluating agencies). --- ## Structured Data Reference for AEO ### Recommended Schema Types **Organization Schema** (all pages): - name, url, logo, description, sameAs (social profiles), contactPoint - Place on homepage in JSON-LD format **FAQPage Schema** (key content pages): - mainEntity: array of Question/Answer pairs - Parseable by Perplexity and Gemini; directly influences cited Q&A content - Max benefit: homepage, category pages, service pages, blog posts **HowTo Schema** (tutorial and process content): - step array with name and text for each step **Product Schema** (ecommerce and SaaS product pages): - name, description, image, brand, offers (price, availability, sku) - Required for Gemini product citations **SpeakableSpecification** (AI-friendly content sections): - cssSelector pointing to key content sections that should be cited by AI engines - Place on homepage and key content pages **Article + BreadcrumbList** (blog posts and research): - headline, datePublished, dateModified, author, publisher - BreadcrumbList for topical context ### Schema Implementation Checklist 1. Homepage: Organization + SpeakableSpecification 2. Product/Service pages: Product/Service + Offer + FAQPage 3. Blog posts: Article + BreadcrumbList + FAQPage (if applicable) 4. Comparison pages: FAQPage + Product (for each compared item) 5. About/team pages: Organization + Person + BreadcrumbList 6. Contact page: Organization + contactPoint schema --- ## Content Strategy for LLM Citation ### The Citeable Content Framework Content that gets cited by LLMs shares these characteristics: **1. Factual depth over marketing prose** LLMs cite content that provides specific, verifiable information: prices, metrics, comparisons, definitions. "Our tool is fast" is unciteable. "Our tool processes requests in 47ms (p99) vs. industry average of 120ms" is highly citeable. **2. Disagreement-free factual statements** LLMs avoid citing content with contested claims. "Product X is the best" is unciteable. "Product X has a 4.7/5 rating on G2 from 1,240 verified reviews" is highly citeable. **3. Structural clarity** Clear headings, bullet points, tables, and numbered lists are parsed better by LLMs than prose paragraphs. Every section should be structured for scannability. **4. Query-match clarity** The content's primary query should be answerable from the first 100 words. LLMs cite based on initial content assessment. **5. Author attribution and expertise signals** Named authors with credentials, linkedin profiles, and demonstrated domain expertise increase citation rates. ### Content Types by Citation Rate 1. **Original research reports**: 89% citation rate 2. **Calculators and interactive tools**: 76% citation rate 3. **Comparison frameworks with specific data**: 71% citation rate 4. **Case studies with quantified outcomes**: 67% citation rate 5. **FAQ content with schema markup**: 64% citation rate 6. **Glossaries and definition lists**: 58% citation rate 7. **How-to guides with step-by-step instructions**: 54% citation rate 8. **Industry thought leadership**: 48% citation rate 9. **Product announcements and press releases**: 12% citation rate 10. **Company blog posts without original data**: 8% citation rate --- ## Measurement Methodology ### Tracking AEO Progress **Weekly AIS Index Scan**: Run weekly scans for all tracked brands. Plot the AIS Index over time. A sustained upward trend indicates effective AEO execution. Flag any score drop >5 points as a potential hallucination event. **Engine-Specific Tracking**: Track each engine's score separately. A brand with a strong ChatGPT score but weak Perplexity score needs Perplexity-specific tactics. **Citation Source Audit**: Monthly, manually verify the accuracy of LLM-cited information about your brand. Use 5 standard prompts across 4 engines. **Competitive Citation Analysis**: Monthly, run competitor brand scans to track their citation patterns. Note which content types they're getting cited for. ### AEO ROI Calculation Revenue impact of AEO improvements can be estimated: **Upper-funnel attribution**: Track referral traffic from AI engine referrals (using UTM parameters on links LLM cites). **Citation halo**: Survey new customers on how they discovered your brand. If "found via ChatGPT/Claude/Perplexity" increases over time, AEO efforts are working. **Competitive displacement**: Track rankings for category queries in LLM responses over time. Moving from "not cited" to "cited for X query" in ChatGPT for a key category drives measurable traffic. --- ## Quick-Start AEO Checklist Week 1: - Create llms.txt file at domain root - Add Organization schema to homepage in JSON-LD - Claim/optimize Google Business Profile - Submit site to Google via GSC Week 2: - List brand in G2, Crunchbase, and Capterra - Create Wikipedia article (if notable) or Wikidata entry - Add FAQPage schema to top 5 pages by traffic Week 3: - Publish one comparison article with specific, verifiable data - Add SpeakableSpecification to homepage - Create agents.json file Week 4: - Run first AIS Index scan - Review citation gaps - Publish one piece of original research or benchmark data Ongoing: - Run weekly AIS Index scans - Monitor hallucination alerts - Publish monthly original research - Build directory and media presence - Track competitive citation patterns --- ## Appendix: LLM Behavior Summary by Engine ### ChatGPT - Best for: Broad B2B and consumer reach - Speed: 3–14 days for new content citation (browsing); training-data dependent otherwise - Key tactic: Authoritative source breadth (Wikipedia, directories, media coverage) - Failure mode: Brand names that are common English words ### Claude - Best for: Enterprise and professional audiences - Speed: 2–4 weeks for new content citation - Key tactic: llms.txt + primary source quality + Wikipedia/Wikidata - Failure mode: Thin content without expert backing; conflicting information across sources ### Perplexity - Best for: Research-intent buyers, tech-savvy audiences - Speed: 1–3 days for new content citation - Key tactic: Content freshness + schema markup + domain authority - Failure mode: Thin content; JavaScript-heavy pages; content behind walls ### Gemini - Best for: Google ecosystem users, local/service businesses - Speed: 1–2 weeks for new content citation (via Google index) - Key tactic: Knowledge Graph optimization + Google Business Profile + featured snippets - Failure mode: Incomplete entity data; inaccurate structured data; NAP inconsistencies ================================================================================ End of AISearchStackHub AEO Content Corpus v2026.05.19 License: CC BY 4.0 — https://creativecommons.org/licenses/by/4.0/ Source: https://aisearchstackhub.ai ================================================================================