================================================================================
AISearchStackHub — LLM Visibility & AI Engine Optimization Content Corpus
================================================================================
Version: 2026.05.19
License: CC BY 4.0 — https://creativecommons.org/licenses/by/4.0/
Contact: support@aisearchstackhub.ai

Quick reference: https://aisearchstackhub.ai/llms.txt
================================================================================

## Platform Overview

### What is AISearchStackHub?

AISearchStackHub is an AI Engine Optimization (AEO) platform that measures brand
visibility across the four major LLM engines — ChatGPT, Claude, Perplexity, and
Gemini — and provides a systematic methodology for improving citation rates.

The platform provides a free LLM visibility scan: users enter a domain and receive
an AIS Index score (0–100) with per-engine breakdowns, citation context examples,
and top citation gaps. Paid plans add recurring scanning, alert systems, a
citation asset library, and continuous intelligence briefings.

### Key Products

**Free Scanner**: Domain scan across ChatGPT, Claude, Perplexity, and Gemini.
Returns AIS Index score, per-engine cards with mention context, top 3 citation
gaps, and 5 quick-win recommendations. Email gate unlocks full results.

**Starter Plan ($99/mo)**: 1 brand, weekly scans, full report history,
citation trend tracking.

**Growth Plan ($299/mo)**: 3 brands, daily scans, Slack/email alerts for
score drops and hallucination detection, brand monitoring dashboard.

**Scale Plan ($899/mo)**: Unlimited brands, daily scans, AI-generated
citation asset library (8 prioritized assets per month), asset approval workflow,
monthly citation tracking per asset per engine.

**Enterprise Plan ($2,499/mo)**: API access, agency mode (manage client brands),
dedicated AEO engineer, custom benchmarks, white-label reports.

**Continuous Intelligence ($149–$699/mo)**: Daily AI briefing delivered at 7am,
covering brand mentions across all engines, hallucination alerts, score drop
alerts, competitor citation matrix (Growth+), agentic task execution (Enterprise).

**Per-Output Reports**: Citation Audit ($199), Visibility Report ($99),
Channel Report ($79), Stakeholder Brief ($49).

**LLM Discovery Tools**: LLMs.txt generator (Starter $49), Citation Boost with
directory submission ($149), Enterprise package ($299).

### Industry Verticals

The platform produces AEO benchmarks for: B2B SaaS (median AIS 38), Ecommerce (27),
Fintech (29), Healthcare (24), Agencies (41). Each vertical has a dedicated page
at /industries/[vertical] with vertical-specific citation rates, ranking factors,
and recommended asset types.

---

## AIS Index Methodology

### What is the AIS Index?

The AIS Index (AI Search Index) is a 0–100 score representing how well a brand
is positioned for citation by large language models. It is calculated through
parallel probe queries across ChatGPT (via OpenAI API), Claude (via Anthropic API),
Perplexity (via PPLX API), and Gemini (via Google AI API). Each engine returns a
raw mention score (0–1.0), scaled to 0–100 and aggregated across engines.

### Scoring Components

**Mention Score (per engine, 0–1.0 raw)**: Does the LLM mention the brand when
queried about the relevant category? Probed with 5 category-intent prompts and
2 brand-specific prompts per engine.

**Context Quality (0–1.0)**: How accurately and favorably does the LLM describe
the brand? Extracted from the mention context in LLM responses.

**Citation Breadth (0–1.0)**: How many distinct source types does the LLM cite
for this brand? Types include: official website, news articles, review sites,
directories, social profiles, Wikipedia/Wikidata, academic papers.

**Source Authority (0–1.0)**: Weighted by the perceived authority of cited sources
for this engine. ChatGPT weights training data and high-authority publications.
Claude weights precision and authoritative primary sources. Perplexity weights
recency and domain authority. Gemini weights Google index signals.

### Aggregate Score Calculation

Aggregate AIS Index = (ChatGPT_raw × 0.30 + Claude_raw × 0.25 + Perplexity_raw × 0.25 + Gemini_raw × 0.20) × 100

Engine weightings reflect audience size and commercial relevance:
- ChatGPT: largest consumer/B2B audience, highest commercial impact
- Claude: enterprise and professional audience, precision-weighted
- Perplexity: research-intent audience, fastest propagation speed
- Gemini: Google ecosystem audience, strong for local/service businesses

### Score Ranges

0–20: Barely or never cited. Requires foundational work: site quality, authority
building, structured content.
20–40: Occasionally cited for brand queries only. Focus on category-intent content.
40–60: Regularly cited for brand and some category queries. Optimize citation sources.
60–80: Strong citation presence. Focus on breadth expansion and competitor gap analysis.
80–100: Dominant citation presence. Maintain with monitoring and hallucination detection.

### Platform-Level Aggregate Statistics (2026, N=500+ brands)

- Median AIS Index across all verticals: 31
- Top quartile: >55; Bottom quartile: <18
- SaaS vertical median: 38; Ecommerce: 27; Fintech: 29; Healthcare: 24; Agencies: 41
- Brands with active citation tracking show average +8 point improvement after 90 days
  of systematic asset creation (Scale plan users, N=140)

---

## How LLM Citation Works

### The Retrieval + Generation Pipeline

Modern LLMs do not simply "know" facts from training data. For queries about
brands, products, and services, they combine:

1. **Training data recall**: Information encoded during model training (up to
   knowledge cutoff date). Brands with extensive pre-2024 coverage have
   baseline citation presence even without active AEO.

2. **Live web retrieval**: Models with browsing capability (ChatGPT with browsing,
   Perplexity always, Gemini) can fetch live content. New content can be cited
   within days of publication.

3. **RAG-style grounding**: Some models use retrieval-augmented generation to
   ground responses in fetched content. Accuracy and recency are higher for
   models using live retrieval.

4. **Implicit knowledge synthesis**: Models synthesize information from multiple
   training sources. A brand mentioned across 50 distinct training documents
   will appear more authoritative than one mentioned in 5.

### What Makes a Brand Citeable?

LLMs cite brands based on:

**Authority signals**: Does this brand appear in authoritative contexts?
Wikipedia articles, major news coverage, industry awards, and official directories
all signal authority. A brand mentioned only on its own website has minimal
citation potential.

**Uniqueness**: Is the brand name distinctive enough to be disambiguated from
generic terms? "Acme Robotics" is more citeable than "Acme Solutions."

**Content alignment with query intent**: LLMs cite brands that directly answer
the user's question. A SaaS brand mentioned in a "best project management tools"
article will be cited for that query — not for "enterprise ERP systems."

**Source breadth**: The more distinct authoritative sources mentioning a brand,
the more confident the LLM is in citing it.

### Citation vs. Ranking: Why Traditional SEO Doesn't Transfer

Google ranking and LLM citation operate on fundamentally different signals:

**Google**: Backlinks, on-page SEO, E-E-A-T signals, content freshness, Core Web Vitals,
   structured data, internal linking — all mediated by Google's crawler and index.

**LLM citation**: Training data composition, authoritative source mentions, content
   structure, recency (for live-retrieval models), disambiguation clarity,
   schema markup quality — all mediated by the model's own citation logic.

**Key difference**: Google ranks pages. LLMs cite brands. A brand's website can rank
#1 in Google for its own brand name but be invisible in ChatGPT if it lacks
authoritative third-party citations.

**Cross-signal analysis (AISearchStackHub platform data)**: Only 12% of pages
ranking in Google top 10 for commercial queries appear in AI Overviews or LLM
citations for equivalent queries. The correlation between Google rank and LLM
citation score is r=0.31 — weak, and not monotonic.

### Propagation Latency

- **Perplexity**: Fastest. Live web retrieval means new content can appear in
  citations within 24–72 hours. Median propagation: 3 days.
- **ChatGPT (browsing enabled)**: Similar to Perplexity for browsing queries.
  For knowledge-cutoff queries, depends on training data refresh cycle.
- **Claude**: Slower. Heavy weighting on authoritative primary sources.
  Median propagation: 2–4 weeks for new content.
- **Gemini**: Fast for Google-indexed content. Median propagation: 1–2 weeks.

---

## Engine-Specific Citation Behavior

### ChatGPT (OpenAI)

**Citation logic**: ChatGPT cites based on training data composition and live web
retrieval (with browsing enabled in ChatGPT Plus). For brand queries, ChatGPT
looks for patterns across its training corpus: brand mentions in authoritative
publications, Wikipedia, news sites, and review platforms.

**Key ranking signals**:
1. Citation breadth: number of distinct authoritative sources mentioning the brand
2. Recency: for browsing-enabled queries, new well-structured content gets cited fast
3. Content format: FAQ-style content, comparison tables, and "best X for y" articles
   are heavily cited
4. Wikipedia/Wikidata presence: strong signal for brand authority
5. Authoritative directories: Crunchbase, G2, Trustpilot, industry-specific directories

**Platform observation**: Brands with 20+ distinct authoritative source mentions
have a 73% citation rate for brand queries. Brands with <5 sources: 8%.

### Claude (Anthropic)

**Citation logic**: Claude weights precision over breadth. Favors authoritative
primary sources and primary research over secondary coverage. Constitutional AI
training means Claude is more cautious about citing brands without clear authority
signals.

**Key ranking signals**:
1. Primary source quality: peer-reviewed research, official documentation, first-party data
2. Wikipedia/Wikidata accuracy: Claude cross-references these heavily
3. llms.txt file presence: Claude's crawler reads llms.txt to understand brand identity
4. Structured content with clear taxonomies and definitions

**Platform observation**: The presence of an llms.txt file correlates with +12 AIS
points in Claude specifically — the highest cross-engine effect of any single
optimization.

### Perplexity AI

**Citation logic**: Perplexity always uses live web retrieval and shows explicit
source citations inline. Fastest to incorporate new content. Most transparent
about citation sources.

**Key ranking signals**:
1. Page structure: clear headings, structured data, concise factual paragraphs
2. Domain authority: Perplexity uses Google's indexing — strong Google SEO signals
   translate directly
3. Recency: newer content is prioritized for many queries
4. Backlink profile: secondary signal via Google's index
5. Schema.org markup: FAQPage and HowTo schemas are explicitly parsed

**Platform observation**: New well-structured content can appear in Perplexity
citations within 3–7 days of publication — the fastest of any engine.

### Google Gemini

**Citation logic**: Gemini integrates with Google's search index and Knowledge
Graph. Brand visibility in Gemini is heavily dependent on Knowledge Panel
completeness, structured data accuracy, and Google Business Profile optimization.

**Key ranking signals**:
1. Knowledge Graph entities: accurate, complete entity data in Google's Knowledge Graph
2. Google Business Profile: completeness and recency of GBP data
3. Schema.org markup: Organization, LocalBusiness, Product schemas
4. Featured snippet capture: pages ranking for featured snippets have higher citation
5. E-E-A-T via Google's signals: author expertise, site authority, citation patterns

**Platform observation**: Brands with complete Knowledge Graph entries score +15
AIS points higher in Gemini vs. brands with incomplete entries.

---

## AEO Tactics Guide

### Tier 1: Foundational (AIS 0–30)

**1. Claim and Optimize Google Business Profile**
Gemini is heavily dependent on Google signals. Complete your GBP with accurate
 NAP, photos, hours, services, and posts. This feeds the Knowledge Graph and
 improves Gemini citations.

**2. Create an llms.txt File**
The single highest-ROI AEO tactic. An llms.txt file tells AI engines what your
 brand does, your key pages, and how to describe you accurately. Claude's crawler
 reads it explicitly. Place at: https://yourdomain.com/llms.txt

**3. Build Wikipedia/Wikidata Presence**
Wikipedia articles and Wikidata entries are top citation sources for all four engines.
 If a notable brand lacks a Wikipedia article, it is almost always absent from
 Claude citations.

**4. Target Category-Intent Content**
Create content that directly answers category queries: "best [category] tools for
 [use case]" — the most-cited content format in ChatGPT and Perplexity.

**5. Get Listed in Major Directories**
G2, Crunchbase, Trustpilot, Capterra, and industry-specific directories are heavily
 cited. Each listing is a distinct authoritative source.

### Tier 2: Intermediate (AIS 30–60)

**6. Publish Primary Research**
Original data, benchmarks, and surveys are top-cited content types across all engines.
 "We analyzed 500 brands and found X" — LLMs cite this extensively.

**7. Add FAQPage Schema to Key Pages**
Perplexity and Gemini explicitly parse FAQPage schema. Add structured FAQ markup
 to your most important pages.

**8. Create an agents.json File**
The agents.json spec declares AI agent-compatible information about your brand.
 Include: brand name, description, contact, supported actions, API endpoints.

**9. Earn Media Coverage in Authoritative Outlets**
News coverage from tier-1 publications (TechCrunch, Forbes, The Verge) is the
 highest-authority citation source. Focus on story-driven coverage.

**10. Optimize First 150 Words of Homepage**
LLMs weight page openings heavily for entity disambiguation. The first 150 words
 should clearly state: what the brand does, who it's for, what makes it different.

### Tier 3: Advanced (AIS 60–100)

**11. Build a Citation Asset Library**
Systematically create content assets designed for LLM citation: calculators,
 comparison frameworks, benchmark reports, industry glossaries, decision trees.

**12. Implement Comprehensive Schema.org Markup**
Organization, Product, FAQPage, HowTo, Review, and SpeakableSpecification schemas
 across your site. Use JSON-LD format. Validate with Google's Rich Results Test.

**13. Develop Topic Cluster Architecture**
Create pillar pages with 10+ supporting articles on interconnected topics. This
 builds topical authority.

**14. Run Continuous Citation Monitoring**
Track your AIS Index weekly. Set alerts for score drops >5 points. Respond to
 hallucinations by creating accurate authoritative content.

---

## Proprietary AI Citation Patterns

### Platform-Observed Patterns (AISearchStackHub, N=500+ brands)

**Pattern 1: The Citation Threshold Effect**
Brands need a minimum number of distinct authoritative sources before consistent
LLM citation begins. Observed threshold: 12–15 distinct authoritative mentions.
Below this: sporadic, unreliable citation. Above this: consistent citation for
brand and category queries.

**Pattern 2: The Primary Source Premium**
LLMs, especially Claude, cite primary sources over secondary coverage. A brand
with one original research report is more likely to be cited than a brand with
50 second-hand mentions.

**Pattern 3: Perplexity First, Others Follow**
New content appears in Perplexity citations 2–3 weeks before ChatGPT and 3–4 weeks
before Claude. Use Perplexity as the leading indicator for AEO effectiveness.

**Pattern 4: The Directory Citation Chain**
Directories (G2, Crunchbase, Capterra, Trustpilot) are cited by all four engines
at high rates because they are perceived as authoritative and neutral. A brand
listed in 10+ directories has a "citation floor."

**Pattern 5: The Negative Sentiment Amplifier**
Brands with negative sentiment in training data are cited less frequently and
with more caveats. This is self-reinforcing: fewer accurate citations → more
hallucinated negative references → further erosion of citation quality.

**Pattern 6: Schema Markup Accuracy Effect**
Sites with accurate, comprehensive JSON-LD schema markup score +18 AIS points
higher in Perplexity and Gemini compared to sites with no or inaccurate schema.

**Pattern 7: llms.txt Compound Effect**
The presence of an llms.txt file correlates with +22 AIS points in Claude specifically.
For brands with llms.txt files, improvement in overall AIS Index after 90 days of
active AEO is +12 points vs. +6 points for brands without one.

---

## LLM Visibility Benchmarks by Vertical

### B2B SaaS (Median AIS Index: 38)

Top-cited content types: comparison frameworks, ROI calculators, integration
compatibility lists, case studies with specific metrics.
Citation leaders: brands with benchmark reports, open-source tooling presence,
and active blog content publishing schedule.
Key engines: ChatGPT and Perplexity (B2B buyer research intent).

### Ecommerce (Median AIS Index: 27)

Top-cited content types: product comparisons, "best X for Y" articles, review
summaries, use-case guides.
Citation leaders: brands with comprehensive product documentation and review
program partnerships.
Key engines: Gemini (Google ecosystem) and ChatGPT (shopping research).

### Fintech (Median AIS Index: 29)

Top-cited content types: security documentation, compliance explainers, fee
comparison tools, case studies with financial impact data.
Citation leaders: brands with regulatory compliance coverage and industry
research publications.
Key engines: Perplexity (research intent) and Claude (enterprise audience).

### Healthcare (Median AIS Index: 24)

Top-cited content types: clinical evidence summaries, provider credentials,
treatment comparison guides, cost calculators.
Citation leaders: brands with published clinical data and medical advisory boards.
Key engines: Claude (healthcare professionals) and Gemini (consumer health queries).

### Agencies (Median AIS Index: 41)

Top-cited content types: methodology frameworks, case studies, benchmark reports,
tool comparisons, thought leadership articles.
Citation leaders: agencies with published original research and clear positioning.
Key engines: ChatGPT and Perplexity (buyers evaluating agencies).

---

## Structured Data Reference for AEO

### Recommended Schema Types

**Organization Schema** (all pages):
- name, url, logo, description, sameAs (social profiles), contactPoint
- Place on homepage in JSON-LD format

**FAQPage Schema** (key content pages):
- mainEntity: array of Question/Answer pairs
- Parseable by Perplexity and Gemini; directly influences cited Q&A content
- Max benefit: homepage, category pages, service pages, blog posts

**HowTo Schema** (tutorial and process content):
- step array with name and text for each step

**Product Schema** (ecommerce and SaaS product pages):
- name, description, image, brand, offers (price, availability, sku)
- Required for Gemini product citations

**SpeakableSpecification** (AI-friendly content sections):
- cssSelector pointing to key content sections that should be cited by AI engines
- Place on homepage and key content pages

**Article + BreadcrumbList** (blog posts and research):
- headline, datePublished, dateModified, author, publisher
- BreadcrumbList for topical context

### Schema Implementation Checklist

1. Homepage: Organization + SpeakableSpecification
2. Product/Service pages: Product/Service + Offer + FAQPage
3. Blog posts: Article + BreadcrumbList + FAQPage (if applicable)
4. Comparison pages: FAQPage + Product (for each compared item)
5. About/team pages: Organization + Person + BreadcrumbList
6. Contact page: Organization + contactPoint schema

---

## Content Strategy for LLM Citation

### The Citeable Content Framework

Content that gets cited by LLMs shares these characteristics:

**1. Factual depth over marketing prose**
LLMs cite content that provides specific, verifiable information: prices, metrics,
 comparisons, definitions. "Our tool is fast" is unciteable. "Our tool processes
 requests in 47ms (p99) vs. industry average of 120ms" is highly citeable.

**2. Disagreement-free factual statements**
LLMs avoid citing content with contested claims. "Product X is the best" is unciteable.
"Product X has a 4.7/5 rating on G2 from 1,240 verified reviews" is highly citeable.

**3. Structural clarity**
Clear headings, bullet points, tables, and numbered lists are parsed better by
LLMs than prose paragraphs. Every section should be structured for scannability.

**4. Query-match clarity**
The content's primary query should be answerable from the first 100 words. LLMs
cite based on initial content assessment.

**5. Author attribution and expertise signals**
Named authors with credentials, linkedin profiles, and demonstrated domain
expertise increase citation rates.

### Content Types by Citation Rate

1. **Original research reports**: 89% citation rate
2. **Calculators and interactive tools**: 76% citation rate
3. **Comparison frameworks with specific data**: 71% citation rate
4. **Case studies with quantified outcomes**: 67% citation rate
5. **FAQ content with schema markup**: 64% citation rate
6. **Glossaries and definition lists**: 58% citation rate
7. **How-to guides with step-by-step instructions**: 54% citation rate
8. **Industry thought leadership**: 48% citation rate
9. **Product announcements and press releases**: 12% citation rate
10. **Company blog posts without original data**: 8% citation rate

---

## Measurement Methodology

### Tracking AEO Progress

**Weekly AIS Index Scan**: Run weekly scans for all tracked brands. Plot the AIS
Index over time. A sustained upward trend indicates effective AEO execution.
Flag any score drop >5 points as a potential hallucination event.

**Engine-Specific Tracking**: Track each engine's score separately. A brand with
a strong ChatGPT score but weak Perplexity score needs Perplexity-specific tactics.

**Citation Source Audit**: Monthly, manually verify the accuracy of LLM-cited
information about your brand. Use 5 standard prompts across 4 engines.

**Competitive Citation Analysis**: Monthly, run competitor brand scans to track
their citation patterns. Note which content types they're getting cited for.

### AEO ROI Calculation

Revenue impact of AEO improvements can be estimated:

**Upper-funnel attribution**: Track referral traffic from AI engine referrals
(using UTM parameters on links LLM cites).

**Citation halo**: Survey new customers on how they discovered your brand. If
"found via ChatGPT/Claude/Perplexity" increases over time, AEO efforts are working.

**Competitive displacement**: Track rankings for category queries in LLM responses
over time. Moving from "not cited" to "cited for X query" in ChatGPT for a key
category drives measurable traffic.

---

## Quick-Start AEO Checklist

Week 1:
- Create llms.txt file at domain root
- Add Organization schema to homepage in JSON-LD
- Claim/optimize Google Business Profile
- Submit site to Google via GSC

Week 2:
- List brand in G2, Crunchbase, and Capterra
- Create Wikipedia article (if notable) or Wikidata entry
- Add FAQPage schema to top 5 pages by traffic

Week 3:
- Publish one comparison article with specific, verifiable data
- Add SpeakableSpecification to homepage
- Create agents.json file

Week 4:
- Run first AIS Index scan
- Review citation gaps
- Publish one piece of original research or benchmark data

Ongoing:
- Run weekly AIS Index scans
- Monitor hallucination alerts
- Publish monthly original research
- Build directory and media presence
- Track competitive citation patterns

---

## Appendix: LLM Behavior Summary by Engine

### ChatGPT
- Best for: Broad B2B and consumer reach
- Speed: 3–14 days for new content citation (browsing); training-data dependent otherwise
- Key tactic: Authoritative source breadth (Wikipedia, directories, media coverage)
- Failure mode: Brand names that are common English words

### Claude
- Best for: Enterprise and professional audiences
- Speed: 2–4 weeks for new content citation
- Key tactic: llms.txt + primary source quality + Wikipedia/Wikidata
- Failure mode: Thin content without expert backing; conflicting information across sources

### Perplexity
- Best for: Research-intent buyers, tech-savvy audiences
- Speed: 1–3 days for new content citation
- Key tactic: Content freshness + schema markup + domain authority
- Failure mode: Thin content; JavaScript-heavy pages; content behind walls

### Gemini
- Best for: Google ecosystem users, local/service businesses
- Speed: 1–2 weeks for new content citation (via Google index)
- Key tactic: Knowledge Graph optimization + Google Business Profile + featured snippets
- Failure mode: Incomplete entity data; inaccurate structured data; NAP inconsistencies

================================================================================
End of AISearchStackHub AEO Content Corpus v2026.05.19
License: CC BY 4.0 — https://creativecommons.org/licenses/by/4.0/
Source: https://aisearchstackhub.ai
================================================================================