# How to Build an AI Visibility Dashboard for a GEO Audit

**Author:** John Morabito (Founder, /winston)
**Published:** June 10, 2026
**Reading time:** 16 minutes
**Canonical:** https://www.winstondigitalmarketing.com/playbooks/how-to-build-an-ai-visibility-dashboard/

A spreadsheet of citation data does not sell a GEO engagement. A dashboard does. This is the full architecture behind the AI visibility dashboards we ship: six phases from prompt research to a single self-contained HTML file a client can open, click through, and immediately understand where they are invisible.

## Why a dashboard, not a deck

The data behind a GEO audit is complex: thousands of AI-engine answers, brand mention counts across five engines, Reddit and YouTube citation maps, share-of-voice splits by category. In a slide deck it dies on slide four. In a spreadsheet the client never opens it.

A dashboard is interactive (the client explores their own categories), visual (the invisible-category problem hits them in the gut), and because we ship it as a single self-contained HTML file there is no login, no SaaS seat, no expired link. They double-click it and it works forever.

The deliverable: one HTML file, self-contained, all data inlined, no external dependencies. Tabs for overview, share of voice, engine breakdown, Reddit, YouTube, and prioritized recommendations. Dark theme so it looks like a product, not a report.

## Phase 1: Prompt research

300 to 500 prompts tagged by category, sub-theme, and intent. This is its own discipline; the full method is in the GEO prompt research playbook. Output is a CSV that feeds phase 2.

## Phase 2: Multi-engine scrape

Run every prompt through ChatGPT, Gemini, Perplexity, Google AI Overview, and Google AI Mode. For 400 prompts that is 2,000 calls, so automate it. Engineering notes that save a day of pain:

- **Shard it** across parallel workers. Serial takes all day; sharded takes an hour.
- **Make it resumable.** Write each result to its own file. When the run dies at 1,400 of 2,000 (it will), resume instead of restarting.
- **Capture the full answer and the sources.** You need both: answer text for brand-mention detection, sources for the citation gap.
- **Carry the tags through** so you can aggregate without a join.

## Phase 3: Reddit citation analysis

AI engines cite Reddit constantly. Pull every cited Reddit URL from phase 2, scrape those threads plus the relevant subreddits (title, body, comments, upvotes, sub, URL), compute Reddit share of voice, find the citation gap (threads the engines cite where the client is absent), and extract sentiment themes with quotes.

**Hard-won schema lesson:** the dashboard data layer is strict. Reddit threads need a consistent shape and one mismatched key throws a JS error mid-render that silently kills every section after it. Validate the JSON against a known-good example before building, and syntax-check the final inlined script. Reddit scraper items often have a null post id, so join comments to posts via the `/comments/<id>` segment of the URL, not the id field.

## Phase 4: YouTube ownership analysis

Identify the YouTube videos the engines cite, then classify each: brand owns the channel, creator/partner, or brand absent. The absent-but-cited videos are a content gap and often a creator-partnership opportunity.

## Phase 5: Assemble the data layer

Normalize everything into a small set of JSON files, one per analysis, each matching the dashboard's expected schema exactly. Validate against a known-good reference before injecting. Escape your strings (brand names and Reddit quotes contain apostrophes that break JS-context injection). Keep the focal-brand key stable so the same template works for every engagement.

## Phase 6: Build the self-contained HTML dashboard

Start from a dashboard template (tab structure, chart rendering, dark theme already built) and transform per client: inject the JSON, swap brand name and accent color, write the per-client narrative into the recommendations tab.

- **Inline everything** so the file works with no network.
- **Syntax-check the final script** after every transform.
- **Tabs map to the analyses.**
- **Make recommendations specific:** not "improve your AI visibility" but "you are absent from 17 of the 20 cited Reddit threads in the protein category; here are the three to target first."

## The whole pipeline

1. Prompt research: 300-500 tagged prompts.
2. Multi-engine scrape: 5 engines, sharded, resumable, sources captured.
3. Reddit analysis: share of voice + citation gap + sentiment.
4. YouTube ownership: cited videos classified owned / partner / absent.
5. Data layer: validated JSON per analysis, schema-checked.
6. Dashboard: one self-contained HTML file, syntax-checked, client-branded.

## Why we publish this

The pipeline is a lot of disciplined engineering and the value is in running it correctly and repeatedly, not in keeping the recipe secret.

Service: https://www.winstondigitalmarketing.com/services/generative-engine-optimization/
Audit: https://www.winstondigitalmarketing.com/contact/#audit
