What we test, how we score, what we publish, and what we keep internal.
The Shortlist Index measures how often B2B SaaS companies are cited by name when AI tools answer software recommendation questions.
We test each company against a standardized set of buyer-intent questions across five AI engines: ChatGPT (OpenAI gpt-4o), Perplexity (sonar), Gemini (gemini-3-pro), Claude (claude-sonnet-4-6), and Google AI Overviews. Questions span category-level searches ("best CRM for a small team"), problem-led searches ("how do I reduce churn?"), and direct comparisons ("HubSpot vs Pipedrive").
Each company receives a Shortlist Score from 0 to 100. The score reflects four factors: how often the company is mentioned, how high it ranks in AI responses, how many of the five engines mention it, and how well its website content is structured for AI extraction.
We update every category weekly. Scores are based on the prior 7 days of testing. We do not accept payments to improve a ranking, do not adjust rankings based on advertising relationships, and do not allow companies to submit corrections to their scores. Scores are determined entirely by what the AI engines say — we are the measurement layer, not the influence layer.
We keep these internal so the Index can't be gamed by optimizing specifically for our test set. Companies that want to improve their score must improve their general AI visibility — which is the correct outcome.
For each category, we source candidate companies from G2 top 50 + Capterra top 50, deduplicated, then filtered to a $500K–$500M ARR band via Crunchbase and LinkedIn headcount cross-references. We add up to 5 manual entries per category for emerging tools with active community presence (Hacker News mentions, Reddit threads, significant Twitter/X following).
Target count per category: 20–30 companies. Below 20 isn't credible as a leaderboard. Above 35 isn't scannable.
Every Sunday at 02:00 UTC, our cron job runs the prompt suite across all 5 engines for every company in every active category. Monday morning, scores update. Anomaly checks (any 20+ point WoW change, any engine returning empty for 5+ companies) hold the publish until manually reviewed. We never partially publish — either the full category updates or it doesn't.
A citation = the brand name is mentioned by name in the AI engine's response, with our parsing logic confirming the mention is the brand and not an unrelated entity. Soft mentions (e.g. "tools like X, Y, and Z") count. Indirect mentions ("the leading provider in the space") don't — citations require a name.
The methodology version is currently v1.0. We'll publish a changelog when we revise scoring weights, add engines, or change the company sourcing process. Existing weekly snapshots are preserved at the version they were generated under.
Run your free Shortlist Scorefor the same audit applied to your specific domain and category. That gives you the breakdown that the public Index leaderboard doesn't show.