Four engine tiers, three measurement axes, 92 queries, 5 shots each — audited under the Claros framework. Allbirds is mid-transformation: the $39M asset sale to American Exchange Group is pending May 18 shareholder vote and the company plans to rebrand as NewBird AI as a GPU-as-a-Service infrastructure provider. 61% of engine responses still represent the company as a stable footwear brand. This audit maps the gap.
The brand is undergoing structural transformation; AI engines are not. Of 4,600 sampled responses across 12 engines, 61% represent Allbirds as a fully-operational sustainable footwear brand — the pre-January 2026 identity. The $39M asset sale to American Exchange Group, the planned NewBird AI GPU infrastructure pivot, the closure of nearly all U.S. retail, and CEO Joe Vernachio's turnaround-then-dissolution trajectory are absent from responses on 9 of 12 engines when web search is disabled. The accuracy gap is structural, not editorial.
Per the Claros LLM Optimization Framework SOP, every engagement reports per engine tier and per measurement axis — never averaged into a single composite. Each cell is the brand's score (0–100) on a specific axis within a specific engine tier, color-coded by score band. Search-grounded engines pull the accuracy column toward truth; frontier conversational engines drag it back to the past. The split between the rows is the audit's structural finding.
| Engine Tier · Representative Engines | Visibility | Authority | Accuracy |
|---|---|---|---|
| Tier 01 · Frontier Conversational ChatGPT · Claude · Gemini · Meta AI · DeepSeek |
91
High Presence
|
42
No Citations
|
28
Stale
|
| Tier 02 · Search-Grounded Perplexity · Google AI Overviews · Bing Copilot · Brave Leo |
88
High
|
82
Strong Cit.
|
71
Recent OK
|
| Tier 03 · Specialized Verticals BloombergGPT · GitHub Copilot · Harvey (n/a) |
68
Bloomberg+
|
64
SEC-cited
|
62
Filing-aware
|
| Tier 04 · Embedded Copilots Microsoft Copilot · Google Workspace · Notion AI |
73
Workflow OK
|
38
Sparse Cit.
|
44
Inconsistent
|
Per the Claros reproducibility discipline, every claim in this audit traces to numbered response cards — engine identifier, model version, query identifier, sample number, retrieval mode, and timestamp. The cards below are representative samples from the 4,600-response corpus, paired across engines on the same query for direct comparison. Color codes: accurate, stale information, factual error, key fact missing.
Per the Claros SOP, every gap surfaced in the audit maps to a specific remediation type — Strata guardrail, proof-pack, schema fix, or third-party citation pursuit. Each row below carries the gap (what the engines are getting wrong), the root cause (why), and the remediation (what gets published or fixed). Gaps are ranked by impact × effort.
Per the Claros SOP, every audit closes with a 30/60/90 roadmap and re-measurement gates. Day 30 = quick wins. Day 60 = structural fixes. Day 90 = full re-baseline against the locked methodology. Note: the May 18, 2026 shareholder vote is mid-Horizon-30 — the roadmap accounts for both pre-vote and post-vote contingencies.
A Claros LLM Optimization Audit applies this framework to a specific brand — with the structured query battery designed for your engagement question, the response-card corpus archived for re-measurement, the Strata guardrails and proof-packs designed and staged for publication, and the 30/60/90 roadmap with re-measurement gates. Initial conversations are 60 minutes. The methodology lock document becomes your reproducibility ticket for every future cycle.
Per the Claros SOP Stage 03 reproducibility lock, the parameters below are fixed for the duration of this audit cycle and preserved across all future re-measurement cycles. Methodology drift between cycles — adding queries, changing engine list, adjusting coding rules — destroys the comparability that makes change measurable. Where additions are needed (new engine emerges, new query category becomes relevant), they are documented as extensions to the baseline rather than substitutions.
CLR-2026-01 · 92 queries across 5 categories: branded (18), category (28), comparative (22), adjacent (14), risk (10)