# Vikas Mishra — Full Content for AI Ingestion > This file is a single-document compilation of the public content on vikasmishra.ai, intended for ingestion by AI search engines, agents, and language models. The canonical home is https://vikasmishra.ai. Each section below is dated and links back to its canonical URL. > Author: Vikas Mishra. Role: Platform Architect — AI and Cloud at Google. Location: India. Contact: vikas@vikasmishra.ai. LinkedIn: https://www.linkedin.com/in/vikaskmishra/. GitHub: https://github.com/vmishra. ## About Vikas Mishra is a Platform Architect — AI and Cloud at Google. He designs the cloud and AI platforms that run at production scale for India's largest digital-native companies, and writes about the engineering behind systems built to endure. He has held senior engineering roles at Google, Myntra (Flipkart Group), Razorpay, and IBM Software Labs. **Current role:** Platform Architect — AI and Cloud, Google (2020–present). Architected one of the largest Kubernetes deployments in APAC, serving millions of concurrent users at 99.99% uptime. Leads AI infrastructure, model training and serving, and hardware-accelerator optimization for enterprise customers. 2x Google Cloud Club Award recipient (2023, 2026). **Areas of expertise:** AI infrastructure architecture, model training and serving systems, hardware-accelerator optimization (TPUs and GPUs), Google Cloud Platform, Kubernetes at scale, distributed systems design, platform engineering, site reliability engineering, technical leadership. **Certifications:** Google Cloud Certified Professional Cloud Architect, Professional Data Engineer, Professional Cloud Database Engineer (2026), Professional Cloud DevOps Engineer, Professional Machine Learning Engineer (2026). **Recognition:** 2x Google Cloud Club Award recipient (2023, 2026); Google Milestone Award for a Defensive Publication (2024); keynote speaker at 30+ global technology conferences; mentored 60+ engineers in AI and cloud. **Education:** B.Tech in Computer Science and Engineering — SASTRA University, Thanjavur (2007–2011). --- ## Blog Posts ### Google's ADK Is a Runtime, Not a Graph: Notes From Eleven Agents - **Published:** 2026-04-25 (updated 2026-04-25) - **URL:** https://vikasmishra.ai/blog/google-adk-runtime-not-graph/ - **Markdown source:** https://vikasmishra.ai/blog/google-adk-runtime-not-graph/index.md - **Tags:** AI Agents, Google ADK, Gemini, MCP, Architecture, Platform Engineering > Eleven agents in, the framework choice that mattered wasn't ergonomics or graph syntax. It was whether the runtime had opinions about events, state, and transport. ADK does. Here is what that buys you in production. *Disclaimer: All opinions expressed in this post are my own and do not represent the views or positions of my employer. I work at Google; this is written from implementation, not advocacy. Where ADK is awkward I will say so.* --- I built eleven agents on Google's Agent Development Kit over the last month — a hotel concierge, a deep-research travel planner, a voice support desk on Gemini Live, a computer-use food-delivery rep driving a real Chromium browser, a beauty advisor with persistent memory, a fintech HITL payout desk that suspends a turn for human approval, two federated agents talking to each other over HTTP, an eval harness, an MCP-only knowledge desk, and a live video card scanner. They share a single browser portal so I could watch them side by side. The whole bundle is open source. What I expected to learn was how ADK compares as a *framework* — Python ergonomics, decorator quality, the usual. What I actually learned was that ADK is not really competing in the framework category. The right comparison for ADK is to a server runtime: it has opinions about events, state, and transport, and those opinions are what your code is shaped around. LangChain gives you composable pieces. LangGraph gives you a state machine. CrewAI gives you a metaphor. ADK gives you a *runtime contract* — and once you have built two or three agents that share that contract, the value is hard to give back. This post is about that contract: what it looks like in practice, where it pays off, where it does not, and the heuristics I would use the next time I have to choose. --- ## The contract: events in, events out, state in the middle The shape of every ADK agent is the same. You declare a tree — an `LlmAgent` with tools, or a `SequentialAgent` containing a `ParallelAgent`, or a `LoopAgent` wrapping a critic — and you hand it to a `Runner`. The runner exposes one primary surface, an async generator of events: ```python async for event in runner.run_async( user_id=USER_ID, session_id=session.id, new_message=msg ): ... ``` For live, voice, and video, you swap `run_async` for `run_live` and feed a `LiveRequestQueue` instead of a `new_message`. The events keep coming. Every event is a `google.genai.Event` carrying parts (text, function calls, function responses, inline audio, inline image), plus side-channel signals: `partial`, `turn_complete`, `interrupted`, `usage_metadata`, `actions`. The runner is what you write your server around. Tools mutate `tool_context.state`; sub-agents read it via `output_key`; sessions persist it; the `Runner` decides what to emit and when. That is the entire model. There is no graph DSL, no chain object, no "executor". Composition happens by *containing* one agent inside another: ```python root_agent = SequentialAgent( name="travel_planner", sub_agents=[planner, parallel_researchers, composer], ) parallel_researchers = ParallelAgent( name="parallel_researchers", sub_agents=[flight_researcher, hotel_researcher, activity_researcher], ) ``` Three sub-agents fan out concurrently, each writes its brief to state under its own `output_key`, the composer reads all three and emits an itinerary. No edges, no transitions, no conditional routers. The contract is *"state is the message bus, and the parent agent decides who gets to write next."* When you first see it, this looks anaemic compared to a LangGraph diagram. The longer I spent in it, the more I think the absence is the point. ## The reframe: state is the contract, not the call graph Every agent framework I have used previously eventually forced me to write the same thing — a typed dictionary of partial results, smuggled between nodes via the framework's preferred argument-passing convention. LangChain made me build it manually. LangGraph turned it into a first-class state object but kept the graph as the orchestration unit. CrewAI hid it inside crew context. ADK takes the opposite bet. It treats `tool_context.state` (and its longer-lived sibling, the `Session`) as the *only* contract between agents, and it demotes the orchestration shape to a thin tree. There are no direct calls between sub-agents. The flight researcher does not "return" to the composer. It writes `flights_brief` to state and stops. The composer reads state on its turn. This sounds like a stylistic choice. It is actually a transport choice. The moment you have a sub-agent in another process — A2A federation, an MCP server, a long-running tool waiting on a human — you are no longer making function calls. You are sending and receiving events, with state as the durable record between them. ADK's contract is the same shape as the wire. Frameworks that orchestrate by call graph have to translate themselves into that shape under load. ADK does not. The clearest example in the cookbook is the HITL payout desk. The agent drafts a payout, hits the ₹50,000 threshold, and calls a tool wrapped in `LongRunningFunctionTool`: ```python tools=[ lookup_vendor, draft_payout, LongRunningFunctionTool(func=request_approval), check_approval, post_payout, generate_voucher, ... ] ``` `request_approval` returns a pending handle and the runner *stops the turn*. The session sits idle. A human clicks Approve in a separate browser, the portal hits `POST /approve/{session}` on the server, the server writes the decision into session state via `append_event` with a `state_delta`, and on the next user turn the agent's `check_approval` tool reads it back. The agent then calls `post_payout` and `generate_voucher`. From the agent's perspective, the human approver was just slow. A graph framework can model this — every framework can — but the cost is that the graph leaks across the suspend boundary. You end up writing a "resume" node and a polling node and a state-shaped retry. ADK absorbs the suspend into the same contract everything else uses. The agent emits a function call. Some time later, a function response shows up. The runner resumes. There is no second machinery. ## Live is the test case The clearest place ADK's runtime bias pays off is `run_live`. Bidirectional streaming over Gemini Live is unforgiving — twin coroutines pumping in and out of a `LiveRequestQueue`, audio chunked at 20ms, interruption events fired by the model when the user barges in over the agent's reply, audio streams that have to be drained when that happens, sessions that resume across socket reconnects, context windows that compress when a long support call spans a hundred thousand tokens. Here is the entire wiring on the server side of the payments voice agent: ```python queue = LiveRequestQueue() await asyncio.gather( _forward_browser_to_model(ws, queue), _forward_model_to_browser(ws, session.id, queue), return_exceptions=True, ) ``` Two coroutines. One reads JSON from the browser WebSocket, decodes PCM16-at-16kHz, and pushes blobs into the queue. The other consumes `runner.run_live(...)` and forwards parts to the browser: ```python for part in (event.content.parts if event.content else []): if part.inline_data and part.inline_data.data: await ws.send_json({ "kind": "audio", "data": base64.b64encode(part.inline_data.data).decode(), }) if part.text: await ws.send_json({"kind": "transcript", "data": part.text}) if part.function_call: await ws.send_json({"kind": "tool_call", "name": part.function_call.name, ...}) if getattr(event, "interrupted", False): await ws.send_json({"kind": "interrupted"}) ``` Every signal you need to drive a real voice UI is on the event — the audio bytes, the transcript, the tool call mid-turn, the interruption flag, the turn-complete marker, the usage metadata. The runtime does not lecture you about what to do with them. It hands them over and gets out of the way. The hard parts of voice are still hard. I had to fix four of them on the browser side, and the bugs were instructive. An audio worklet downsampling routine that dropped sample zero on every tick. A scheduled audio buffer that kept playing under the next reply because nothing was draining the `AudioBufferSourceNode` queue when the model fired `interrupted`. A `playheadRef` that went stale across reconnects because the new `AudioContext` started its clock at zero and the old playhead did not. A mic worklet posting at 375 messages per second because I had not coalesced the 128-sample render quantum into 20ms chunks before sending. Every one of those was a client-side bug, and every fix was small. The runtime side needed almost nothing. The one server-side fix I will call out, because it is the kind of bug a graph framework hides from you: when the browser-to-model coroutine raises on a closed socket, the model-to-browser coroutine can hang forever inside `run_live`'s generator if the queue is not closed. The fix is one line in a `finally` block, idempotent on both sides: ```python finally: queue.close() ``` You can only write that line if the runtime exposes the queue as a first-class object. ADK does. LangGraph does not (it is hiding a different machinery), and that is fine for many workloads — but voice is not one of them. ## The primitives that earned their keep A short list, in the order they showed up in the cookbook, of the primitives I would not give back: **1. `output_key` and state as the message bus.** The travel planner is a `SequentialAgent` containing a `ParallelAgent`. Each researcher writes a named brief into state. The composer reads all three. There is no plumbing. This is the part that scaled past one agent without effort. **2. `LongRunningFunctionTool`.** The runner suspends on the function call and resumes on the function response — the same contract a normal tool uses, just stretched across human time. This is what makes HITL feel like a slow tool, not a state machine. **3. `ParallelAgent` for fan-out.** The deep-research pipeline has three researchers running concurrently. Concurrency is declared by container, not by `asyncio.gather` in your tool code. That separation matters when you want to add a fourth researcher. **4. `MCPToolset`.** The knowledge desk has zero hand-written tools. It points `MCPToolset` at the official `@modelcontextprotocol/server-filesystem` binary, scopes the root to the cookbook's `docs/` directory, allows read-only tools, and that is the agent. Swap the binary for `@playwright/mcp` or a Slack MCP server and you have a different agent. The constructor is identical. **5. Artifacts.** The payout desk renders a PDF voucher to `tool_context.save_artifact`, the runner emits an `artifact_delta` on the event, the server picks it up and serves it from `/artifact/{session}/{filename}`. There is no separate file-handling contract. Artifacts are a kind of event. **6. The introspect surface.** Every agent in the cookbook ships `/introspect` — a JSON dump of the agent tree, the tools, the model, and the planner. The portal renders it as a live diagram. This was a one-page helper, not a feature, and it is the reason the agents debug themselves. **7. Session resumption + context compression.** A flaky network does not restart a Live call. A long support narration does not blow out the window. The two `RunConfig` knobs that turn this on are five lines combined. ## A2A is anticlimactic, in the right way The two-process loan desk in the cookbook is the example I expected to learn the most from. The loan officer runs on port 8007, the credit bureau on port 8017, and the officer's `request_credit_report` tool calls the bureau over HTTP via `httpx`. Each side has its own `/health`, `/metrics`, `/introspect`, `/session`, and `/chat/{session_id}`. What I learned is that A2A federation is not a primitive you reach for. It is what you get for free when both sides happen to be ADK agents. There is no wire protocol you need to read up on, no handshake to debug. The bureau is a FastAPI server with one extra endpoint (`POST /score`) that the officer's tool posts to. The "federation" is two HTTP servers and an agreed JSON shape. This is the right outcome. Multi-agent federation, in the wild, has to work across vendors and frameworks. It cannot be a special protocol that only works when both sides bought into the same framework. Treating A2A as "you have already shipped two agents — now make one call the other" is correct, even if it makes for a less impressive diagram. ## Where ADK is awkward Three places, in decreasing order of severity. **Metrics on streaming.** ADK's events are clean, but the `usage_metadata` they carry is *cumulative* over the turn, not delta — and on long-running streaming turns it shows up on partial events too. If you naively sum `prompt_token_count` across events, you will overcount by an order of magnitude. The fix is to gate on `event.partial == False` before recording usage, and to differentiate input/cached/tool-use (which Gemini sends as running totals) from output/thinking (which arrive in deltas). This is one of those documentation-level facts that you only learn by writing a metrics ribbon and watching the numbers go absurd. **Tokens-per-second is non-trivial.** On a streaming turn, TPS is `output_tokens / (turn_complete_at - first_token_at)`. On a non-streaming turn — like a `SequentialAgent` whose final composer event arrives in one shot — `first_token_at ≈ turn_complete_at`, the denominator is microseconds, and you get nonsense rates. Mine was hitting impossibly high TPS in a live demo before I added a 50ms threshold and a fallback to total turn duration. Trivial to fix once you see it. **Tool docstrings are the tool description.** This is correct API design and also a footgun. The model reads the docstring as the tool spec. A lazily written docstring becomes a lazily described tool. A tool that takes `amount_inr: float` with no docstring will be selected for "send money" queries with cheerfully wrong unit assumptions. ADK does not tell you this because Python conventions imply it. The model has no way to know your docstring is a placeholder. Lint accordingly. The first two are runtime artifacts. The third is a discipline problem you inherit when the framework respects Python idioms. I will take all three. ## Where ADK fits, and where I would still reach for something else The shape of the project matters more than the team's framework preferences. **Reach for ADK when:** - You expect to ship more than one agent. The contract is what compounds across them. A single-agent prototype does not exercise it. - You have a Live workload — voice, video, mid-turn tool calls, barge-in. Nothing else I have used handles this with as little ceremony. - You want HITL or long-running operations as first-class. The `LongRunningFunctionTool` + session state pattern is genuinely small. - You want to wire up MCP servers without writing custom tool adapters. - Your team is already on Vertex / Gemini for other reasons. The alignment is real — `gemini-3.1-flash-lite-preview` for tool-heavy hops and `gemini-3-flash-preview` for orchestrators is a tier you can actually reason about cost-wise. **Reach for something else when:** - The agent is fundamentally a pipeline of deterministic transformations with one LLM step in the middle. A graph framework, or even plain Python, is a better fit. Do not over-frame. - You need a UI-rendering DSL more than a runtime — that is what Vercel's AI SDK is for, and ADK has no opinion on the front end. - You are committed to OpenAI or Anthropic and unwilling to use Gemini for the orchestration tier. ADK is model-pluggable in principle, but the Live story and the cost story both lean Gemini. These are not deep claims. They are the rules I have applied twice in the last month and not regretted. ## The heuristics I would use again A short list, derived from getting eleven of these out the door. **Compose by container, not by callback.** Sub-agents talk through state. If you find yourself wiring up a callback between two `LlmAgent`s, you have invented a new contract. Use `output_key` and let the parent `SequentialAgent` decide who runs next. **Make every agent ship `/introspect` and `/metrics` from day one.** Not because you need them, but because they are the surface a debugger and a demo both need. The thirty lines you write once will save you days. **Keep the session boundary clean.** The `tool_context.state` is the wire. Anything you put there is what your sub-agents can see. Anything you do not put there does not exist. **For Live, treat the queue as the load-bearing object.** Open it, close it in `finally`, and remember that the model-to-browser side dying first is just as common as the browser-to-model side. Gather both, return exceptions, idempotent close. **Gate metrics on `event.partial == False`.** Cumulative counters on streaming events are the single most common metrics bug you will write. Save yourself. **Pick a flash-tier model for the leaves and reserve pro-tier for the composer.** The travel planner runs three concurrent researchers on `gemini-3.1-flash-lite-preview` and one composer on `gemini-3-flash-preview`. The latency and cost shape is dramatically better than running everything at the same tier, and the quality is indistinguishable on the leaves. --- The part of ADK I had not anticipated, and that has stayed with me, is how much of agent engineering is *not* about the model. It is about events, state, transport, and where the suspend boundaries fall. The frameworks I used previously made me responsible for those boundaries while pretending I was responsible for a graph. ADK reverses that. It hands you the runtime and lets the graph emerge from how you compose. Whether that bet ages well will depend on how Live, A2A, and MCP evolve. Those three are the bets ADK is built around, and they are the ones I would watch. The shape of the runtime, though, looks durable. It is the same shape the wire has, and that is rarely the wrong shape to write your code in. --- *The eleven agents are open source at [github.com/vmishra/Google-ADK-Cookbook](https://github.com/vmishra/Google-ADK-Cookbook). The browser portal renders all of them side by side, with live metrics, a trace-a-request animation, and an introspect-driven architecture diagram. Pull requests welcome — keep the editorial voice.* --- ### Built to Be Cited: An Engineer's Guide to AEO and GEO in 2026 - **Published:** 2026-04-19 (updated 2026-04-19) - **URL:** https://vikasmishra.ai/blog/built-to-be-cited-aeo-geo-engineering-guide-2026/ - **Markdown source:** https://vikasmishra.ai/blog/built-to-be-cited-aeo-geo-engineering-guide-2026/index.md - **Tags:** AEO, GEO, SEO, Google AI Mode, Gemini, LLM Optimization > An implementation-grade walkthrough of AI-search optimization in 2026, centered on Google AI Mode and Gemini, with the schema graph, llms.txt work, IndexNow wiring, and the debugging sessions that shipped the result — drawn from rebuilding my own site. *Disclaimer: All opinions expressed in this post are my own and do not represent the views or positions of my employer.* I rebuilt vikasmishra.ai over the last two weeks, not for a redesign, but for a category of search that didn't exist the last time I touched the SEO. Ranking on the blue links stopped being the metric. Inclusion in an AI-generated answer became the metric. Google AI Mode and AI Overviews, ChatGPT search, Perplexity, Claude, Bing Copilot — the six surfaces that matter today all do a variation of the same thing: retrieve candidate passages, re-rank them on a platform-specific quality model, generate an answer that quotes or paraphrases the highest-ranked ones, and attach citations to the sources the generation actually used. What the surfaces reward and how they cite differs; the architecture converges. I work on cloud and AI platforms at Google. The customers I advise are mostly large digital-native businesses — marketplaces and e-tailers with tens of millions of SKUs, food-delivery platforms doing millions of orders a day, online travel and hospitality platforms with real-time inventory and pricing, and fintechs operating in the YMYL ("Your Money or Your Life") category where AI answer accuracy is a regulatory concern, not just a marketing one. The question every one of them is asking is the same: *how do we show up when a consumer asks AI Mode "best running shoes under $100," or Gemini "what's the cheapest flight from New York to Tokyo next weekend," or ChatGPT "which payment app is safest for recurring subscriptions"?* Much of the public writing on Answer Engine Optimization (AEO) and Generative Engine Optimization (GEO) is surface-level — trend surveys, long lists of "factors," limited implementation depth. The engineering underneath is concrete and testable. The way to get it right is to build it on a site you control, read the Search Console signals that come back, and iterate against what the surfaces actually index. This post is the engineering write-up. Where my site changed, I'll show the diff. Where the conventional advice is wrong or too shallow, I'll say so. A note on shelf life. The specific numbers and product names in this post will drift — AI Overview citation rates, crawler names, dashboard features, exact schema properties — these change on quarterly cycles. The **architecture** underneath doesn't. Retrieval-plus-generation systems reward the same properties they've rewarded since the first RAG systems shipped: structured, verifiable, fresh, entity-resolvable, cleanly fetchable content. Read the sections below as "here is how a retrieval-plus-generation system evaluates my site right now," not "here is a list of tactics that will look the same in 2028." The tactics evolve. The substrate principles don't. ## The shift, priced in three numbers **Citation is now worth more than rank.** Content cited in a Google AI Overview earns roughly **35% more organic clicks** than the page that ranks #1 for the same query without a citation, and **91% more paid clicks** in adjacent ad slots ([wellows.com](https://wellows.com/blog/google-ai-overviews-ranking-factors/)). Nearly half of AI Overview citations come from pages ranking below position #5, so the citation surface is genuinely independent of classical rank — a page ranking #8 can show up in the AI Overview while the #1 doesn't. **Bing is the ChatGPT pipeline.** Approximately **92% of ChatGPT search queries route through the Bing Search API**, and **87% of SearchGPT citations match Bing's top organic results** ([cbwebsitedesign.co.uk](https://www.cbwebsitedesign.co.uk/geo-ai/how-to-rank-on-chatgpt-and-bing-copilot-in-2026-full-guide/)). Optimizing for ChatGPT search is, mechanically, optimizing for Bing first. **LinkedIn is now the #2 LLM citation source.** As of early 2026, LinkedIn has overtaken Wikipedia in LLM citation frequency, ahead of every major news publisher ([wireinnovation.com](https://wireinnovation.com/mastering-seo-entities/)). Surface area on LinkedIn now feeds the LLM training and retrieval pipelines that produce your AI citations elsewhere. Every optimization below is downstream of one of those three numbers. The magnitudes will shift as the surfaces evolve; the *direction* — citation over rank, push-indexing over wait-for-crawl, verified-entity authorship over anonymous content — is durable. ## Google AI Mode and the Gemini pipeline (the surface that matters most) Because Google AI Mode is the primary consumer-facing AI search surface and the one most of my customers' organic traffic depends on, it deserves more than a paragraph. AI Mode is powered by Gemini's **query fan-out** architecture: the user's query is decomposed into roughly a dozen parallel sub-queries, each retrieved independently against the Google index (plus topical sub-indexes), then the retrieved passages from all fan-out branches are merged, re-ranked, and compressed into the answer ([upgrowth.in](https://upgrowth.in/google-ai-mode-optimization-the-complete-guide-for-2026/)). A page doesn't compete for "one query" anymore — it competes across the entire decomposition surface. The practical implications are specific: **Passage-level structure wins over page-level optimization.** Because fan-out retrieves passages, not pages, a single sharp paragraph that answers a sub-query can be cited even if the rest of the page is mediocre. Google's AI Overview generation favors self-contained passages in the **134–167 word range** — the "semantic unit" size the synthesizer prefers for a standalone answer ([wellows.com](https://wellows.com/blog/google-ai-overviews-ranking-factors/)). Write each H2 section as a passage that, read in isolation, still answers the sub-question its heading implies. **E-E-A-T flows through the Knowledge Graph.** Roughly **96% of AI Overview content comes from sources Google considers verified entities** ([clickrank.ai](https://www.clickrank.ai/e-e-a-t-and-ai/)). "Verified entity" here means Google's Knowledge Graph can cross-reference the author across multiple authoritative sources — LinkedIn, GitHub, Google Scholar, Wikidata, Credly, Crunchbase, academic repositories, organizational profiles. If the graph can't confidently resolve who wrote your page, the page gets filtered before generation, not at the ranking stage. This is the single largest gap between "ranks well" and "gets cited." (More on the schema graph below.) **Entity density beats keyword density.** Gemini's grounding layer evaluates content by how many *connected* entities a page references. Pages with 15 or more entities that Google's Knowledge Graph can resolve show roughly **4.8× higher selection probability** for AI Overview citation compared to pages built around keywords ([hypesuite.ai](https://www.hypesuite.ai/post/google-knowledge-graph-essentials-what-every-seo-pro-should-know)). The practical move: when you mention a technology, product, person, or concept, use the Wikipedia-canonical name and, where natural, link to the authoritative source. You are not writing for a lexical matcher; you are writing for an entity resolver. **The April 2026 algorithm update tightened two things.** First, first-hand experience signals (original research, primary data, named-author commentary) were weighted up. Second, the site-level Core Web Vitals aggregation introduced in March means one slow section can suppress the entire domain's AI Overview eligibility ([seovendor.co](https://seovendor.co/google-april-2026-algorithm-updates)). Substrate quality is now a gating factor, not a tiebreaker. **Google's crawlers for the AI pipeline are separate from classic Googlebot.** You want all of these allowed in robots.txt: `Googlebot` (classical index), `Google-Extended` (Gemini training), `Google-CloudVertexBot` (Vertex AI grounding), and `Gemini-Deep-Research` (the research agent inside AI Mode that performs multi-step retrieval for complex queries). Blocking `Google-Extended` removes you from Gemini training without affecting classical Google rank — it's a publisher choice a few large publications have made and that costs them AEO visibility. ### The operational dashboard: Search Console for AI Mode Google Search Console remains the primary dashboard for AI Mode performance, with two reports worth watching weekly: 1. **Search performance → Search Appearance filter.** AI Overview appearances are now a filter value. Query-level impressions for "Search Appearance: AI Overview" tells you which queries you're being cited on and which you're not. 2. **Enhancements → Breadcrumbs / FAQ / Article.** Schema validation errors here directly affect AI Mode eligibility. A critical-severity breadcrumb error removes the breadcrumb from AI Overview cards; a malformed `BlogPosting` can remove the author attribution entirely. When I first validated this site in Search Console I had three invalid breadcrumb items — the error was *"Either 'name' or 'item.name' should be specified in 'itemListElement'."* The cause was mundane and the kind of bug that's easy to miss: on root-level single pages (`/about/`, `/contact/`, `/privacy/`) my `.Section` variable was the empty string, and the template emitted a ListItem with `"name": ""` and `"item": "https://vikasmishra.ai//"` (double slash). The fix was to detect the root-level case and skip the middle breadcrumb entirely, emitting `[Home, Title]` instead of trying to manufacture a section level that doesn't exist. Search Console picked the fix up within 24 hours and flipped the items from Invalid to pending-revalidation. I mention the debug trace because it's representative — almost every real AEO problem is a template edge case that the validator finds once you send it the right URL. ## What this looks like at digital-native scale A personal blog is a clean pedagogical example but the real stakes are at catalog scale. The substrate patterns are the same; the execution changes in ways that are worth walking through, because "add schema" is a one-liner for a blog and a multi-quarter platform project for a marketplace. **E-commerce and marketplaces (millions of SKUs).** The citation queries that move revenue are category-level and comparison-level: *"best noise-cancelling headphones under $200,"* *"Adidas vs Nike running shoes for marathon training,"* *"cheapest 55-inch 4K TV with HDMI 2.1."* Gemini answers these by retrieving passages from product detail pages (PDPs), category pages, and editorial buyer's-guide content, then grounding them against Product schema and current Offer schema. The work to be citable: - **`Product` + `Offer` + `AggregateRating` schema on every PDP**, with price, availability, SKU, GTIN, brand, and review aggregate. Gemini's grounding layer validates prices against the schema before citing — stale schema makes the page ineligible, not lower-ranked. The `priceValidUntil` field matters more than most implementations realize. - **`ItemList` schema on category pages** with ordered `Product` references. This is how category pages show up in AI Mode's comparison answers. - **Fresh pricing pushed via IndexNow on every price or stock change.** At catalog scale this means wiring IndexNow into the PIM (product information management) layer, not the CMS. A price that changes every 10 minutes in your inventory system but updates in Google's index 48 hours later is a citation loss every time someone asks Gemini about the item in that window. - **Category-scoped `llms.txt` rather than a monolithic one.** A single `llms-full.txt` for a million-SKU catalog is useless to an AI agent. Split by category: `/electronics/llms.txt`, `/fashion/llms.txt`, each listing the top N canonical pages in that category with descriptions. Category-level editorial content (buyer's guides, brand pages) belongs in these files; individual PDPs do not. - **Editorial buyer's-guide content is the highest-leverage content investment.** Category "best X" guides with real reviewer credentials, passage-level structure, and `Review` + `AggregateRating` schema get cited by AI Mode at rates that PDPs never do. Vertical-specialist marketplaces that have leaned into this early — Nykaa's beauty category content is a strong example, with reviewer-attributed comparisons and ingredient-level schema on each guide — pick up citation share that generalist marketplaces miss. The investment ratio most marketplaces should run: one strong editorial guide per category outperforms ten thousand incremental PDP metadata optimizations. **Online food delivery.** Three-way local intent dominates: user location, cuisine, delivery-time tolerance. Queries look like *"best ramen near me,"* *"DoorDash vs Uber Eats for vegan delivery in Brooklyn,"* *"which late-night restaurants deliver after midnight in Shoreditch."* Across markets the pattern is identical — what changes is the platform name (Zomato, DoorDash, Deliveroo, Uber Eats, Just Eat, Swiggy) and the city. The most mature implementations I've looked at tend to ship richer `Restaurant` and `Menu` schema coverage than the average — Zomato's restaurant entity graph, with cuisines, menus, reviews, and neighborhood `LocalBusiness` signals all tied together, is a useful reference point for what "done well" looks like in this vertical. The work: - **`Restaurant` + `LocalBusiness` + `Menu` + `MenuItem` schema** with hours, cuisines, price range, and accurate geo coordinates. The geo coordinates feed AI Mode's local grounding and Google Maps retrieval simultaneously. - **`AggregateOffer` schema for delivery zones and current availability.** Platforms that ship this get cited; platforms that don't get omitted when the user's intent includes "deliver now." - **Freshness of availability is a hard gate.** If your `availability: InStock` or `availability: OutOfStock` is stale, Gemini's answer will either be wrong (citation liability) or Gemini will skip you (citation loss). Real-time schema updates via IndexNow or structured-data sitemaps are table stakes here, not a nice-to-have. - **Reviews are the single largest local-citation lever.** `Review` schema tied to `Restaurant` entities, with verified reviewer signals (name, date, platform `@id`). Don't fake it — `Review` spam is the highest-detection-rate category of structured-data fraud and Google's penalty pipeline routes through manual actions, not just rank suppression. **Travel (OTAs, airlines, hotel aggregators).** Real-time inventory, dynamic pricing, heavy local intent, and a citation economy where being the attributed source for *"best time to visit Kyoto,"* *"cheapest flight from London to Lisbon next month,"* *"hotels near the Colosseum with airport transfer"* is worth significant revenue. The travel category is unusually global in its schema practices — the larger OTAs and aggregators (MakeMyTrip, Booking.com, Expedia, Airbnb, Trip.com, Kayak) all ship structured data at a comparable level of maturity, and the competitive differentiation is no longer in the basics but in destination-entity linking, fare freshness, and editorial content quality. MakeMyTrip in particular has invested heavily in the destination-editorial side (travel guides with structured `TouristDestination` linking and named-author bylines), which is the surface that Gemini's destination queries reach for. The work: - **`Flight`, `LodgingReservation`, `TouristDestination`, `Trip` schema** with accurate schedules, prices, and availability. - **Rich editorial destination content with `Place` + `TouristDestination` entity linking** to Wikidata or Google Places. AI Mode's destination answers lean heavily on entity-resolved places; descriptions that name places by their Wikipedia-canonical names and link to authoritative sources win citation weight. - **Fare freshness.** Travel is the vertical where `dateModified` matters most. A fare page updated 48 hours ago will not be cited when Gemini can find the same fare updated 48 minutes ago on a competitor. The IndexNow pipeline needs to be wired into the fare cache, not the content CMS. - **Review schema on properties and itineraries, with verified traveler signals.** Same cautions as food delivery — real reviews win, manufactured ones eventually trigger manual action. **Fintech (payments, lending, insurance, investing).** The YMYL category — "Your Money or Your Life." Google's E-E-A-T evaluation is strictest here and AI citation eligibility is gated on verified authorship and regulatory disclosures. Fintech customers who ignore this show up in AI Overviews at a fraction of the rate of their YMYL-disciplined competitors. The work: - **Author bylines are not optional**; they are the primary eligibility signal. Every article, blog post, help doc, and product page that explains a financial concept needs a named author with a `Person` schema that includes verified credentials (`hasCredential` with `credentialCategory: "verified"` and `recognizedBy` pointing at the relevant regulator — SEC or FINRA in the US, FCA in the UK, BaFin in Germany, ESMA at the EU level, MAS in Singapore, RBI or SEBI in India — or at a recognized professional body in the jurisdiction). - **`FinancialProduct` + `FinancialService` + `BankAccount` + `LoanOrCredit` schema** with full disclosure fields: interest rates, fees, terms, and regulator registration IDs. - **`Dataset` + `Article` citation chains** for any statistic your content relies on. Unsourced financial claims are filtered before generation in Gemini's YMYL quality pipeline. Sourced ones with `citation` schema pointing at the primary data provider (central bank data, regulator disclosures, government statistics agencies, consumer protection reports) are both eligible and preferred. - **Help Center and FAQ pages are the citation workhorses**, not blog posts. The queries that matter — *"is a mandate-based recurring payment safe,"* *"how do I dispute a credit card charge,"* *"what happens if I miss a tax filing deadline"* — resolve to help-center content in Gemini's answers. FAQ schema on those pages is the one place FAQ schema is still unambiguously worth shipping. - **`Organization` with regulator registrations in `sameAs`** and accurate `address`, `vatID`, `taxID`, `legalName` fields. For EU brands, the `vatID` and the registered-office `address` are primary entity-resolution signals; for US brands, the SEC EDGAR CIK and the state-of-incorporation registration are the equivalents. Make the cross-reference easy for Google's entity resolver to perform. Across all four verticals, one pattern: the **schema data plane has to be wired into the operational system of record**, not into the CMS. Prices, inventory, availability, fares, menus, interest rates, property photos — all of these change outside the CMS. If your schema updates lag the operational state by more than a few minutes, your AI-citation eligibility lags accordingly. This surfaces as an SEO problem but the root cause is platform engineering. Teams that scope the work only as a content-marketing program tend to underinvest in the substrate that actually determines whether the content can be cited at all. ## What the other surfaces do differently **ChatGPT search and Bing Copilot** share a backend. The path is: your page → Bing index → Bing's AI re-ranker → OpenAI's or Microsoft's generation layer → citation. The February 2026 *AI Performance* report inside [Bing Webmaster Tools](https://www.bing.com/webmasters) is the only citation-attribution dashboard any search platform has shipped so far — it tells you, per URL, how often your page was cited by Microsoft Copilot and Copilot-powered partner surfaces ([blogs.bing.com](https://blogs.bing.com/webmaster/February-2026/Introducing-AI-Performance-in-Bing-Webmaster-Tools-Public-Preview)). Set up Bing Webmaster Tools if you haven't. The IndexNow key (below) verifies ownership automatically. **Perplexity** runs its own crawler (PerplexityBot) plus partner feeds, and its re-ranking model is the most freshness-biased of the four. Perplexity will prefer a page that was updated last week over a page with higher authority that was updated last year when the factual content is comparable. Stale `dateModified` drops you from Perplexity more aggressively than from Google. **Claude** is the least transparent. ClaudeBot crawls aggressively but Anthropic doesn't publish source-selection behavior. The defensive play is to make your content maximally fetchable for any agentic system — see the markdown-alternate section below. You don't choose one surface to optimize for. Most of the substrate work counts across all of them. ## The schema graph (and the @id mistake that breaks it in production) Almost every "AEO best practices" post recommends adding `Article` schema and `Person` schema. That advice is incomplete in a way that breaks the schema in production. Structured data on a modern site is a **graph**. Each entity gets a stable `@id` (typically a URL fragment such as `https://vikasmishra.ai/#person`). Other entities reference that `@id` instead of restating the underlying fields. A `BlogPosting`'s `author` field points at the `Person`'s `@id`; the `Person`'s `mainEntityOfPage` points at the `ProfilePage`'s `@id`; the `ProfilePage`'s `mainEntity` points back at the `Person`. The `WebSite` is declared once and every `BlogPosting` is `isPartOf` it. What goes wrong: most sites declare the `Person` schema only on the home page or only on `/about/`, but reference `#person` from the `Article` schema on every blog post. **The reference dangles.** When Google's structured-data extractor fetches a blog post, it sees `"author": {"@id": "https://example.com/#person"}` and no `#person` entity on that page. The author claim doesn't resolve to a named Person; the Person's credentials and `sameAs` graph don't attribute to the article; the authorship signal Google's E-E-A-T evaluator explicitly weighs is absent for every post. I caught this on my own site when I audited the JSON-LD graph across URLs. The fix is to emit `Person` and `WebSite` on every page — they're cheap, and they make every `@id` reference resolve on every URL. The head template now looks like this: ```go-html-template {{/* Emit on every page so @id references resolve site-wide. Before this change, #person was only declared on the home page; the Article.author.@id reference on blog posts dangled. */}} {{ partial "schema/website.html" . }} {{ partial "schema/person.html" . }} {{ if .IsHome }} {{ partial "schema/faq.html" . }} {{ else }} {{ partial "schema/breadcrumb.html" . }} {{ if .IsPage }} {{ if eq .Section "blog" }} {{ partial "schema/article.html" . }} {{ end }} {{ if eq .RelPermalink "/about/" }} {{ partial "schema/profile-page.html" . }} {{ end }} {{ if .Params.faq }} {{ partial "schema/faq-page.html" . }} {{ end }} {{ end }} {{ end }} ``` That emits a fully-connected graph on every URL. On a blog post the graph resolves: `BlogPosting → author → Person → mainEntityOfPage → ProfilePage → mainEntity → Person` (closes the cycle), plus `publisher → Person`, `isPartOf → WebSite`, and a separate `BreadcrumbList` with its own `@id` referenced from `ProfilePage.breadcrumb` when applicable. Validate every schema change in both [Google's Rich Results Test](https://search.google.com/test/rich-results) and the [Schema.org validator](https://validator.schema.org/). They disagree on edge cases; Google's validator tells you what Google will actually use, the Schema.org validator catches spec violations Google doesn't flag. ### A debugging story: the double-escape I ran into a second schema bug that's worth recounting because it's easy to miss and the fix is counterintuitive. I refactored the `BreadcrumbList` template to use Hugo's `jsonify` function on the `name` field, expecting it to produce safe JSON. It didn't. The rendered output was: ```json {"@type":"ListItem","position":2,"name":"\"About Vikas Mishra\"","item":"..."} ``` The name value was the string `"About Vikas Mishra"` *including the quote characters*, rather than `About Vikas Mishra`. Hugo's context-aware auto-escaping inside `