Flipkart Commerce Cloud Search
Product Documentation v1.0 Built By Retailers, For Retailers
Confidential — For Publisher and Partner Use Only © 2026 Flipkart Commerce Cloud
About This Document
This document is the product reference for Flipkart Commerce Cloud (FCC) Search. It covers the platform's architecture across four pillars — Organize, Understand, Rank, and Experience — the semantic search layer, capability status, strategic vision, and integration guidance for publishers and their technical teams.
Audience: Network publishers, e-commerce product managers, merchandising teams, and integration engineers.
Prerequisites: Familiarity with e-commerce concepts (product catalogs, search results pages, conversion metrics) is helpful but not required.
Table of Contents
- Introduction
- Platform Architecture — The Four Pillars
- Pillar 1 — Organize: Indexing and Taxonomy
- Pillar 2 — Understand: Query Processing Pipeline
- Pillar 3 — Rank
- Pillar 4 — Experience
- Semantic Search
- Capability Summary
- Market Context and Strategic Vision
- Roadmap
- Metrics and Measurement
- Onboarding Checklist
- Glossary
1. Introduction
Overview
FCC Search is a composable, multi-tenant e-commerce search platform built on the same engine that powers Flipkart — one of the world's largest commerce platforms by search volume. It is purpose-built for mid-to-large retailers and marketplaces seeking higher conversion on existing traffic, without the complexity and cost of building or maintaining a search infrastructure in-house.
Search is not a feature — it is the primary discovery mechanism for e-commerce. Roughly 40–55% of all units sold in e-commerce flow through organic search. Customers who search convert significantly more than those who browse, and their average order value is higher. A customer who hits a dead end — null results, wrong results, confusing results — does not try again. They leave.
FCC Search is designed to ensure that does not happen.
The Retail Store Analogy
The simplest way to understand what a search system does is to think of a large physical retail store. Every element of the store is designed to help a shopper find what they came for — products grouped by type, arranged on shelves in a logical order, placed in sections that match how shoppers think. When a shopper asks a staff member for help, they understand what was meant — even if the wrong word was used — and guide the shopper to the right aisle. Once in the aisle, the most popular or best-value products are at eye level, not buried at the back.
E-commerce search does all of this digitally:
- Organize — How the store is laid out and how products are shelved
- Understand — The staff member who interprets your query and points you in the right direction
- Rank — Which products are placed at eye level vs. buried on the bottom shelf
- Experience — What information is visible on the shelf tag to help you decide
FCC Search is built around these four pillars.
Who This Platform Is For
| Persona | Description |
|---|---|
| Network Publishers | Retailers and marketplaces ($100M–$2B GMV, 10K–5M SKUs) seeking higher conversion from existing traffic through best-in-class search relevance |
| Merchandising and Category Teams | Business users who need control over what products surface for which queries — without raising engineering tickets for every change |
| Integration Engineers | Technical teams integrating FCC Search APIs into storefronts, mobile apps, and analytics pipelines |
| E-Commerce Product Managers | Teams responsible for search quality, null search reduction, and conversion optimisation across discovery surfaces |
2. Platform Architecture — The Four Pillars
Overview
Every FCC Search query passes through four sequential layers. Each layer has a distinct responsibility, and together they produce a ranked, relevant result page for every user interaction.
User Types a Query
│
▼
┌─────────────────────────────┐
│ PILLAR 1 — ORGANIZE │ Products indexed into a demand-side store tree
│ (Indexing & Taxonomy) │ Real-time + batch ingestion pipelines
└──────────┬──────────────────┘
│
▼
┌─────────────────────────────┐
│ PILLAR 2 — UNDERSTAND │ Sequential query processing pipeline
│ (Query Processing) │ DNA → Spell → Classify → Intent → Retrieve
└──────────┬──────────────────┘
│ Candidate Set
▼
┌─────────────────────────────┐
│ PILLAR 3 — RANK │ 3-stage cascaded ranking (L0 → L1 → L2)
│ (Ranking) │ Coarse sort → product reranking → personalisation
└──────────┬──────────────────┘
│ Top N results
▼
┌─────────────────────────────┐
│ PILLAR 4 — EXPERIENCE │ AutoSuggest, Snippets, Spotlights, Filters
│ (Search Results Page) │ Null handling, sort, grid/list view
└──────────┬──────────────────┘
│
▼
Search Results Page (SRLP) — rendered by publisher storefront
Important: FCC Search is a headless, API-first service. It returns ranked result data via API — the publisher's storefront is responsible for rendering. This gives publishers complete control over the visual experience while FCC handles the intelligence layer.
3. Pillar 1 — Organize: Indexing and Taxonomy
Overview
Before any query can be answered, products must be indexed in a way that mirrors how shoppers think — not just how sellers list. The Organize layer structures the catalog into a demand-side taxonomy and makes every product retrievable in near real-time.
Status: ✅ Live
3.1 Store Tree (Demand-Side Taxonomy)
FCC Search organizes products into a Store Tree — a hierarchical taxonomy from L0 (All Products) down to Leaf Stores (specific product types). Every product maps to a Leaf Store, and the Leaf Store a query is classified into determines the entire universe of results returned.
All Products (L0)
├── Electronics (L1)
│ ├── Phones & Tablets (L2)
│ │ └── Smart TVs (L3 ● Leaf)
│ └── Computing (L2)
│ └── Laptops (L3 ● Leaf)
├── Home & Furniture (L1)
│ └── ...
└── Fashion (L1)
├── Men's T-Shirts (L3 ● Leaf)
└── Dresses (L3 ● Leaf)
Every product maps to a Leaf Store. The Leaf Store determines:
- Which products are in the result universe for a given query
- Which filters are available (Electronics shows RAM/storage; Fashion shows size/fit)
- Which ranking signals are applied
- How the Search Results Page (SRLP) is laid out
A wrong taxonomy classification means wrong results — always. Taxonomy quality is a direct multiplier on search quality. A product filed in the wrong store will never surface for queries that belong to the correct store.
Demand indexing vs. supply indexing. Products are indexed into stores based on how buyers think, not just the seller's vertical. A unisex t-shirt can live simultaneously in Men's, Women's, and Kids' stores. This demand-side mapping is configured by catalog and merchandising teams and is one of the highest-leverage levers in the system.
3.2 Indexing Pipeline
FCC Search maintains two ingestion paths for the product index:
| Path | Latency | Purpose |
|---|---|---|
| Real-Time Ingestion | Near-real-time | Price changes, stock updates, availability flags. Staleness here is a direct revenue risk — a product shown as in-stock that is actually out of stock undermines buyer trust immediately. |
| Batch Ingestion | Periodic (scheduled) | Product attribute updates, taxonomy reclassification, image changes, new catalog additions. |
Catalog quality directly impacts search quality. Sparse product attributes — missing titles, incomplete specifications, absent category assignments — make products effectively invisible to relevant queries even if they are correctly indexed. Publishers should prioritise catalog completeness as a foundational dependency for search performance.
4. Pillar 2 — Understand: Query Processing Pipeline
Overview
The Query Processing Pipeline fires sequentially on every query. It transforms raw user input into a structured retrieval instruction that the index can act on. The order of steps is critical — an error in an early stage propagates through all subsequent stages and degrades the final result set.
Raw Query
│
▼
[DNA] → [Spell Correction] → [Store Classification] → [Intent Understanding]
│
▼
[Query Rewrite] → [Quantifier Extraction] → [Partial Match]
│
▼
[Lexical Retrieval + Semantic Retrieval] → Candidate Set for Ranking
4.1 DNA — Do Not Augment Gate
Status: ✅ Live
The first step in the pipeline is a protection gate. DNA identifies brand names, trademarks, product codes, and other terms that must not be modified by any downstream module. Brand names that clear DNA are passed through unchanged — they are never spell-corrected, rewritten, or expanded.
This protection is essential: spell-correcting a brand name like "Defy" or "Cemento" into a common dictionary word corrupts everything that follows.
4.2 Spell Correction
Status: ✅ Live (with ongoing improvements)
Spell Correction fixes genuine typos in queries that cleared the DNA gate. The system distinguishes between a real misspelling and an unusual-looking brand name — getting this wrong corrupts all downstream processing.
FCC uses a combination of a trained spell correction model augmented with brand-specific override lists to protect known brand names from incorrect correction. New brand terms are onboarded to the override list to maintain protection as the client's catalog evolves.
4.3 Store Classification
Status: ✅ Live
Store Classification is an ML model that predicts which Leaf Store a query belongs to. This is the single most consequential step in the pipeline — the classified store determines the entire result universe.
| Input | Output |
|---|---|
| Corrected user query | Predicted Leaf Store (e.g., "washing machine" → Washing Machines Leaf) |
When the model's confidence for any Leaf Store falls below a threshold, the query falls back to the parent-level store — a broader result universe with lower precision. This fallback protects against null results at the cost of some relevance precision.
FCC's Store Classifier is trained on client-specific query and catalog data. For new client onboardings, the classifier is retrained on the publisher's local query patterns, taxonomy, and brand ecosystem — a key step in the onboarding process.
4.4 Intent Understanding
Status: ✅ Live
Intent Understanding classifies the semantic type of the query. Different query types receive different retrieval and ranking treatment:
| Intent Type | Example | Retrieval Treatment |
|---|---|---|
| Brand | "Nike" | Brand-filtered retrieval; brand store prioritised |
| Category | "running shoes" | Broad category retrieval within classified store |
| Feature | "earphones under 2k" | Feature + price constraint extraction |
| Thematic | "gym essentials" | Broad semantic matching across categories |
Understanding intent allows the system to optimise both what is retrieved and how results are ranked and presented for each query type.
4.5 Query Rewrite / Query Expansion
Status: 🔜 Coming Soon
Query Rewrite expands search recall without sacrificing precision. It adds semantically equivalent terms to the retrieval query — for example, "hiking footwear" triggers retrieval also including "trekking shoes" and "trail boots."
This capability is critical for tail queries where lexical matching alone fails. Without it, queries using non-standard terminology for a product category return no or poor results even when matching products exist in the catalog.
4.6 Quantifier Extraction (Numeric Constraint Handling)
Status: 🔜 Coming Soon
The Quantifier module extracts numeric constraints from natural-language queries and encodes them into the retrieval query. Examples:
- "TVs under 30,000" → price ceiling constraint applied at retrieval
- "laptops 40k–60k" → price band constraint applied at retrieval
- "65-inch TV" → screen size constraint applied at retrieval
Without this capability, numeric terms in queries are treated as raw text and ignored — the price or size constraint the user expressed is not applied to results.
4.7 Partial Match
Status: 🔜 Coming Soon
Partial Match is a null-rescue mechanism. If a query containing multiple tokens returns no results after all processing steps, Partial Match progressively relaxes the query — dropping less essential tokens and loosening constraints — until a result set is found.
This directly reduces null search rate. Without it, multi-term queries that partially match the catalog return zero results even when relevant products exist.
4.8 Retrieval — Lexical and Semantic
Status: ✅ Live
FCC Search uses two parallel retrieval signals that fire simultaneously and whose outputs are merged before ranking:
| Signal | How It Works | Strength |
|---|---|---|
| Lexical Retrieval | Token-match query against the product index. Matches on title, attributes, brand, and category terms. | High precision for exact and near-exact matches. Fast. |
| Semantic Retrieval | Dense vector match by meaning using the FK V5 embedding model (LLM-augmented). Products are matched by semantic intent, not just token overlap. | Captures intent for natural-language, long-tail, and synonym-heavy queries that lexical retrieval misses. |
Both signals contribute to the candidate set passed to ranking. See Section 7 for a dedicated description of the Semantic Search layer and its current rollout status.
5. Pillar 3 — Rank
Overview
Ranking determines which products in the candidate set are shown at the top of the results page. Position 1 receives approximately 40% of clicks; position 10 receives approximately 2%. Ranking decisions have an outsized impact on both conversion and revenue outcomes.
FCC Search uses a three-stage cascaded ranking architecture, where each stage narrows candidates with progressively richer signals.
~1M Candidate Documents
│
▼
┌────────────────────────────────────────────────┐
│ L0 — Coarse Sort │ ✅ Live
│ Primary signal: Units Per Impression (UPI) │
│ Fast, broad filter. Output: ~10K candidates │
└──────────────────────┬─────────────────────────┘
│
▼
┌────────────────────────────────────────────────┐
│ L1 — Collapse Sort + Product Reranking │ ✅ Live
│ Buy box selection (1 best listing per product)│
│ Speed + Quality + Price + Seller signals │
│ Output: top 1,000 products │
└──────────────────────┬─────────────────────────┘
│
▼
┌────────────────────────────────────────────────┐
│ L2 — Personalisation │ 🔜 Coming Soon
│ Near-real-time session + purchase history │
│ Top 120 results shown on SRLP │
└────────────────────────────────────────────────┘
5.1 L0 — Coarse Sort
Status: ✅ Live
L0 applies a fast, broad scoring to the full candidate set retrieved from the index. The primary signal is UPI (Units Per Impression) — a measure of how frequently products in a position have historically converted to purchases. L0 eliminates clearly irrelevant or low-quality candidates before the more expensive L1 signals are applied.
5.2 L1 — Collapse Sort and Product Reranking
Status: ✅ Live
L1 performs two operations:
Buy Box Selection: For any product sold by multiple sellers, L1 selects the single best listing to represent the product in results — the "buy box" winner. This collapses multiple seller listings of the same product into one result card.
Product Reranking: The collapsed product list is re-scored using a richer signal set:
| Ranking Signal | Description |
|---|---|
| UPI / fRPI | Units Per Impression with time-decay. Penalises products whose historical popularity is stale; gives recently popular products a fair weighting. |
| Speed | Delivery speed to the user's location. Faster delivery receives a ranking boost — a key commercial signal for platforms where delivery SLA is a differentiator. |
| Listing Quality Score (LQS) | Completeness of the product listing: title richness, image count, attribute fill rate. Incomplete listings are demoted. |
| Product Quality Score (PQS) | Review count and average rating. High-quality, well-reviewed products are surfaced above equivalent products with sparse or poor reviews. |
| Price Value Score (PVS) | Competitive pricing relative to the category median. Products priced at or below the category median receive a ranking boost. |
| Seller Tier | Seller's historical fulfilment reliability, return rate, and responsiveness. Higher-tier sellers receive a ranking advantage. |
Configurable signal weights: L1 signal weights are configurable per publisher — different weight profiles can be applied for different business contexts (e.g., sale events, category promotions). A self-serve ranking configuration layer is currently in development to allow business users to adjust these weights without engineering intervention.
5.3 L2 — Personalisation
Status: 🔜 Coming Soon
L2 applies personalisation signals to the top 120 products from L1 to produce the final ranking shown on the SRLP. It uses near-real-time signals from the user's current session and historical purchase behaviour to re-order results for that specific user.
L2 personalisation is expected to deliver meaningful uplift for returning users and high-frequency shoppers, where session context strongly predicts purchase intent.
5.4 Offline Ranking Calibration
Status: 🔜 Coming Soon
FCC is building an offline simulation system that runs ICU (Impressions, Clicks, Units) simulations to identify optimal signal weight configurations for L0/L1 ranking. This system will replace the current manual approach to signal weight setting with a principled, data-driven calibration process that can be run after every major catalog or traffic change.
6. Pillar 4 — Experience
Overview
The Experience layer covers everything the user sees — before, during, and after typing a query. A perfectly ranked result set that is hard to navigate, filter, or understand still fails. The Experience layer ensures the right information reaches the user in a way that makes choosing easy.
Status: ✅ Live (all components in this section)
6.1 AutoSuggest
AutoSuggest provides real-time query completions as the user types. Suggestions are drawn from a combination of popular queries, trending searches, past user queries, and category shortcuts.
| Feature | Description |
|---|---|
| Popular completions | Highest-volume query completions for the typed prefix |
| Trending suggestions | Queries gaining momentum in the last 24–72 hours |
| Category shortcuts | Direct links to relevant category stores based on the prefix |
| In-session context | Suggestions adapt based on what the user has already searched or browsed in the current session |
| Zero-prefix state | Before any character is typed, the system surfaces personalised starting suggestions based on user history and platform trends |
AutoSuggest has its own ranking model — it is not simply a UI feature but a ranked retrieval problem in its own right.
6.2 Sort and Filter
Sort options and faceted filters allow users to refine results after the initial search.
Sort options:
| Option | Description |
|---|---|
| Relevance | Default — system's ranked order balancing all ranking signals |
| Price: Low to High / High to Low | Price-sorted results |
| Popularity | Sorted by UPI / purchase volume |
| Rating | Sorted by average review rating |
| Newest First | Recently added products surfaced at the top |
Filters: Filter sets are Store-specific and generated dynamically from the index. Electronics surfaces RAM, storage, screen size, and connectivity filters. Fashion surfaces size, color, fit, and material filters. The filter quality is directly dependent on catalog attribute completeness — products with sparse attributes do not contribute to filter facets and effectively become unfilterable.
6.3 Snippets
Snippets are structured information cards displayed on each product result — showing key attributes, price, delivery estimate, and rating without requiring the user to click through to the product page.
Snippets reduce click-through friction and help users make faster decisions at the results level. Snippet quality scales directly with catalog attribute richness: products with complete, structured attributes produce rich snippets; sparse products produce empty or minimal cards.
6.4 Spotlights
Spotlights are dynamic callout labels displayed on product cards that communicate urgency and value signals at a glance:
| Spotlight Type | Trigger |
|---|---|
| Limited Stock | Product inventory falls below a defined threshold |
| New | Product was recently added to the catalog |
| More 4 Less | Product is part of a promotional or value offer |
Spotlight signals are dynamic — they are computed in real time from inventory and promotions data rather than being manually applied labels.
6.5 Null Search Handling
Null search — queries that return zero results — is a top health metric for every search system. A user who hits a null result does not retry; they leave.
FCC's null search handling re-engages user intent rather than ending the session:
| Component | Description |
|---|---|
| Related category suggestions | Links to stores relevant to the failed query |
| Alternative query suggestions | "Did you mean X?" suggestions derived from similar successful queries |
| Partial match fallback | (Coming Soon) Progressive query relaxation to return near-matches when exact matches fail |
Null search rate is measured and tracked continuously. The introduction of semantic search, spell correction improvements, and the planned Partial Match module each target null rate reduction as a primary outcome metric.
6.6 Grid and List View
The results page layout adapts to the Store context:
| View | Default Context | Rationale |
|---|---|---|
| Grid | Fashion, Accessories, Home Décor | Visual-first categories where product images are the primary decision driver |
| List | Electronics, Appliances, B2B | Detail-first categories where specifications and price comparisons matter |
Publishers can configure default view preferences per Store and allow users to toggle between views.
7. Semantic Search
Overview
Semantic Search is FCC's most significant search quality investment — a dense vector retrieval layer that matches queries by meaning, not just by token overlap. It is currently live and being progressively rolled out to full traffic.
Status: ✅ Live — Rolling Rollout (30% → 100%)
How Semantic Search Works
Traditional lexical retrieval answers the question: "which products contain words that appear in the query?" Semantic retrieval answers a different question: "which products mean the same thing as what the user asked for?"
A user searching for "gym shoes" on a platform that lists products as "athletic footwear" or "training sneakers" gets no results from lexical retrieval — the words don't match. Semantic retrieval understands that these phrases describe the same thing and surfaces the right products.
| Approach | Retrieval Logic | Strength | Weakness |
|---|---|---|---|
| Lexical | Token matching against index | High precision for exact matches. Fast. | Fails on synonyms, natural language, long-tail queries |
| Semantic | Dense vector similarity by meaning | Captures intent; handles natural language and synonyms | Requires model training; higher compute cost |
| Blended (Coming Soon) | Both signals merged into a single ranked result set | Best of both — precision + recall | Requires careful merging and calibration |
The FK V5 Embedding Model
FCC's semantic retrieval is powered by the FK V5 embedding model — a two-stage trained, LLM-augmented model built on Flipkart's corpus of 200M+ products and years of user interaction data. The model generates dense vector representations of both queries and products, enabling semantic similarity matching at scale.
Key properties of the V5 model:
- Domain-specific training: Trained on e-commerce query and product data — not a general-purpose language model. Understanding of commerce-specific semantics (brands, categories, specifications) is built in.
- LLM augmentation: A second training stage uses LLM-generated signals to improve concept-level matching for abstract and thematic queries.
- Client adaptation: The model can be fine-tuned on client-specific catalog and query data during onboarding to improve performance for the publisher's category mix and market.
Current Rollout Status
Semantic search is currently live at 30% of traffic. The current implementation uses a position-split model — lexical retrieval fills positions 1–20 of the result set, and semantic retrieval contributes from position 21 onwards.
Next milestone — Blended Retrieval (Coming Soon): The position-split model is being replaced with true blended retrieval, where lexical and semantic signals are merged into a single unified ranking across all positions. This eliminates the artificial boundary and allows the best signal to win at every position in the result set.
What Semantic Search Improves
| Query Type | Without Semantic | With Semantic |
|---|---|---|
| Synonym queries | "hiking footwear" returns no results if catalog uses "trekking shoes" | Returns relevant products regardless of terminology used |
| Natural-language queries | "something to charge my laptop fast" returns null or unrelated results | Maps to fast chargers and USB-C adapters correctly |
| Long-tail queries | Multi-token descriptive queries frequently hit null | Meaning-based matching finds relevant products even for unusual phrasing |
| Thematic queries | "home office setup" fails lexical matching | Returns monitors, chairs, desks, and accessories through semantic clustering |
Primary outcome metric: Null search rate reduction. Secondary metrics: result quality uplift for long-tail queries (measured by position-level click data) and conversion rate improvement for semantic-served result positions.
8. Capability Summary
| Capability | Pillar | Status |
|---|---|---|
| Store Tree (Demand-Side Taxonomy) | Organize | ✅ Live |
| Real-Time Price / Stock Indexing | Organize | ✅ Live |
| Batch Attribute / Catalog Indexing | Organize | ✅ Live |
| DNA — Do Not Augment Gate | Understand | ✅ Live |
| Spell Correction | Understand | ✅ Live |
| Store Classification (ML Model) | Understand | ✅ Live |
| Intent Understanding | Understand | ✅ Live |
| Lexical Retrieval | Understand | ✅ Live |
| Semantic Retrieval (V5 Model) | Understand | ✅ Live — Rolling (30% → 100%) |
| Query Rewrite / Query Expansion | Understand | 🔜 Coming Soon |
| Quantifier Extraction (Numeric Constraints) | Understand | 🔜 Coming Soon |
| Partial Match (Null Rescue) | Understand | 🔜 Coming Soon |
| Blended Retrieval (Lexical + Semantic) | Understand | 🔜 Coming Soon |
| L0 Coarse Sort | Rank | ✅ Live |
| L1 Collapse Sort + Product Reranking | Rank | ✅ Live |
| Configurable Ranking Signals | Rank | 🔜 In Progress |
| Result Explainability (Client-Facing) | Rank | 🔜 In Progress |
| L2 Personalisation | Rank | 🔜 Coming Soon |
| Offline Ranking Calibration | Rank | 🔜 Coming Soon |
| AutoSuggest | Experience | ✅ Live |
| Sort and Filter | Experience | ✅ Live |
| Snippets | Experience | ✅ Live |
| Spotlights | Experience | ✅ Live |
| Null Search Handling | Experience | ✅ Live |
| Grid / List View | Experience | ✅ Live |
| Natural Language / Conversational Search | Experience | 🔜 Coming Soon |
| Audience Manager Signal Integration | Experience | 🔜 Coming Soon |
| Multi-Tenant Sandbox (New Client Demo) | Platform | 🔜 In Progress |
| Search Generalisation (Multi-Tenant) | Platform | 🔜 In Progress |
9. Market Context and Strategic Vision
The Opportunity
E-commerce search is a $8B global TAM, with a realistic addressable market of $800M across FCC's primary geographies (US, UAE, Singapore, ANZ). FCC's conservative 3-year SOM target is $45M across these four markets.
| Geography | 3-Year SOM Target | Key Context |
|---|---|---|
| United States | $26M | ~1,200 ICP accounts · largest mid-market depth globally |
| ANZ | $9M | ~320 accounts · English-first · underpenetrated by specialists |
| UAE / GCC | $7M | ~180 accounts · Hybris EoL tailwind · strong SaaS appetite |
| Singapore | $3M | ~90 accounts · SEA hub · gateway to regional marketplaces |
Ideal Customer Profile (ICP)
FCC Search is optimised for mid-market, business-led retailers and marketplaces:
| Attribute | Profile |
|---|---|
| Revenue scale | $100M – $2B GMV |
| Traffic | 1M – 50M monthly sessions |
| Catalog size | 10K – 5M SKUs |
| Team | 3–15 person digital or merchandising team; small or no dedicated search engineering team |
| Categories | T1 (Electronics, Auto, B2B MRO, Pharmacy) and T2 (Fashion, Beauty, Home, Grocery) |
| Buyer posture | Business-led (CPO, Head of Digital, VP E-commerce) · wants conversion outcomes, not infrastructure knobs |
What these clients want: Outcome-first dashboards, merchandising control without engineering tickets, fast onboarding, predictable pricing, and composable integration with their existing stack.
What FCC uniquely delivers: Baseline relevance from session one without requiring client data to cold-start; transparent merchandising controls; a free sandbox for evaluation before commitment; category-specific tuning; and agent-ready architecture.
Competitive Landscape
| Competitor | Profile | FCC Differentiation |
|---|---|---|
| Algolia | Tech-led SMB, ~$15–20K ACV | FCC targets business-led buyers; Algolia's buyer is the developer. FCC wins on merchandising controls and commerce depth. |
| Constructor.io | Mid-large business-led, ~$250–400K ACV | Constructor is the lone serious specialist in FCC's segment. FCC undercuts on pricing and wins where Constructor has thin presence — UAE, Singapore, ANZ. |
| Bloomreach | Mid-large suite, ~$150–300K ACV | Heavy bundled product; slow deployments. FCC is composable — plug in what the client needs, not the whole suite. |
| SFCC / Adobe | Enterprise suite, $500K+ ACV | Bundled search add-ons of uneven quality. FCC wins when clients are actively re-platforming or dissatisfied with their suite's search. |
| Coveo | Enterprise B2B, ~$300–500K ACV | Enterprise-only; legacy orientation. FCC wins in mid-market where Coveo doesn't play. |
6 Strategic Bets
| Bet | Description |
|---|---|
| 1 — Composable Stack + Baseline Relevance | One reusable search engine across clients; usable relevance from session one without client data cold-start |
| 2 — Business Glassbox | Merchandising rules, A/B testing, revenue attribution, and result explainability — from black-box to client-legible |
| 3 — Frictionless Sales Motion | From paid 6-week MVP to free 24-hour sandbox — cut pilot-to-close time dramatically |
| 4 — Agentic Commerce Readiness | Conversational on-site search surface + MCP/AP2 gateway for off-platform AI agents. Defend the agent-era pipeline. |
| 5 — Market Tuning for US / UAE / SG / ANZ | Region-specific language models, category packs, and compliance — beat Constructor where they barely play |
| 6 — Winning Categories vs. Generalised Search | Deep category packs (electronics, fashion, horizontals) — depth beats breadth in mid-market |
The Future of Search
Core keyword search share will shrink. Three forces are reshaping discovery:
- On-platform AI discovery: Shoppers shifting from "type and scroll" to "ask and get answered." Amazon Rufus, Walmart Sparky, and Flipkart Flippi are early signals of a broad shift toward conversational on-site search.
- Off-platform AI discovery: Discovery starting in ChatGPT, Perplexity, and Gemini rather than on the retailer's site. 500M+ ChatGPT weekly active users are a pool of buyers whose first product interaction may never touch a retailer's search bar.
- Agentic commerce: Software agents that research, compare, and transact without a browser. Gartner projects ~25% of shopping happening via agents by 2028.
FCC Search's agentic commerce readiness work (Bet 4) positions clients to capture discovery across all three surfaces — not just traditional on-site search.
10. Roadmap
Q2 2026 — Active Priorities
| Bucket | Item | Status |
|---|---|---|
| Semantic Search | Full rollout to 100% traffic | ✅ Live → Rolling |
| Semantic Search | Blended retrieval (replace position-split model) | 🔜 Planned |
| Ranking | Configurable ranking signals (self-serve for clients) | 🔄 In Progress |
| Ranking | Offline → online ranking plan improvements | 🔜 Planned |
| NL Search | Natural language / conversational query support | 🔜 Planned |
| Personalisation | Audience Manager signal integration | 🔜 Planned |
| Client Transparency | Client-facing ranking and result explainability | 🔄 In Progress |
| Sales Enablement | Sandbox environment for new client evaluation | 🔄 In Progress |
Longer Arc — Search Generalisation
Beyond Q2, the strategic north star is making FCC Search a genuinely portable, multi-tenant product. This means abstracting every client-specific assumption — brand lists, store taxonomy, query patterns, ranking signal weights — into configurable, tenant-aware layers. A new client in a different country, with a different catalog and brand ecosystem, should be onboardable without rebuilding the system from scratch.
Search Generalisation is the foundational work that unlocks new client acquisition at scale.
11. Metrics and Measurement
Core Search Health Metrics
| Metric | Definition | Target Direction |
|---|---|---|
| Null Search Rate | % of queries returning zero results | Minimise |
| Query Volume | Total search queries per day / per session | Track; volume drop signals UX or platform issue |
| Click-Through Rate (CTR) | % of search sessions where a user clicks a result | Maximise |
| Conversion Rate (CVR) | % of search sessions that result in a purchase | Primary business outcome metric |
| Average Order Value (AOV) | Average revenue per order from search-originated sessions | Track for uplift from personalisation features |
Relevance Metrics
| Metric | Definition |
|---|---|
| NDCG (Normalised Discounted Cumulative Gain) | Ranking quality metric — measures how well the ranked result list matches an ideal ordering based on relevance judgements |
| Revenue per Search | Revenue attributed to search sessions ÷ number of search sessions |
| Position-Level CTR | Click-through rate at each result position (P1, P2, ... P10). Position 1 should receive ~40% of clicks in a well-tuned system. |
| Semantic vs. Lexical Attribution | % of clicks and conversions coming from semantic-served vs. lexical-served result positions |
Null Search Diagnostic Metrics
| Metric | Definition |
|---|---|
| Brand Query Null Rate | Null rate specifically for queries classified as brand-intent |
| Category Query Null Rate | Null rate for queries classified as category-intent |
| Tail Query Null Rate | Null rate for queries with low historical search volume (long-tail) |
| Store Classifier Fallback Rate | % of queries falling back to parent store due to low classification confidence |
Measurement Approach
Baseline definition. Before any new feature is rolled out, a measurement baseline — null rate, CTR, CVR — must be established. FCC works with publishers to define this baseline during onboarding.
Staged rollout. New ranking and retrieval features are rolled out in traffic tranches (e.g., 10% → 30% → 100%) with client sign-off at each stage. Metrics are reviewed at each gate before the next tranche is enabled.
Query-level analysis. For suspected quality issues, FCC provides query-level debugging access showing the full processing pipeline output for any specific query — which store was classified, which products were retrieved, and how they were ranked.
12. Onboarding Checklist
Overview
Standard FCC Search onboarding takes approximately 4 to 8 weeks from contract finalisation to first live traffic, depending on catalog size, taxonomy complexity, and how much training data is available for model adaptation.
Phase 1: Taxonomy and Index Setup
| Task | Owner |
|---|---|
| Define Store Tree: L0 → L3 Leaf Stores for the publisher's category structure | Publisher Catalog Team + FCC |
| Map demand-side stores to supply-side verticals (which stores are products indexed into?) | Publisher Catalog Team + FCC |
| Configure filter sets per Leaf Store (which attributes to expose as facets) | Publisher Catalog Team + FCC |
| Set up real-time indexing pipeline for price and stock updates | Publisher Engineering |
| Set up batch indexing pipeline for catalog attributes and taxonomy changes | Publisher Engineering |
| Validate catalog attribute completeness for key categories (impacts snippet and filter quality) | Publisher Catalog Team + FCC |
Phase 2: Model Adaptation and Query Processing
| Task | Owner |
|---|---|
| Retrain Store Classifier on publisher's taxonomy and local query patterns | FCC Data Science |
| Configure DNA (brand/trademark) override lists for publisher's brand ecosystem | Publisher + FCC |
| Build spell correction model and initial brand override list | FCC Data Science + Publisher |
| Configure intent understanding for publisher's category and brand mix | FCC |
| Adapt semantic retrieval model (V5) for publisher's catalog distribution | FCC Data Science |
| Validate query processing pipeline end-to-end with sample query set | FCC + Publisher |
Phase 3: Ranking Configuration
| Task | Owner |
|---|---|
| Define ranking signal weights for the publisher's business context (speed, price, quality priorities) | Publisher + FCC |
| Configure Spotlight triggers (inventory thresholds, promotional signals) | Publisher + FCC |
| Configure sort options and default sort per Store | Publisher + FCC |
| Establish null search rate baseline before go-live | FCC + Publisher |
| Define measurement framework: which metrics, attribution windows, reporting cadence | Publisher + FCC |
Phase 4: Go Live and Optimisation
| Task | Owner |
|---|---|
| Launch at partial traffic (10–30%) for initial quality validation | Publisher + FCC |
| Monitor null search rate, CTR, and CVR for the first 2 weeks | Publisher + FCC |
| Review position-level click data to assess ranking quality | FCC |
| Expand semantic search rollout based on validated metrics | FCC + Publisher |
| Plan phased rollout of Coming Soon capabilities (Query Rewrite, Partial Match, L2 Personalisation) | Publisher + FCC |
| Establish quarterly search health review cadence | Publisher + FCC |
Glossary
| Term | Definition |
|---|---|
| AutoSuggest | Real-time query completion suggestions displayed as the user types, drawn from popular queries, trending searches, and session context |
| Blended Retrieval | A retrieval approach where lexical and semantic signals are merged into a single unified ranked result set, rather than being applied to separate position ranges. Coming Soon. |
| Buy Box | The single best seller listing selected to represent a product in search results when multiple sellers offer the same item |
| Cascaded Ranking | A multi-stage ranking architecture where each stage applies progressively richer signals to a progressively smaller candidate set (L0 → L1 → L2) |
| CTR | Click-Through Rate — the percentage of search result impressions that result in a user clicking a product |
| CVR | Conversion Rate — the percentage of search sessions that result in a purchase |
| DNA | Do Not Augment — the first gate in the query processing pipeline. Protects brand names, trademarks, and product codes from modification by downstream modules. |
| FCC | Flipkart Commerce Cloud |
| FK V5 Model | FCC's domain-specific, LLM-augmented semantic embedding model trained on Flipkart's e-commerce corpus. Powers semantic retrieval. |
| Headless Search | FCC's deployment model — a search API that returns ranked result data; the publisher's storefront controls rendering and display. |
| ICP | Ideal Customer Profile — mid-market ($100M–$2B GMV) business-led retailers and marketplaces in T1/T2 categories across US, UAE, Singapore, and ANZ |
| Intent Understanding | The pipeline step that classifies a query's semantic type — brand, category, feature, or thematic — to determine appropriate retrieval and ranking treatment |
| L0 / L1 / L2 | The three stages of cascaded ranking: L0 (Coarse Sort), L1 (Collapse Sort + Product Reranking), L2 (Personalisation) |
| Leaf Store | The lowest-level node in the Store Tree taxonomy. Every product maps to a Leaf Store. The classified Leaf Store determines the result universe, available filters, and SRLP layout for a query. |
| Lexical Retrieval | Token-based matching of query terms against the product index. Precise for exact and near-exact matches. |
| MACH | Microservices, API-First, Cloud-Native SaaS, Headless — FCC Search's architectural standard |
| MCP Gateway | Model Context Protocol gateway — an interface that exposes client catalog, inventory, and pricing data to AI agents (ChatGPT, Perplexity, Gemini) for off-platform discovery. Part of the agentic commerce roadmap. |
| NDCG | Normalised Discounted Cumulative Gain — a standard ranking quality metric measuring how closely a ranked list matches an ideal relevance ordering |
| Null Search Rate | The percentage of search queries that return zero results. A primary search health metric. |
| Nymeria | FCC's planned offline simulation system for ranking calibration — runs ICU (Impressions, Clicks, Units) simulations to identify optimal signal weights. Coming Soon. |
| Partial Match | A null-rescue mechanism that progressively relaxes a query (drops tokens, loosens constraints) until a result set is found. Coming Soon. |
| PQS | Product Quality Score — a ranking signal based on product review count and average rating |
| PVS | Price Value Score — a ranking signal that boosts products priced at or below the category median |
| Quantifier Extraction | A pipeline step that extracts numeric constraints from queries (price limits, size specifications) and encodes them into the retrieval query. Coming Soon. |
| Query Rewrite | A pipeline step that expands queries with semantically equivalent terms to improve recall without losing precision. Coming Soon. |
| Sandbox | A demo-ready evaluation environment where prospective clients can plug in their catalog, run test queries, and see FCC Search results in real time — before any commercial commitment. In Progress. |
| Search Generalisation | The platform work to make FCC Search a portable, multi-tenant product by abstracting all client-specific assumptions into configurable tenant layers. In Progress. |
| Semantic Search | Dense vector retrieval that matches queries by meaning rather than token overlap. Powered by the FK V5 embedding model. |
| Snippets | Structured product information displayed on result cards (key attributes, price, delivery, rating) without requiring click-through to the product page |
| Spotlights | Dynamic callout labels on product cards (Limited Stock, New, More 4 Less) computed in real time from inventory and promotions data |
| SRLP | Search Results Listing Page — the page rendered by the publisher's storefront showing search results |
| Store Classifier | An ML model that predicts which Leaf Store a user's query belongs to. The most consequential single step in the query processing pipeline. |
| Store Tree | The demand-side taxonomy hierarchy (L0 → L3 Leaf) that organises the product index. Products are indexed into stores based on how buyers think, not just how sellers list. |
| UPI | Units Per Impression — the primary L0 and L1 ranking signal. Measures how frequently a product in a given position has historically converted to a purchase. |
© 2026 Flipkart Commerce Cloud — Confidential. For Publisher and Partner Use Only.