Flipkart Commerce Cloud Search

Product Documentation v1.0 Built By Retailers, For Retailers

About This Document

This document is the product reference for Flipkart Commerce Cloud (FCC) Search. It covers the platform's architecture across four pillars — Organize, Understand, Rank, and Experience — the semantic search layer, capability status, strategic vision, and integration guidance for publishers and their technical teams.

Audience: Network publishers, e-commerce product managers, merchandising teams, and integration engineers.

Prerequisites: Familiarity with e-commerce concepts (product catalogs, search results pages, conversion metrics) is helpful but not required.

Introduction
Platform Architecture — The Four Pillars
Pillar 1 — Organize: Indexing and Taxonomy
Pillar 2 — Understand: Query Processing Pipeline
Pillar 3 — Rank
Pillar 4 — Experience
Semantic Search
Capability Summary
Market Context and Strategic Vision
Roadmap
Metrics and Measurement
Onboarding Checklist
Glossary

1. Introduction

Overview

FCC Search is a composable, multi-tenant e-commerce search platform built on the same engine that powers Flipkart — one of the world's largest commerce platforms by search volume. It is purpose-built for mid-to-large retailers and marketplaces seeking higher conversion on existing traffic, without the complexity and cost of building or maintaining a search infrastructure in-house.

Search is not a feature — it is the primary discovery mechanism for e-commerce. Roughly 40–55% of all units sold in e-commerce flow through organic search. Customers who search convert significantly more than those who browse, and their average order value is higher. A customer who hits a dead end — null results, wrong results, confusing results — does not try again. They leave.

FCC Search is designed to ensure that does not happen.

The Retail Store Analogy

The simplest way to understand what a search system does is to think of a large physical retail store. Every element of the store is designed to help a shopper find what they came for — products grouped by type, arranged on shelves in a logical order, placed in sections that match how shoppers think. When a shopper asks a staff member for help, they understand what was meant — even if the wrong word was used — and guide the shopper to the right aisle. Once in the aisle, the most popular or best-value products are at eye level, not buried at the back.

E-commerce search does all of this digitally:

Organize — How the store is laid out and how products are shelved
Understand — The staff member who interprets your query and points you in the right direction
Rank — Which products are placed at eye level vs. buried on the bottom shelf
Experience — What information is visible on the shelf tag to help you decide

FCC Search is built around these four pillars.

Who This Platform Is For

Persona	Description
Network Publishers	Retailers and marketplaces ($100M–$2B GMV, 10K–5M SKUs) seeking higher conversion from existing traffic through best-in-class search relevance
Merchandising and Category Teams	Business users who need control over what products surface for which queries — without raising engineering tickets for every change
Integration Engineers	Technical teams integrating FCC Search APIs into storefronts, mobile apps, and analytics pipelines
E-Commerce Product Managers	Teams responsible for search quality, null search reduction, and conversion optimisation across discovery surfaces

2. Platform Architecture — The Four Pillars

Overview

Every FCC Search query passes through four sequential layers. Each layer has a distinct responsibility, and together they produce a ranked, relevant result page for every user interaction.

User Types a Query
        │
        ▼
┌─────────────────────────────┐
│  PILLAR 1 — ORGANIZE        │  Products indexed into a demand-side store tree
│  (Indexing & Taxonomy)      │  Real-time + batch ingestion pipelines
└──────────┬──────────────────┘
           │
           ▼
┌─────────────────────────────┐
│  PILLAR 2 — UNDERSTAND      │  Sequential query processing pipeline
│  (Query Processing)         │  DNA → Spell → Classify → Intent → Retrieve
└──────────┬──────────────────┘
           │  Candidate Set
           ▼
┌─────────────────────────────┐
│  PILLAR 3 — RANK            │  3-stage cascaded ranking (L0 → L1 → L2)
│  (Ranking)                  │  Coarse sort → product reranking → personalisation
└──────────┬──────────────────┘
           │  Top N results
           ▼
┌─────────────────────────────┐
│  PILLAR 4 — EXPERIENCE      │  AutoSuggest, Snippets, Spotlights, Filters
│  (Search Results Page)      │  Null handling, sort, grid/list view
└──────────┬──────────────────┘
           │
           ▼
  Search Results Page (SRLP) — rendered by publisher storefront

Important: FCC Search is a headless, API-first service. It returns ranked result data via API — the publisher's storefront is responsible for rendering. This gives publishers complete control over the visual experience while FCC handles the intelligence layer.

3. Pillar 1 — Organize: Indexing and Taxonomy

Overview

Before any query can be answered, products must be indexed in a way that mirrors how shoppers think — not just how sellers list. The Organize layer structures the catalog into a demand-side taxonomy and makes every product retrievable in near real-time.

Status: ✅ Live

3.1 Store Tree (Demand-Side Taxonomy)

FCC Search organizes products into a Store Tree — a hierarchical taxonomy from L0 (All Products) down to Leaf Stores (specific product types). Every product maps to a Leaf Store, and the Leaf Store a query is classified into determines the entire universe of results returned.

All Products (L0)
├── Electronics (L1)
│   ├── Phones & Tablets (L2)
│   │   └── Smart TVs (L3 ● Leaf)
│   └── Computing (L2)
│       └── Laptops (L3 ● Leaf)
├── Home & Furniture (L1)
│   └── ...
└── Fashion (L1)
    ├── Men's T-Shirts (L3 ● Leaf)
    └── Dresses (L3 ● Leaf)

Every product maps to a Leaf Store. The Leaf Store determines:

Which products are in the result universe for a given query
Which filters are available (Electronics shows RAM/storage; Fashion shows size/fit)
Which ranking signals are applied
How the Search Results Page (SRLP) is laid out

A wrong taxonomy classification means wrong results — always. Taxonomy quality is a direct multiplier on search quality. A product filed in the wrong store will never surface for queries that belong to the correct store.

Demand indexing vs. supply indexing. Products are indexed into stores based on how buyers think, not just the seller's vertical. A unisex t-shirt can live simultaneously in Men's, Women's, and Kids' stores. This demand-side mapping is configured by catalog and merchandising teams and is one of the highest-leverage levers in the system.

3.2 Indexing Pipeline

FCC Search maintains two ingestion paths for the product index:

Path	Latency	Purpose
Real-Time Ingestion	Near-real-time	Price changes, stock updates, availability flags. Staleness here is a direct revenue risk — a product shown as in-stock that is actually out of stock undermines buyer trust immediately.
Batch Ingestion	Periodic (scheduled)	Product attribute updates, taxonomy reclassification, image changes, new catalog additions.

Catalog quality directly impacts search quality. Sparse product attributes — missing titles, incomplete specifications, absent category assignments — make products effectively invisible to relevant queries even if they are correctly indexed. Publishers should prioritise catalog completeness as a foundational dependency for search performance.

4. Pillar 2 — Understand: Query Processing Pipeline

Overview

The Query Processing Pipeline fires sequentially on every query. It transforms raw user input into a structured retrieval instruction that the index can act on. The order of steps is critical — an error in an early stage propagates through all subsequent stages and degrades the final result set.

Raw Query
    │
    ▼
[DNA]  →  [Spell Correction]  →  [Store Classification]  →  [Intent Understanding]
    │
    ▼
[Query Rewrite]  →  [Quantifier Extraction]  →  [Partial Match]
    │
    ▼
[Lexical Retrieval + Semantic Retrieval]  →  Candidate Set for Ranking

4.1 DNA — Do Not Augment Gate

Status: ✅ Live

The first step in the pipeline is a protection gate. DNA identifies brand names, trademarks, product codes, and other terms that must not be modified by any downstream module. Brand names that clear DNA are passed through unchanged — they are never spell-corrected, rewritten, or expanded.

This protection is essential: spell-correcting a brand name like "Defy" or "Cemento" into a common dictionary word corrupts everything that follows.

4.2 Spell Correction

Status: ✅ Live (with ongoing improvements)

Spell Correction fixes genuine typos in queries that cleared the DNA gate. The system distinguishes between a real misspelling and an unusual-looking brand name — getting this wrong corrupts all downstream processing.

FCC uses a combination of a trained spell correction model augmented with brand-specific override lists to protect known brand names from incorrect correction. New brand terms are onboarded to the override list to maintain protection as the client's catalog evolves.

4.3 Store Classification

Status: ✅ Live

Store Classification is an ML model that predicts which Leaf Store a query belongs to. This is the single most consequential step in the pipeline — the classified store determines the entire result universe.

Input	Output
Corrected user query	Predicted Leaf Store (e.g., "washing machine" → Washing Machines Leaf)

When the model's confidence for any Leaf Store falls below a threshold, the query falls back to the parent-level store — a broader result universe with lower precision. This fallback protects against null results at the cost of some relevance precision.

FCC's Store Classifier is trained on client-specific query and catalog data. For new client onboardings, the classifier is retrained on the publisher's local query patterns, taxonomy, and brand ecosystem — a key step in the onboarding process.

4.4 Intent Understanding

Status: ✅ Live

Intent Understanding classifies the semantic type of the query. Different query types receive different retrieval and ranking treatment:

Intent Type	Example	Retrieval Treatment
Brand	"Nike"	Brand-filtered retrieval; brand store prioritised
Category	"running shoes"	Broad category retrieval within classified store
Feature	"earphones under 2k"	Feature + price constraint extraction
Thematic	"gym essentials"	Broad semantic matching across categories

Understanding intent allows the system to optimise both what is retrieved and how results are ranked and presented for each query type.

4.5 Query Rewrite / Query Expansion

Status: 🔜 Coming Soon

Query Rewrite expands search recall without sacrificing precision. It adds semantically equivalent terms to the retrieval query — for example, "hiking footwear" triggers retrieval also including "trekking shoes" and "trail boots."

This capability is critical for tail queries where lexical matching alone fails. Without it, queries using non-standard terminology for a product category return no or poor results even when matching products exist in the catalog.

4.6 Quantifier Extraction (Numeric Constraint Handling)

Status: 🔜 Coming Soon

The Quantifier module extracts numeric constraints from natural-language queries and encodes them into the retrieval query. Examples:

"TVs under 30,000" → price ceiling constraint applied at retrieval
"laptops 40k–60k" → price band constraint applied at retrieval
"65-inch TV" → screen size constraint applied at retrieval

Without this capability, numeric terms in queries are treated as raw text and ignored — the price or size constraint the user expressed is not applied to results.

4.7 Partial Match

Status: 🔜 Coming Soon

Partial Match is a null-rescue mechanism. If a query containing multiple tokens returns no results after all processing steps, Partial Match progressively relaxes the query — dropping less essential tokens and loosening constraints — until a result set is found.

This directly reduces null search rate. Without it, multi-term queries that partially match the catalog return zero results even when relevant products exist.

4.8 Retrieval — Lexical and Semantic

Status: ✅ Live

FCC Search uses two parallel retrieval signals that fire simultaneously and whose outputs are merged before ranking:

Signal	How It Works	Strength
Lexical Retrieval	Token-match query against the product index. Matches on title, attributes, brand, and category terms.	High precision for exact and near-exact matches. Fast.
Semantic Retrieval	Dense vector match by meaning using the FK V5 embedding model (LLM-augmented). Products are matched by semantic intent, not just token overlap.	Captures intent for natural-language, long-tail, and synonym-heavy queries that lexical retrieval misses.

Both signals contribute to the candidate set passed to ranking. See Section 7 for a dedicated description of the Semantic Search layer and its current rollout status.

5. Pillar 3 — Rank

Overview

Ranking determines which products in the candidate set are shown at the top of the results page. Position 1 receives approximately 40% of clicks; position 10 receives approximately 2%. Ranking decisions have an outsized impact on both conversion and revenue outcomes.

FCC Search uses a three-stage cascaded ranking architecture, where each stage narrows candidates with progressively richer signals.

~1M Candidate Documents
        │
        ▼
┌────────────────────────────────────────────────┐
│  L0 — Coarse Sort                              │  ✅ Live
│  Primary signal: Units Per Impression (UPI)    │
│  Fast, broad filter. Output: ~10K candidates   │
└──────────────────────┬─────────────────────────┘
                       │
                       ▼
┌────────────────────────────────────────────────┐
│  L1 — Collapse Sort + Product Reranking        │  ✅ Live
│  Buy box selection (1 best listing per product)│
│  Speed + Quality + Price + Seller signals      │
│  Output: top 1,000 products                    │
└──────────────────────┬─────────────────────────┘
                       │
                       ▼
┌────────────────────────────────────────────────┐
│  L2 — Personalisation                          │  🔜 Coming Soon
│  Near-real-time session + purchase history     │
│  Top 120 results shown on SRLP                 │
└────────────────────────────────────────────────┘

5.1 L0 — Coarse Sort

Status: ✅ Live

L0 applies a fast, broad scoring to the full candidate set retrieved from the index. The primary signal is UPI (Units Per Impression) — a measure of how frequently products in a position have historically converted to purchases. L0 eliminates clearly irrelevant or low-quality candidates before the more expensive L1 signals are applied.

5.2 L1 — Collapse Sort and Product Reranking

Status: ✅ Live

L1 performs two operations:

Buy Box Selection: For any product sold by multiple sellers, L1 selects the single best listing to represent the product in results — the "buy box" winner. This collapses multiple seller listings of the same product into one result card.

Product Reranking: The collapsed product list is re-scored using a richer signal set:

Ranking Signal	Description
UPI / fRPI	Units Per Impression with time-decay. Penalises products whose historical popularity is stale; gives recently popular products a fair weighting.
Speed	Delivery speed to the user's location. Faster delivery receives a ranking boost — a key commercial signal for platforms where delivery SLA is a differentiator.
Listing Quality Score (LQS)	Completeness of the product listing: title richness, image count, attribute fill rate. Incomplete listings are demoted.
Product Quality Score (PQS)	Review count and average rating. High-quality, well-reviewed products are surfaced above equivalent products with sparse or poor reviews.
Price Value Score (PVS)	Competitive pricing relative to the category median. Products priced at or below the category median receive a ranking boost.
Seller Tier	Seller's historical fulfilment reliability, return rate, and responsiveness. Higher-tier sellers receive a ranking advantage.

Configurable signal weights: L1 signal weights are configurable per publisher — different weight profiles can be applied for different business contexts (e.g., sale events, category promotions). A self-serve ranking configuration layer is currently in development to allow business users to adjust these weights without engineering intervention.

5.3 L2 — Personalisation

Status: 🔜 Coming Soon

L2 applies personalisation signals to the top 120 products from L1 to produce the final ranking shown on the SRLP. It uses near-real-time signals from the user's current session and historical purchase behaviour to re-order results for that specific user.

L2 personalisation is expected to deliver meaningful uplift for returning users and high-frequency shoppers, where session context strongly predicts purchase intent.

5.4 Offline Ranking Calibration

Status: 🔜 Coming Soon

FCC is building an offline simulation system that runs ICU (Impressions, Clicks, Units) simulations to identify optimal signal weight configurations for L0/L1 ranking. This system will replace the current manual approach to signal weight setting with a principled, data-driven calibration process that can be run after every major catalog or traffic change.

6. Pillar 4 — Experience

Overview

The Experience layer covers everything the user sees — before, during, and after typing a query. A perfectly ranked result set that is hard to navigate, filter, or understand still fails. The Experience layer ensures the right information reaches the user in a way that makes choosing easy.

Status: ✅ Live (all components in this section)

6.1 AutoSuggest

AutoSuggest provides real-time query completions as the user types. Suggestions are drawn from a combination of popular queries, trending searches, past user queries, and category shortcuts.

Feature	Description
Popular completions	Highest-volume query completions for the typed prefix
Trending suggestions	Queries gaining momentum in the last 24–72 hours
Category shortcuts	Direct links to relevant category stores based on the prefix
In-session context	Suggestions adapt based on what the user has already searched or browsed in the current session
Zero-prefix state	Before any character is typed, the system surfaces personalised starting suggestions based on user history and platform trends

AutoSuggest has its own ranking model — it is not simply a UI feature but a ranked retrieval problem in its own right.

6.2 Sort and Filter

Sort options and faceted filters allow users to refine results after the initial search.

Sort options:

Option	Description
Relevance	Default — system's ranked order balancing all ranking signals
Price: Low to High / High to Low	Price-sorted results
Popularity	Sorted by UPI / purchase volume
Rating	Sorted by average review rating
Newest First	Recently added products surfaced at the top

Filters: Filter sets are Store-specific and generated dynamically from the index. Electronics surfaces RAM, storage, screen size, and connectivity filters. Fashion surfaces size, color, fit, and material filters. The filter quality is directly dependent on catalog attribute completeness — products with sparse attributes do not contribute to filter facets and effectively become unfilterable.

6.3 Snippets

Snippets are structured information cards displayed on each product result — showing key attributes, price, delivery estimate, and rating without requiring the user to click through to the product page.

Snippets reduce click-through friction and help users make faster decisions at the results level. Snippet quality scales directly with catalog attribute richness: products with complete, structured attributes produce rich snippets; sparse products produce empty or minimal cards.

6.4 Spotlights

Spotlights are dynamic callout labels displayed on product cards that communicate urgency and value signals at a glance:

Spotlight Type	Trigger
Limited Stock	Product inventory falls below a defined threshold
New	Product was recently added to the catalog
More 4 Less	Product is part of a promotional or value offer

Spotlight signals are dynamic — they are computed in real time from inventory and promotions data rather than being manually applied labels.

6.5 Null Search Handling

Null search — queries that return zero results — is a top health metric for every search system. A user who hits a null result does not retry; they leave.

FCC's null search handling re-engages user intent rather than ending the session:

Component	Description
Related category suggestions	Links to stores relevant to the failed query
Alternative query suggestions	"Did you mean X?" suggestions derived from similar successful queries
Partial match fallback	(Coming Soon) Progressive query relaxation to return near-matches when exact matches fail

Null search rate is measured and tracked continuously. The introduction of semantic search, spell correction improvements, and the planned Partial Match module each target null rate reduction as a primary outcome metric.

6.6 Grid and List View

The results page layout adapts to the Store context:

View	Default Context	Rationale
Grid	Fashion, Accessories, Home Décor	Visual-first categories where product images are the primary decision driver
List	Electronics, Appliances, B2B	Detail-first categories where specifications and price comparisons matter

Publishers can configure default view preferences per Store and allow users to toggle between views.

7. Semantic Search

Overview

Semantic Search is FCC's most significant search quality investment — a dense vector retrieval layer that matches queries by meaning, not just by token overlap. It is currently live and being progressively rolled out to full traffic.

Status: ✅ Live — Rolling Rollout (30% → 100%)

How Semantic Search Works

Traditional lexical retrieval answers the question: "which products contain words that appear in the query?" Semantic retrieval answers a different question: "which products mean the same thing as what the user asked for?"

A user searching for "gym shoes" on a platform that lists products as "athletic footwear" or "training sneakers" gets no results from lexical retrieval — the words don't match. Semantic retrieval understands that these phrases describe the same thing and surfaces the right products.

Approach	Retrieval Logic	Strength	Weakness
Lexical	Token matching against index	High precision for exact matches. Fast.	Fails on synonyms, natural language, long-tail queries
Semantic	Dense vector similarity by meaning	Captures intent; handles natural language and synonyms	Requires model training; higher compute cost
Blended (Coming Soon)	Both signals merged into a single ranked result set	Best of both — precision + recall	Requires careful merging and calibration

The FK V5 Embedding Model

FCC's semantic retrieval is powered by the FK V5 embedding model — a two-stage trained, LLM-augmented model built on Flipkart's corpus of 200M+ products and years of user interaction data. The model generates dense vector representations of both queries and products, enabling semantic similarity matching at scale.

Key properties of the V5 model:

Domain-specific training: Trained on e-commerce query and product data — not a general-purpose language model. Understanding of commerce-specific semantics (brands, categories, specifications) is built in.
LLM augmentation: A second training stage uses LLM-generated signals to improve concept-level matching for abstract and thematic queries.
Client adaptation: The model can be fine-tuned on client-specific catalog and query data during onboarding to improve performance for the publisher's category mix and market.

Current Rollout Status

Semantic search is currently live at 30% of traffic. The current implementation uses a position-split model — lexical retrieval fills positions 1–20 of the result set, and semantic retrieval contributes from position 21 onwards.

Next milestone — Blended Retrieval (Coming Soon): The position-split model is being replaced with true blended retrieval, where lexical and semantic signals are merged into a single unified ranking across all positions. This eliminates the artificial boundary and allows the best signal to win at every position in the result set.

What Semantic Search Improves

Query Type	Without Semantic	With Semantic
Synonym queries	"hiking footwear" returns no results if catalog uses "trekking shoes"	Returns relevant products regardless of terminology used
Natural-language queries	"something to charge my laptop fast" returns null or unrelated results	Maps to fast chargers and USB-C adapters correctly
Long-tail queries	Multi-token descriptive queries frequently hit null	Meaning-based matching finds relevant products even for unusual phrasing
Thematic queries	"home office setup" fails lexical matching	Returns monitors, chairs, desks, and accessories through semantic clustering

Primary outcome metric: Null search rate reduction. Secondary metrics: result quality uplift for long-tail queries (measured by position-level click data) and conversion rate improvement for semantic-served result positions.

8. Capability Summary

Capability	Pillar	Status
Store Tree (Demand-Side Taxonomy)	Organize	✅ Live
Real-Time Price / Stock Indexing	Organize	✅ Live
Batch Attribute / Catalog Indexing	Organize	✅ Live
DNA — Do Not Augment Gate	Understand	✅ Live
Spell Correction	Understand	✅ Live
Store Classification (ML Model)	Understand	✅ Live
Intent Understanding	Understand	✅ Live
Lexical Retrieval	Understand	✅ Live
Semantic Retrieval (V5 Model)	Understand	✅ Live — Rolling (30% → 100%)
Query Rewrite / Query Expansion	Understand	🔜 Coming Soon
Quantifier Extraction (Numeric Constraints)	Understand	🔜 Coming Soon
Partial Match (Null Rescue)	Understand	🔜 Coming Soon
Blended Retrieval (Lexical + Semantic)	Understand	🔜 Coming Soon
L0 Coarse Sort	Rank	✅ Live
L1 Collapse Sort + Product Reranking	Rank	✅ Live
Configurable Ranking Signals	Rank	🔜 In Progress
Result Explainability (Client-Facing)	Rank	🔜 In Progress
L2 Personalisation	Rank	🔜 Coming Soon
Offline Ranking Calibration	Rank	🔜 Coming Soon
AutoSuggest	Experience	✅ Live
Sort and Filter	Experience	✅ Live
Snippets	Experience	✅ Live
Spotlights	Experience	✅ Live
Null Search Handling	Experience	✅ Live
Grid / List View	Experience	✅ Live
Natural Language / Conversational Search	Experience	🔜 Coming Soon
Audience Manager Signal Integration	Experience	🔜 Coming Soon
Multi-Tenant Sandbox (New Client Demo)	Platform	🔜 In Progress
Search Generalisation (Multi-Tenant)	Platform	🔜 In Progress

9. Market Context and Strategic Vision

The Opportunity

E-commerce search is a $8B global TAM, with a realistic addressable market of $800M across FCC's primary geographies (US, UAE, Singapore, ANZ). FCC's conservative 3-year SOM target is $45M across these four markets.

Geography	3-Year SOM Target	Key Context
United States	$26M	~1,200 ICP accounts · largest mid-market depth globally
ANZ	$9M	~320 accounts · English-first · underpenetrated by specialists
UAE / GCC	$7M	~180 accounts · Hybris EoL tailwind · strong SaaS appetite
Singapore	$3M	~90 accounts · SEA hub · gateway to regional marketplaces

Ideal Customer Profile (ICP)

FCC Search is optimised for mid-market, business-led retailers and marketplaces:

Attribute	Profile
Revenue scale	$100M – $2B GMV
Traffic	1M – 50M monthly sessions
Catalog size	10K – 5M SKUs
Team	3–15 person digital or merchandising team; small or no dedicated search engineering team
Categories	T1 (Electronics, Auto, B2B MRO, Pharmacy) and T2 (Fashion, Beauty, Home, Grocery)
Buyer posture	Business-led (CPO, Head of Digital, VP E-commerce) · wants conversion outcomes, not infrastructure knobs

What these clients want: Outcome-first dashboards, merchandising control without engineering tickets, fast onboarding, predictable pricing, and composable integration with their existing stack.

What FCC uniquely delivers: Baseline relevance from session one without requiring client data to cold-start; transparent merchandising controls; a free sandbox for evaluation before commitment; category-specific tuning; and agent-ready architecture.

Competitive Landscape

Competitor	Profile	FCC Differentiation
Algolia	Tech-led SMB, ~$15–20K ACV	FCC targets business-led buyers; Algolia's buyer is the developer. FCC wins on merchandising controls and commerce depth.
Constructor.io	Mid-large business-led, ~$250–400K ACV	Constructor is the lone serious specialist in FCC's segment. FCC undercuts on pricing and wins where Constructor has thin presence — UAE, Singapore, ANZ.
Bloomreach	Mid-large suite, ~$150–300K ACV	Heavy bundled product; slow deployments. FCC is composable — plug in what the client needs, not the whole suite.
SFCC / Adobe	Enterprise suite, $500K+ ACV	Bundled search add-ons of uneven quality. FCC wins when clients are actively re-platforming or dissatisfied with their suite's search.
Coveo	Enterprise B2B, ~$300–500K ACV	Enterprise-only; legacy orientation. FCC wins in mid-market where Coveo doesn't play.

6 Strategic Bets

Bet	Description
1 — Composable Stack + Baseline Relevance	One reusable search engine across clients; usable relevance from session one without client data cold-start
2 — Business Glassbox	Merchandising rules, A/B testing, revenue attribution, and result explainability — from black-box to client-legible
3 — Frictionless Sales Motion	From paid 6-week MVP to free 24-hour sandbox — cut pilot-to-close time dramatically
4 — Agentic Commerce Readiness	Conversational on-site search surface + MCP/AP2 gateway for off-platform AI agents. Defend the agent-era pipeline.
5 — Market Tuning for US / UAE / SG / ANZ	Region-specific language models, category packs, and compliance — beat Constructor where they barely play
6 — Winning Categories vs. Generalised Search	Deep category packs (electronics, fashion, horizontals) — depth beats breadth in mid-market

The Future of Search

Core keyword search share will shrink. Three forces are reshaping discovery:

On-platform AI discovery: Shoppers shifting from "type and scroll" to "ask and get answered." Amazon Rufus, Walmart Sparky, and Flipkart Flippi are early signals of a broad shift toward conversational on-site search.
Off-platform AI discovery: Discovery starting in ChatGPT, Perplexity, and Gemini rather than on the retailer's site. 500M+ ChatGPT weekly active users are a pool of buyers whose first product interaction may never touch a retailer's search bar.
Agentic commerce: Software agents that research, compare, and transact without a browser. Gartner projects ~25% of shopping happening via agents by 2028.

FCC Search's agentic commerce readiness work (Bet 4) positions clients to capture discovery across all three surfaces — not just traditional on-site search.

10. Roadmap

Q2 2026 — Active Priorities

Bucket	Item	Status
Semantic Search	Full rollout to 100% traffic	✅ Live → Rolling
Semantic Search	Blended retrieval (replace position-split model)	🔜 Planned
Ranking	Configurable ranking signals (self-serve for clients)	🔄 In Progress
Ranking	Offline → online ranking plan improvements	🔜 Planned
NL Search	Natural language / conversational query support	🔜 Planned
Personalisation	Audience Manager signal integration	🔜 Planned
Client Transparency	Client-facing ranking and result explainability	🔄 In Progress
Sales Enablement	Sandbox environment for new client evaluation	🔄 In Progress

Longer Arc — Search Generalisation

Beyond Q2, the strategic north star is making FCC Search a genuinely portable, multi-tenant product. This means abstracting every client-specific assumption — brand lists, store taxonomy, query patterns, ranking signal weights — into configurable, tenant-aware layers. A new client in a different country, with a different catalog and brand ecosystem, should be onboardable without rebuilding the system from scratch.

Search Generalisation is the foundational work that unlocks new client acquisition at scale.

11. Metrics and Measurement

Core Search Health Metrics

Metric	Definition	Target Direction
Null Search Rate	% of queries returning zero results	Minimise
Query Volume	Total search queries per day / per session	Track; volume drop signals UX or platform issue
Click-Through Rate (CTR)	% of search sessions where a user clicks a result	Maximise
Conversion Rate (CVR)	% of search sessions that result in a purchase	Primary business outcome metric
Average Order Value (AOV)	Average revenue per order from search-originated sessions	Track for uplift from personalisation features

Relevance Metrics

Metric	Definition
NDCG (Normalised Discounted Cumulative Gain)	Ranking quality metric — measures how well the ranked result list matches an ideal ordering based on relevance judgements
Revenue per Search	Revenue attributed to search sessions ÷ number of search sessions
Position-Level CTR	Click-through rate at each result position (P1, P2, ... P10). Position 1 should receive ~40% of clicks in a well-tuned system.
Semantic vs. Lexical Attribution	% of clicks and conversions coming from semantic-served vs. lexical-served result positions

Null Search Diagnostic Metrics

Metric	Definition
Brand Query Null Rate	Null rate specifically for queries classified as brand-intent
Category Query Null Rate	Null rate for queries classified as category-intent
Tail Query Null Rate	Null rate for queries with low historical search volume (long-tail)
Store Classifier Fallback Rate	% of queries falling back to parent store due to low classification confidence

Measurement Approach

Baseline definition. Before any new feature is rolled out, a measurement baseline — null rate, CTR, CVR — must be established. FCC works with publishers to define this baseline during onboarding.

Staged rollout. New ranking and retrieval features are rolled out in traffic tranches (e.g., 10% → 30% → 100%) with client sign-off at each stage. Metrics are reviewed at each gate before the next tranche is enabled.

Query-level analysis. For suspected quality issues, FCC provides query-level debugging access showing the full processing pipeline output for any specific query — which store was classified, which products were retrieved, and how they were ranked.

12. Onboarding Checklist

Overview

Standard FCC Search onboarding takes approximately 4 to 8 weeks from contract finalisation to first live traffic, depending on catalog size, taxonomy complexity, and how much training data is available for model adaptation.

Phase 1: Taxonomy and Index Setup

Task	Owner
Define Store Tree: L0 → L3 Leaf Stores for the publisher's category structure	Publisher Catalog Team + FCC
Map demand-side stores to supply-side verticals (which stores are products indexed into?)	Publisher Catalog Team + FCC
Configure filter sets per Leaf Store (which attributes to expose as facets)	Publisher Catalog Team + FCC
Set up real-time indexing pipeline for price and stock updates	Publisher Engineering
Set up batch indexing pipeline for catalog attributes and taxonomy changes	Publisher Engineering
Validate catalog attribute completeness for key categories (impacts snippet and filter quality)	Publisher Catalog Team + FCC

Phase 2: Model Adaptation and Query Processing

Task	Owner
Retrain Store Classifier on publisher's taxonomy and local query patterns	FCC Data Science
Configure DNA (brand/trademark) override lists for publisher's brand ecosystem	Publisher + FCC
Build spell correction model and initial brand override list	FCC Data Science + Publisher
Configure intent understanding for publisher's category and brand mix	FCC
Adapt semantic retrieval model (V5) for publisher's catalog distribution	FCC Data Science
Validate query processing pipeline end-to-end with sample query set	FCC + Publisher

Phase 3: Ranking Configuration

Task	Owner
Define ranking signal weights for the publisher's business context (speed, price, quality priorities)	Publisher + FCC
Configure Spotlight triggers (inventory thresholds, promotional signals)	Publisher + FCC
Configure sort options and default sort per Store	Publisher + FCC
Establish null search rate baseline before go-live	FCC + Publisher
Define measurement framework: which metrics, attribution windows, reporting cadence	Publisher + FCC

Phase 4: Go Live and Optimisation

Task	Owner
Launch at partial traffic (10–30%) for initial quality validation	Publisher + FCC
Monitor null search rate, CTR, and CVR for the first 2 weeks	Publisher + FCC
Review position-level click data to assess ranking quality	FCC
Expand semantic search rollout based on validated metrics	FCC + Publisher
Plan phased rollout of Coming Soon capabilities (Query Rewrite, Partial Match, L2 Personalisation)	Publisher + FCC
Establish quarterly search health review cadence	Publisher + FCC

Glossary

Term	Definition
AutoSuggest	Real-time query completion suggestions displayed as the user types, drawn from popular queries, trending searches, and session context
Blended Retrieval	A retrieval approach where lexical and semantic signals are merged into a single unified ranked result set, rather than being applied to separate position ranges. Coming Soon.
Buy Box	The single best seller listing selected to represent a product in search results when multiple sellers offer the same item
Cascaded Ranking	A multi-stage ranking architecture where each stage applies progressively richer signals to a progressively smaller candidate set (L0 → L1 → L2)
CTR	Click-Through Rate — the percentage of search result impressions that result in a user clicking a product
CVR	Conversion Rate — the percentage of search sessions that result in a purchase
DNA	Do Not Augment — the first gate in the query processing pipeline. Protects brand names, trademarks, and product codes from modification by downstream modules.
FCC	Flipkart Commerce Cloud
FK V5 Model	FCC's domain-specific, LLM-augmented semantic embedding model trained on Flipkart's e-commerce corpus. Powers semantic retrieval.
Headless Search	FCC's deployment model — a search API that returns ranked result data; the publisher's storefront controls rendering and display.
ICP	Ideal Customer Profile — mid-market ($100M–$2B GMV) business-led retailers and marketplaces in T1/T2 categories across US, UAE, Singapore, and ANZ
Intent Understanding	The pipeline step that classifies a query's semantic type — brand, category, feature, or thematic — to determine appropriate retrieval and ranking treatment
L0 / L1 / L2	The three stages of cascaded ranking: L0 (Coarse Sort), L1 (Collapse Sort + Product Reranking), L2 (Personalisation)
Leaf Store	The lowest-level node in the Store Tree taxonomy. Every product maps to a Leaf Store. The classified Leaf Store determines the result universe, available filters, and SRLP layout for a query.
Lexical Retrieval	Token-based matching of query terms against the product index. Precise for exact and near-exact matches.
MACH	Microservices, API-First, Cloud-Native SaaS, Headless — FCC Search's architectural standard
MCP Gateway	Model Context Protocol gateway — an interface that exposes client catalog, inventory, and pricing data to AI agents (ChatGPT, Perplexity, Gemini) for off-platform discovery. Part of the agentic commerce roadmap.
NDCG	Normalised Discounted Cumulative Gain — a standard ranking quality metric measuring how closely a ranked list matches an ideal relevance ordering
Null Search Rate	The percentage of search queries that return zero results. A primary search health metric.
Nymeria	FCC's planned offline simulation system for ranking calibration — runs ICU (Impressions, Clicks, Units) simulations to identify optimal signal weights. Coming Soon.
Partial Match	A null-rescue mechanism that progressively relaxes a query (drops tokens, loosens constraints) until a result set is found. Coming Soon.
PQS	Product Quality Score — a ranking signal based on product review count and average rating
PVS	Price Value Score — a ranking signal that boosts products priced at or below the category median
Quantifier Extraction	A pipeline step that extracts numeric constraints from queries (price limits, size specifications) and encodes them into the retrieval query. Coming Soon.
Query Rewrite	A pipeline step that expands queries with semantically equivalent terms to improve recall without losing precision. Coming Soon.
Sandbox	A demo-ready evaluation environment where prospective clients can plug in their catalog, run test queries, and see FCC Search results in real time — before any commercial commitment. In Progress.
Search Generalisation	The platform work to make FCC Search a portable, multi-tenant product by abstracting all client-specific assumptions into configurable tenant layers. In Progress.
Semantic Search	Dense vector retrieval that matches queries by meaning rather than token overlap. Powered by the FK V5 embedding model.
Snippets	Structured product information displayed on result cards (key attributes, price, delivery, rating) without requiring click-through to the product page
Spotlights	Dynamic callout labels on product cards (Limited Stock, New, More 4 Less) computed in real time from inventory and promotions data
SRLP	Search Results Listing Page — the page rendered by the publisher's storefront showing search results
Store Classifier	An ML model that predicts which Leaf Store a user's query belongs to. The most consequential single step in the query processing pipeline.
Store Tree	The demand-side taxonomy hierarchy (L0 → L3 Leaf) that organises the product index. Products are indexed into stores based on how buyers think, not just how sellers list.
UPI	Units Per Impression — the primary L0 and L1 ranking signal. Measures how frequently a product in a given position has historically converted to a purchase.

About This Document​

Table of Contents​

1. Introduction​

Overview​

The Retail Store Analogy​

Who This Platform Is For​

2. Platform Architecture — The Four Pillars​

Overview​

3. Pillar 1 — Organize: Indexing and Taxonomy​

Overview​

3.1 Store Tree (Demand-Side Taxonomy)​

3.2 Indexing Pipeline​

4. Pillar 2 — Understand: Query Processing Pipeline​

Overview​

4.1 DNA — Do Not Augment Gate​

4.2 Spell Correction​

4.3 Store Classification​

4.4 Intent Understanding​

4.5 Query Rewrite / Query Expansion​

4.6 Quantifier Extraction (Numeric Constraint Handling)​

4.7 Partial Match​

4.8 Retrieval — Lexical and Semantic​

5. Pillar 3 — Rank​

Overview​

5.1 L0 — Coarse Sort​

5.2 L1 — Collapse Sort and Product Reranking​

5.3 L2 — Personalisation​

5.4 Offline Ranking Calibration​

6. Pillar 4 — Experience​

Overview​

6.1 AutoSuggest​

6.2 Sort and Filter​

6.3 Snippets​

6.4 Spotlights​

6.5 Null Search Handling​

6.6 Grid and List View​

7. Semantic Search​

Overview​

How Semantic Search Works​

The FK V5 Embedding Model​

Current Rollout Status​

What Semantic Search Improves​

8. Capability Summary​

9. Market Context and Strategic Vision​

The Opportunity​

Ideal Customer Profile (ICP)​

Competitive Landscape​

6 Strategic Bets​

The Future of Search​

10. Roadmap​

Q2 2026 — Active Priorities​

Longer Arc — Search Generalisation​

11. Metrics and Measurement​

Core Search Health Metrics​

Relevance Metrics​

Null Search Diagnostic Metrics​

Measurement Approach​

12. Onboarding Checklist​

Overview​

Phase 1: Taxonomy and Index Setup​

Phase 2: Model Adaptation and Query Processing​

Phase 3: Ranking Configuration​

Phase 4: Go Live and Optimisation​

Glossary​

About This Document

Table of Contents

1. Introduction

Overview

The Retail Store Analogy

Who This Platform Is For

2. Platform Architecture — The Four Pillars

Overview

3. Pillar 1 — Organize: Indexing and Taxonomy

Overview

3.1 Store Tree (Demand-Side Taxonomy)

3.2 Indexing Pipeline

4. Pillar 2 — Understand: Query Processing Pipeline

Overview

4.1 DNA — Do Not Augment Gate

4.2 Spell Correction

4.3 Store Classification

4.4 Intent Understanding

4.5 Query Rewrite / Query Expansion

4.6 Quantifier Extraction (Numeric Constraint Handling)

4.7 Partial Match

4.8 Retrieval — Lexical and Semantic

5. Pillar 3 — Rank

Overview

5.1 L0 — Coarse Sort

5.2 L1 — Collapse Sort and Product Reranking

5.3 L2 — Personalisation

5.4 Offline Ranking Calibration

6. Pillar 4 — Experience

Overview

6.1 AutoSuggest

6.2 Sort and Filter

6.3 Snippets

6.4 Spotlights

6.5 Null Search Handling

6.6 Grid and List View

7. Semantic Search

Overview

How Semantic Search Works

The FK V5 Embedding Model

Current Rollout Status

What Semantic Search Improves

8. Capability Summary

9. Market Context and Strategic Vision

The Opportunity

Ideal Customer Profile (ICP)

Competitive Landscape

6 Strategic Bets

The Future of Search

10. Roadmap

Q2 2026 — Active Priorities

Longer Arc — Search Generalisation

11. Metrics and Measurement

Core Search Health Metrics

Relevance Metrics

Null Search Diagnostic Metrics

Measurement Approach

12. Onboarding Checklist

Overview

Phase 1: Taxonomy and Index Setup

Phase 2: Model Adaptation and Query Processing

Phase 3: Ranking Configuration

Phase 4: Go Live and Optimisation

Glossary