Index Creation

Salesforce B2C Commerce search indexes use data from these indexes: Product, Spelling, Content, Synonym, Suggest, Availability, and Active data.

Product Index

For a product to be indexed, it must be searchable, online, and assigned to at least one online category. If a product is assigned to both online and offline categories, it is still indexed. Only product attributes that are configured as searchable attributes are indexed.

Index terms are derived from product data by processing data as follows:
  • Synonyms: synonyms configured so that any term returns the search results for all terms. For example, if pants and trousers are synonyms, then searching on pants returns pants and trousers and searching on trousers returns pants and trousers. Synonyms are added for words with German umlauts, that is, ‘küche’, ‘kueche’
  • Hypernyms: a hypernym returns all the results from a series of hyponyms, though the hyponyms only return their own results. For example, if winterwear is a hypernym for hats, gloves, and mittens, then searching on winterwear returns hats, gloves, and mittens, but searching on hats returns hats, gloves returns gloves, and mittens only returns mittens.
  • Stop words: words that are ignored by the search engine. For example: "an" or "the". This creates a more compact index and faster return of search results. Stop words are removed based on stop word dictionary
  • Compound words: the merchant specifies phrases that are searched as phrases and as a single word. For example, if a customer enters cheese cake, if this is a compound word, then results for "cheese AND cake" and "cheesecake" are returned. Compound words are split into their parts based on the compound words dictionary
  • Common phrases: the merchant specifies words that must be found together to count as a search hit or that counts as a search hit if either the full phrase or just the last word in the phrase is found.
  • Word Stems: All words in searchable product attributes are stemmed based on the stemmer configured for each index type. Word stemming lets customers find products based on the basic part of the word.
  • Special characters: ! ( ) : [ ] { } + ~ ^ ? ' are removed
  • Product numbers: Product numbers are split into their parts.
Note: Only those price books with the same currency as the site that is being indexed are indexed. For example, if the site's currency is USD, all EUR prices are not included in the index for the site.

Spelling Index

If a customer searches on a term and no results are found, spelling suggestions analyze the search term to see if there it's a misspelling and then presents similar terms found in the product set. For example, if a customer searches on "telewision" they are asked if they meant "television".

The spelling index is compiled using an algorithm and doesn't need to be manually configured. However, you must enable the feature on the Search Preferences page in Business Manager. The spelling index uses the product and content indexes and must be rebuilt after them.

Content Index

The content index is built based on content assets, library folders, and libraries. For a content asset to be indexed, it must be searchable and online. Content attributes that are configured as searchable are indexed. The content index uses the same processing rules as the product index to derive index terms.

Synonym Index

The synonym index is built based on synonyms/synonym groups defined for a specific language. All synonyms are stemmed based on the configured stemmer.

Suggest Index

The suggest index is built based on category names, brand names, and terms added or excluded on the Search Suggestions page in Business Manager.

All suggestions are converted to appear in lower case, even if their original values include uppercase letters. Suggestions only appear if there is at least one result associated with the suggestion.

To determine search result counts, a search is executed for each suggestion during building, using the current product index. You must build the product index before building the suggest index to get accurate counts.

Availability Index

Inventory data is included in the availability index even if no product exists for the inventory record. This decouples availability updates and data replication so that they are not dependent on each other. You can update the availability index for a new product and when the new product is replicated from Staging to Production, it immediately appears in search results because the related availability data is already included in the availability index on Production.

The following specific rules apply:
  • Availability index processes inventory records for all existing site (and multi-site for inventory list) products.
  • Availability index processes inventory records if no product exists for it and if the inventory record was created or modified within the last 5 days.
  • Availability index skips inventory records if no product exists for it and if the inventory records were not created or modified within the last 5 days. This is to prevent the processing of inventory records that are considered outdated and obsolete.
  • If the new product is a master or product set, or is a variant or part of a product set, search result sorting might be incorrect if the applied sorting rule uses any aggregated availability attributes, such as SKUCoverage or TTOOS (Time to Out of Stock). This is because the search index can't aggregate data for complex products if it does not have any information about the product. When the product data has been replicated, the availability index has been rebuilt, and the page cache has been refreshed, the sorting is correct.

The availability index is built based on information about product availability that is used to sort and filter search results by inventory status. This index relies on a transaction that updates inventory amounts and therefore must be rebuilt regularly. B2C Commerce sites that are assigned to the same inventory list share the search availability index. Only one index is maintained per inventory list.

Note: Although full or incremental updates of a search availability index by the user or by custom jobs are no longer required, you can trigger a rebuild of the search availability index, but only for the unexpected case of when the index is out-of-synch.

The search availability index is updated automatically and incrementally when the following events occur:

Note: There can be delay of five or more minutes before changes are reflected in the shared search availability index
Note: The Scripting API pipelet UpdateSearchIndexes will do nothing if it is used for index type availability.

The availability index isn't replicated from the staging system to the production system. Instead, it's created and updated on the production system.

Active Data Index

The active data index enables you to include active data attributes collected from the storefront, imported from backend systems, or calculated in B2C Commerce, when presenting sorting options to customers or for sorting rules for category or keyword searches.

The active data index contains information about product active data such as sales velocity. The daily active data feed automatically causes this index to rebuild once a day. The index only needs to be manually rebuilt after a custom feeds import.