LinkedIn shares insights on the development of Search tab - Social Samosa

2022-09-02 23:31:19 By : Ms. Eva Gu

LinkedIn Post search saw organic growth of a 35% year-over-year increase in user engagement in 2020. The insights on the development would help businesses and users on the platform to learn how the LinkedIn Search engine works and how these learnings can help provide a better understanding of this functionality on the platform.

LinkedIn Post Search intends to serve results that are relevant to a member’s query, from a wide variety of verticals such as jobs they may be interested in, people they may want to connect with, or posts that are trending within their industry.

The search midtier at LinkedIn follows the federated search pattern, it fans out calls to various search backends and blends the results together. The Post search stack, however, is different, as it was designed to be compatible with the Feed ecosystem at LinkedIn. For Post search, the fanout and blending logic depended on Feed services, including feed-mixer and interest-discovery.

Apart from the different service architectures, Post search also uses an intermediate language called Interest Query Language (IQL) to translate a user query into index-specific queries used to serve search results.

Due to the complex architecture, increasing the development and experimentation velocity proved to be difficult, as any search relevance or feature improvements required changes to multiple points throughout the stack. It was challenging to address many of the unique needs of Post search, such as balancing multiple aspects of relevance, ensuring diversity of results, and supporting other product requirements.

The platform set out to simplify the system architecture to improve productivity and facilitate faster relevance iterations. To achieve this, they decided to decouple the Feed and Post search services in two phases. The first phase removed the feed-mixer from the call stack and moved fanout and blending into the search federator. The second phase removed interest-discovery. This enabled getting rid of all the cruft built up over the years and simplified the stack by removing additional layers of data manipulation.

As LinkedIn thought about ways to improve the relevance of results from Post search, they realized that the user’s perceived relevance of results is a delicate balance of several orthogonal aspects, such as:

In addition to those aspects, the platform wanted to easily implement other requirements from their product partners to satisfy searcher expectations (e.g., ensuring diversity of results, promoting serendipitous discovery, etc.). To meet these goals for post relevance, they implemented a new ML-powered system; the high-level architecture is shown in Figure.

As a single, unified model did not scale well for the platform’s needs, they invested in modelling the First Pass Ranker (FPR) as a multi-aspect model, wherein each aspect is optimized through an independent ML model. Combining the scores from all these aspect models in a separate layer to determine the final score for ranking. This approach enables them to:

To iterate quickly on a multi-layered, complex ML stack, testing and validation was a foundational piece. They built a suite of internal tools to assess the quality of new candidate models and quantify how they differed from the current production model. This enabled them to have a principled approach to testing relevance changes and ensured they did not regress on the core functionality/user experience.

These changes to the system architecture have helped us unlock several wins, such as:

In large complex systems, the existing state can be suboptimal due to incremental solutions to problems over time. By stepping back and taking a high-level view, it is possible to identify several areas of improvement. It takes an open mind and support from engineering, product, and leadership to entertain these improvements.