Late last week, the Facebook team wrote a blog post that let users “under the hood” to show them how Graph Search goes about indexing and ranking search query results on the social network.
In effect, this helps to close, or at least illuminate, the path that a particular query takes to identify the best possible search results, and knowing this information is half the battle.
It appears that Facebook uses a blending of data-driven techniques and “intuition” to rank the best results for a query by coming up with an idea for a ranking change (based on information from both ranking engineers and user feedback), implementing it, testing it and then launching it to a small fraction of users before measuring the impact it has on ranking changes. And it all starts with Unicorn.
Unicorn is the inverted index framework that features the ability to build indices and retrieve data from the index, but more than all of that, it is “fundamentally an in-memory ‘database’ with a query language for retrieval.” However, before Facebook could begin to utilize Unicorn for Graph Search, the company had to add to its search ranking capabilities by adding things like query rewrites, forward indexing, A/B testing and extended retrieval operations with weak “and” and strong “or” recognition, in addition a number of other changes.
The Graph Search Lifecycle
All of these modifications serve Graph Search at different points throughout a query’s lifecycle, which largely takes place in two distinct phases: the query suggestion phase and the search phase. In the query suggestion phase, the query that a user types into the search box is parsed based on grammar by a Natural Language Processing (NLP) module, which then identifies the query for potential entities, before sending them along to Unicorn to conduct a search for them.
This is perhaps the most important step in the whole process, as it defines what type of entity the searcher may be looking for, thus signaling where Unicorn should look for the information, as Facebook altered the index framework to keep different entity types in separate verticals. In other words, it tells Unicorn what kind of results to present. For example, if a user types in “people who live in Sri,” the query will be sent with a strong bias toward cities and places, meaning location-based results will be more likely to appear because of the particular grammar qualifications of the query.
The search phase begins after a user had made a selection from Graph Search’s suggestions. Again, all of this new information is sent back to Unicorn via the “Top Aggregator,” which takes in raw query and user metadata information and then represents it at the end as the blended and rescored search results. This is where Graph Search is able to narrow down the original query to provide only relevant results unique to a user’s search. In addition, Graph Search is able to rewrite queries into a social context to bias the results socially, as every user is indexed with his or her friends.
Graph Search and You
Of all of the information released in the blog post (and there was a lot), perhaps the most important and useful of it all is that Facebook is not only going to use Graph Search to find data, but also to build upon it and include new features that will “become inputs to the scoring function” that “quantify important properties of the entities and/or the query.” This includes the distance between a searcher and a place, how close a searcher is from user results in terms of friend connections and the amount of overlap a query may have with an entity name, among others. In other words, spelling things right and having many friends are both great ways to improve your Facebook page’s performance when it comes to graph search results.
In the future, this could mean that an entity will have several different scores assigned to it, and Graph Search can select the top results based on each type of score. So, for instance, there could be three different scores for a “nearby” search that account for overall popularity, social bias and distance, thus allowing for greater diversity and more relevance in search results that are, ultimately, more useful for those searching as well as those being searched for.