Reciprocal Rank Fusion (RRF): How LLMs Rank Answers (ChatGPT, Gemini)

Dans cet article

RRF : Reciprocal Ranking Fusion, what is it ?

The Reciprocal Ranking Fusion (RRF) is a method of algorithmic used by ChatGPT and other language models to merge multiple lists of ranked results in a single final ranking.

Objective of reciprocal ranking fusion

The objective of the Reciprocal Ranking Merger is to offer the user the most relevant sources, that is to say, those which best meet his / her request.

Reciprocal Rakning Merger, how does it work ?

The reciprocal ranking fusion(RRF) calculates a score for each document is the sum of the inverses of its position (rank) in each list, adjusted by a constant.

RRF works, in real life, according to a simple principle : the documents that appear in the head results in several research methods are likely tobe truly relevant.

The steps of the Reciprocal Ranking Fusion

The benefits of the Reciprocal Ranking Fusion

The RRF, cis a bit of a swiss army knife of the fusion of results.

Simple : no need of data fortraining, not for standardization complicated. You are theimplements in 5 minutes, it calculates quickly, and basta.

Solid : no matter your rating scales, it works. Incomplete results ? No problem, the RRF cash.

Effective : against all odds, this basic method regularly beat techniques are much more sophisticated. Especially when you combined different approaches.

Studies confirm : to mix keyword search and semantic search, the RRF is the taf. Point.

Example of RRF

We has a query : ‘how to choose trail shoes’ on Google

The SERP is multimodal : blog articles, YouTube videos, snippets featured image. Google mixes it all up.

Chatgpt, want to understand what content performs really by combining several signals for ranking. It does not mean thata single Google search. It combines :

Search by keywords (Bing API or equivalent)
Semantic search (vector search on embeddings)
Research on reliable sources (database verified : Wikipedia, academic sites, etc)

Each system returns to his own ranking of results. The RRF merges all of this to decide which sources to include in the final answer.

The calculation of the score RRF sources is (theoretically) as follows :

Calculation RRF detailed (k=60)

Source	Keywords	Semantics	Reliability	Total RRF
Tom’s Hardware	1/(60+1) = 0,0164	1/(60+2) = 0,0161	1/(60+3) = 0,0159	0,0484
LesNumeriques	1/(60+3) = 0,0159	1/(60+6) = 0,0152	1/(60+4) = 0,0156	0,0467
TechPowerUp	0	1/(60+1) = 0,0164	1/(60+2) = 0,0161	0,0325
AnandTech	0	1/(60+3) = 0,0159	1/(60+1) = 0,0164	0,0323
PCGamer	1/(60+2) = 0,0161	1/(60+4) = 0,0156	0	0,0317
Reddit	1/(60+4) = 0,0156	0	0	0,0156
Hardware Unboxed	0	1/(60+5) = 0,0154	0	0,0154
Amazon	1/(60+5) = 0,0154	0	0	0,0154

The result ?

ChatGPT will prioritize the first 3-4 sources to build his response

What is theimpact of RRF ?

Without RRF, based only on a search for keywords :

ChatGPT many respondents would identify Toms Hardware #1, PCGamer #2, and then… Reddit and Amazon. Not terrible reliability.

Without RRF, based only on a semantic search) : ChatGPT many respondents would identify TechPowerUp #1, Toms #2, AnandTech #3. Lack of diversity, too technical.

With RRF :

ChatGPT mixes relevance-SEO + semantic + reliability. Tom’s Hardware back because it is good everywhere. LesNumeriques follows closely. Reddit and Amazon will disappear from the top (not reliable enough).

What is the relationship between the Query Fan Out, and the reciprocal ranking fusion ?

The Query Fan-Out and the reciprocal ranking fusion RRF are two sides ofthe same coin in ChatGPT and other LLMs.

Query Fan Out = the decomposition of the query
RRF = the redial results

Cis a pipeline in two stages :

User request
↓
Query Fan-Out (1 → N queries)
↓
N parallel searches = N collations
↓
(RRF fusion of N rankings)
↓
Ranking single final

Query Fan Out – Reciprocal Ranking Fusion (RRF) = chaos

Theuse of the query fan out without the reciprocal ranking merger would be chaos for Chatgpt and other LLMs( including Google).

I mexplains : Chatgpt decomposes a query into N sub-queries. He gets so N lists of results are different. But how he chooses what sources to use ?

Take the #1 of each list ? One risk ofhaving 5 redundant sources.
Take only the first list ? We lose the richness of the other angles.
Average manual ? Impossible on a large scale.

Without RRF, the query fan-out nhas no method of robust fusion

How Chatgpt calculating the score by Reciprocal Ranking Fusion ?

It has been said that the RRF is based on a mathematical basis. Which is the exact formula that ChatGPT uses ?

Spoiler : it didn’t. It would be too easy.

OpenAI does not publish the details of its implementation. Cis their secret sauce. But some have dug.

Drostenolone’VE dug into the backend of ChatGPT to try to recover the formula. By inspecting theimplementation of the research ChatGPT, He found this code developer :

rrf_alpha : 1,

rrf_input_threshold : 0,

ranking_model : null

This small piece of code confirms that ChatGPT uses the RRF standard to combine the results of research. What’s interesting is, cis that this implementation reveals theimportance of the ranking for multiple variations of queries.

Drostenolone’VE pushes the cap a little more, highlighting the factors and criteria that are likely toinfluence the scores RRF content.

Here is a version in the list, clear, direct, efficient — always with a touch of Leo Duff :

Optimise content for the Reciprocal Ranking Fusion (RRF)

Build a real topical authority : produce content that covers a topic in-depth. The faster your site to master a topic, the more the CSSP you favors.
Create a web semantics : group your content by intention, not only by key words. You must respond to the entire constellation of questions around a topic.
Cover all the sub-queries : the CSSP works with a query fan-out very wide. Your content must anticipate angles at the side, variants, related issues… all that can be triggered around the main query.
Enrich the lexicon and semantics : use a rich vocabulary and precise. The higher your text provides context, the better it is understood and classified.
Adopt a multimodal strategy : the SERPS becomes multi-format : text, image, video, audio. Integrate multiple media to increase your visibility and respond to different modes of research.
Foster clarity and structure : the RRF operates the signals used by different models : own shares, logical hierarchy, sections net. Keep it simple, readable, immediately understandable.

How to calculate the score RRF its contents ?

If you hate the code, the scripts, and anything that looks like near or far to a line in Python… you’re a little bad luck.

There is simply no tool, no-code to estimate the scores RRF. Nada. The desert.

That said, no need to take out the violin : there are still ways simple but simplistic to make approximations. Not perfect, not academic, but enough to understand the trends and do not stay in the fuzzy total.

Calculate its score by Reciprocal Ranking Fusion with an SQL script

If you are a geek SQL like me, you can play with this script that Icame across on the canvas on the same topic:

WITH fulltext AS (
  SELECT id, RANK() OVER (ORDER BY score DESC) AS rank
  FROM (
    SELECT id, pdb.score(id) AS score
    FROM mock_items
    WHERE description @@@ 'keyboard'
    ORDER BY pdb.score(id) DESC
    LIMIT 20
  )
),
--- Semantic search, using pgvector and cosine distance for ranking
semantic AS (
  SELECT
    id,
    RANK() OVER (ORDER BY embedding <=> '[1,2,3]') AS rank
  FROM mock_items
  ORDER BY embedding <=> '[1,2,3]'
  LIMIT 20
),

-- Calculate RRF contributions from each ranker
rrf AS (
  SELECT id, 1.0 / (60 + rank) AS s FROM fulltext
  UNION ALL
  SELECT id, 1.0 / (60 + rank) AS s FROM semantic
)

-- Sum the RRF scores, order by them, and join back the original data
SELECT
  m.id,
  sum(s),
  m.description
FROM rrf
JOIN mock_items AS m USING (id)
GROUP BY m.id, m.description
ORDER BY sum(s) DESC
LIMIT 5;

Platforms with RRF integrated :

Several services offer RRF natively without the need for coding . Azure AI Search integrates RRF for the hybrid search, as OpenSearch, Elasticsearch and MariaDB . These platforms automatically calculate the scores during queries hybrids.

Google Sheets / Excel :

You are an SEO, you’re rather Excel and Sheets. Don’t worry, you can create an array of simple calculation with the formula RRF : =1/(60+rank), where 60 is the constant k standard . For each document, add up the scores from different ranking lists.

Sort then by score in descending order to obtain the classification merged.

Conclusion

The Reciprocal Rank Fusion, combined with the Query Fan-Out models GPT type, radically changing the way content is understood, evaluated, and assembled in the engines of modern research. There optimizes more for a query isolated : one optimizes for an ecosystem of micro-queries generated in a cascade.

Working a real topical authority, structuring cocoons semantic solid, enriching your density, lexical and adopting a presence multimodal, you make your content is not only readable, but interpretable by these hybrid systems.

The couple RRF + Fan-Out does not reward the texts superficial : it emphasizes the content of deep, consistent and rich signals.

If you understand this logic, you create a sustainable advantage in the SERPS as in environments IA.

If you want ChatGPT cite your content,

Optimize for keywords (search classic)
Optimizes for theintent semantics (structure clear, direct answers)
Boosts your reliability (backlinks from sites authoritarian, recognized expertise)
Work theauthority (topical authority) of your site

Only one of these 3 areas is not enough. The RRF emphasizes the sources that perform on several tables.

Here, cis that theimpact of the reciprocal ranking fusion in the context ChatGPT/SearchGPT and GEO. A system that merges multiple search engines to give the best possible sources. Simple, robust, efficient.

SEO / GEO consultant

Aslane SAMAI

Google AIO SEO