2024-05-03 15:47:57
RAG stands for “Retrieval Augmented Generation” and refers to a technique used to improve the performance of large language models (LLMs) in certain tasks by incorporating information retrieval capabilities.
The main idea behind RAG is to combine the powerful generation capabilities of LLMs with the ability to retrieve relevant information from external data sources such as websites, databases or document collections.
Here’s a high-level overview of how RAG works:
Recovery: Given an input query or context, a retrieval module (often based on techniques such as TF-IDF or dense vector embeddings) is used to identify and fetch relevant documents or passages from external data sources.
Increase: The retrieved relevant information is then augmented or concatenated with the original input to create an augmented input context.
Generation: This augmented context is fed into a standard language model, which can then generate an output response while relating both the original input and retrieved external knowledge.
The recovery step allows the LLM to go beyond just relying on its pre-trained knowledge and incorporate up-to-date, task-specific information. This can lead to better performance on tasks that require access to external knowledge not covered by the original data used in LLM training.
RAG models have shown promising results in tasks such as open-domain question answering, fact checking, and factual knowledge investigation compared to regular LLMs without recall enhancement.
David Matos
References:
Generative AI and LLMs for Natural Language Processing
Related
#Retrieval #Augmented #Generation #Science #Data