What are RAG and Agents?
什么是 RAG 和代理?什么是 RAG 和代理?
Overview of RAG RAG概述RAG概述
Imagine you want to integrate a chatbot into your website, one that only answers questions regarding your portfolio and the details you've shared about yourself. There are several methods to achieve this goal, and Retrieval-Augmented Generation (RAG) stands out as a particularly effective approach.
想象一下,你想将一个聊天机器人集成到你的网站中,它只回答有关你的作品集和你分享的关于你自己的细节的问题。有几种方法可以实现这一目标,检索增强生成 (RAG) 是一种特别有效的方法。
In the current era, Large Language Models (LLMs) like ChatGPT (from OpenAI) and Llama (from Meta) have gained significant attention, emerging as the most trending topics. These models, which are integral to Generative AI, help in generating text-based content on provided instructions, assisting tasks such as writing an email, coding, and text summarization. LLMs originated after the release of transformer neural network architectures.
在当前时代,ChatGPT(来自 OpenAI)和 Llama(来自 Meta)等大型语言模型 (LLMs) 获得了极大的关注,成为最热门的话题。这些模型是生成式 AI 不可或缺的一部分,有助于根据提供的说明生成基于文本的内容,协助编写电子邮件、编码和文本摘要等任务。LLMs起源于 Transformer 神经网络架构发布后。
However, the primary challenge with LLMs is their lack of real-time updates and failure to refer to sources while generating answers, particularly affecting their reliability, for fact-based questions. Here's an example where ChatGPT lacks updated information.
然而,主要的挑战LLMs是他们缺乏实时更新,并且在生成答案时无法参考来源,特别是影响了基于事实的问题的可靠性。这是一个 ChatGPT 缺乏更新信息的示例。
The groundbreaking Retrieval-Augmented Generation (RAG) paper emerged in 2020 when Patrick Lewis was completing his doctorate in Natural Language Processing (NLP) at University College London while also being employed at Meta. The team aimed to enhance the knowledge capacity of LLMs and devised a benchmark to gauge their progress. When Lewis integrated the retrieval system from another Meta team into their project, the initial outcomes surpassed expectations.
开创性的检索增强生成 (RAG) 论文出现在 2020 年,当时 Patrick Lewis 在伦敦大学学院完成自然语言处理 (NLP) 博士学位,同时也在 Meta 工作。该团队旨在提高他们LLMs的知识能力,并设计了一个基准来衡量他们的进步。当 Lewis 将另一个 Meta 团队的检索系统集成到他们的项目中时,最初的结果超出了预期。
Retrieval-Augmented Generation (RAG) is a framework/pipeline that is designed to improve the accuracy and relevance of Large Language Models (LLMs) for specific use cases. RAG uses LLMs to generate content based on provided prompts. Rather than solely depending on the training data of our LLM for content generation, we use the Retrieval-Augmented approach to incorporate source data or context into our queries or prompts. When a user asks questions on a particular thing, the LLM gets the relevant information from the provided data source in the Retrieval-Augmented section. Then, it leverages this information to generate contextually appropriate answers.
检索增强生成 (RAG) 是一个框架/管道,旨在提高大型语言模型 (LLMs) 针对特定用例的准确性和相关性。RAG 用于LLMs根据提供的提示生成内容。我们不仅仅依靠我们的LLM训练数据来生成内容,而是使用检索增强方法将源数据或上下文合并到我们的查询或提示中。当用户对特定事物提出问题时,将从“检索增强”部分中提供的数据源LLM中获取相关信息。然后,它利用这些信息来生成上下文适当的答案。
Therefore, the issue of LLM hallucination – which originates from excessive dependence on its training data – is removed by instructing the LLM to pay attention to the source data before responding to each question. This approach also enhances the LLM's confidence in its answers, enabling it to acknowledge when it lacks relevant context and reply with, “I don’t know” instead of providing incorrect answers to the user.
因此,通过指示LLM在回答每个问题之前注意源数据,可以消除幻觉问题LLM(源于对其训练数据的过度依赖)。这种方法还增强LLM了对答案的信心,使其能够在缺乏相关上下文时进行确认,并回复“我不知道”,而不是向用户提供不正确的答案。
What are the applications of the RAG pipeline/framework?
RAG 管道/框架有哪些应用?RAG 管道/框架有哪些应用?
- Question answering: RAG performs extremely well in question-answering tasks when it requires factual information and verification. By paying attention to the provided data by the user, RAG frameworks can generate answers that are factually correct.
问答:当 RAG 需要事实信息和验证时,它在问答任务中表现得非常好。通过关注用户提供的数据,RAG 框架可以生成事实正确的答案。 - Document Summarization: In summarizing long documents, RAG can get key pieces of information from the document, creating short and comprehensive summaries.
文档摘要:在总结长文档时,RAG 可以从文档中获取关键信息,创建简短而全面的摘要。 - Language Translation: By retrieving contextually relevant information, RAG can improve the quality of language translation, especially in cases involving specialized vocabulary.
语言翻译:通过检索上下文相关信息,RAG可以提高语言翻译的质量,特别是在涉及专业词汇的情况下。 - Speech-to-text conversion: Using audio-to-text conversion models, the RAG pipeline can extract details from transcribed audio content and produce correct responses aligned with the query asked.
语音到文本转换:使用音频到文本转换模型,RAG 管道可以从转录的音频内容中提取详细信息,并生成与所请求的查询一致的正确响应。
Introduction to Agents 代理简介
LLMs have capabilities beyond simply predicting the next word. Their usecases extend past Question Answering or Chatting. LLMs can be tasked with complex activities like booking a flight ticket. They can make a plan by breaking the task into steps or sub-tasks, execute each step, monitor the outcomes, reason through successes or failures, and adapt the plan accordingly. They can also adjust their actions based on feedback. Such systems are known as Autonomous Agents.
LLMs具有超越简单地预测下一个单词的能力。他们的用例超越了问答或聊天。LLMs可以负责复杂的活动,例如预订机票。他们可以通过将任务分解为步骤或子任务来制定计划,执行每个步骤,监控结果,通过成功或失败进行推理,并相应地调整计划。他们还可以根据反馈调整他们的行动。这种系统被称为自治代理。
These intelligent systems can think and act independently and are designed to execute specific tasks without constant human supervision. They use reasoning, which we can reinforce with prompts and instructions. Virtual assistants like Siri and Alexa are also types of agents that we control through voice commands.
这些智能系统可以独立思考和行动,旨在执行特定任务,而无需持续的人工监督。他们使用推理,我们可以通过提示和指示来加强推理。Siri 和 Alexa 等虚拟助手也是我们通过语音命令控制的代理类型。
What are the applications of AI Agents?
- Self-driving cars - AI agents help in real-time path planning and navigation, ensuring efficient and safe routes by analyzing traffic data, road conditions, and GPS signals. It can then automatically navigate through the route and reach the destination.
- Virtual assistants - AI virtual assistants help manage calendars, set reminders, and schedule meetings, making personal and professional life more organized. It can also use its environment to sense whether a task has been completed and manage the calendar accordingly.
- Healthcare, Finance, and Education - AI agents can analyze medical data and patient history to assist doctors in diagnosing diseases and suggesting treatment plans. They interpret medical images like X-rays, MRIs, and CT scans to detect abnormalities and assist radiologists in their assessments.
What are the potential drawbacks of RAG?
- By limiting our answers to the information provided, (which is finite and not as large as the internet) we restrict the capability of the LLM.
- This also suggests having caution when providing the knowledge base to the RAG components. The LLM assumes the provided information to be true and does not cross-verify it with potentially correct information available on the internet.
- Fine-tuning and Retrieval-Augmented Generation (RAG) have entirely different roles in language processing. However, in some cases, fine-tuning offers superior accuracy and consistency because the model grasps nuances better, and it also retains prior knowledge.
In the upcoming modules, we will explore more about RAG pipelines in a detailed manner, and examine their mechanics while pointing out the fundamental differences between using a RAG pipeline and fine-tuning an LLM.
在即将到来的模块中,我们将详细探讨 RAG 流水线,并检查它们的机制,同时指出使用 RAG 流水线和微调 LLM.