RAG for LLMs explained in 3 minutes

Apr 14, 2024

Big language models and AI assistants are taking the consumer world by storm, but what happens when you try to bring these big language models to business in the enterprise? Well, we have three problems, three problems that we have to overcome, let's analyze them. The number one problem is this. lack of domain knowledge, remember that these large language models have been trained on publicly available data sets, which means they do not have access to your operations, your standard operating procedures, they do not have access to your own IP, your own logs , so they really can You don't answer a lot of questions and tailor that answer to your particular business and you lose a lot of performance and effectiveness because problem number two has to do with hallucinations.

These models will give you answers. They look really credible, but they are very off and if you run with them, you might have a problem and then number three, it's becoming a little bit less of an issue with search, but we have deadlines for the training data, so for a while he was missing months. of training data because it hadn't been updated for a while and partly because it takes a lot of compute to train these models, so you have these three problems that prevent you from getting much performance out of your movies as you bring them in. in Hound, so let's talk about a pattern that has emerged as particularly useful here and that is recoverable augmented generation.

More Interesting Facts About,

rag for llms explained in 3 minutes...

You may have heard this term but first let's talk about what's happening here, let me give you some context, this is what happens when you send a message to your standard AI assistant so that your message goes into the AI assistant , generates a response and then returns it directly to you in an irregular implementation. You're adding an extra step here before that message goes into the AI Assistant, we have a search that hits a corpus of data, now this will be your data, your own documents and other relevant information that you want to make available to the AI Assistant.

AI, a retrieval will be performed and that context will be added on top of your original message, so the large language model will receive your message and then also any relevant information that was found during this process and then everything else will come from the same way the AI assistant will process and generate it. is normally a better answer for you as a user, so the fetch here is this fetch function here where we take the information, the augmented part here is that we are augmenting that to the original message and the generation here is that we are generating generation.

Same message here from llm turns out that this is a solid and efficient way to address these problems that we are seeing with

llms

and industry and business, so I hope this helps explain the rag framework. If you have any questions, any comments. If I missed anything, please leave it below. If you are a practitioner and would like to add something to this conversation, please leave it below as well. There are many people who will see this and I am sure they will benefit. your experience and knowledge thank you and we will talk to you soon

Watch Video & Subscribe

If you have any copyright issue, please Contact