Productionizing RAGs: Challenges

Sivasathivel Kandasamy
4 min readJul 19, 2024

In a very short span, Generative AI has sparked a widespread interest in industries. Many organizations are well on their way trying to adopt RAGs in their portfolio. However, at the same time, they are also discovering the many challenges during the productionization of the Retrieval Augmented Generation (RAG) systems.

In this blog, I try to address some of challenges in deployment of RAG solutions in production, along with my experience and some common solutions.

Pain Points in RAG

The figure above shows a naive implementation of a RAG system for illustrating the common pain points. The challenges of building a robust RAG system increases with the complexity.

  1. Unclear Query: While building toy applications or Proof-of-concepts, it is common for to provide a clear input to the system. For example, “Do you serve coffee” or “What are the different coffee options”. However, in practice the user may just write “Coffee”. In this case, either the user might ask if he can be served coffee and/or his options. The most common reaction to the problem is to create a chat-bot based on RAG. While building a chat-bot does makes sense, it will add other complications. A better approach would be to expand and augment the query.
  2. Top-K & Missed Top Chunks: Retrieving the top-k results for…

--

--

No responses yet