How LlamaIndex is ushering in the future of RAG for businesses
How LlamaIndex is ushering in the future of RAG for businesses

We want to hear from you! Take our short AI survey and share your insights on the current state of AI, how you are implementing it and what you expect for the future. Learn more

Retrieval Augmented Generation (RAG) is an important technique that leverages external knowledge bases to improve the quality of large language model (LLM) outputs and provides transparency into model sources that can be cross-checked by humans.

However, according to Jerry Liu, co-founder and CEO of LlamaIndex, basic RAG systems can have primitive interfaces and poor understanding and planning, lack function calls or tool usage, and are stateless (with no storage). Data silos only exacerbate this problem. Liu spoke yesterday during VB Transform in San Francisco.

This can make it difficult to produce LLM apps at scale due to issues with accuracy, scaling, and too many required parameters (which require deep technical expertise).

Therefore, there are many questions to which RAG has no answer.

Register to access VB Transform On-Demand

Passes for VB Transform 2024 are now sold out! Don’t miss out – register now for exclusive on-demand access available after the conference. Learn more

“RAG was really just the beginning,” Liu said on stage at VB Transform this week. Many of the core concepts of the naive RAG were “kind of stupid” and led to “very suboptimal decisions.”

LlamaIndex aims to overcome these challenges by providing a platform that helps developers quickly and easily build next-generation LLM-based apps. The framework offers data extraction, which transforms unstructured and semi-structured data into unified, programmatically accessible formats; RAG, which answers queries on internal data via question-answering systems and chatbots; and autonomous agents, Liu explained.

Synchronize data so it is always up to date

It is crucial to bring together all the different types of data within an organization, whether unstructured or structured, Liu noted. Multi-agent systems can then tap into “the wealth of heterogeneous data” that exists within companies.

“Any LLM application is only as good as your data,” Liu said. “If your data quality is not good, you will not get good results.”

LlamaCloud – now available via a waitlist – offers advanced ETL (extract, transform load) capabilities, allowing developers to “synchronize data over time so it’s always up to date,” Liu explained. “When you ask a question, you’re guaranteed to have the relevant context, no matter how complex or challenging the question is.”

LlamaIndex’s interface can handle both simple and complex questions as well as sophisticated research tasks, and the results can include short answers, structured results or even research reports, he said.

The company’s LllamaParse is an advanced document parser specifically aimed at reducing LLM hallucinations. Liu said the program has 500,000 downloads a month and 14,000 unique users, and has processed more than 13 million pages.

“LlamaParse is currently the best technology I’ve seen for parsing complex document structures for enterprise RAG pipelines,” said Dean Barr, head of applied AI at global investment firm The Carlyle Group. “The ability to maintain nested tables and extract sophisticated spatial layouts and images is key to maintaining data integrity when building advanced RAG and agent-based models.”

Liu explained that LlamaIndex’s platform is used to support financial analysts, centralized web search, sensor data analytics dashboards, and internal LLM application development platforms, as well as in industries such as technology, consulting, financial services and healthcare.

From simple agents to advanced multi-agents

Importantly, LlamaIndex is built on agent-based thinking to enable better query understanding, planning and tool usage across different data interfaces, Liu explained. It also includes multiple agents that provide specialization and parallelization, helping to optimize costs and reduce latency.

The problem with single-agent systems is that “the more you try to cram into them, the less reliable they become, even if the overall theoretical complexity is higher,” says Liu. In addition, single agents cannot solve an infinite number of tasks. “If you try to give an agent 10,000 tools, it’s not going to perform very well.”

Multi-agents help each agent specialize in a specific task, he explained. This has system-level benefits, such as parallelization costs and latency.

“The idea is that through collaboration and communication, you can solve even more challenging tasks,” Liu said.

By Aurora