Engineering
November 18, 2024

How Ragie Outperformed the FinanceBench Test — Part 2

Mohammed Rafiq
,
Co-Founder and CTO

In our initial FinanceBench evaluation, Ragie demonstrated its ability to ingest and process over 50,000 pages of complex, multi-modal financial documents with remarkable speed and accuracy. Thanks to our advanced multi-step ingestion process, we outperformed the benchmarks for Shared Store retrieval by 42%. 

However, the FinanceBench test revealed a key area where our RAG pipeline could be improved—we saw that Ragie performed higher on text data than tables. Tables are a critical component of real-world use cases; they contain precise data often required to generate accurate answers. Efficiently parsing these tables while maintaining data integrity during chunking and retrieval is a complex challenge.

After analyzing patterns and optimizing our table extraction strategy, we re-ran the FinanceBench test to see how Ragie would perform. This enhancement significantly boosted Ragie’s ability to handle structured data embedded within unstructured documents.

Ragie’s New Table Extraction and Chunking Pipeline

In improving our table extraction performance, we looked at both our accuracy & speed, and made significant improvements across the board. 

Ragie’s new table extraction pipeline now includes:

  • Using models to detect table structures
  • OCR to extract header, row, and column data
  • LLM vision models to describe and create context suitable for semantic chunking
  • Specialized table chunking to prepend table headers to each chunk
  • Specialized table chunking to ensure row data is never split mid-record

We also made significant speed improvements and increased our table extraction speed by 25%. With these performance improvements, we were able to ingest 50,000+ pdf pages in the FinanceBench dataset in high-resolution mode in ~3hrs compared to 4hrs in our previous test.

Ragie’s New Performance vs. FinanceBench Benchmarks

With Ragie’s improved table extraction and chunking, on the single store test with top_k=128, Ragie outperformed the benchmark by 58%. On the harder and more complex shared store test, with top_k=128, Ragie outperformed the benchmark by 137%.

Conclusion

The FinanceBench test has driven our innovations further, especially in how we process structured data like tables. These insights allow Ragie to support developers with an even more robust and scalable solution for large-scale, multi-modal datasets. If you'd like to see Ragie in action, try our Free Developer Plan.

Feel free to reach out to us at support@ragie.ai if you're interested in running the FinanceBench test yourself.