Creating a Financial Chatbot in Minutes Using Gradient Accelerator Blocks

Feb 22, 2024

Gradient Team

A few months ago, we demonstrated in a tutorial on how you can develop a financial chatbot using Gradient, LlamaIndex, and MongoDB. Today we'll walk you through how you can create that very same financial chatbot, 10x faster using Gradient's Accelerator Blocks.

A few months ago, we demonstrated in a tutorial on how you can develop a financial chatbot using Gradient, LlamaIndex, and MongoDB. Today we'll walk you through how you can create that very same financial chatbot, 10x faster using Gradient's Accelerator Blocks.

A few months ago, we demonstrated in a tutorial on how you can develop a financial chatbot using Gradient, LlamaIndex, and MongoDB. Today we'll walk you through how you can create that very same financial chatbot, 10x faster using Gradient's Accelerator Blocks.

A few months ago, we demonstrated in a tutorial on how you can develop a financial chatbot using Gradient, LlamaIndex, and MongoDB. Today we'll walk you through how you can create that very same financial chatbot, 10x faster using Gradient's Accelerator Blocks.

A few months ago, we demonstrated in a tutorial on how you can develop a financial chatbot using Gradient, LlamaIndex, and MongoDB. Today we'll walk you through how you can create that very same financial chatbot, 10x faster using Gradient's Accelerator Blocks.

Accelerating Development by 10x

Back in November we partnered with MongoDB and LlamaIndex to showcase how businesses could create powerful and cost-effective solutions, like a financial chatbot that’s powered by best-of-breed technologies. In our step-by-step tutorial, we focused on each component of the chatbot leveraging:

  1. Gradient for private, state-of-the-art LLMs and embeddings.

  2. LlamaIndex for orchestration and use of their advanced RAG framework.

  3. MongoDB Atlas for storing, indexing, and retrieving high-dimensional vector data.

  4. Streamlit for their UI interface.

Today we’re diving into how Gradient Accelerator Blocks can be used to replicate this same development process 10x faster, without sacrificing quality or performance.

Recognizing the Development Gap

While we’ve had an overwhelming amount of positive feedback from our tutorial, there are many teams that we’ve encountered that are still having troubles with the following challenges around AI implementation:

  1. Resourcing: Lack of expertise in AI development and implementation.

  2. AI Knowledge: Lack of technical depth, when it comes to each component required for the build.

  3. Time: Lack of available time required to incorporate each component into the chatbot.

An Easier and Faster Way to Develop AI

With the launch of Gradient Accelerator Blocks, we’ve made it even easier for developers and businesses to develop AI. The time required to develop an application like a financial chatbot, has now been reduced by more than 10x. Accelerator Blocks are comprehensive building blocks, designed to help accelerate AI development through a low-code, frictionless experience. Gradient offers a range of flavors including:

  1. LLM Development Blocks: Sophisticated optimization tools designed to simplify fine-tuning, RAG, and embeddings.

  2. Task Specific Blocks: Comprehensive building blocks used to accelerate task specific use cases, powered by custom Gradient LLMs.(e.g. document summarization, entity extraction, etc.)

  3. Domain Specific Blocks: Domain-specific AI, built to enable industry solutions (e.g. Albatross LLM, etc.)

Each category is designed to give developers and businesses the option for customization or the choice of not having to worry about the details and complexities under the hood. As always, data security and compliance are always kept top of mind as users are given the option to build within their preferred environment, while maintaining the highest standards in data security and compliance (e.g. SOC 2 Type 2, GDPR, etc.).

The Before and After

Initially, our financial chatbot required five major steps in order to bridge together four different technologies. This produced a sophisticated chatbot that leveraged a state-of-the-art LLM, optimized by retrieval augmented generation (RAG) to help reduce hallucinations. However this process also meant that you’d need to allocate proper time for setup and time to familiarize yourself with the technology, in order to bridge together the necessary components. You can find a full breakdown of our original tutorial here.

Using our Accelerator Blocks you’ll be able to cut this process down to three simple steps, while Gradient takes care of all of the heavy lifting behind the scenes. This removes any need for infrastructure, setup, or in-depth knowledge around the technologies that are being used or how to bridge them together.

To do this, you’ll be leveraging two separate Accelerator Blocks. One that is specifically used to set up RAG and the other that will enable your application to be able to respond to questions from documents that are stored in RAG or via text. Since each Accelerator Block is fully managed, all that’s left is to drag and drop your data into the UI and access your finished model via an API when you’re ready.

Step-by-Step Tutorial

Step 1: Set up RAG in Seconds Using Gradient’s Accelerator Block for RAG

Gradient’s made it extremely easy to setup RAG. Simply drag and drop your documents into Gradient’s UI and Gradient will generate a “RAG Collection ID” for you, that can be used as a reference. With Gradient’s Accelerator Block for RAG, you’ll be receiving:

  • The Fastest Way to Production RAG: Gradient is improving your development process by 10x, by removing the need for infrastructure, setup or in-depth knowledge around AI.

  • Best-of-Breed Technologies: When you use Gradient’s Accelerator Block for RAG, you won’t be compromising on quality. Our RAG service is powered by best-of-breed technologies including: Gradient for state-of-the-art LLMs and embeddings, MongoDB Atlas for indexing high-dimensional vector data, and LlamaIndex for the use of their advanced RAG framework.

  • Optimized RAG Performance: We’ve already done the work for you to enhance your RAG performance and accuracy. Our Accelerator Block for RAG incorporates best practices and methods for optimization including the use of an optimal chunking strategy, rerankers, and advanced retrieval strategies.


Step 2: Connect Your RAG Collection to Your Accelerator Block for Q&A

Now that you’ve set up your RAG collection via the Accelerator Block for RAG, you can connect it to Gradient’s Accelerator Block for Q&A. Our Q&A block, accurately responds to questions on documents stored in RAG or provided via text input. If you want to check out whether or not everything is up to par before connecting it to your final application, you can actually easily test this out using Gradient’s playground as shown below. You’ll also be able to see the RAG context that is retrieved from the query, so you can quickly debug the query live.


To do this via the Python SDK, reference the following example. You can find your RAG collection ID using the instructions here.

from gradientai import Gradient
gradient = Gradient()
question = "What are Meta's areas of investments?"
response = gradient.answer(
    question=question,
    source={
        "type": "rag",
        "collection_id": "YOUR_RAG_COLLECTION_ID",
    },
)

Step 3: Chat Over the Documents Using Streamlit

You can easily use the Gradient SDK in your Streamlit app and couple it with Streamlit’s built-in chat components to create your financial chatbot. After deploying the Streamlit app, you can start using the chat interface to ask questions about the financial reports from Meta, Alphabet, and Netflix.


Here is an example of using the Gradient Q&A Accelerator Block in your Streamlit code. Note that more Streamlit integration is necessary for the full chatbot interface to function.

import streamlit as st
from gradientai import Gradient
gradient = Gradient()
# Additional Streamlit logic necessary to handle full chat interface.
with st.chat_message("assistant"):
    with st.spinner("Thinking..."):
        # Using Gradient Accelerator Block
        response = gradient.answer(
            question=prompt,
            source={
                "type": "rag",
                "collection_id": "YOUR_RAG_COLLECTION_ID",
            },
        )
        
        # Write response to Streamlit UI and chat message history
        st.write(response)
        message = {"role": "assistant", "content": response}
        st.session_state.messages.append(message)

Supercharge Your Financial Chatbot

At Gradient, we specialize in enabling domain-specific AI solutions for every industry including financial services. For enterprise businesses who are looking to increase their competitive edge, Gradient offers a domain-specific LLM in finance. The Albatross LLM has been extensively trained on all aspects of financial services (banking, investments, insurance, etc.) and consistently outperforms similar models - overcoming deficiencies that general-purpose language models face when solving domain-specific tasks in finance.

As a benchmark, a couple weeks ago we made Alphatross, an earlier version of Gradient's Albatross model with limited capabilities, available on Hugging Face. Despite Alphatross being an earlier model with limited capabilities, it held the highest performing H6 model for Llama2-70B variations and substantially outperformed similar variations on GSM8K.