RAG is a popular technique for enhancing the output of language models. RAG consists of two parts:
- "R" (Retrieval) – search for relevant context. Substrate comes with a built-in vector store that's performant and cost effective.
- "AG" (Augmented Generation) – enrich a prompt with search results to "augment" the output.
In this guide, we'll show you how to search Hacker News comments for a topic, extract sentiment, and generate a research summary using Substrate.
This concise RAG implementation runs dozens of LLM calls in parallel and streams the markdown result.
First, we search HackerNews comments using the Algolia HN Search API (opens in a new tab).
Next, we use ComputeJSON to extract summary, sentiment, and other metadata from each comment. In TypeScript, we use zod (opens in a new tab) and zod-to-json-schema (opens in a new tab) to create the JSON schema. In Python, we use Pydantic (opens in a new tab).
Finally, we use ComputeText to generate a markdown summary of all the extracted JSON, and stream the results of the markdown
node.
The code we wrote was really simple. Implicitly, we were creating the graph below. But we didn't have to think about the graph at all! With Substrate, by simply relating tasks to each other, we get automatic parallelization of dozens of LLM calls for free, and 0 roundtrips.
To try this example yourself, and fork it, check out our runnable example on val.town (opens in a new tab).