Substrate comes with built-in vector storage, which you can use to store and query generated embeddings. In this example, we'll embed a set of common phrases used to enhance image generation prompts. Then, we'll query the vector store with a given prompt to recommend phrases to enhance the prompt.
First, we'll create a new vector store using FindOrCreateVectorStore, providing a name for the collection, and the model we'll use to embed our data.
Next, we'll embed the enhancement phrases using EmbedText providing the text to embed, the name of the collection, and the embedding model. We'll create an array of embedding nodes, and Substrate will automatically run the nodes in parallel.
Finally, we'll query the vector store with a given prompt using QueryVectorStore, providing the query string, the collection name, and the embedding model.
- We'll set
include_metadata
toTrue
to include metadata in the response, as the metadata includes the embedded text in thedoc
field. - We'll set
top_k
to 3 to retrieve only the top 3 most similar results.
The output of QueryVectorStore has query results in the results
field, which is a list of lists. In this example, it contains a single list of results. If we instead provided two query_strings
, it would contain two lists of results, one for each query string.
Output
{ "results": [ [ { "id": "079ee5765c8c4df98b50bdb7b5cbdd29", "distance": -0.723642945289612, "vector": null, "metadata": { "doc": "cell shaded cartoon", "doc_id": "079ee5765c8c4df98b50bdb7b5cbdd29" } }, { "id": "98ec8bb1da1243d88721645fc0a8899b", "distance": -0.717301785945892, "vector": null, "metadata": { "doc": "cinematic", "doc_id": "98ec8bb1da1243d88721645fc0a8899b" } }, { "id": "158f2fc695e648878d245fdf93fa2917", "distance": -0.715586066246033, "vector": null, "metadata": { "doc": "wide shot", "doc_id": "158f2fc695e648878d245fdf93fa2917" } } ] ], "collection_name": null, "model": "jina-v2", "metric": "inner"}