Substrate comes with built-in vector storage, which you can use to store and query generated embeddings. It's performant, colocated with the rest of your workload, and much more cost-effective than alternative vector database providers, like Pinecone, Supabase, or Weaviate.
This guide embeds a set of common phrases used to enhance image generation prompts – and then queries the vector store, recommending phrases to enhance a given prompt.
To create a new vector store, use FindOrCreateVectorStore. Give your store a collection_name, and specify the embedding model.
Learn more: embedding models
- jina-v2is a popular model for embedding text.
- clipis a popular model for embedding text and images.
To embed data, use EmbedText. Provide the data to embed, the collection_name, and the embedding model. Below, we create an array of embedding nodes – Substrate automatically runs these nodes in parallel because they have no upstream dependencies.
To query a vector store, use QueryVectorStore. Provide the query string, collection_name, and embedding model.
Learn more: QueryVectorStore parameters
- Set include_metadatatoTrueto include metadata in the response. The metadata includes the embedded text in thedocfield.
- Set top_kto 3 to retrieve only the top 3 most similar results.
- To query images using a multimodal embedding model like clip, providequery_image_uris.
- Multiple queries can be run in a batch – simply pass multiple query strings or images.
The output of QueryVectorStore has query results in the results field, which is a list of lists. In this example, it contains a single list of results. If we instead provided two query_strings, it would contain two lists of results, one for each query string.
Output
{  "results": [    [      {        "id": "079ee5765c8c4df98b50bdb7b5cbdd29",        "distance": -0.723642945289612,        "vector": null,        "metadata": {          "doc": "cell shaded cartoon",          "doc_id": "079ee5765c8c4df98b50bdb7b5cbdd29"        }      },      {        "id": "98ec8bb1da1243d88721645fc0a8899b",        "distance": -0.717301785945892,        "vector": null,        "metadata": {          "doc": "cinematic",          "doc_id": "98ec8bb1da1243d88721645fc0a8899b"        }      },      {        "id": "158f2fc695e648878d245fdf93fa2917",        "distance": -0.715586066246033,        "vector": null,        "metadata": {          "doc": "wide shot",          "doc_id": "158f2fc695e648878d245fdf93fa2917"        }      }    ]  ],  "collection_name": null,  "model": "jina-v2",  "metric": "inner"}