Vector Search¶

The Vector Search module provides IVectorIndex<T> — a provider-agnostic interface for embedding-based similarity search. The Supabase implementation uses pgvector for storage and PostgREST RPC for queries.

Effects Module¶

The pre-built VectorModule<T> wraps IVectorIndex<T> as an effect, mirroring SearchModule<T>:

[EffectsModule(typeof(SearchModule<Document>))]
[EffectsModule(typeof(VectorModule<Document>))]
public partial class SearchEffects;

In development, VectorModule<T> registers InMemoryVectorIndex<T>. In production, register a provider:

// Supabase pgvector
options.AddSupabaseVectorIndex<AppRuntime, Document>();

Overview¶

services.AddDeepstaging(options => options
    .AddSupabaseVectorIndex<AppRuntime, Document>());

This registers IVectorIndex<Document> → SupabaseVectorIndex<Document>, backed by a pgvector table and match function in your Supabase database.

Database Setup¶

Create the pgvector extension, table, and match function:

Supabase SQL Editor

-- Enable pgvector
CREATE EXTENSION IF NOT EXISTS vector;

-- Create the documents table
CREATE TABLE documents (
  id text PRIMARY KEY,
  content jsonb NOT NULL,
  embedding vector(1536)
);

-- Create an index for fast similarity search
CREATE INDEX ON documents
  USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

-- Create the match function (called via PostgREST RPC)
CREATE OR REPLACE FUNCTION match_documents(
  query_embedding vector(1536),
  match_threshold float DEFAULT 0.0,
  match_count int DEFAULT 10
)
RETURNS TABLE (id text, content jsonb, similarity float)
LANGUAGE sql STABLE
AS $$
  SELECT id, content, 1 - (embedding <=> query_embedding) AS similarity
  FROM documents
  WHERE 1 - (embedding <=> query_embedding) > match_threshold
  ORDER BY embedding <=> query_embedding
  LIMIT match_count;
$$;

Configuration¶

appsettings.json

{
  "Deepstaging": {
    "Supabase": {
      "Vectors": {
        "TableName": "documents",
        "MatchFunctionName": "match_documents",
        "EmbeddingDimension": 1536
      }
    }
  }
}

Property	Default	Description
`TableName`	`"documents"`	Postgres table storing documents and embeddings
`MatchFunctionName`	`"match_documents"`	Postgres function for similarity search
`EmbeddingDimension`	`1536`	Vector dimensions (must match your embedding model)

Common dimensions: 1536 (OpenAI text-embedding-3-small), 3072 (text-embedding-3-large), 768 (sentence-transformers).

Usage¶

// Generate embedding from your ML provider
float[] queryEmbedding = await embeddingService.EmbedAsync("search query");

// Search
var results = await vectorIndex.SearchAsync(queryEmbedding,
    new VectorSearchOptions(Limit: 5, MinSimilarity: 0.7));

foreach (var hit in results.Items)
    Console.WriteLine($"{hit.Id}: {hit.Similarity:P0} — {hit.Document}");

// Index a document
float[] docEmbedding = await embeddingService.EmbedAsync(document.Content);
await vectorIndex.UpsertAsync(document.Id, document, docEmbedding);

// Batch index
var items = documents.Select(d => new VectorDocument<Document>(
    d.Id, d, embeddings[d.Id])).ToList();
await vectorIndex.UpsertManyAsync(items);

// Remove
await vectorIndex.RemoveAsync(documentId);

Interface¶

`IVectorIndex<T>`¶

Method	Returns	Description
`SearchAsync(embedding, options?)`	`Task<VectorSearchResult<T>>`	Find similar documents
`UpsertAsync(id, document, embedding)`	`Task`	Add or update a document
`UpsertManyAsync(items)`	`Task`	Batch add/update
`RemoveAsync(id)`	`Task`	Remove a document
`ClearAsync()`	`Task`	Remove all documents

`VectorSearchOptions`¶

Property	Type	Default	Description
`Limit`	`int`	`10`	Maximum results
`MinSimilarity`	`double`	`0.0`	Minimum similarity threshold
`Metric`	`DistanceMetric`	`Cosine`	Distance metric (`Cosine`, `Euclidean`, `InnerProduct`)
`IncludeEmbeddings`	`bool`	`false`	Return stored embeddings in results

`VectorSearchResult<T>`¶

Property	Type	Description
`Items`	`IReadOnlyList<VectorHit<T>>`	Matched documents ordered by similarity
`TotalCount`	`int`	Total matches

`VectorHit<T>`¶

Property	Type	Description
`Id`	`string`	Document identifier
`Document`	`T`	The matched document
`Similarity`	`double`	Similarity score (higher = more similar)
`Embedding`	`ReadOnlyMemory<float>?`	Stored embedding (when `IncludeEmbeddings` is set)

In-Memory Provider¶

For development and testing, use InMemoryVectorIndex<T> — a brute-force implementation that requires no infrastructure:

// Development
services.AddSingleton<IVectorIndex<Document>, InMemoryVectorIndex<Document>>();

// Production — Supabase pgvector
options.AddSupabaseVectorIndex<AppRuntime, Document>();

InMemoryVectorIndex<T> supports all three distance metrics and is marked with [DevelopmentOnly] to prevent accidental production use.

Hybrid Search¶

Combine vector search with full-text search by pairing IVectorIndex<T> with ISearchIndex<T>:

public class HybridSearchService(
    ISearchIndex<Article> textSearch,
    IVectorIndex<Article> vectorSearch)
{
    public async Task<IReadOnlyList<Article>> Search(
        string query, float[] queryEmbedding)
    {
        // Full-text search for keyword matches
        var textResults = await textSearch.SearchAsync(query);

        // Vector search for semantic matches
        var vectorResults = await vectorSearch.SearchAsync(queryEmbedding);

        // Merge and deduplicate
        return MergeResults(textResults, vectorResults);
    }
}