@radzor/rag-pipeline

End-to-end Retrieval-Augmented Generation pipeline. Ingests documents, chunks and embeds them into an in-memory vector index, then answers queries by retrieving relevant context and calling an LLM completion endpoint.

AI & MLv0.1.0typescriptpythonServerragretrievalaugmented-generationembeddingsvector-searchaillmopenaiby Radzor

Install

View source on GitHub →

$npx radzor@latest add rag-pipeline

⚠ Constraints: Requires an OpenAI-compatible API key for embeddings and completions. Uses native fetch (Node.js 18+). The vector index is in-memory — data is lost on restart.

Inputs

Name	Type	Default	Description
embeddingApiKey*	string	—	API key for the embeddings endpoint (OpenAI-compatible).OPENAI_API_KEY
completionApiKey	string	—	API key for the completion endpoint (OpenAI-compatible). Defaults to embeddingApiKey if not set.OPENAI_API_KEY
chunkSize	number	512	Maximum number of characters per text chunk during ingestion.
overlapSize	number	64	Number of overlapping characters between consecutive chunks.
embeddingModel	string	text-embedding-3-small	Model identifier for embeddings.

{ "llm": { "constraints": "Requires an OpenAI-compatible API key for embeddings and completions. Uses native fetch (Node.js 18+). The vector index is in-memory — data is lost on restart.", "usageExamples": "llm/examples.md", "integrationPrompt": "llm/integration.md" }, "name": "@radzor/rag-pipeline", "tags": [ "rag", "retrieval", "augmented-generation", "embeddings", "vector-search", "ai", "llm", "openai" ], "events": [ { "name": "onQueryComplete", "payload": { "query": "string", "answer": "string", "confidence": "number", "sourceCount": "number" }, "description": "Fired when a RAG query completes successfully." }, { "name": "onIngestComplete", "payload": { "chunkCount": "number", "documentId": "string" }, "description": "Fired when a document finishes ingestion and indexing." } ], "inputs": [ { "name": "embeddingApiKey", "type": "string", "envVar": "OPENAI_API_KEY", "required": true, "description": "API key for the embeddings endpoint (OpenAI-compatible)." }, { "name": "completionApiKey", "type": "string", "envVar": "OPENAI_API_KEY", "required": false, "description": "API key for the completion endpoint (OpenAI-compatible). Defaults to embeddingApiKey if not set." }, { "name": "chunkSize", "type": "number", "default": 512, "required": false, "description": "Maximum number of characters per text chunk during ingestion." }, { "name": "overlapSize", "type": "number", "default": 64, "required": false, "description": "Number of overlapping characters between consecutive chunks." }, { "name": "embeddingModel", "type": "string", "default": "text-embedding-3-small", "required": false, "description": "Model identifier for embeddings." }, { "name": "completionModel", "type": "string", "default": "gpt-4o-mini", "required": false, "description": "Model identifier for the completion/chat call." }, { "name": "topK", "type": "number", "default": 5, "required": false, "description": "Number of top matching chunks to retrieve per query." }, { "name": "embeddingBaseUrl", "type": "string", "default": "https://api.openai.com/v1", "required": false, "description": "Override base URL for the embeddings API." }, { "name": "completionBaseUrl", "type": "string", "default": "https://api.openai.com/v1", "required": false, "description": "Override base URL for the completion API." } ], "radzor": "1.0.0", "actions": [ { "name": "ingest", "params": [ { "name": "text", "type": "string", "description": "The raw text content to ingest." }, { "name": "metadata", "type": "Record<string, string>", "required": false, "description": "Optional metadata to attach to every chunk from this document." } ], "returns": "Promise<{ documentId: string; chunkCount: number }>", "description": "Ingest a text document: chunk it, embed each chunk, and store in the vector index." }, { "name": "query", "params": [ { "name": "question", "type": "string", "description": "The user's question." }, { "name": "systemPrompt", "type": "string", "required": false, "description": "Optional system prompt override for the completion call." } ], "returns": "Promise<RagResult>", "description": "Query the pipeline: embed the question, retrieve top-K chunks, call the LLM with context, and return the grounded answer." }, { "name": "clearIndex", "params": [], "returns": "void", "description": "Remove all documents and embeddings from the in-memory index." } ], "outputs": [ { "name": "ragResult", "type": "RagResult", "fields": [ { "name": "answer", "type": "string", "description": "The LLM-generated answer grounded in retrieved context." }, { "name": "sources", "type": "Array<{ text: string; score: number; metadata?: Record<string, string> }>", "description": "The retrieved chunks used as context, with similarity scores." }, { "name": "confidence", "type": "number", "description": "Average similarity score of retrieved sources (0-1)." } ], "description": "The result of a RAG query including the generated answer, source chunks, and confidence score." } ], "runtime": "server", "version": "0.1.0", "category": "ai", "languages": [ "typescript", "python" ], "description": "End-to-end Retrieval-Augmented Generation pipeline. Ingests documents, chunks and embeds them into an in-memory vector index, then answers queries by retrieving relevant context and calling an LLM completion endpoint.", "dependencies": { "radzor": [], "packages": {} }, "composability": { "connectsTo": [ { "output": "ragResult", "mapField": "answer", "compatibleWith": [ "@radzor/guardrails.action.validateOutput.text", "@radzor/structured-output.action.extract.text" ] }, { "event": "onQueryComplete", "description": "Log RAG queries for analysis", "compatibleWith": [ "@radzor/event-tracker.action.track.eventName", "@radzor/log-aggregator.action.info.message" ] } ] } }

@radzor/rag-pipeline

Install

Inputs

Outputs

Actions

Events

Composability

radzor.manifest.json