RADZOR
ComponentsRecipesDocsContributeGitHub
Get Started
RADZOR

The universal component registry for LLM-driven development. Empowering developers to build better apps, faster.

Product
  • Components
  • Standard
Resources
  • Documentation
  • API Reference
  • AI Agent Integration
  • Pipeline Guide
  • MCP Server
Community
  • GitHub
  • X / Twitter
  • Discord

© 2026 Radzor Registry. All rights reserved.

Cookbook
Intermediatedataaiautomation

Scrape → CSVAI Workflow

Scrape a website, extract structured data with an LLM, and export the results as a clean CSV. A three-step data pipeline with no manual parsing.

Prerequisites

Environment variables

OPENAI_KEY
Respect robots.txt and rate limits when scraping third-party sites.

Install

$npx radzor@latest recipe add scrape-to-csv

AI Prompt

“Run `npx radzor@latest add web-scraper structured-output csv-export` to install 3 Radzor components. Then read components/radzor/web-scraper/radzor.manifest.json, components/radzor/structured-output/radzor.manifest.json, components/radzor/csv-export/radzor.manifest.json and each component's llm/integration.md. Wire them together to scrape a website, extract structured data with an LLM, and export the results as a clean CSV. A three-step data pipeline with no manual parsing. Use the manifest's inputs (check envVar for required environment variables), outputs (check fields for object shapes), composability (check mapField for field extraction), and actions — don't invent custom interfaces.”

Paste this into Claude Code, Cursor, Windsurf, or any AI coding agent.

Pipeline

WebScraper

Fetches raw HTML from target URLs

→
↓
HTML

StructuredOutput

Extracts structured data via LLM

→
↓
typed records

CsvExport

Generates the final CSV file

Scaffolded Code

scrape-to-csv-recipe.ts
// npx radzor@latest add web-scraper structured-output csv-export
import { WebScraper }       from "./components/radzor/web-scraper"
import { StructuredOutput } from "./components/radzor/structured-output"
import { CsvExport }        from "./components/radzor/csv-export"

const scraper = new WebScraper({ timeout: 15000, rateLimit: 2000 })
const extractor = new StructuredOutput({ provider: "openai", apiKey: process.env.OPENAI_KEY!, model: "gpt-4o", temperature: 0 })
const csv = new CsvExport({ delimiter: ",", includeHeaders: true })

const urls = [
  "https://example.com/products/1",
  "https://example.com/products/2",
  "https://example.com/products/3",
]

const schema = { name: "string", price: "number", inStock: "boolean" }
const rows: Record<string, unknown>[] = []

for (const url of urls) {
  const html = await scraper.fetchHtml(url)
  const product = await extractor.extract(html, schema)
  rows.push(product)
}

// Export to CSV file
await csv.toFile("./products.csv", rows)

Components used

WebScraperFetches raw HTML from target URLs
View
StructuredOutputExtracts structured data via LLM
View
CsvExportGenerates the final CSV file
View

LLM tip

Pass all 3 radzor.manifest.json files to your agent at once. It will read the outputs of each step and match them against the inputs of the next — wiring the full pipeline without any extra instructions.

web-scraper/manifest.jsonstructured-output/manifest.jsoncsv-export/manifest.json