Scrape any Site using n8n and Crawl4AI

This video tutorial shows how to build a complete web scraping workflow by combining Crawl4AI, n8n, DigitalOcean, and Supabase. It walks through checking site permissions (robots.txt, sitemap.xml), setting up a Docker-hosted Crawl4AI endpoint on DigitalOcean (with memory sizing, port configuration, and securing via API token), and then building an n8n flow to fetch URLs from a sitemap, loop through them, scrape each listing, and finally push structured data into Supabase for retrieval and use in a RAG (retrieval-augmented generation) system. The result: you can ask natural-language questions (e.g. “any four bedroom listings?”) and have your AI agent answer based on fresh scraped data.

Watch the tutorial here

Comments

No comments yet. Be the first to comment!