deep-web-scraper-rag-automation-workflow

Deep web scrapper + RAG: Automation recursively downloads each page of the target website and extracts links, emails, texts, and PDF documents. Then, all extracted data goes into RAG, from which you can later extract data via chat or any other interface. Steps to follow: 1. Create a Supabase account and project. 2. Connect Supabase to n8n. 3. Connect PostgreSQL from Supabase to n8n. 4. Create Supabase tables and functions. 5. Run the automation. 6. If automation times out, you can re-run it with a click-to-start workflow node connected to the 'Check Supabase' node. 7. Sometimes, an HTTP request fails and causes automation to mark the URL as failed, but you can re-activate these URLs (after automation is finished) with another sub-flow. Then simply re-run the main web-scrapper automation.

shadowDragons
about 2 months ago
Data Processing
Statistics
Downloads
0
Likes
0
Workflow Files
deep-web-scraper-rag-automation-workflowMain

Workflow file: deep-web-scraper-rag-automation-workflow.json

33.6 KB

Installation Instructions

Step 1: Download the Workflow

Click the "Download Workflow" button above to save the workflow JSON file to your computer.

Step 2: Import to n8n

In your n8n instance, go to the workflows page and click "Import from File". Select the downloaded JSON file to import this workflow.

Step 3: Configure Credentials

Set up any required credentials and API connections for the nodes used in this workflow. Check each node's configuration to ensure proper setup.

Step 4: Test and Activate

Test the workflow with sample data to ensure it works correctly, then activate it to start automation.

About This Workflow

Author

SshadowDragons

Category

Data Processing

Created

about 2 months ago