LoreonLabsPlatform
DocsHome
  • Overview

Intelligence

  • Markets
  • Builders
  • Research
  • Ecosystems
  • Launchpads
  • Search
Ecosystems

Python

scrape-website

Incredibly High Performant web scraper (seconds for what takes others lots of minutes) - battle tested on millions of websites

PythonEmergingaiapi-web-scrapingllmllms
GitHub
Stars
2
Forks
2
Contributors
1
Last push
28d ago

Recent commits

Latest commits.

  • feat: output Markdown with metadata and fix cross-page dedup loss
    6b7a19dVentz Petkov28d ago
  • feat: add URL exclude patterns, tracking-param stripping, sitemap seeding
    f02183cVentz Petkov1mo ago
  • fix: prevent UTF-8 mojibake from chardet charset mis-detection
    f237155Ventz Petkov1mo ago
  • Initial commit: async website scraper with text extraction
    b57c21fVentz Petkov3mo ago

Top contributors

Builders behind this project.

ventz
4 commits