Skip to content

Latest commit

 

History

History
20 lines (12 loc) · 836 Bytes

playwright-web-scraper.md

File metadata and controls

20 lines (12 loc) · 836 Bytes

Playwright Web Scraper

Playwright is a Node.js library that allows automation of web browsers for web scraping. It was developed by Microsoft and supports multiple browsers, including Chromium. Keep in mind that when scraping websites, you should always review and comply with the website's terms of service and policies to ensure ethical and legal use of the data.

Scrape One URL

  1. (Optional) Connect Text Splitter.
  2. Input desired URL to be scraped.

Crawl & Scrape Multiple URLs

Visit Web Crawl guide to allow scraping of multiple pages.

Output

Loads URL content as Document

Resources