Skip to content

Releases: amerkurev/scrapper

v0.17.0

16 Apr 09:42
Compare
Choose a tag to compare
  • fix: handle empty strings in levenshtein_similarity to avoid division by zero
  • feat(parser): include zero and one pixel elements in hidden checks
  • feat(parser): add comment removal functionality to article parser
  • feat(htmlutil): add text content improvement function
  • fix: remove 'media' from negative regex to prevent image removal in Readability.js

v0.16.0

16 Jan 09:03
Compare
Choose a tag to compare
  • fix: remove fixed width from code elements in custom.css
  • feat(htmlutil): expand tag checks in content improvement function

v0.15.0

15 Jan 07:53
Compare
Choose a tag to compare
  • add support for fetching any web page
  • add support for the device parameter to simulate specific device behaviors
  • deprecate explicit viewport-width, viewport-height, screen-width, and screen-height in favor of using the device parameter

BREAKING CHANGE: Default values for viewport-width, viewport-height, screen-width, and screen-height have been removed. Users should now use the device parameter to specify device-specific behaviors.

v0.14.0

12 Jan 12:42
Compare
Choose a tag to compare
  • add Docker health check features
  • fix parser for extra HTTP headers
  • display app version in API, documentation, and UI
  • rename user_data_dir to user_dir for clarity
  • add browser context limit
  • refactor test script for clarity
  • add Pylint for code linting
  • add HTTPS and authentication deployment instructions

v0.13.0

24 Dec 22:09
Compare
Choose a tag to compare
  • update dependencies: Playwright and Readability (0.5.0)
  • move backend from flask to fastapi (use async)
  • switching to uvicorn from gunicorn
  • fix some errors
  • add tests

v0.12.0

20 Sep 11:12
80c2d13
Compare
Choose a tag to compare
  • added --user-scripts-timeout parameter

v0.11.0

24 Apr 08:17
Compare
Choose a tag to compare
  • fix http404 error

v0.10.0

15 Apr 12:37
Compare
Choose a tag to compare
  • extract social meta tags (open graph, twitter)
  • new request parameter: headless
  • new request parameter: scroll-down