Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Python SDK Not Catching All scraping methods failed Error #851

Open
brian-carnot opened this issue Oct 31, 2024 · 0 comments
Open
Labels
bug Something isn't working

Comments

@brian-carnot
Copy link

Describe the Bug
When this issue occurs from scraping a page failing, instead of throwing this exception, it seems that content is just empty instead.

To Reproduce
Steps to reproduce the issue:

  1. Wait for a website to raise exception All scraping methods failed for url: on the dashboard
  2. View the return result from the .scrape_url(url) method
  3. Example: {'content': '', 'markdown': '', 'linksOnPage': [], 'metadata': {'sourceURL': 'https://ycombinator.com/people', 'pageStatusCode': 200}}

Expected Behavior
An exception should be thrown by the scrape_url method instead of returning empty content.

Screenshots
If applicable, add screenshots or copies of the command line output to help explain the issue.

Environment (please complete the following information):

  • OS: Linux (python:3.12.3-bookworm image)
  • Firecrawl Version: ^0.0.20 Python SDK
  • Node.js Version: v23.1.0

Logs

{
    "url": "https://ycombinator.com/people",
    "type": "scrape",
    "method": "fetch",
    "result": {
        "error": null,
        "success": false,
        "time_taken": 591,
        "response_code": 200,
        "response_size": 55917
    },
    "createdAt": "2024-10-31T05:50:34.653524+00:00"
}
{
    "type": "error",
    "stack": "Error: All scraping methods failed for URL: https://ycombinator.com/people\n    at scrapSingleUrl (/app/dist/src/scraper/WebScraper/single_url.js:378:19)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async /app/dist/src/scraper/WebScraper/index.js:66:32\n    at async Promise.all (index 0)\n    at async WebScraperDataProvider.convertUrlsToDocuments (/app/dist/src/scraper/WebScraper/index.js:64:13)\n    at async Promise.all (index 0)\n    at async WebScraperDataProvider.processLinks (/app/dist/src/scraper/WebScraper/index.js:208:40)\n    at async WebScraperDataProvider.handleSingleUrlsMode (/app/dist/src/scraper/WebScraper/index.js:174:25)\n    at async runWebScraper (/app/dist/src/main/runWebScraper.js:77:23)\n    at async startWebScraperPipeline (/app/dist/src/main/runWebScraper.js:13:13)\n    at async processJob (/app/dist/src/services/queue-worker.js:236:44)\n    at async processJobInternal (/app/dist/src/services/queue-worker.js:72:24)\n    at async /app/dist/src/services/queue-worker.js:174:39\n    at async /app/dist/src/services/queue-worker.js:161:25",
    "message": "All scraping methods failed for URL: https://ycombinator.com/people",
    "createdAt": "2024-10-31T05:50:35.242789+00:00"
}

Additional Context
Add any other context about the problem here, such as configuration specifics, network conditions, data volumes, etc.

Repeating the error is inconsistent and I am not reaching a rate limit for my api key.

@brian-carnot brian-carnot added the bug Something isn't working label Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant