Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry: Reasons for choosing XPath over CSS selectors #13

Open
DenDen047 opened this issue Nov 8, 2024 · 0 comments
Open

Inquiry: Reasons for choosing XPath over CSS selectors #13

DenDen047 opened this issue Nov 8, 2024 · 0 comments

Comments

@DenDen047
Copy link

I’ve been studying your paper “AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation” and noticed that you’ve chosen to use XPath for element selection rather than CSS selectors. As both methods are commonly used in web scraping, I’m curious about the reasoning behind this decision.

Could you please elaborate on why XPath was preferred for AutoCrawler/AutoScraper? Specifically, I’m interested in understanding:

  1. Were there specific advantages of XPath that made it more suitable for your progressive understanding approach?
  2. Did you encounter any limitations with CSS selectors that XPath addressed?
  3. How does the choice of XPath align with AutoCrawler’s goal of generating web crawlers through progressive understanding?

Your insights would be valuable for those of us working on similar projects and trying to make informed decisions about selector methods in web scraping applications.
Thank you for your time and for sharing your research with the community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant