Accurate search #73

somewordstoolate · 2024-12-19T12:42:03Z

Background and Problem
When using the query "Haagsma C, van Riel P, de Jong A, van de Putte L. Combination of sulphasalazine and methotrexate versus the single components in early rheumatoid arthritis: a randomized, controlled, double-blind, 52 week clinical trial. British Journal of Rheumatology. 1997;36(10):1082.", the PDF could not be downloaded even though an accurate search result is available on Google Scholar.

Through debugging, it was discovered that when performing an accurate search (e.g., using the paper title), and Google Scholar returns only one search result, the div's class_ attribute value should be gs_r gs_or gs_scl gs_fmar.

Modifications
Updated the HTMLparsers.scholarParser function name (previously named schoolarParser, corrected a spelling error from "schoolar" to "scholar") and modified its soup.findAll logic to correctly identify the div element with the specific class_ attribute when there is only one search result.

…ccurate search

goghvan1113 · 2024-12-27T13:21:54Z

The newest branch v1.4.1 modifications for accurate search: （in HTMLparsers.py）
replacing
for element in soup.findAll("div", class_="gs_r gs_or gs_scl"):
with

for element in soup.findAll(
        "div", class_=["gs_r gs_or gs_scl", "gs_r gs_or gs_scl gs_fmar"]
    ):  # "gs_r gs_or gs_scl gs_fmar" for only one search result

Yushuhuan added 2 commits December 19, 2024 19:51

change function schoolarParser for accurate search

24f3f61

change function scholarParser (previously named schoolarParser) for a…

cf04533

…ccurate search

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accurate search #73

Accurate search #73

somewordstoolate commented Dec 19, 2024

goghvan1113 commented Dec 27, 2024

Accurate search #73

Are you sure you want to change the base?

Accurate search #73

Conversation

somewordstoolate commented Dec 19, 2024

goghvan1113 commented Dec 27, 2024