-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(community): Added Google Scholar Integration (#7278)
Co-authored-by: jacoblee93 <[email protected]>
- Loading branch information
1 parent
475288c
commit 8832af3
Showing
8 changed files
with
366 additions
and
0 deletions.
There are no files selected for viewing
185 changes: 185 additions & 0 deletions
185
docs/core_docs/docs/integrations/tools/google_scholar.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,185 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "raw", | ||
"id": "10238e62-3465-4973-9279-606cbb7ccf16", | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "raw" | ||
} | ||
}, | ||
"source": [ | ||
"---\n", | ||
"sidebar_label: Google Scholar\n", | ||
"---" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "a6f91f20", | ||
"metadata": {}, | ||
"source": [ | ||
"# Google Scholar Tool\n", | ||
"\n", | ||
"This notebook provides a quick overview for getting started with [`SERPGoogleScholarTool`](https://api.js.langchain.com/classes/_langchain_community.tools_google_scholar.SERPGoogleScholarAPITool.html). For detailed documentation of all `SERPGoogleScholarAPITool` features and configurations, head to the [API reference](https://api.js.langchain.com/classes/_langchain_community.tools_google_scholar.SERPGoogleScholarAPITool.html).\n", | ||
"\n", | ||
"## Overview\n", | ||
"\n", | ||
"### Integration details\n", | ||
"\n", | ||
"| Class | Package | [PY support](https://python.langchain.com/docs/integrations/tools/google_scholar/) | Package latest |\n", | ||
"| :--- | :--- | :---: | :---: |\n", | ||
"| [GoogleScholarTool](https://api.js.langchain.com/classes/_langchain_community.tools_google_scholar.SERPGoogleScholarAPITool.html) | [@langchain/community](https://www.npmjs.com/package/@langchain/community) | ✅ | ![NPM - Version](https://img.shields.io/npm/v/@langchain/community?style=flat-square&label=%20&) |\n", | ||
"\n", | ||
"### Tool features\n", | ||
"\n", | ||
"- Retrieve academic publications by topic, author, or query.\n", | ||
"- Fetch metadata such as title, author, and publication year.\n", | ||
"- Advanced search filters, including citation count and journal name.\n", | ||
"\n", | ||
"## Setup\n", | ||
"\n", | ||
"The integration lives in the `@langchain/community` package.\n", | ||
"\n", | ||
"```bash\n", | ||
"npm install @langchain/community\n", | ||
"```\n", | ||
"\n", | ||
"### Credentials\n", | ||
"\n", | ||
"Ensure you have the appropriate API key to access Google Scholar. Set it in your environment variables:\n", | ||
"\n", | ||
"```typescript\n", | ||
"process.env.GOOGLE_SCHOLAR_API_KEY=\"your-serp-api-key\"\n", | ||
"```\n", | ||
"\n", | ||
"It's also helpful to set up [LangSmith](https://smith.langchain.com/) for best-in-class observability:\n", | ||
"\n", | ||
"```typescript\n", | ||
"process.env.LANGCHAIN_TRACING_V2=\"true\"\n", | ||
"process.env.LANGCHAIN_API_KEY=\"your-langchain-api-key\"\n", | ||
"```" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "1c97218f-f366-479d-8bf7-fe9f2f6df73f", | ||
"metadata": {}, | ||
"source": [ | ||
"## Instantiation\n", | ||
"\n", | ||
"You can import and instantiate an instance of the `SERPGoogleScholarAPITool` tool like this:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"id": "8b3ddfe9-ca79-494c-a7ab-1f56d9407a64", | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "typescript" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import { SERPGoogleScholarAPITool } from \"@langchain/community/tools/google_scholar\";\n", | ||
"\n", | ||
"const tool = new SERPGoogleScholarAPITool({\n", | ||
" apiKey: process.env.SERPAPI_API_KEY,\n", | ||
"});" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "74147a1a", | ||
"metadata": {}, | ||
"source": [ | ||
"## Invocation\n", | ||
"\n", | ||
"### Invoke directly with args\n", | ||
"\n", | ||
"You can invoke the tool directly with query arguments:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "65310a8b-eb0c-4d9e-a618-4f4abe2414fc", | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "typescript" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"const results = await tool.invoke({\n", | ||
" query: \"neural networks\",\n", | ||
" maxResults: 5,\n", | ||
"});\n", | ||
"\n", | ||
"console.log(results);" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "d6e73897", | ||
"metadata": {}, | ||
"source": [ | ||
"### Invoke with ToolCall\n", | ||
"\n", | ||
"We can also invoke the tool with a model-generated `ToolCall`:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "f90e33a7", | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "typescript" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"const modelGeneratedToolCall = {\n", | ||
" args: { query: \"machine learning\" },\n", | ||
" id: \"1\",\n", | ||
" name: tool.name,\n", | ||
" type: \"tool_call\",\n", | ||
"};\n", | ||
"await tool.invoke(modelGeneratedToolCall);" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "93848b02", | ||
"metadata": {}, | ||
"source": [ | ||
"## API reference\n", | ||
"\n", | ||
"For detailed documentation of all `SERPGoogleScholarAPITool` features and configurations, head to the [API reference](https://api.js.langchain.com/classes/_langchain_community.tools_google_scholar.SERPGoogleScholarAPITool.html)." | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "poetry-venv-311", | ||
"language": "python", | ||
"name": "poetry-venv-311" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.9" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
import { Tool } from "@langchain/core/tools"; | ||
import { getEnvironmentVariable } from "@langchain/core/utils/env"; | ||
|
||
/** | ||
* Interface for parameters required by the SERPGoogleScholarAPITool class. | ||
*/ | ||
export interface GoogleScholarAPIParams { | ||
/** | ||
* Optional API key for accessing the SerpApi service. | ||
*/ | ||
apiKey?: string; | ||
} | ||
|
||
/** | ||
* Tool for querying Google Scholar using the SerpApi service. | ||
*/ | ||
export class SERPGoogleScholarAPITool extends Tool { | ||
/** | ||
* Specifies the name of the tool, used internally by LangChain. | ||
*/ | ||
static lc_name() { | ||
return "SERPGoogleScholarAPITool"; | ||
} | ||
|
||
/** | ||
* Returns a mapping of secret environment variable names to their usage in the tool. | ||
* @returns {object} Mapping of secret names to their environment variable counterparts. | ||
*/ | ||
get lc_secrets(): { [key: string]: string } | undefined { | ||
return { | ||
apiKey: "SERPAPI_API_KEY", | ||
}; | ||
} | ||
|
||
// Name of the tool, used for logging or identification within LangChain. | ||
name = "serp_google_scholar"; | ||
|
||
// The API key used for making requests to SerpApi. | ||
protected apiKey: string; | ||
|
||
/** | ||
* Description of the tool for usage documentation. | ||
*/ | ||
description = `A wrapper around Google Scholar API via SerpApi. Useful for querying academic | ||
articles and papers by keywords or authors. Input should be a search query string.`; | ||
|
||
/** | ||
* Constructs a new instance of SERPGoogleScholarAPITool. | ||
* @param fields - Optional parameters including an API key. | ||
*/ | ||
constructor(fields?: GoogleScholarAPIParams) { | ||
super(...arguments); | ||
|
||
// Retrieve API key from fields or environment variables. | ||
const apiKey = fields?.apiKey ?? getEnvironmentVariable("SERPAPI_API_KEY"); | ||
|
||
// Throw an error if no API key is found. | ||
if (!apiKey) { | ||
throw new Error( | ||
`SerpApi key not set. You can set it as "SERPAPI_API_KEY" in your environment variables.` | ||
); | ||
} | ||
this.apiKey = apiKey; | ||
} | ||
|
||
/** | ||
* Makes a request to SerpApi for Google Scholar results. | ||
* @param input - Search query string. | ||
* @returns A JSON string containing the search results. | ||
* @throws Error if the API request fails or returns an error. | ||
*/ | ||
async _call(input: string): Promise<string> { | ||
// Construct the URL for the API request. | ||
const url = `https://serpapi.com/search.json?q=${encodeURIComponent( | ||
input | ||
)}&engine=google_scholar&api_key=${this.apiKey}`; | ||
|
||
// Make an HTTP GET request to the SerpApi service. | ||
const response = await fetch(url); | ||
|
||
// Handle non-OK responses by extracting the error message. | ||
if (!response.ok) { | ||
let message; | ||
try { | ||
const json = await response.json(); // Attempt to parse the error response. | ||
message = json.error; // Extract the error message from the response. | ||
} catch (error) { | ||
// Handle cases where the response isn't valid JSON. | ||
message = | ||
"Unable to parse error message: SerpApi did not return a JSON response."; | ||
} | ||
// Throw an error with detailed information about the failure. | ||
throw new Error( | ||
`Got ${response.status}: ${response.statusText} error from SerpApi: ${message}` | ||
); | ||
} | ||
|
||
// Parse the JSON response from SerpApi. | ||
const json = await response.json(); | ||
|
||
// Transform the raw response into a structured format. | ||
const results = | ||
json.organic_results?.map((item: any) => ({ | ||
title: item.title, // Title of the article or paper. | ||
link: item.link, // Direct link to the article or paper. | ||
snippet: item.snippet, // Brief snippet or description. | ||
publication_info: | ||
item.publication_info?.summary | ||
?.split(" - ") // Split the summary at hyphens. | ||
.slice(1) // Remove the authors from the start of the string. | ||
.join(" - ") ?? "", // Rejoin remaining parts as publication info. | ||
authors: | ||
item.publication_info?.authors | ||
?.map((author: any) => author.name) // Extract the list of author names. | ||
.join(", ") ?? "", // Join author names with a comma. | ||
total_citations: item.inline_links?.cited_by?.total ?? "", // Total number of citations. | ||
})) ?? `No results found for ${input} on Google Scholar.`; | ||
|
||
// Return the results as a formatted JSON string. | ||
return JSON.stringify(results, null, 2); | ||
} | ||
} |
Oops, something went wrong.