From 12d74d5bef144bbdd4c76c08a6e9c286909eb8a7 Mon Sep 17 00:00:00 2001 From: Erick Friis Date: Wed, 4 Dec 2024 10:15:34 -0800 Subject: [PATCH] docs: single security doc (#28515) --- SECURITY.md | 37 +++++++++++++++++++++++++++++++------ docs/Makefile | 1 + docs/docs/security.md | 30 ------------------------------ 3 files changed, 32 insertions(+), 36 deletions(-) delete mode 100644 docs/docs/security.md diff --git a/SECURITY.md b/SECURITY.md index 50e0632582c68..15e44be0b4314 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -1,5 +1,30 @@ # Security Policy +LangChain has a large ecosystem of integrations with various external resources like local and remote file systems, APIs and databases. These integrations allow developers to create versatile applications that combine the power of LLMs with the ability to access, interact with and manipulate external resources. + +## Best practices + +When building such applications developers should remember to follow good security practices: + +* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc. as appropriate for your application. +* **Anticipate Potential Misuse**: Just as humans can err, so can Large Language Models (LLMs). Always assume that any system access or credentials may be used in any way allowed by the permissions they are assigned. For example, if a pair of database credentials allows deleting data, it’s safest to assume that any LLM able to use those credentials may in fact delete data. +* [**Defense in Depth**](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)): No security technique is perfect. Fine-tuning and good chain design can reduce, but not eliminate, the odds that a Large Language Model (LLM) may make a mistake. It’s best to combine multiple layered security approaches rather than relying on any single layer of defense to ensure security. For example: use both read-only permissions and sandboxing to ensure that LLMs are only able to access data that is explicitly meant for them to use. + +Risks of not doing so include, but are not limited to: +* Data corruption or loss. +* Unauthorized access to confidential information. +* Compromised performance or availability of critical resources. + +Example scenarios with mitigation strategies: + +* A user may ask an agent with access to the file system to delete files that should not be deleted or read the content of files that contain sensitive information. To mitigate, limit the agent to only use a specific directory and only allow it to read or write files that are safe to read or write. Consider further sandboxing the agent by running it in a container. +* A user may ask an agent with write access to an external API to write malicious data to the API, or delete data from that API. To mitigate, give the agent read-only API keys, or limit it to only use endpoints that are already resistant to such misuse. +* A user may ask an agent with access to a database to drop a table or mutate the schema. To mitigate, scope the credentials to only the tables that the agent needs to access and consider issuing READ-ONLY credentials. + +If you're building applications that access external resources like file systems, APIs +or databases, consider speaking with your company's security team to determine how to best +design and secure your applications. + ## Reporting OSS Vulnerabilities LangChain is partnered with [huntr by Protect AI](https://huntr.com/) to provide @@ -14,7 +39,7 @@ Before reporting a vulnerability, please review: 1) In-Scope Targets and Out-of-Scope Targets below. 2) The [langchain-ai/langchain](https://python.langchain.com/docs/contributing/repo_structure) monorepo structure. -3) LangChain [security guidelines](https://python.langchain.com/docs/security) to +3) The [Best practicies](#best-practices) above to understand what we consider to be a security vulnerability vs. developer responsibility. @@ -33,13 +58,13 @@ The following packages and repositories are eligible for bug bounties: All out of scope targets defined by huntr as well as: - **langchain-experimental**: This repository is for experimental code and is not - eligible for bug bounties, bug reports to it will be marked as interesting or waste of + eligible for bug bounties (see [package warning](https://pypi.org/project/langchain-experimental/)), bug reports to it will be marked as interesting or waste of time and published with no bounty attached. - **tools**: Tools in either langchain or langchain-community are not eligible for bug bounties. This includes the following directories - - langchain/tools - - langchain-community/tools - - Please review our [security guidelines](https://python.langchain.com/docs/security) + - libs/langchain/langchain/tools + - libs/community/langchain_community/tools + - Please review the [best practices](#best-practices) for more details, but generally tools interact with the real world. Developers are expected to understand the security implications of their code and are responsible for the security of their tools. @@ -47,7 +72,7 @@ All out of scope targets defined by huntr as well as: case basis, but likely will not be eligible for a bounty as the code is already documented with guidelines for developers that should be followed for making their application secure. -- Any LangSmith related repositories or APIs see below. +- Any LangSmith related repositories or APIs (see [Reporting LangSmith Vulnerabilities](#reporting-langsmith-vulnerabilities)). ## Reporting LangSmith Vulnerabilities diff --git a/docs/Makefile b/docs/Makefile index a3c41260e3dd0..f8d5d96714dd9 100644 --- a/docs/Makefile +++ b/docs/Makefile @@ -47,6 +47,7 @@ generate-files: $(PYTHON) scripts/partner_pkg_table.py $(INTERMEDIATE_DIR) curl https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md | sed 's/<=/\<=/g' > $(INTERMEDIATE_DIR)/langserve.md + cp ../SECURITY.md $(INTERMEDIATE_DIR)/security.md $(PYTHON) scripts/resolve_local_links.py $(INTERMEDIATE_DIR)/langserve.md https://github.com/langchain-ai/langserve/tree/main/ copy-infra: diff --git a/docs/docs/security.md b/docs/docs/security.md deleted file mode 100644 index 08e841c89a25d..0000000000000 --- a/docs/docs/security.md +++ /dev/null @@ -1,30 +0,0 @@ -# Security - -LangChain has a large ecosystem of integrations with various external resources like local and remote file systems, APIs and databases. These integrations allow developers to create versatile applications that combine the power of LLMs with the ability to access, interact with and manipulate external resources. - -## Best practices - -When building such applications developers should remember to follow good security practices: - -* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc. as appropriate for your application. -* **Anticipate Potential Misuse**: Just as humans can err, so can Large Language Models (LLMs). Always assume that any system access or credentials may be used in any way allowed by the permissions they are assigned. For example, if a pair of database credentials allows deleting data, it’s safest to assume that any LLM able to use those credentials may in fact delete data. -* [**Defense in Depth**](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)): No security technique is perfect. Fine-tuning and good chain design can reduce, but not eliminate, the odds that a Large Language Model (LLM) may make a mistake. It’s best to combine multiple layered security approaches rather than relying on any single layer of defense to ensure security. For example: use both read-only permissions and sandboxing to ensure that LLMs are only able to access data that is explicitly meant for them to use. - -Risks of not doing so include, but are not limited to: -* Data corruption or loss. -* Unauthorized access to confidential information. -* Compromised performance or availability of critical resources. - -Example scenarios with mitigation strategies: - -* A user may ask an agent with access to the file system to delete files that should not be deleted or read the content of files that contain sensitive information. To mitigate, limit the agent to only use a specific directory and only allow it to read or write files that are safe to read or write. Consider further sandboxing the agent by running it in a container. -* A user may ask an agent with write access to an external API to write malicious data to the API, or delete data from that API. To mitigate, give the agent read-only API keys, or limit it to only use endpoints that are already resistant to such misuse. -* A user may ask an agent with access to a database to drop a table or mutate the schema. To mitigate, scope the credentials to only the tables that the agent needs to access and consider issuing READ-ONLY credentials. - -If you're building applications that access external resources like file systems, APIs -or databases, consider speaking with your company's security team to determine how to best -design and secure your applications. - -## Reporting a vulnerability - -Please report security vulnerabilities by email to security@langchain.dev. This will ensure the issue is promptly triaged and acted upon as needed.