Construct a Repo-Specific Semantic Graph of a project. For simplicity, the Python package name is
called dependency_graph
.
- Multi-language support: Python, Java, C#, JavaScript, TypeScript.
- Retrieval of cross-file context from the Repo-Specific Semantic Graph.
- Serialization/Deserialization: edge list in JSON format is supported to output the graph. The graph can also be deserialized from JSON/Dictionary.
- Graph visualizations: Pyvis and ipysigma interactive visualizations are supported.
- Analyze on virtual file system: the graph can be constructed from a virtual file system, which is useful for real-time analysis when the codebase originates from a dataset and the original file structure is either unavailable or cannot be restored.
- The graph supports topological sorting, allowing the graph to be ordered according to specific relationships. This feature is particularly useful for sorting project files based on import dependencies.
Currently, the following Repo-Specific Semantic Graph generator types are supported, with the corresponding languages and dependency relations:
Graph Generator Type | Supported Languages | Supported Dependency Relations |
---|---|---|
Jedi | Python | ParentOf/ChildOf, Imports/ImportedBy, BaseClassOf/DerivedClassOf, Overrides/OverriddenBy, Calls/CalledBy, Instantiates/InstantiatedBy, Uses/UsedBy, Defines/DefinedBy |
Tree-sitter | Python, Java, C, C++, Kotlin, C#, TypeScript, JavaScript, Go, PHP, Rust, Swift, Ruby, Lua, Bash, R | Imports/ImportedBy |
First, install poetry via https://python-poetry.org/docs/#installation, then run the following command to create a Python virtual environment and install the dependencies:
poetry install
# Activate the virtual environment
poetry shell
See python -m dependency_graph -h
for help:
~ python -m dependency_graph -h
usage: __main__.py [-h] -r REPO -l LANG [-g GRAPH_GENERATOR] [-f {edgelist,pyvis,ipysigma}] [-o OUTPUT_FILE]
Construct Repo-Specific Semantic Graph for a given project.
options:
-h, --help show this help message and exit
-r REPO, --repo REPO The path to a local repository.
-l LANG, --lang LANG The language of the parsed file.
-g GRAPH_GENERATOR, --graph-generator GRAPH_GENERATOR
The code agent type to use. Should be one of the ['jedi', 'tree_sitter']. Defaults to jedi.
-f {edgelist,pyvis,ipysigma}, --output-format {edgelist,pyvis,ipysigma}
The format of the output.
-o OUTPUT_FILE, --output-file OUTPUT_FILE
The path to the output file. If not specified, will print to stdout.
Example:
python -m dependency_graph -r </path/to/repo> -l python -g jedi -f edgelist -o edgelist.json
from dependency_graph import construct_dependency_graph
from dependency_graph.models.language import Language
from dependency_graph.graph_generator import GraphGeneratorType
from dependency_graph.models.repository import Repository
dependency_graph_generator = GraphGeneratorType.JEDI
repo = Repository('/path/to/repo', Language.Python)
graph = construct_dependency_graph(repo, dependency_graph_generator)
from dependency_graph import output_dependency_graph
output_dependency_graph(graph, "edgelist", "graph.json")
# Or
output_dependency_graph(graph, "pyvis", "graph.html")
# Or
output_dependency_graph(graph, "ipysigma", "graph.html")
Get cross-file context of a specific file path:
graph.as_retriever().get_cross_file_context("/path/into/repo")
Get cross-file definition of a file by line:
graph.as_retriever().get_cross_file_definition_by_line("/path/into/repo")
For example, get call graph of the project:
from dependency_graph.models.graph_data import EdgeRelation
graph.get_related_edges(EdgeRelation.Calls)
The graph can be constructed from a virtual file system, making it useful for real-time analysis. This approach is particularly beneficial when the codebase originates from a dataset and the original file structure is either unavailable or cannot be restored.
from textwrap import dedent
from dependency_graph import construct_dependency_graph
from dependency_graph.models.language import Language
from dependency_graph.graph_generator import GraphGeneratorType
from dependency_graph.models.virtual_fs.virtual_repository import VirtualRepository
dependency_graph_generator = GraphGeneratorType.JEDI
repo = VirtualRepository(
"repo",
Language.Python,
[
(
"src/a.py",
dedent(
"""
from b import foo
def bar():
print("hello")
foo()
"""
),
),
(
"src/b.py",
dedent(
"""
def foo():
print("hello")
"""
),
),
(
"src/c.py",
dedent(
"""
from pkg_not_exist import module
"""
),
),
],
)
graph = construct_dependency_graph(repo, dependency_graph_generator)
The graph supports topological sorting, allowing the graph to be ordered according to specific relationships. This feature is particularly useful for sorting project files based on import dependencies. Notably, even if the import graph is cyclic, the graph can still perform the sorting. Furthermore, the sorting process is unique, stable, and consistent.
from dependency_graph import (
construct_dependency_graph,
GraphGeneratorType,
)
from dependency_graph.models.language import Language
from dependency_graph.models.graph_data import EdgeRelation
repo_path = "/path/to/repo"
graph = construct_dependency_graph(
repo_path,
GraphGeneratorType.TREE_SITTER,
Language.Python,
)
sorted_nodes = list(graph.get_topological_sorting(EdgeRelation.ImportedBy))