Skip to content

Semantic flow graphs for data science

License

Notifications You must be signed in to change notification settings

epatters/semanticflowgraph

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Semantic flow graphs

Build Status DOI

Create semantic dataflow graphs of data science code.

Using this package, you can convert data science code to dataflow graphs with semantic content. The package works in tandem with the Data Science Ontology and our language-specific program analysis tools. Currently Python and R are supported.

For more information, please see our research paper on "Teaching machines to understand data science code by semantic enrichment of dataflow graphs".

Command-line interface

We provide a CLI that supports the recording, semantic enrichment, and visualization of flow graphs. To set up the CLI, install this package and add the bin directory to your PATH. Invoke the CLI by running flowgraphs.jl in your terminal.

The CLI includes the following commands:

  • record: Record a raw flow graph by running a script.
    Requirements: To record a Python script, you must install the Julia package PyCall.jl and the Python package flowgraph. Likewise, to record an R script, you must install the Julia package RCall.jl and the R package flowgraph.
  • enrich: Convert a raw flow graph to a semantic flow graph.
  • visualize: Visualize a flow graph using Graphviz.
    Requirements: To output an image, using the --to switch, you must install Graphviz.

All the commands take as primary argument either a directory, which is filtered by file extension, or a single file, arbitrarily named.

CLI examples

Record all Python/R scripts in the current directory, yielding raw flow graphs:

flowgraphs.jl record .

Convert a raw flow graph to a semantic flow graph:

flowgraphs.jl enrich my_script.py.graphml --out my_script.graphml

Visualize a semantic flow graph, creating and opening an SVG file:

flowgraphs.jl visualize myscript.graphml --to svg --open

About

Semantic flow graphs for data science

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Julia 99.6%
  • Other 0.4%