To report bugs and issues with Daft, please report in detail:
- Operating system
- Daft version
- Python version
- Runner that your code is using
Please start a GitHub Discussion in our Ideas channel. Once the feature is clarified, fleshed out and approved, the corresponding issue(s) will be created from the GitHub discussion.
When proposing features, please include:
- Feature Summary (no more than 3 sentences)
- Example usage (pseudo-code to show how it is used)
- Corner-case behavior (how should this code behave in various corner-case scenarios)
To set up your development environment:
- Ensure that your system has a suitable Python version installed (>=3.7, <=3.11)
- Install the Rust compilation toolchain
- Clone the Daft repo:
git clone [email protected]:Eventual-Inc/Daft.git
- Run
make .venv
from your new cloned Daft repository to create a new virtual environment with all of Daft's development dependencies installed - Run
make hooks
to install pre-commit hooks: these will run tooling on every commit to ensure that your code meets Daft development standards
make build
: recompile your code after modifying any Rust code insrc/
make test
: run testsDAFT_RUNNER=ray make test
: set the runner to the Ray runner and run tests (DAFT_RUNNER defaults topy
)
Running a development version of Daft on a local Ray cluster is as simple as including daft.context.set_runner_ray()
in your Python script and then building and executing it as usual.
To use a remote Ray cluster, run the following steps on the same operating system version as your Ray nodes, in order to ensure that your binaries are executable on Ray.
mkdir wd
: this is the working directory, it will hold all the files to be submitted to Ray for a jobln -s daft wd/daft
: create a symbolic link from the Python module to the working directorymake build-release
: an optimized build to ensure that the module is small enough to be successfully uploaded to Ray. Run this after modifying any Rust code insrc/
ray job submit --working-dir wd --address "http://<head_node_host>:8265" -- python script.py
: submitwd/script.py
to be run on Ray
Benchmark tests are located in tests/benchmarks
. If you would like to run benchmarks, make sure to first do make build-release
instead of make build
in order to compile an optimized build of Daft.
pytest tests/benchmarks/[test_file.py] -m benchmark
: Run all benchmarks in a filepytest tests/benchmarks/[test_file.py] -k [test_name] -m benchmark
: Run a specific benchmark in a file
More information about writing and using benchmarks can be found on the pytest-benchmark docs.