Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package review PR #1

Open
wants to merge 57 commits into
base: test
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 46 commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
5237a10
Initial commit
KarthikSubbarao Apr 26, 2024
a96c08a
Add support for BF.ADD, BF.EXISTS, BF.CARD
KarthikSubbarao Apr 26, 2024
be97a25
Update README
KarthikSubbarao Apr 26, 2024
5ebc1cb
Optimize bloom operations to use a reference to already created Bloom…
KarthikSubbarao Apr 27, 2024
8557b55
Remove older Bloom APIs using prev serialization apporach
KarthikSubbarao Apr 27, 2024
1729520
Add Dev Profile in Cargo.toml for debugging
KarthikSubbarao Apr 27, 2024
be80d9d
Create Module Datatype and support RDB load, save, free
KarthikSubbarao Apr 27, 2024
ed4d0f9
Update README.md
KarthikSubbarao Apr 27, 2024
2db0e4b
Add support for BF.RESERVE and expansion config
KarthikSubbarao Apr 27, 2024
778c219
Add support for BF.INFO
KarthikSubbarao Apr 27, 2024
b11970e
Add support for BF.MEXISTS
KarthikSubbarao Apr 27, 2024
bdf5f0d
Add support for BF.MADD
KarthikSubbarao Apr 27, 2024
8f5e3fc
Update error handling / messages and update expansion logic
KarthikSubbarao Apr 28, 2024
5545680
Add auto scaling support for bloom filters
KarthikSubbarao Apr 28, 2024
42d015c
Fix RDB Save for scaled filters
KarthikSubbarao Apr 28, 2024
481ecac
Refactoring
KarthikSubbarao Apr 28, 2024
ff60817
Update TODOs
KarthikSubbarao Apr 28, 2024
1c1b6ab
Fix mem_usage calculation
KarthikSubbarao Apr 28, 2024
d00fa99
Update Cargo.toml
KarthikSubbarao Apr 28, 2024
053d5e9
minor refactoring
KarthikSubbarao Apr 28, 2024
fbdfb5b
Add support for BF.INSERT and fix multi add logic
KarthikSubbarao Apr 29, 2024
a2343b7
Update error messages
KarthikSubbarao Apr 29, 2024
753575a
Type conversions + struct updates to use less memory
KarthikSubbarao Apr 29, 2024
3f85cae
fixed few clippy warnings
KarthikSubbarao Apr 29, 2024
159c938
fix clippy warnings
KarthikSubbarao Apr 29, 2024
bebf72a
Update README and address all clippy warnings
KarthikSubbarao Apr 30, 2024
ef4e215
Specify package license
KarthikSubbarao Apr 30, 2024
6ca68c5
Cargo fmt
KarthikSubbarao Apr 30, 2024
b7504e9
Add support for COPY commands on BloomFilter datatypes
KarthikSubbarao May 1, 2024
3ebb15b
Add more documentation
KarthikSubbarao Jul 2, 2024
ec72262
Migrate to valkeymodule-rs
KarthikSubbarao Jul 2, 2024
2816474
Implement free_effort datatype callback, Make aux_load cb ignore invo…
KarthikSubbarao Jul 16, 2024
3c285d5
Defrag Bloom Object callback
KarthikSubbarao Jul 17, 2024
ffb722f
Minor Refactoring + Handle Non Scaling Filters which are filled to ca…
KarthikSubbarao Aug 21, 2024
5b0df15
Replication support + Update Module/Datatype name + Refactor
KarthikSubbarao Aug 23, 2024
c66243d
Update data type name and use static str for errors
KarthikSubbarao Aug 23, 2024
0d2ad8a
Support keyspace notifications for write operations
KarthikSubbarao Aug 26, 2024
85e20fd
Add Python testing framework to support Integration testing of the bl…
KarthikSubbarao Sep 10, 2024
0b9a90d
Types, Ranges, limit updates and overflow handling
KarthikSubbarao Sep 13, 2024
a180f42
Add unit testing support using the valkeymodule-rs enable-system-allo…
KarthikSubbarao Sep 16, 2024
736a48a
Add unit testing for scaling & non scaling filters for behavior and f…
KarthikSubbarao Sep 17, 2024
6d0a4b6
Merge pull request #3 from KarthikSubbarao/unstable
KarthikSubbarao Sep 18, 2024
dee8a7e
Adding github workflow for building, running format checks, unit test…
zackcam Sep 19, 2024
145f9b2
Merge pull request #5 from zackcam/unstable
KarthikSubbarao Sep 19, 2024
282599d
RDB format optimization: Using a fixed seed for bloom filters (#2)
YueTang-Vanessa Sep 19, 2024
687bec7
Update build.sh and fix import in save/restore pytest
KarthikSubbarao Sep 19, 2024
f299216
Adding replication ability to valkey test case, changing waiter funct…
zackcam Oct 2, 2024
4d38649
Integration tests for commands, delete operations, expiry, compatibil…
YueTang-Vanessa Oct 9, 2024
fde4d37
Updating _ to -in lib.rs. Also updating loading from rdb to use a tra…
zackcam Oct 10, 2024
f5aa39e
Add Integration Testing for correctness of scaling and non scaling fi…
KarthikSubbarao Oct 15, 2024
57a7901
Add integration tests for maxmemory scenarios and replication correct…
KarthikSubbarao Oct 17, 2024
12031b2
Handle bloom object max allowed size limit, switch fp rate to f64 (#18)
KarthikSubbarao Oct 29, 2024
31d21c0
Adding info handler, that contains three fields, num_objects, num_fil…
zackcam Nov 5, 2024
b301baa
add bf.load to support bloom filter aofrewrite. (#17)
wuranxx Nov 15, 2024
a33e0e3
Add new metrics to show capacity and items across objects (#20)
YueTang-Vanessa Nov 19, 2024
7c25468
Support for DEBUG DIGEST module data type callback (#21)
nnmehta Nov 30, 2024
34d6fb0
Updating how we create BloomFilter from rdb loads and upgrading bloom…
zackcam Dec 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 103 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
name: ci
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am a new developer looking at this file, how do I understand the intent of this file without documentation? Are these configurations obvious? What documentation did you read to write this file?


on:
push:
pull_request:

env:
CARGO_TERM_COLOR: always
VALKEY_REPO_URL: https://github.com/valkey-io/valkey.git

jobs:
build-ubuntu-latest:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
server_version: ['unstable', '8.0.0']
steps:
- uses: actions/checkout@v4
- name: Set the server verison for python integeration tests
run: echo "SERVER_VERSION=${{ matrix.server_version }}" >> $GITHUB_ENV
- name: Run cargo and clippy format check
run: |
cargo fmt --check
cargo clippy --profile release --all-targets -- -D clippy::all
- name: Release Build
run: cargo build --all --all-targets --release
- name: Run unit tests
run: cargo test --features enable-system-alloc
- name: Make valkey-server binary
run: |
mkdir -p "tests/.build/binaries/${{ matrix.server_version }}"
cd tests/.build
git clone "${{ env.VALKEY_REPO_URL }}"
cd valkey
git checkout ${{ matrix.server_version }}
make
cp src/valkey-server ../binaries/${{ matrix.server_version }}/
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.8'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Update module path
run: echo "MODULE_PATH=$(realpath target/release/libvalkey_bloom.so)" >> $GITHUB_ENV
- name: Run integration tests
run: python -m pytest --cache-clear -v "tests/"

build-macos-latest:
runs-on: macos-latest
steps:
- uses: actions/checkout@v4
- name: Run cargo and clippy format check
run: |
cargo fmt --check
cargo clippy --profile release --all-targets -- -D clippy::all
- name: Release Build
run: cargo build --all --all-targets --release
- name: Run unit tests
run: cargo test --features enable-system-alloc

asan-build:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
server_version: ['unstable', '8.0.0']
steps:
- uses: actions/checkout@v4
- name: Set the server verison for python integeration tests
run: echo "SERVER_VERSION=${{ matrix.server_version }}" >> $GITHUB_ENV
- name: Run cargo and clippy format check
run: |
cargo fmt --check
cargo clippy --profile release --all-targets -- -D clippy::all
- name: Release Build
run: cargo build --all --all-targets --release
- name: Run unit tests
run: cargo test --features enable-system-alloc
- name: Make Valkey-server binary with asan
run: |
mkdir -p "tests/.build/binaries/${{ matrix.server_version }}"
cd tests/.build
git clone "${{ env.VALKEY_REPO_URL }}"
cd valkey
git checkout ${{ matrix.server_version }}
make SANITIZER=address SERVER_CFLAGS='-Werror' BUILD_TLS=module
cp src/valkey-server ../binaries/${{ matrix.server_version }}/
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.8'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Update module path
run: echo "MODULE_PATH=$(realpath target/release/libvalkey_bloom.so)" >> $GITHUB_ENV
- name: Run integration tests
run: python -m pytest --cache-clear -v "tests/"
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Cargo.lock
target
tests/.build
__pycache__
test-data
.attach_pid*
33 changes: 33 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
[package]
name = "valkey-bloom"
authors = ["Karthik Subbarao"]
version = "0.1.0"
edition = "2021"
license = "BSD-3-Clause"
repository = "https://github.com/valkey-io/valkey-bloom"
readme = "README.md"
description = "A bloom filter module for Valkey"
homepage = "https://github.com/valkey-io/valkey-bloom"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
valkey-module = "0.1.2"
bloomfilter = "1.0.13"
lazy_static = "1.4.0"
libc = "0.2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need libc?
Why do we not depend on "valkey-module-macros" package? (valkey-module-rs github repo)


[dev-dependencies]
rand = "0.8"

[lib]
crate-type = ["cdylib"]
name = "valkey_bloom"

[profile.dev]
opt-level = 0
debug = 2
debug-assertions = true

[features]
enable-system-alloc = ["valkey-module/enable-system-alloc"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intention with this configuration?

117 changes: 116 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,116 @@
# valkey-bloom
# valkey-bloom

Valkey-Bloom (BSD-3-Clause) is a Rust Valkey-Module which brings a native and space efficient probabilistic Module data type to Valkey. With this, users can create filters (space-efficient probabilistic Module data type) to add elements, perform “check” operation to test whether an element exists, auto scale their filters, perform RDB Save and load operations, etc.

Valkey-Bloom is built using bloomfilter::Bloom (https://crates.io/crates/bloomfilter which has a BSD-2-Clause license).

It is compatible with the BloomFilter (BF.*) command APIs of the ReBloom Module from Redis Ltd.

The following commands are supported.
```
BF.EXISTS
BF.ADD
BF.MEXISTS
BF.MADD
BF.CARD
BF.RESERVE
BF.INFO
BF.INSERT
```

Build instructions for Linux.
```
curl https://sh.rustup.rs -sSf | sh
sudo yum install clang
git clone https://github.com/KarthikSubbarao/valkey-bloom.git
cd valkey-bloom
cargo build --all --all-targets --release
valkey-server --loadmodule ./target/release/libvalkey_bloom.so
```

Local development script to build, run format checks, run unit / integration tests, and for cargo release:
```
# Builds the valkey-server (unstable) for integration testing.
SERVER_VERSION=unstable
./build.sh
# Builds the valkey-server (8.0.0) for integration testing.
SERVER_VERSION=8.0.0
./build.sh
```

Client Usage
```
<redacted> % ./valkey-cli
127.0.0.1:6379> module list
1) 1) "name"
2) "bloom"
3) "ver"
4) (integer) 1
5) "path"
6) "./target/release/libvalkey_bloom.so"
7) "args"
8) (empty array)
127.0.0.1:6379> bf.add key item
(integer) 1
127.0.0.1:6379> bf.exists key item
(integer) 1
127.0.0.1:6379> bf.exists key item2
(integer) 0
127.0.0.1:6379> bf.card key
(integer) 1
127.0.0.1:6379> bf.reserve key 0.01 10000
(error) ERR item exists
127.0.0.1:6379> bf.reserve key1 0.01 10000
OK
127.0.0.1:6379> bf.card key1
(integer) 0
127.0.0.1:6379> bf.add key1 item
(integer) 1
127.0.0.1:6379> bf.card key1
(integer) 1
```

```
127.0.0.1:6379> bf.reserve key1 0.01 10000
OK
127.0.0.1:6379> bf.info key3
(empty array)
127.0.0.1:6379> bf.info key1
1) Capacity
2) (integer) 10000
3) Size
4) (integer) 12198
5) Number of filters
6) (integer) 1
7) Number of items inserted
8) (integer) 0
9) Expansion rate
10) (integer) 2
```

RDB Load, Save and flushall validation
```
127.0.0.1:6379> info keyspace
# Keyspace
127.0.0.1:6379> bf.add key item
(integer) 1
127.0.0.1:6379> info keyspace
# Keyspace
db0:keys=1,expires=0,avg_ttl=0
127.0.0.1:6379> flushall
OK
127.0.0.1:6379> info keyspace
# Keyspace
127.0.0.1:6379> bf.add key item
(integer) 1
127.0.0.1:6379> bgsave
Background saving started
127.0.0.1:6379> shutdown
not connected> info keyspace // Started up
# Keyspace
db0:keys=1,expires=0,avg_ttl=0
127.0.0.1:6379> keys *
1) "key"
127.0.0.1:6379> bf.exists key item
(integer) 1
```
69 changes: 69 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
#!/usr/bin/env sh

# Script to run format checks valkey-bloom module, build it and generate .so files, run unit and integration tests.

# Exit the script if any command fails
set -e

SCRIPT_DIR=$(pwd)
echo "Script Directory: $SCRIPT_DIR"

echo "Running cargo and clippy format checks..."
cargo fmt --check
cargo clippy --profile release --all-targets -- -D clippy::all

echo "Running cargo build release..."
cargo build --all --all-targets --release

echo "Running unit tests..."
cargo test --features enable-system-alloc

# Ensure SERVER_VERSION environment variable is set
if [ -z "$SERVER_VERSION" ]; then
echo "ERROR: SERVER_VERSION environment variable is not set. Defaulting to unstable."
export SERVER_VERSION="unstable"
fi

if [ "$SERVER_VERSION" != "unstable" ] && [ "$SERVER_VERSION" != "8.0.0" ] ; then
echo "ERROR: Unsupported version - $SERVER_VERSION"
exit 1
fi

REPO_URL="https://github.com/valkey-io/valkey.git"
BINARY_PATH="tests/.build/binaries/$SERVER_VERSION/valkey-server"

if [ -f "$BINARY_PATH" ] && [ -x "$BINARY_PATH" ]; then
echo "valkey-server binary '$BINARY_PATH' found."
else
echo "valkey-server binary '$BINARY_PATH' not found."
mkdir -p "tests/.build/binaries/$SERVER_VERSION"
cd tests/.build
rm -rf valkey
git clone "$REPO_URL"
cd valkey
git checkout "$SERVER_VERSION"
make
cp src/valkey-server ../binaries/$SERVER_VERSION/
fi

REQUIREMENTS_FILE="requirements.txt"

# Check if pip is available
if command -v pip > /dev/null 2>&1; then
echo "Using pip to install packages..."
pip install -r "$SCRIPT_DIR/$REQUIREMENTS_FILE"
# Check if pip3 is available
elif command -v pip3 > /dev/null 2>&1; then
echo "Using pip3 to install packages..."
pip3 install -r "$SCRIPT_DIR/$REQUIREMENTS_FILE"
else
echo "Error: Neither pip nor pip3 is available. Please install Python package installer."
exit 1
fi

export MODULE_PATH="$SCRIPT_DIR/target/release/libvalkey_bloom.so"

echo "Running the integration tests..."
python3 -m pytest --cache-clear -v "$SCRIPT_DIR/tests/"

echo "Build and Integration Tests succeeded"
2 changes: 2 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
valkey
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this file do?

pytest==4
Loading
Loading