Skip to content

Commit

Permalink
Updated README and refactored crate into 3 subcrates
Browse files Browse the repository at this point in the history
  • Loading branch information
LucaCappelletti94 committed Apr 15, 2024
1 parent 5b10e62 commit 682a5e1
Show file tree
Hide file tree
Showing 27 changed files with 445 additions and 279 deletions.
6 changes: 3 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ keywords = ["sql", "minifier"]
categories = ["filesystem", "database"]
authors = ["Marco Visani"]

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
regex = "1.10.4"
minify_sql = { path = "minify_sql" }
load_sql_proc = { path = "load_sql_proc" }
minify_sql_proc = { path = "minify_sql_proc" }
82 changes: 46 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
[![Documentation](https://docs.rs/sql_minifier/badge.svg)](https://docs.rs/sql_minifier)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)

This crate provides a simple SQL minifier. It removes comments, unnecessary
whitespaces, and shortens they keywords that can be shortened from SQL files.
This crate provides the methods and procedural macros to minify SQL code, optionally at compile time.
It removes comments, unnecessary whitespaces, and shortens SQL keywords such as `INTEGER` to `INT`.

## Installation
Add the following to your `Cargo.toml` file:
Expand All @@ -19,36 +19,37 @@ or use the following command:
cargo add sql_minifier
```

## Usage
The create provides two main functions:
- `minifiy_sql_to_string` which reads an SQL file and returns a `String` of the
minified SQL.
- `minifiy_sql_to_file` which reads an SQL file and writes the minified SQL
to a new file specified by the user.
## Examples
Suppose you have an SQL string and you want to minify it. You can use the `minify_sql` function:

Additionally, the crate provides a macro `minify_sql_files!` that can be used
to minify SQL files at compile time. The macro accepts file paths as input.
```rust
use sql_minifier::minify_sql;

It's important to note that the macro will write the minified SQL to a new file
with the same name as the input file, but with the suffix `_minified`.
Additionally, it will append the `_minified` suffix just before the last `.` in the
file name. For instance, if the input file is `test_data/test_file_1.sql`, the
minified file will be named `test_data/test_file_1_minified.sql`.
let minified: String = minify_sql(
"-- Your SQL goes here
CREATE TABLE IF NOT EXISTS taxa (
-- The unique identifier for the taxon
id UUID PRIMARY KEY,
-- The scientific name of the taxon
name TEXT NOT NULL,
-- The NCBI Taxon ID is a unique identifier for a taxon in the NCBI Taxonomy database
-- which may be NULL when this taxon is not present in the NCBI Taxonomy database.
ncbi_taxon_id INTEGER
);"
);

The macro can be utilized as follows:
```rust
use sql_minifier::prelude::*;
minify_sql_files!(
"test_data/test_file_1.sql",
"test_data/test_file_2.sql",
"test_data/test_file_3.sql"
);
assert_eq!(
minified,
"CREATE TABLE IF NOT EXISTS taxa(id UUID PRIMARY KEY,name TEXT NOT NULL,ncbi_taxon_id INT)"
);
```

## Example
The following SQL file:
```sql
-- Your SQL goes here
If you want this to be done at compile time, you can use the `minify_sql` macro:
```rust
use sql_minifier::macros::minify_sql;

const SQL_CONTENT: &str = minify_sql!(
"-- Your SQL goes here
CREATE TABLE IF NOT EXISTS taxa (
-- The unique identifier for the taxon
id UUID PRIMARY KEY,
Expand All @@ -57,15 +58,16 @@ CREATE TABLE IF NOT EXISTS taxa (
-- The NCBI Taxon ID is a unique identifier for a taxon in the NCBI Taxonomy database
-- which may be NULL when this taxon is not present in the NCBI Taxonomy database.
ncbi_taxon_id INTEGER
);"
);
```

will be minified to:
```sql
CREATE TABLE IF NOT EXISTS taxa ( id UUID PRIMARY KEY, name TEXT NOT NULL, ncbi_taxon_id INT);
assert_eq!(
SQL_CONTENT,
"CREATE TABLE IF NOT EXISTS taxa(id UUID PRIMARY KEY,name TEXT NOT NULL,ncbi_taxon_id INT)"
);
```

A more complex SQL file:
A more complex [SQL file](tests/test_file_3.sql) such as:
```sql
-- SQL defining the container_horizontal_rules table.
-- The container horizontal rules define whether an item type can be placed next to another item type.
Expand Down Expand Up @@ -111,7 +113,15 @@ CREATE TABLE container_horizontal_rules (
/* and other multiline comment */
```

will be minified to:
```sql
CREATE TABLE container_horizontal_rules ( id UUID PRIMARY KEY REFERENCES describables(id) ON DELETE CASCADE, item_type_id UUID REFERENCES item_categories(id) ON DELETE CASCADE, other_item_type_id UUID REFERENCES item_categories(id) ON DELETE CASCADE, minimum_temperature FLOAT DEFAULT NULL, maximum_temperature FLOAT DEFAULT NULL, minimum_humidity FLOAT DEFAULT NULL, maximum_humidity FLOAT DEFAULT NULL, minimum_pressure FLOAT DEFAULT NULL, maximum_pressure FLOAT DEFAULT NULL, CHECK ( minimum_temperature IS NULL OR maximum_temperature IS NULL OR minimum_temperature <= maximum_temperature ), CHECK ( minimum_humidity IS NULL OR maximum_humidity IS NULL OR minimum_humidity <= maximum_humidity ), CHECK ( minimum_pressure IS NULL OR maximum_pressure IS NULL OR minimum_pressure <= maximum_pressure ));
```
We can load it and minify it at compile time using the `load_sql` macro:
```rust
use sql_minifier::macros::load_sql;

const SQL_CONTENT: &str = load_sql!("tests/test_file_3.sql");

assert_eq!(
SQL_CONTENT,
"CREATE TABLE container_horizontal_rules(id UUID PRIMARY KEY REFERENCES describables(id)ON DELETE CASCADE,item_type_id UUID REFERENCES item_categories(id)ON DELETE CASCADE,other_item_type_id UUID REFERENCES item_categories(id)ON DELETE CASCADE,minimum_temperature FLOAT DEFAULT NULL,maximum_temperature FLOAT DEFAULT NULL,minimum_humidity FLOAT DEFAULT NULL,maximum_humidity FLOAT DEFAULT NULL,minimum_pressure FLOAT DEFAULT NULL,maximum_pressure FLOAT DEFAULT NULL,CHECK(minimum_temperature IS NULL OR maximum_temperature IS NULL OR minimum_temperature<=maximum_temperature),CHECK(minimum_humidity IS NULL OR maximum_humidity IS NULL OR minimum_humidity<=maximum_humidity),CHECK(minimum_pressure IS NULL OR maximum_pressure IS NULL OR minimum_pressure<=maximum_pressure))"
);
```

13 changes: 13 additions & 0 deletions load_sql_proc/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[package]
name = "load_sql_proc"
version = "0.1.0"
edition = "2021"

[lib]
proc-macro = true

[dependencies]
quote = "1.0.36"
regex = "1.10.4"
syn = "2.0.59"
minify_sql = { path = "../minify_sql" }
43 changes: 43 additions & 0 deletions load_sql_proc/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
//! This crate provides a procedural macro to minify SQL queries at compile time.
#![deny(missing_docs)]
extern crate proc_macro;

use proc_macro::TokenStream;
use quote::quote;

/// This macro will load and minify the provided SQL document at compile time
///
/// # Arguments
/// * `path` - A string slice that holds the path to the SQL file
///
/// # Examples
///
/// ```rust
/// use load_sql_proc::load_sql;
///
/// const SQL_CONTENT: &str = load_sql!("tests/test_file_1.sql");
///
/// assert_eq!(
/// SQL_CONTENT,
/// "CREATE TABLE IF NOT EXISTS taxa(id UUID PRIMARY KEY,name TEXT NOT NULL,ncbi_taxon_id INT)"
/// );
/// ```
#[proc_macro]
pub fn load_sql(input: TokenStream) -> TokenStream {
// Parse the input token stream
let path = syn::parse_macro_input!(input as syn::LitStr).value();

// We prepend CARGO_HOME to the path, as the path is relative to the project root
let path = format!("{}/{}", std::env::var("CARGO_MANIFEST_DIR").unwrap(), path);

// Read the content of the file
let document = std::fs::read_to_string(path).expect("Could not read SQL file to minify");

// Minify the SQL content
let minified_document: String = minify_sql::minify_sql(&document);

// Return the minified SQL content
TokenStream::from(quote! {
#minified_document
})
}
8 changes: 8 additions & 0 deletions minify_sql/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[package]
name = "minify_sql"
version = "0.1.0"
edition = "2021"

[dependencies]
quote = "1.0.36"
regex = "1.10.4"
161 changes: 161 additions & 0 deletions minify_sql/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
//! Crate providing a function to minify SQL content.
#![deny(missing_docs)]

use regex::Regex;

/// Returns the provided SQL content minified.
///
/// # Arguments
/// * `path` - A string slice that holds the path to the SQL file
///
/// # Examples
///
/// ```rust
/// use minify_sql::minify_sql;
///
/// let minified: String = minify_sql(
/// "-- Your SQL goes here
/// CREATE TABLE IF NOT EXISTS taxa (
/// -- The unique identifier for the taxon
/// id UUID PRIMARY KEY,
/// -- The scientific name of the taxon
/// name TEXT NOT NULL,
/// -- The NCBI Taxon ID is a unique identifier for a taxon in the NCBI Taxonomy database
/// -- which may be NULL when this taxon is not present in the NCBI Taxonomy database.
/// ncbi_taxon_id INTEGER
/// );
/// ",
/// );
///
/// assert_eq!(
/// SQL_CONTENT,
/// "CREATE TABLE IF NOT EXISTS taxa(id UUID PRIMARY KEY,name TEXT NOT NULL,ncbi_taxon_id INT)"
/// );
/// ```
pub fn minify_sql(document: &str) -> String {
// Preliminarly, we remove all multiline comments from the file
// We need to this first, as the multiline comments can span multiple lines
// and if we remove the line breaks first, we might accidentally add new
// combinations of characters that seem to be an open or close comment.
let document_without_multiline_comments = remove_multiline_comments(&document);

// We remove in all lines of the file the single line comments
let mut document_without_comments =
remove_single_line_comments(&document_without_multiline_comments);

// We apply the minifications relative to the SQL types, such as replacing
// "INTEGER" by "INT", "BOOLEAN" by "BOOL", "CHARACTER" by "CHAR", and
// "DECIMAL" by "DEC", while handling the case where table names of column
// names contain these words.

for (long, short) in LONG_FORMAT_TYPES {
let re = Regex::new(&format!(r"\b{}\b", long)).unwrap();
document_without_comments = re
.replace_all(&document_without_comments, short)
.to_string();
}

// remove all excess whitespaces meaning that if the string has more that
// one whitespace, it will be replaced by a single whitespace
let mut output = document_without_comments
.split_whitespace()
.collect::<Vec<&str>>()
.join(" ");

// Remove all whitespace before and after commas, semicolons, and parentheses (either
// opening or closing), as well as before or after operators
for symbols in vec![
",", ";", "(", ")", ">", "<", ">=", "<=", "!=", "<>", "=", "+", "-", "*", "/",
] {
output = output.replace(&format!(" {}", symbols), symbols);
output = output.replace(&format!("{} ", symbols), symbols);
}

// If the last character is a semi-colon, remove it, as it is not needed when executing
// the SQL statement. It would be solely needed when executing multiple statements in a row
if output.ends_with(';') {
output.pop();
}

output
}

/// List of long format data types and their corresponding short format
const LONG_FORMAT_TYPES: [(&str, &str); 4] = [
("INTEGER", "INT"),
("BOOLEAN", "BOOL"),
("CHARACTER", "CHAR"),
("DECIMAL", "DEC"),
];

/// Remove all multiline comments from the SQL content
///
/// # Arguments
/// * `sql_content` - A string slice that holds the content of the SQL file
fn remove_multiline_comments(sql_content: &str) -> String {
// A multiline comment is a classical example of balanced parenthesis.
// We can use this to our advantage to remove them from the SQL content,
// where the parenthesis in question is \* and *\. These are two characters
// and not one, so we need to keep track of the last two characters we've seen
// to determine if we're in a comment or not.
let mut output = String::new();

let mut last_char = char::default();
let mut number_of_open_comments: u32 = 0;

for mut c in sql_content.chars() {
if number_of_open_comments > 0 && last_char == '*' && c == '/' {
// We're closing a comment
number_of_open_comments -= 1;
c = char::default();
} else if last_char == '/' && c == '*' {
// We're opening a comment
number_of_open_comments += 1;
c = char::default();
} else if number_of_open_comments == 0 {
// Maybe we are not in a comment
if c != '/' {
// We're not in a comment
if last_char == '/' {
output.push('/');
}
output.push(c);
}
}
last_char = c;
}

output
}

/// Remove all single line comments from the SQL content
///
/// # Arguments
/// * `document` - A string slice that holds the content of the SQL file
fn remove_single_line_comments(document: &str) -> String {
let mut output = String::new();

// Once we detect a single line comment, we can ignore the rest of the line
// and continue with the next line. In SQL, a single line comment is denoted
// by two dashes "--" and goes until the end of the line.
for line in document.lines() {
let mut last_char = char::default();

for c in line.chars() {
if last_char == '-' && c == '-' {
// We're starting a comment
output.pop();
break;
}

output.push(c);

last_char = c;
}

// Add a space to separate the lines
output.push(' ');
}

output
}
13 changes: 13 additions & 0 deletions minify_sql_proc/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[package]
name = "minify_sql_proc"
version = "0.1.0"
edition = "2021"

[lib]
proc-macro = true

[dependencies]
quote = "1.0.36"
regex = "1.10.4"
syn = "2.0.59"
minify_sql = { path = "../minify_sql" }
Loading

0 comments on commit 682a5e1

Please sign in to comment.