Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoformat #4

Merged
merged 1 commit into from
Feb 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: mixed-line-ending
- id: check-added-large-files
- repo: https://github.com/doublify/pre-commit-rust
rev: v1.0
hooks:
- id: fmt
args: ['--check', '--quiet', '--']
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ The chosen output CSV format aims to ease the import of the data by 3rd party so

# Why This Tool

Most of the stock data providers offer derived data rather than the official values from the stock market. For example, if we aim to analyse the trend for **AENA**, we usually get data from the CFDs rather than from the regular stock market. This data coming from CFDs sometimes don't fully match the official data provided by the exchange. I've found this issue to happen quite often with volumes, as the price difference between derived data and regular stock data is usually very small, volumes quite differ sometimes, and it is a struggle to define strategies using wrong volume data. Also, the granularity of the collected data only depends on you. It's difficult to find data sources that allow downloading stock data for the Spanish market with a time interval lower than 1 day.
Most of the stock data providers offer derived data rather than the official values from the stock market. For example, if we aim to analyse the trend for **AENA**, we usually get data from the CFDs rather than from the regular stock market. This data coming from CFDs sometimes don't fully match the official data provided by the exchange. I've found this issue to happen quite often with volumes, as the price difference between derived data and regular stock data is usually very small, volumes quite differ sometimes, and it is a struggle to define strategies using wrong volume data. Also, the granularity of the collected data only depends on you. It's difficult to find data sources that allow downloading stock data for the Spanish market with a time interval lower than 1 day.

The tool is designed to parse the official stock data coming from [BME's](https://www.bolsasymercados.es/bme-exchange/es/Mercados-y-Cotizaciones/Acciones/Mercado-Continuo/Precios/ibex-35-ES0SI0000005) web page. Though delayed, this page offers the most accurate stock data for all the components of the **Ibex 35** index. And for testing algorithms or custom indicators, I don't need real-time but accurate data.

Expand All @@ -71,3 +71,13 @@ Data collection could be automated using some piece of code that connects to the
# Output File Format

As of today, the output format is fixed. Each parsed entry is outputted to the console with CSV format, i.e. each value is separated from the next value using the character ";". Decimals are marked using "," and thousands with ".". Prices are in €. Theres no logic that performs an ordering of the input data, so the output is shown in the same order as it was parsed. This makes important naming the input files using indexes with the order that you expect them to be processed.

# Development

This repository follows these development rules/policies:
- Use this commit hook: [pre-commit](https://gist.github.com/felipet/c1455d3eae316d0c077e8e2b7385e5fc) to check the content of a git commit before it's actually commit.
- Rust source code shall be autoformatted using **[rustfmt](https://github.com/rust-lang/rustfmt)**. The previous rule checks this before you are allowed to commit source code.

## Maintainers

- Felipe Torres González([email protected])
18 changes: 8 additions & 10 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,7 @@

pub mod parser_ibex;

use std::path::{
Path,
PathBuf
};
use std::path::{Path, PathBuf};

/// Discover files that contain raw data for the stock prices of the Ibex 35.
///
Expand Down Expand Up @@ -101,11 +98,12 @@ pub fn discover(path: &Path, filter: Option<&str>, format: Option<&str>) -> Vec<
"_"
};

if extension == file_format &&
filter == cur_file.file_stem().unwrap().to_str().unwrap()[..filter.len()] {
files.push(
String::from(cur_file.file_name().unwrap().to_str().unwrap())
);
if extension == file_format
&& filter == cur_file.file_stem().unwrap().to_str().unwrap()[..filter.len()]
{
files.push(String::from(
cur_file.file_name().unwrap().to_str().unwrap(),
));
} else {
continue;
}
Expand All @@ -114,4 +112,4 @@ pub fn discover(path: &Path, filter: Option<&str>, format: Option<&str>) -> Vec<
}

files
}
}
17 changes: 10 additions & 7 deletions src/main.rs
Original file line number Diff line number Diff line change
@@ -1,24 +1,27 @@
// Copyright 2024 Felipe Torres González

use clap::Parser;
use ibex_parser::discover;
use ibex_parser::parser_ibex::IbexParser;
use std::path::Path;
use clap::Parser;

// The minium size of a text file that might contain stock data. Files with less than this size are omitted.
const MIN_BYTES_X_FILE: u64 = 560;

#[derive(Parser, Debug)]
#[command(name = "IbexParser")]
#[command(version = "0.1.0")]
#[command(about = "Parser for Ibex35 stock data", long_about = r#"
#[command(
about = "Parser for Ibex35 stock data",
long_about = r#"
Ibex35 Data Parsing Tool: This tool parses stock prices and other data as is offered by BME's web.
Data is parsed from raw text files and output in CSV format for a later import into some analysis tool
or graph tool.

Raw text files shall keep the same data organization as BME's web does. For example, select all the content
of the page and paste it into a text file. That file is ready to be used by this parser.
"#)]
"#
)]
struct Args {
/// Directory to search for text data files.
path: String,
Expand All @@ -39,22 +42,22 @@ fn main() {
let parser = IbexParser::new();

for file in files {
let file_string = format!("{}/{}",&args.path,file.as_str());
let file_string = format!("{}/{}", &args.path, file.as_str());
let path = Path::new(&file_string);

// Avoid passing empty files to the parser.
if path.metadata().unwrap().len() < MIN_BYTES_X_FILE {
continue;
}
let data = parser.filter_file(path , &filter);
let data = parser.filter_file(path, &filter);

match data {
Some(x) => {
for line in x {
println!("{}", line);
}
},
None => println!("File {file} doesn't contain valid data.")
}
None => println!("File {file} doesn't contain valid data."),
}
}
}
24 changes: 11 additions & 13 deletions src/parser_ibex.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// Copyright 2024 Felipe Torres González

use std::path::Path;
use std::fs::read_to_string;
use std::path::Path;

/// How many stock prices are included in a raw text file.
const N_STOCKS_IN_RAW_FILE: usize = 36;
Expand Down Expand Up @@ -77,8 +77,8 @@ impl IbexParser {
skip_n_lines_beg: 11,
ibex_line: 6,
skip_n_lines_end: 5,
cols_to_keep_main: vec![0,5,6,1],
cols_to_keep_stock: vec![0,7,8,1,5,6],
cols_to_keep_main: vec![0, 5, 6, 1],
cols_to_keep_stock: vec![0, 7, 8, 1, 5, 6],
}
}

Expand Down Expand Up @@ -121,7 +121,7 @@ impl IbexParser {
idxl: usize,
endl: usize,
colsidx: Vec<usize>,
colsstock: Vec<usize>
colsstock: Vec<usize>,
) -> IbexParser {
IbexParser {
skip_n_lines_beg: inil,
Expand Down Expand Up @@ -189,7 +189,6 @@ impl IbexParser {
if lines.len() < N_LINES_PER_RAW_FILE {
None
} else {

for line in lines {
if counter == self.ibex_line {
counter += 1;
Expand Down Expand Up @@ -285,8 +284,8 @@ impl IbexParser {
#[cfg(test)]
mod tests {
use super::*;
use rstest::*;
use pretty_assertions::assert_eq;
use rstest::*;
use std::path::Path;

#[fixture]
Expand Down Expand Up @@ -347,10 +346,7 @@ mod tests {

#[rstest]
fn test_ibexparser_parse_customfile(valid_data: Box<&'static Path>) {
let parser = IbexParser::with_custom_values(
11, 6, 5,
vec![0,1], vec![0,1]
);
let parser = IbexParser::with_custom_values(11, 6, 5, vec![0, 1], vec![0, 1]);
let path = *valid_data;

let parsed_data = parser.parse_file(path).unwrap();
Expand Down Expand Up @@ -379,7 +375,10 @@ mod tests {
// `filter_file` with an empty filter yields the same result as `parse_file`.
filter = Vec::new();
parsed_data = parser.filter_file(path, &filter);
assert_eq!(parsed_data.unwrap().len(), N_STOCKS_IN_RAW_FILE-filter.len());
assert_eq!(
parsed_data.unwrap().len(),
N_STOCKS_IN_RAW_FILE - filter.len()
);
}

#[rstest]
Expand All @@ -391,5 +390,4 @@ mod tests {
let parsed_data = parser.filter_file(path, &filter);
assert_eq!(parsed_data, None);
}

}
}
Loading