Skip to content

Commit

Permalink
add check option to confirm the file is clean
Browse files Browse the repository at this point in the history
  • Loading branch information
solaoi committed Mar 26, 2022
1 parent 49a890a commit d72b0da
Show file tree
Hide file tree
Showing 2 changed files with 48 additions and 4 deletions.
17 changes: 14 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ colc [column] [file.csv|tsv|txt]
## Option

```
# check the column is valid
-c,--check
# show frequency table and histogram
-b,--binsize <number>
Expand Down Expand Up @@ -45,7 +48,7 @@ you can download a binary release
```sh
# Install with wget or curl
## set the latest version on releases.
VERSION=v1.0.11
VERSION=v1.0.12
## case you use wget
wget https://github.com/solaoi/colc/releases/download/$VERSION/colc_linux_amd64.tar.gz
## case you use curl
Expand Down Expand Up @@ -76,22 +79,30 @@ colc 2 some.csv

<img width="381" alt="スクリーンショット 2022-03-26 23 53 17" src="https://user-images.githubusercontent.com/46414076/160244923-bedc63d3-a516-473f-9cb8-c8c926884c10.png">

Of course Binsize option works well:)
Of course `-b,--binsize` works well:)

```
colc 2 some.csv -b 10
```

<img width="742" alt="スクリーンショット 2022-03-26 23 54 04" src="https://user-images.githubusercontent.com/46414076/160244950-d543ed29-4709-465d-8b7d-63be530cc29a.png">

There are noises, then filter necessaries(>=1%)
There are noises, then filter necessaries(>=1%) with `-f,--filter`

```
colc 2 some.csv -b 10 -f 1
```

<img width="742" alt="スクリーンショット 2022-03-26 23 54 44" src="https://user-images.githubusercontent.com/46414076/160244980-3e938d6a-766b-4f63-865c-8e5caef18739.png">

If you wanna check whether the file is valid in advance,

`-c,--check` answers the file is dirty or clean.

```
colc 2 some.csv -c
```

## Development

```
Expand Down
35 changes: 34 additions & 1 deletion colc.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ import {
import { runner } from "./lib/common.ts";
import { parse } from "https://deno.land/[email protected]/flags/mod.ts";

const { _, binsize, b, filter, f } = parse(Deno.args);
const { _, binsize, b, filter, f, check, c } = parse(Deno.args);
const [column, filename] = _;
if (typeof column !== "number" || typeof filename !== "string") {
console.log("Usage:\n colc [column] [file.csv|tsv|txt]");
Expand All @@ -23,13 +23,46 @@ const filterRank: number = (() => {
if (typeof f === "number" && f > 0) return f;
return 0;
})();
const hasCheck: boolean = (() => {
if (typeof check === "boolean") return check;
if (typeof c === "boolean") return c;
return false;
})();

const isCsv = filename.endsWith(".csv");
const headerName = await runner.run(
`head -n 1 ${filename}| cut -f${column} ${isCsv ? "-d, " : ""}| tr -d 0-9.-`,
);
const hasHeader = headerName !== "";

if (hasCheck) {
const checkCommand = (() => {
const bash = [];
if (hasHeader) {
bash.push(
`tail -n +2 ${filename} | cut -f${column} ${isCsv ? "-d, " : ""}`,
);
} else {
bash.push(
`cut -f${column} ${isCsv ? "-d, " : ""}${filename} `,
);
}
bash.push(
"| awk '{clean=($1 ~/^[0-9.-]+$/);if(!clean)exit}END{print clean}'",
);
return bash.join(" ");
})();
const result = await runner
.run(checkCommand);
const hasError = result === "0";
if (hasError) {
console.log("dirty:<");
Deno.exit(1);
}
console.log("clean:D");
Deno.exit(0);
}

if (binSize === null) {
const statsCommand = (() => {
const bash = [];
Expand Down

0 comments on commit d72b0da

Please sign in to comment.