Skip to content

Commit

Permalink
update README and license
Browse files Browse the repository at this point in the history
  • Loading branch information
JasonGellis committed Mar 15, 2024
1 parent e3d9bd8 commit f6a49c3
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 4 deletions.
5 changes: 4 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
{
"cSpell.words": [
"imshow",
"pytesseract"
]
],

"python.pythonPath": "/Users/jasongellis/miniconda3/envs/table_reader/bin/python"
}
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2024 Jason Gellis
Copyright (c) 2024 Jason Jacob Gellis

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
31 changes: 29 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,29 @@
# table_reader
table_reader
# Table Reader

Table Reader is a Python command-line interface (CLI) application designed to extract data values from tables in research publications and field notes. Leveraging image processing and optical character recognition (OCR) techniques, Table Reader can efficiently extract tabular data from images, enabling researchers to digitize and analyze information from various sources.

## Key Features

- Image Import: Table Reader allows users to import images containing tables from a specified directory.
- Optical character recognition (OCR) Processing: Utilizing the powerful Tesseract OCR engine, Table Reader accurately extracts text from images, including tables and tabular data.
- Data Extraction: The application processes extracted text to identify and extract tabular data, preserving the structure of tables found in the input images.

- Data Cleaning: Table Reader includes functionality to clean and pre-process extracted data, removing special characters and ensuring consistent formatting.

- Data Export: Once the data is extracted and cleaned, Table Reader enables users to export the data to a structured format, such as CSV files, for further analysis in statistical software or spreadsheet applications.

## Why Use Table Reader?

- Efficiency: Table Reader streamlines the process of extracting tabular data from imported images, saving researchers valuable time compared to manual transcription.
- Accuracy: By leveraging OCR technology, Table Reader greatly improves accurate extraction of data values, reducing the risk of errors introduced during manual data entry.
- Versatility: Researchers across various fields, including science, engineering, and social sciences, can benefit from Table Reader's ability to digitize and analyze tabular data from diverse sources, such as research publications and field notes.
- Automation: With its command-line interface, Table Reader supports automation and integration into existing data processing pipelines, facilitating seamless data extraction and analysis workflows.

## Future updates

- Webapp interface
- Upload multiple images
- Ability to select/deselect image and OCR processing

## How to cite

0 comments on commit f6a49c3

Please sign in to comment.