PDF Text Summarizer is a Streamlit-based application that allows users to extract and summarize text from PDF documents or input text directly. It's designed to simplify the process of understanding large documents by providing concise summaries.
- PDF Text Extraction: Upload PDF documents to extract text.
- Text Summarization: Summarize extracted or input text for quick comprehension.
- User-Friendly Interface: Easy-to-use sidebar for method selection and interactive elements for a better user experience.
- Streamlit Application (
app.py
): The frontend interface where users interact with the application. - Text Extraction Module (
src/pytesseract_ocr.py
): Extracts text from uploaded PDF files. - Text Summarization Module (
src/summarizer.py
): Summarizes the extracted or input text.
- Start: User chooses to upload a PDF or input text directly.
- Processing:
- If a PDF is uploaded, the
PDFToTextConverter
extracts text from the PDF. - If text is input directly, it is taken as is for summarization.
- If a PDF is uploaded, the
- Summarization: The
TextSummarizer
generates a concise summary of the provided text. - Display: The original text (if extracted) and the summarized text are displayed to the user.
- Start the Application: Run
streamlit run app.py
in the terminal. - Choose Input Method: Use the sidebar to select between uploading a PDF or entering text.
- Upload or Enter Text: Either upload a PDF file or type text into the provided text area.
- Summarize: Click the 'Summarize' button to process and view the summary.
Here are some screenshots showing different stages of the PDF Text Summarizer application:
Follow these steps to get the application up and running:
-
Clone the repository:
git clone https://github.com/olawale0254/IntellectSummarizer.git
-
Navigate to the project directory:
cd IntellectSummarizer
-
Install dependencies:
pip install -r requirements.txt
-
Run the application:
streamlit run app.py