Skip to content

Commit

Permalink
changes
Browse files Browse the repository at this point in the history
  • Loading branch information
Frooodle committed May 29, 2024
1 parent fb36639 commit 71d16fa
Show file tree
Hide file tree
Showing 2 changed files with 139 additions and 19 deletions.
78 changes: 61 additions & 17 deletions docs/Advanced Configuration/How to add configurations.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,52 +5,96 @@ sidebar_position: 1

Stirling PDF allows easy customization of the app.
Includes things like
- Custom application name
- Custom slogans, icons, images, and even custom HTML (via file overrides)

- Custom application name
- Custom slogans, icons, HTML, images CSS etc (via file overrides)

For customization via variables there are two options for this, either using the settings file ``settings.yml``
This file is located in the ``/configs`` directory and follows standard YAML formatting or directly via environment variables.
There are two options for this, either using the generated settings file ``settings.yml``
This file is located in the ``/configs`` directory and follows standard YAML formatting

Environment variables override their settings file equivalents
Environment variables are also supported and would override the settings file
For example in the settings.yml you have

```
system:
defaultLocale: 'en-US'
enableLogin: 'true'
```

To have this via an environment variable you would add each sub section together to form the parameter.
In this case adding ``system`` to ``defaultLocale`` with all caps creating the variable ``SYSTEM_DEFAULTLOCALE`` or ``SYSTEM_DEFAULT_LOCALE``
To have this via an environment variable you would have ``SYSTEM_ENABLELOGIN``

The Current list of settings is

```
security:
enableLogin: false # set to 'true' to enable login
csrfDisabled: true
csrfDisabled: true # Set to 'true' to disable CSRF protection (not recommended for production)
loginAttemptCount: 5 # lock user account after 5 tries
loginResetTimeMinutes: 120 # lock account for 2 hours after x attempts
# initialLogin:
# username: "admin" # Initial username for the first login
# password: "stirling" # Initial password for the first login
# oauth2:
# enabled: false # set to 'true' to enable login (Note: enableLogin must also be 'true' for this to work)
# issuer: "" # set to any provider that supports OpenID Connect Discovery (/.well-known/openid-configuration) end-point
# clientId: "" # Client ID from your provider
# clientSecret: "" # Client Secret from your provider
# autoCreateUser: false # set to 'true' to allow auto-creation of non-existing users
# useAsUsername: "email" # Default is 'email'; custom fields can be used as the username
# scopes: "openid, profile, email" # Specify the scopes for which the application will request permissions
# provider: "google" # Set this to your OAuth provider's name, e.g., 'google' or 'keycloak'
# client:
# google:
# clientId: "" # Client ID for Google OAuth2
# clientSecret: "" # Client Secret for Google OAuth2
# scopes: "https://www.googleapis.com/auth/userinfo.email, https://www.googleapis.com/auth/userinfo.profile" # Scopes for Google OAuth2
# useAsUsername: "email" # Field to use as the username for Google OAuth2
# github:
# clientId: "" # Client ID for GitHub OAuth2
# clientSecret: "" # Client Secret for GitHub OAuth2
# scopes: "read:user" # Scope for GitHub OAuth2
# useAsUsername: "login" # Field to use as the username for GitHub OAuth2
# keycloak:
# issuer: "http://192.168.0.123:8888/realms/stirling-pdf" # URL of the Keycloak realm's OpenID Connect Discovery endpoint
# clientId: "stirling-pdf" # Client ID for Keycloak OAuth2
# clientSecret: "" # Client Secret for Keycloak OAuth2
# scopes: "openid, profile, email" # Scopes for Keycloak OAuth2
# useAsUsername: "email" # Field to use as the username for Keycloak OAuth2
system:
defaultLocale: 'en-US' # Set the default language (e.g. 'de-DE', 'fr-FR', etc)
googlevisibility: false # 'true' to allow Google visibility (via robots.txt), 'false' to disallow
enableAlphaFunctionality: false # Set to enable functionality which might need more testing before it fully goes live (This feature might make no changes)
showUpdate: true # see when a new update is available
showUpdateOnlyAdmin: false # Only admins can see when a new update is available, depending on showUpdate it must be set to 'true'
customHTMLFiles: false # enable to have files placed in /customFiles/templates override the existing template html files
#ui:
# appName: exampleAppName # Application's visible name
# homeDescription: I am a description # Short description or tagline shown on homepage.
# appNameNavbar: navbarName # Name displayed on the navigation bar
ui:
appName: null # Application's visible name
homeDescription: null # Short description or tagline shown on homepage.
appNameNavbar: null # Name displayed on the navigation bar
endpoints:
toRemove: [] # List endpoints to disable (e.g. ['img-to-pdf', 'remove-pages'])
groupsToRemove: [] # List groups to disable (e.g. ['LibreOffice'])
metrics:
enabled: true # 'true' to enable Info APIs endpoints (view http://localhost:8080/swagger-ui/index.html#/API to learn more), 'false' to disable
enabled: true # 'true' to enable Info APIs (`/api/*`) endpoints, 'false' to disable
```

For more info on the individual entries please see their separate pages
There is an additional config file ``/configs/custom_settings.yml`` were users familiar with java and spring application.properties can input their own settings on-top of Stirling-PDFs existing ones


#### Extra notes
- Endpoints. Currently, the endpoints ENDPOINTS_TO_REMOVE and GROUPS_TO_REMOVE can include comma separate lists of endpoints and groups to disable as example ENDPOINTS_TO_REMOVE=img-to-pdf,remove-pages would disable both image-to-pdf and remove pages, GROUPS_TO_REMOVE=LibreOffice Would disable all things that use LibreOffice. You can see a list of all endpoints and groups [here](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/Endpoint-groups.md)
- customStaticFilePath. Customise static files such as the app logo by placing files in the /customFiles/static/ directory. An example of customising app logo is placing a /customFiles/static/favicon.svg to override current SVG. This can be used to change any images/icons/css/fonts/js etc in Stirling-PDF

### Environment only parameters
- ``SYSTEM_ROOT_URI_PATH`` changes the websites root path, ie if set to ``pdf-app`` to application will be viewable at address ``localhost:8080/pdf-app`` instead of ``localhost:8080/``

- ``SYSTEM_ROOTURIPATH`` ie set to ``/pdf-app`` to Set the application's root URI to ``localhost:8080/pdf-app``
- ``SYSTEM_CONNECTIONTIMEOUTMINUTES`` to set custom connection timeout values
- ``DOCKER_ENABLE_SECURITY`` to tell docker to download security jar (required as true for authentication and login functionality)
- ``DOCKER_ENABLE_SECURITY`` to tell docker to download security jar (required as true for auth login)
- ``INSTALL_BOOK_AND_ADVANCED_HTML_OPS`` to download calibre onto stirling-pdf enabling pdf to/from book and advanced html conversion
- ``LANGS`` to define custom font libraries to install for use for document conversions

### Local
If running Java directly outside of docker, you can set these environment variables before starting the app
Expand Down
80 changes: 78 additions & 2 deletions docs/Advanced Configuration/OCR.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,82 @@ sidebar_position: 1
id: OCR
title: OCR (Optical Character Recognition)
---
# OCR (Optical Character Recognition)
# OCR Language Packs and Setup

TODO OCR HERE
This document provides instructions on how to add additional language packs for the OCR tab in Stirling-PDF, both inside and outside of Docker.

## My OCR used to work and now doesn't!
The paths have changed for the tessadata locations on new docker images, please use ``/usr/share/tessdata`` (Others should still work for backwards compatibility but might not)

## How does the OCR Work
Stirling-PDF uses [OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF) which in turn uses tesseract for its text recognition.
All credit goes to them for this awesome work!

## Language Packs

Tesseract OCR supports a variety of languages. You can find additional language packs in the Tesseract GitHub repositories:

- [tessdata_fast](https://github.com/tesseract-ocr/tessdata_fast): These language packs are smaller and faster to load, but may provide lower recognition accuracy.
- [tessdata](https://github.com/tesseract-ocr/tessdata): These language packs are larger and provide better recognition accuracy, but may take longer to load.

Depending on your requirements, you can choose the appropriate language pack for your use case. By default Stirling-PDF uses the tessdata_fast eng but this can be replaced.

### Installing Language Packs

1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tessdata`

# DO NOT REMOVE EXISTING ENG.TRAINEDDATA, IT'S REQUIRED.

#### Docker

If you are using Docker, you need to expose the Tesseract tessdata directory as a volume in order to use the additional language packs.
#### Docker Compose
Modify your `docker-compose.yml` file to include the following volume configuration:


```
services:
your_service_name:
image: your_docker_image_name
volumes:
- /location/of/trainingData:/usr/share/tessdata
```


#### Docker run
Add the following to your existing docker run command
```
-v /location/of/trainingData:/usr/share/tessdata
```

#### Non-Docker
If you are not using Docker, you need to install the OCR components, including the ocrmypdf app.
You can see [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html)

Debian based systems, install languages with this command:

```
sudo apt update &&\
# All languages
# sudo apt install -y 'tesseract-ocr-*'
# Find languages:
apt search tesseract-ocr-
# View installed languages:
dpkg-query -W tesseract-ocr- | sed 's/tesseract-ocr-//g'
```

Fedora:

```
# All languages
# sudo dnf install -y tesseract-langpack-*
# Find languages:
dnf search -C tesseract-langpack-
# View installed languages:
rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
```

0 comments on commit 71d16fa

Please sign in to comment.