Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modification of the definition of networks #18

Open
wants to merge 63 commits into
base: doc_refactor
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
28ad980
Update installation packages and installation instructions
SeguinBe Oct 6, 2018
957cd58
Revamp of the network description and architecture in a more flexible…
SeguinBe Oct 12, 2018
ff1edd8
Removing useless files
SeguinBe Oct 12, 2018
a9e0ed7
dh_segment_train as a script
SeguinBe Oct 12, 2018
e0d6c5d
Correcting the deletion of the main script, oops...
SeguinBe Oct 12, 2018
cb1d8fc
Nicer labels for the progress bars
SeguinBe Oct 12, 2018
da1258a
Nicer handling of number of threads
SeguinBe Oct 12, 2018
4de57fe
Removing code which has been made useless
SeguinBe Oct 12, 2018
7e5ccb4
mainly docstring formatting
solivr Oct 22, 2018
ce214c2
changed :param: by :ivar:
solivr Oct 22, 2018
62ec71d
Updating batchnorm training
SeguinBe Oct 26, 2018
cace550
Added MobileNetV2
SeguinBe Oct 26, 2018
ea11126
Documentation of exported model
SeguinBe Oct 29, 2018
82a5f22
Fixed refactoring
Oct 30, 2018
91540f2
Merge pull request #19 from sriak/master
solivr Oct 30, 2018
9889c7d
updated demo
solivr Nov 1, 2018
4e00913
pip install
solivr Nov 2, 2018
4f177b1
typo in attribute
solivr Nov 14, 2018
932fa3c
corrected non exported segment_ids field
solivr Nov 14, 2018
c5a1965
sorting of TextLines in a TextRegion
solivr Nov 15, 2018
346e2fb
force type to be int (for JSON export compatibility)
solivr Nov 20, 2018
7c25b56
specific to int32 and int64 type
solivr Nov 20, 2018
3eefba8
input csv file
solivr Dec 4, 2018
455a8e9
via annotation processing
e-maud Dec 11, 2018
811af9c
via annotation processing - typo
e-maud Dec 11, 2018
48efe87
type correction
solivr Dec 11, 2018
f736aaa
added doc
solivr Dec 12, 2018
7f65ad4
updated doc
solivr Jan 17, 2019
4509bc5
updated installation doc
solivr Jan 18, 2019
e61079f
packages versions
solivr Jan 18, 2019
db46c35
detected contour should have at least 3 points
solivr Jan 21, 2019
7c53e27
LatestExporter if no eval data is provided
solivr Jan 24, 2019
e07f996
update
solivr Dec 14, 2018
b090906
contour option in mask creation
solivr Jan 24, 2019
ba92f50
export regions coordinates to VIA compatible format
solivr Jan 30, 2019
fbb9350
doc and typos
solivr Feb 5, 2019
600acaa
simlified via.py and updated doc
solivr Feb 11, 2019
665af99
doc formatting
solivr Feb 11, 2019
84ec4dd
parse attributes of TextRegion and TextLines 'custom' and 'type'
solivr Dec 4, 2018
77bb4f3
remove git repo dependency
solivr Feb 11, 2019
532131a
merging
solivr Feb 11, 2019
909e8b1
corrected wrong argument names
solivr Feb 13, 2019
6717332
wrong variable name
solivr Feb 13, 2019
704087a
via example and doc formatting
solivr Feb 12, 2019
04ce8b6
Correcting typo masks creation script
alix-tz Feb 20, 2019
2264cf1
Merge pull request #26 from alix-tz/patch-1
solivr Feb 21, 2019
1262b59
Fixing instruction
alix-tz Feb 26, 2019
12d2759
Merge pull request #27 from alix-tz/patch-2
solivr Feb 28, 2019
6fdfcbd
do not export attribute 'type' if it's empty
solivr Mar 7, 2019
8fbd882
array to list of Point method
solivr Feb 25, 2019
2af56f2
update parsing + get list of tags from xml
solivr Mar 12, 2019
7100855
merge from master
SeguinBe Mar 22, 2019
8deae44
miou metric
solivr Mar 8, 2019
540eb36
to_json method for Page class
solivr Apr 4, 2019
605a930
updated via helpers
solivr Apr 9, 2019
6456a69
update packages version
solivr Apr 9, 2019
a072442
update to opencv 4.0
solivr Apr 9, 2019
fbad361
changelog
solivr Apr 9, 2019
9de5ca7
fix tensorflow-gpu version
solivr Apr 10, 2019
875c547
fixes #37
solivr May 15, 2019
7f2a348
merge
SeguinBe May 22, 2019
de461a7
working version corrected
SeguinBe May 22, 2019
1b36fca
formatting
solivr Jul 26, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from tqdm import tqdm

from dh_segment.io import PAGE
from dh_segment.network import LoadedModel
from dh_segment.inference import LoadedModel
from dh_segment.post_processing import boxes_detection, binarization

# To output results in PAGE XML format (http://www.primaresearch.org/schema/PAGE/gts/pagecontent/2013-07-15/)
Expand Down Expand Up @@ -89,14 +89,17 @@ def format_quad_to_string(quad):
cv2.polylines(original_img, [pred_page_coords[:, None, :]], True, (0, 0, 255), thickness=5)
# Write corners points into a .txt file
txt_coordinates += '{},{}\n'.format(filename, format_quad_to_string(pred_page_coords))

# Create page region and XML file
page_border = PAGE.Border(coords=PAGE.Point.cv2_to_point_list(pred_page_coords[:, None, :]))
else:
print('No box found in {}'.format(filename))
page_border = PAGE.Border()

basename = os.path.basename(filename).split('.')[0]
imsave(os.path.join(output_dir, '{}_boxes.jpg'.format(basename)), original_img)

# Create page region and XML file
page_border = PAGE.Border(coords=PAGE.Point.cv2_to_point_list(pred_page_coords[:, None, :]))
page_xml = PAGE.Page(filename, image_width=original_shape[1], image_height=original_shape[0],
page_xml = PAGE.Page(image_filename=filename, image_width=original_shape[1], image_height=original_shape[0],
page_border=page_border)
xml_filename = os.path.join(output_pagexml_dir, '{}.xml'.format(basename))
page_xml.write_to_file(xml_filename, creator_name='PageExtractor')
Expand Down
347 changes: 347 additions & 0 deletions demo/interactive_demo.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,347 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Interactive demo to load a trained model for page extraction and apply it to a randomly selected file"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 1. Get the annotated sample dataset, which already contains the folders images and labels. Unzip it into `demo/pages_sample`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! wget https://github.com/dhlab-epfl/dhSegment/releases/download/untagged-b55f9aa4fff5efd4b1b8/pages_sample.zip\n",
"! unzip pages_sample.zip"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 2. Download the provided model (download and unzip it in `demo/model`)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! wget https://github.com/dhlab-epfl/dhSegment/releases/download/v0.2/model.zip\n",
"! unzip model.zip"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 3. Run the code step by step"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import cv2\n",
"from glob import glob\n",
"import numpy as np\n",
"import random\n",
"import tensorflow as tf\n",
"from imageio import imread, imsave"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"%matplotlib inline"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from dh_segment.io import PAGE\n",
"from dh_segment.inference import LoadedModel\n",
"from dh_segment.post_processing import boxes_detection, binarization"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def page_make_binary_mask(probs: np.ndarray, threshold: float=-1) -> np.ndarray:\n",
" \"\"\"\n",
" Computes the binary mask of the detected Page from the probabilities outputed by network\n",
" :param probs: array with values in range [0, 1]\n",
" :param threshold: threshold between [0 and 1], if negative Otsu's adaptive threshold will be used\n",
" :return: binary mask\n",
" \"\"\"\n",
"\n",
" mask = binarization.thresholding(probs, threshold)\n",
" mask = binarization.cleaning_binary(mask, kernel_size=5)\n",
" return mask"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define input and output directories / files"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model_dir = 'page_model/export'\n",
"if not os.path.exists(model_dir):\n",
" model_dir = 'model/'\n",
"assert(os.path.exists(model_dir))\n",
"\n",
"input_files = glob(os.path.join('pages_sample', 'images/*'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"output_dir = './processed_images'\n",
"os.makedirs(output_dir, exist_ok=True)\n",
"# PAGE XML format output\n",
"output_pagexml_dir = os.path.join(output_dir, 'page_xml')\n",
"os.makedirs(output_pagexml_dir, exist_ok=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Start a tensorflow session"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"session = tf.InteractiveSession()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Select a random image"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"file_to_process = random.sample(input_files, 1)[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Load the model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"m = LoadedModel(model_dir, predict_mode='filename')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Predict each pixel's label"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# For each image, predict each pixel's label\n",
"prediction_outputs = m.predict(file_to_process)\n",
"probs = prediction_outputs['probs'][0]\n",
"original_shape = prediction_outputs['original_shape']\n",
"\n",
"probs = probs[:, :, 1] # Take only class '1' (class 0 is the background, class 1 is the page)\n",
"probs = probs / np.max(probs) # Normalize to be in [0, 1]\n",
"\n",
"# Binarize the predictions\n",
"page_bin = page_make_binary_mask(probs)\n",
"\n",
"# Upscale to have full resolution image (cv2 uses (w,h) and not (h,w) for giving shapes)\n",
"bin_upscaled = cv2.resize(page_bin.astype(np.uint8, copy=False),\n",
" tuple(original_shape[::-1]), interpolation=cv2.INTER_NEAREST)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Show the probability map and binarized mask"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.figure(figsize=(10,10))\n",
"plt.subplot(1,2,1)\n",
"plt.imshow(probs, cmap='gray')\n",
"plt.axis('off')\n",
"plt.title('Probability map')\n",
"plt.subplot(1,2,2)\n",
"plt.imshow(page_bin, cmap='gray')\n",
"plt.axis('off')\n",
"plt.title('Binary mask')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Find quadrilateral enclosing the page"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pred_page_coords = boxes_detection.find_boxes(bin_upscaled.astype(np.uint8, copy=False),\n",
" mode='min_rectangle', n_max_boxes=1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Draw page box on original image and export it. Add also box coordinates to the txt file\n",
"original_img = imread(file_to_process, pilmode='RGB')\n",
"if pred_page_coords is not None:\n",
" cv2.polylines(original_img, [pred_page_coords[:, None, :]], True, (0, 0, 255), thickness=5)\n",
"else:\n",
" print('No box found in {}'.format(filename))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.figure(figsize=(10,10))\n",
"plt.imshow(original_img)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Export image and create page region and XML file"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"basename = os.path.basename(file_to_process).split('.')[0]\n",
"imsave(os.path.join(output_dir, '{}_boxes.jpg'.format(basename)), original_img)\n",
"\n",
"page_border = PAGE.Border(coords=PAGE.Point.cv2_to_point_list(pred_page_coords[:, None, :]))\n",
"page_xml = PAGE.Page(image_filename=file_to_process, image_width=original_shape[1], image_height=original_shape[0], page_border=page_border)\n",
"xml_filename = os.path.join(output_pagexml_dir, '{}.xml'.format(basename))\n",
"page_xml.write_to_file(xml_filename, creator_name='PageExtractor')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 4. Have a look at the results in ``demo/processed_images``"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python [conda env:dhsegment]",
"language": "python",
"name": "conda-env-dhsegment-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading