HangulDB-Image

Korean handwriting dataset parsed from the HangulDB.

Samples

Each image has different width and height. For the consistency with the original, I intentionally preserve the property.

b0a1

bad0

ba88

Datasets

This repo contains PE92, SERI95, and HanDB.

PE92 contains 2350 classes, each with 100 samples.
SERI95 contains 520 classes, each with 1000 samples.
HANDB merges SERI95 and PE92. That is, 520 classes have 1100 samples and the others (1820 classes) have 100 samples.

Architecture

Three datasets have the same structure:

<dataset_name>/<label>/<sample_index>.jpg

warning

PE92 contains some mislabeled samples at the last few samples for each class.

Verification

parser.ipynb parses a hgu1 file to several jpg files. You can test whether it correctly parse the original dataset using parser.ipynb.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
HanDB_test		HanDB_test
HanDB_train		HanDB_train
PE92_test		PE92_test
PE92_train		PE92_train
SERI_Test		SERI_Test
SERI_Train		SERI_Train
README.md		README.md
parser.ipynb		parser.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HangulDB-Image

Samples

b0a1

bad0

ba88

Datasets

Verification

About

Releases

Packages

hslyu/HangulDB-Image

Folders and files

Latest commit

History

Repository files navigation

HangulDB-Image

Samples

b0a1

bad0

ba88

Datasets

Verification

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages