DI Dataset

DI dataset is the e-commerce images that proposed in "An End-to-End OCR Text Re-organization Sequence Learning for Rich-text Detail Image Comprehension", the original version contains 10k images. For some business reason, they only release ~8k images for reaserch. The dataset can be downloaded from TianChi.

Annoation

We transfer the original annoation into the uniform Davar format as follows,

{
	"O1CN01020zzO1pMyIRdTHd2_!!4117975347.jpg": 
	{
		"height": 1176, 
		"width": 790, 
		"content_ann": {
			"bboxes": [[316, 68, 468, 68, 468, 97, 316, 97],             # text boxes
			          [332, 132, 458, 132, 458, 148, 332, 148], 
					  [286, 230, 343, 230, 343, 247, 286, 247], 
					  [369, 230, 589, 230, 589, 248, 369, 248], 
					  ...], 
			"texts": ["产品信息",                                        # text contents                        
			          "ABOUTPRODUCTS", 
					  "【产品】", 
					  "：康绮墨丽盈润清爽洗发乳",
					  ...], 
			"labels": [[1], [2], [3], [4],...]                           # Reader order, 1,2,3...
		}
	},
	...
},

The formatted datalist can be downloaded from here (Access Code：i22d)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

DI Dataset

Annoation

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

DI Dataset

Annoation