forked from Eric3911/Related-works-ch
-
Notifications
You must be signed in to change notification settings - Fork 0
/
工程笔记
1366 lines (1161 loc) · 46.2 KB
/
工程笔记
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
181108
keras版本的unet
https://blog.csdn.net/u012931582/article/details/70215756
倾斜书本的转正问题-透射变化+最小矩形检测
https://blog.csdn.net/mao_hui_fei/article/details/79729956
图像数据网络传输
https://blog.csdn.net/huhuang/article/details/54911982
python图像直方图
https://blog.csdn.net/yuyangyg/article/details/70857438
deeplabv3+训练笔记
https://blog.csdn.net/ncloveqy/article/details/82285106
https://python.ctolib.com/bonlime-keras-deeplab-v3-plus.html
深度学习论文路线学习笔记
https://github.com/SnailTyan/deep-learning-papers-translation
keras-yolo3-text
https://github.com/chineseocr/keras-yolo3-text
ICPR-2018——OCR笔记
https://blog.csdn.net/qq_14845119/article/details/82219246
SVHN
Text Recognition
https://bartzi.de/research/see
tensorflow版本STN——CNN——LSTM——CTC
https://github.com/Eric3911/STN_CNN_LSTM_CTC_TensorFlow
A文章来源:企鹅号 - CSIG文档图像分析与识别专委会
AAI 2018文档图像分析与识别相关论文选读
https://cloud.tencent.com/developer/news/154431
R2CNN
https://blog.csdn.net/u010183397/article/details/76473071
OCR系统论文
https://blog.csdn.net/jiachen0212/article/details/79498047
文本检测
https://zhuanlan.zhihu.com/p/37781277
https://github.com/bgshih/seglink
http://www.cnblogs.com/skyfsm/p/9776611.html
tensorflow不定长自然场景文本检测
https://blog.csdn.net/p312011150/article/details/82660072
文本检测
https://weibo.com/1560015614/GkLz1a4BH?type=comment
自然场景文本检测
https://blog.csdn.net/u011956004/article/details/79073282
Hypernet
https://segmentfault.com/a/1190000009030250
keras版本的Focal loss+Retinanet
https://blog.csdn.net/u012426298/article/details/80450537
利用seglink训练自己的数据
https://blog.csdn.net/weixin_43122521/article/details/82558346
https://blog.csdn.net/u011440558/article/details/78564615
https://blog.csdn.net/jiachen0212/article/details/79471823?utm_source=blogxgwz1
DenseNet
tps://blog.csdn.net/qq_14845119/article/details/79272082
YOLT遥感图像识别
https://blog.csdn.net/bryant_meng/article/details/81284915
https://blog.csdn.net/jacke121/article/details/80531278
NasNet图像识别
https://blog.csdn.net/sparkexpert/article/details/79834704
遥感图像分割
基于segnet和Unet的语义分割
https://www.sohu.com/a/218292615_642762
Segnet训练自己的模型
https://blog.csdn.net/weixin_43122521/article/details/82558346
文本检测与OCR
https://weibo.com/1560015614/GkLz1a4BH?type=comment
https://zhuanlan.zhihu.com/p/37781277
https://weibo.com/1560015614/GkLz1a4BH?type=comment
https://zhuanlan.zhihu.com/p/37363942
tensorflow训练日志
https://www.urlteam.org/2017/09/%E7%9B%AE%E6%A0%87%E6%A3%80%E6%B5%8B%E7%AC%94%E8%AE%B0%E4%BA%8C%EF%BC%9Atensorflow%E5%B0%8F%E7%99%BD%E5%AE%9E%E8%B7%B5/
https://blog.csdn.net/asukasmallriver/article/details/78752178
https://www.urlteam.org/2017/09/%E7%9B%AE%E6%A0%87%E6%A3%80%E6%B5%8B%E7%AC%94%E8%AE%B0%E4%BA%8C%EF%BC%9Atensorflow%E5%B0%8F%E7%99%BD%E5%AE%9E%E8%B7%B5/
keras版本的GRU
https://blog.csdn.net/dcrmg/article/details/79306402
https://download.csdn.net/download/dcrmg/10248818
https://www.cnblogs.com/skyfsm/p/8029668.html
基于yolov3+CRNN的中文文字识别通用模型
https://swift.ctolib.com/article/comments/94721
动手学深度学习练习代码
https://github.com/SnailTyan/gluon-practice-code
opencv关键点检测
https://www.learnopencv.com/hand-keypoint-detection-using-deep-learning-and-opencv/
基于区域的全卷积神经网络
http://www.cnblogs.com/llfctt/p/9071889.html
图解CNN计算过程
https://blog.csdn.net/v_JULY_v/article/details/79434745?tdsourcetag=s_pcqq_aiomsg
图像旋转转正
import cv2
import numpy as np
from PIL import Image
from skimage import transform
import matplotlib.pyplot as plt
def calc_length(point1, point2):
x = (point1[0] - point2[0]) ** 2
y = (point1[1] - point2[1]) ** 2
return (x + y) ** 0.5
MIN_MATCH_COUNT = 10
#标准的证件模板
dst_img = cv2.imread('C:/Users/Administrator/Desktop/1010test/1234567.jpg', 0)
#需要检测的证件读入路径
ori_img_ori = cv2.imread('C:/Users/Administrator/Desktop/1010test/123456.jpg', 1)
#使用SIFT检测角点
sift = cv2.xfeatures2d.SIFT_create()
# 获取关键点和描述符
kp1, des1 = sift.detectAndCompute(dst_img, None)
kp2, des2 = sift.detectAndCompute(ori_img_ori, None)
# 定义FLANN匹配器
index_params = dict(algorithm = 1, trees = 5)
search_params = dict(checks = 50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
# 使用KNN算法匹配
matches = flann.knnMatch(des1,des2,k=2)
# 去除错误匹配
good = []
for m,n in matches:
if m.distance < 0.9*n.distance:
good.append(m)
# 单应性
print(len(good))
if len(good) > MIN_MATCH_COUNT:
# 改变数组的表现形式,不改变数据内容,数据内容是每个关键点的坐标位置
src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)
# findHomography 函数是计算变换矩阵
# 参数cv2.RANSAC是使用RANSAC算法寻找一个最佳单应性矩阵H,即返回值M
# 返回值:M 为变换矩阵,mask是掩模
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)
# ravel方法将数据降维处理,最后并转换成列表格式
matchesMask = mask.ravel().tolist()
# 获取img1的图像尺寸
h,w = dst_img.shape
# pts是图像img1的四个顶点
pts = np.float32([[0,0],[0,h-1],[w-1,h-1],[w-1,0]]).reshape(-1,1,2)
# 计算变换后的四个顶点坐标位置
ori = cv2.perspectiveTransform(pts, M)
dst = [(ele[0][0], ele[0][1]) for ele in ori]
length_width = int(max(calc_length(dst[3], dst[0]), calc_length(dst[1], dst[2])))
length_hight = int(max(calc_length(dst[0], dst[1]), calc_length(dst[2], dst[3])))
tar = np.float32([[0,0],[length_hight,0],[length_hight,length_width], [0,length_width]])
warp_matrix = cv2.getPerspectiveTransform(ori, tar)
res = cv2.warpPerspective(ori_img_ori, warp_matrix, (length_hight, length_width))
rot_img = transform.rotate(res[::-1,:], -90, resize=True)
#img_new =cv2.resize(rot_img,(880,600),interpolation=cv2.INTER_CUBIC)
result = cv2.imshow("WarpImg", rot_img)
else:
print("Not enough matches are found - %d/%d") % (len(good),MIN_MATCH_COUNT)
matchesMask = None
cv2.waitKey(0)
# -*- coding: utf-8 -*-
"""
Created on Fri Nov 2 11:39:31 2018
@author: Administrator
"""
#光斑检测问题
import cv2 as cv
import numpy as np
#全局阈值
def threshold_demo(image):
gray = cv.cvtColor(image, cv.COLOR_RGB2GRAY) #把输入图像灰度化
#直接阈值化是对输入的单通道矩阵逐像素进行阈值分割。
ret, binary = cv.threshold(gray, 0, 255, cv.THRESH_BINARY | cv.THRESH_TRIANGLE)
print("threshold value %s"%ret)
cv.namedWindow("binary0", cv.WINDOW_NORMAL)
cv.imshow("binary0", binary)
#局部阈值
def local_threshold(image):
gray = cv.cvtColor(image, cv.COLOR_RGB2GRAY) #把输入图像灰度化
#自适应阈值化能够根据图像不同区域亮度分布,改变阈值
binary = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_GAUSSIAN_C,cv.THRESH_BINARY, 25, 10)
cv.namedWindow("binary1", cv.WINDOW_NORMAL)
cv.imshow("binary1", binary)
#用户自己计算阈值
def custom_threshold(image):
gray = cv.cvtColor(image, cv.COLOR_RGB2GRAY) #把输入图像灰度化
h, w =gray.shape[:2]
m = np.reshape(gray, [1,w*h])
mean = m.sum()/(w*h)
print("mean:",mean)
ret, binary = cv.threshold(gray, mean, 255, cv.THRESH_BINARY)
cv.namedWindow("binary2", cv.WINDOW_NORMAL)
cv.imshow("binary2", binary)
src = cv.imread('C:/Users/Administrator/Desktop/1010test/1102/光斑检测/1.jpg')
cv.namedWindow('input_image', cv.WINDOW_NORMAL) #设置为WINDOW_NORMAL可以任意缩放
cv.imshow('input_image', src)
threshold_demo(src)
local_threshold(src)
custom_threshold(src)
cv.waitKey(0)
cv.destroyAllWindows()
取虫子
import cv2
import numpy as np
def get_image(path):
#获取图片
img=cv2.imread(path)
gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
return img, gray
def Gaussian_Blur(gray):
# 高斯去噪
blurred = cv2.GaussianBlur(gray, (3, 3),0)
return blurred
def Sobel_gradient(blurred):
# 索比尔算子来计算x、y方向梯度
gradX = cv2.Sobel(blurred, ddepth=cv2.CV_32F, dx=1, dy=0)
gradY = cv2.Sobel(blurred, ddepth=cv2.CV_32F, dx=0, dy=1)
gradient = cv2.subtract(gradX, gradY)
gradient = cv2.convertScaleAbs(gradient)
return gradX, gradY, gradient
def Thresh_and_blur(gradient):
blurred = cv2.GaussianBlur(gradient, (9, 9),0)
(_, thresh) = cv2.threshold(blurred, 90, 255, cv2.THRESH_BINARY)
return thresh
def image_morphology(thresh):
# 建立一个椭圆核函数
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (25, 25))
# 执行图像形态学, 细节直接查文档,很简单
closed = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
closed = cv2.erode(closed, None, iterations=4)
closed = cv2.dilate(closed, None, iterations=4)
return closed
def findcnts_and_box_point(closed):
# 这里opencv3返回的是三个参数
(_, cnts, _) = cv2.findContours(closed.copy(),
cv2.RETR_LIST,
cv2.CHAIN_APPROX_SIMPLE)
c = sorted(cnts, key=cv2.contourArea, reverse=True)[0]
# compute the rotated bounding box of the largest contour
rect = cv2.minAreaRect(c)
box = np.int0(cv2.boxPoints(rect))
return box
def drawcnts_and_cut(original_img, box):
# 因为这个函数有极强的破坏性,所有需要在img.copy()上画
# draw a bounding box arounded the detected barcode and display the image
draw_img = cv2.drawContours(original_img.copy(), [box], -1, (0, 0, 255), 3)
Xs = [i[0] for i in box]
Ys = [i[1] for i in box]
x1 = min(Xs)
x2 = max(Xs)
y1 = min(Ys)
y2 = max(Ys)
hight = y2 - y1
width = x2 - x1
crop_img = original_img[y1:y1+hight, x1:x1+width]
return draw_img, crop_img
def walk():
img_path = r'C:/Users/Administrator/Desktop/1010test/123456.jpg'
save_path = r'C:/Users/Administrator/Desktop/1010test/1678_save.png'
original_img, gray = get_image(img_path)
blurred = Gaussian_Blur(gray)
gradX, gradY, gradient = Sobel_gradient(blurred)
thresh = Thresh_and_blur(gradient)
closed = image_morphology(thresh)
box = findcnts_and_box_point(closed)
draw_img, crop_img = drawcnts_and_cut(original_img,box)
# 暴力一点,把它们都显示出来看看
cv2.imshow('original_img', original_img)
cv2.imshow('blurred', blurred)
cv2.imshow('gradX', gradX)
cv2.imshow('gradY', gradY)
cv2.imshow('final', gradient)
cv2.imshow('thresh', thresh)
cv2.imshow('closed', closed)
cv2.imshow('draw_img', draw_img)
cv2.imwrite('draw_img',draw_img)
cv2.imshow('crop_img', crop_img)
cv2.waitKey(20171219)
cv2.imwrite(save_path, crop_img)
walk()
#一、批量修改xml节点voc脚本
# coding=utf-8
import os
import os.path
import xml.dom.minidom
#获得文件夹中所有文件
FindPath = '/home/ubuntu/Desktop/myvoc2007/Annotations/'
FileNames = os.listdir(FindPath)
s = []
xml_path = '/home/ubuntu/Desktop/new/'
for file_name in FileNames:
if not os.path.isdir(file_name): # 判断是否是文件夹,不是文件夹才打开
print file_name
#读取xml文件
dom = xml.dom.minidom.parse(os.path.join(FindPath,file_name))
root = dom.documentElement
# 获取标签对name之间的值
name = root.getElementsByTagName('name')
for i in range(len(name)):
print name[i].firstChild.data
if name[i] .firstChild.data== 'screw cap':
name[i].firstChild.data = 'screwnut'
print '修改后的 name'
print name[i].firstChild.data
#将修改后的xml文件保存
with open(os.path.join(xml_path, file_name), 'w') as fh:
dom.writexml(fh)
print('写入name/pose OK!')
#二、图片文件批量重命名
import os
class ImageRename():
def __init__(self):
self.path = 'C:/yawning/'
def rename(self):
filelist = os.listdir(self.path)
total_num = len(filelist)
i = 0
for item in filelist:
if item.endswith('.jpg'):
src = os.path.join(os.path.abspath(self.path), item)
dst = os.path.join(os.path.abspath(self.path), '' + format(str(i), '0>3s') + '.jpg')
os.rename(src, dst)
print('conerting %s to %s ...' %(src,dst))
i = i + 1
print( 'total %d to rename & converted %d jpgs'%(total_num, i))
if __name__ == '__main__':
newname = ImageRename()
newname.rename()
#三、爬取图像(以后尝试爬取视频)
import urllib.request
import re
def getHtml(url):
#url = urllib.parse.quote(url)
page = urllib.request.urlopen(url)
html = page.read()
return html
def getImg(html):
reg = 'src ="(.+?\.jpg)" alt='
image = re.compile(reg)
html =html.decode('utf-8') #python3
imglist = re.findall(imgre,html)
x = 0
for imgurl in imglist:
urllib.request.urlretrieve(imgurl,'%s.jpg'% x)
x+=1
return imglist
html = getHtml("http://www.123.com/13.html")
print(getImg(html))
#四、将视频按照帧数转换成图片(一帧12张)
import cv2
vc = vc2.VideoCapture("initialD.mp4")
c=1
if vc.isOpened():
rval,frame=vc.read()
else:
rval=False
while rval:
rval,frame=vc.read()
cv2.imwrite('F://selffakeedataset//'+str(r)'.jpg',frame)
c=c+1
cv2.waitKey(1)
vc.release()
#五、数据增强(通过旋转模糊和剪切将一张图片扩增为50张)
#https://github.com/aleju/imgaug
from keras.preprocessing.image import ImageDataGenerator,arry_to_img,img_to_array,load_img
datagen = ImageDataGenerator(
rotation_range=30,#参数为整数,图片随机转动的角度
width_shift_range=0.2,#参数为浮点数,图片水平比例偏移的幅度
height_shift_range =0.2,#参数为浮点数,图片竖直偏移的幅度
shear_range=0.2,#参数为浮点数,逆时针方向剪切变换的角度
zoom_range=0.2,#参数为浮点数,随机旋转的幅度
horizontal_flip=Ture,#布尔值,进行随机水平翻转
fil=mode='nearest')#参数为costant/nearest/reflect/wrap/,进行变化时候超出边界的点根据本参数的方法处理。
img = load_img('C:/users/train/000012''.jpg')#这是一个PIL图像
x = img_to_array(img)#把一个PIL图像转换成一个numpy数组形状为(3,150,150)
x = x.reshape((1,)+x.shape)#这是一个numpy数组形状为(1,3,150,150)
#下面是生成图像的代码
i = 0
for batch in datagen.flow(x,batch_size=1,save_to_dir='C:/PIC/',save_prefix='smoking',save_foramt='jpeg'):
i += 1
if i > 50:
break #否则退出生成器循环。
#六、根据图片和xml文件扩增数据
import cv2
import math
import numpy as np
import xml.etree.ElemetTree as ET
import os
def rotate_imge(src,angle,scale=1):
w = src.shape[1]
h = src.shape[0]
#将角度转化为弧度
range=np.deg2rad(angle)
#从新计算图片的宽度和高度
nw=(abs(np.sin(angle)*h)+abs(np.cos(rangle)*w))*scale
nh=(abs(np.cos(angle)*h)+abs(np.sin(rangle)*w))*scale
#访问opencv的旋转矩阵
rot_mat = cv2.getRotationMatrix2D(nw*0.5,nh*0.5),angle,scale)
#随着旋转计算从旧中心到新中心
rot_move = np.dot(rot_mat,np.arry([(nw - w)*0.5,(nh -h)*0.5,0]))
#这部分的转秩只是为了更新部分的参数转秩更新
rot_mat[0,2] += rot_move[0]
rot_mat[1,2] += rot_move[1]
dst =cv2.warpAffine(src,rot_mat,(int(math.ceil(nw)),int(math.ceil(nh))),flages=cv2.INTER_LANCZ0S4)
#仿射变化
return dst
def rotate_xml(src,xmin,ymin,xmax,ymax,angle,scale=1.):
w = src.shape[1]
h = src.shape[0]
rangle = np.deg2rad(angle)
#将弧度转为角度,获取旋转后图像的宽度和长度。
nw =(abs(np.sin(rangle)*h)+abs(np.cos(rangle)*w))*scale
nh = (abs(np.cos(rangle)*h))+abs(np.sin(rangle)*w)*scale
#访问opencv的旋转矩阵
rot_mat = cv2.getRotationMatrix2D((nw*0.5,nh*0.5),angle,scale)
#计算随着旋转从旧中心到新中心
rot_move = np.dot(rot_mat,np.arry([(nw-w)*0.5,(nh-h)*0.5,0]))
rot_mat[0,2] += rot_move[0]
rot_mat[1,2] += rot_move[1]
point1 = np.dot(rot_mat,np.arry([(xmin+xmax)/2,ymin,1]))
point2 = np.dot(rot_mat,np.arry([xmax,(ymin=ymax)/2,1]))
point3 = np.dot(rot_mat,np.arry([(xmin+xmax)/2,ymax,1]))
point4 = np.dot(rot_mat,np.arry([xmin,(ymin+ymax)/2,1]))
concat = np.vstack((point1,point2,point3,point4))
#修改arry类型
concat = concat.astype(np.int32)
rx,ry,rw,rh = cv.boundingRect(concat)
return rx,ry,rw,rh
#源图像路径
imgpath = 'c:/data/1/'
#源图像对应的标注xml
xmlpath = 'C:/data/2/'
#旋转后图像存放路径
rotated_imgpath ='c:/data/3/'
#旋转后的xml存放路径
rotated_xmlpath ='c:/data/4/'
for angle in (180,360):
for i in os.listdir(imgpath):
a,b = os.path.splitext(i)
img = cv2.imwrite(imgpath+a+'.jpg')
rotated_img = rotated_imgpath(img,angle)
cv2.imwrite(rotated_imgpath+a+''+str(angle)+'d.jpg',rotated_img)
print(str(i)+'has been rotated for'+str(angle)+'。')
tree = ET.parse(xmlpath+a+'.xml')
root = tree.getroot()
for box in root.iter('bndbox'):
xmin = float(box.find('xmin').text)
ymin = float(box.find('ymin').text)
xmax = float(box.find('xmax').text)
ymax = float(box.find('yamx').text)
x,y,w,h = rotate_xml(img,xmin,ymin,xmax,ymax,angle)#可以使用该步骤查看转换后的参数是否正确
box.find('xmin').text = str(x)
box.find('ymin').text = str(y)
box.find('xmax').text = str(x+w)
box.find('ymax').text = str(y+h)
tree.write(rotate_xmlpath+a+'_'+str(angle)+'d.xml')
print(str(a)+'.xml has been rotated for'+str(angle)+'。')
#七通过四个点坐标提起区域其他背景全mask为黑
# scale的问题 图片宽高scale多少,咱们的框也相应scale多少就行
import numpy as np
import cv2
# 读取原始图像
img = cv2.imread('./sample.jpg')
# 显示图像
cv2.imshow("original", img)
# 定义roi列表,按(xmin,xmax,ymin,ymax)格式存放所有roi
roi_list = list()
# rois的ndarray对象
# 四个顶点坐标:依次为左上,右上,右下,左下
# 取roi区域时,只需要知道xmin,xmax,ymin,ymax即可,对应左上和右下的两个点
rois = np.array([
[[14, 29], [499, 29], [499, 44], [14, 44]],
[[66, 63], [275, 63], [275, 105], [66, 105]]
])
# 遍历rois的ndarray对象,按照指定格式存入roi_list
# 第一种方式:取第1个点和第3个点
# 局限性:只对顺时针有效
# for roi in rois:
# roi_list.append((roi[0][0], roi[2][0], roi[0][1], roi[2][1]))
# 第二种方式:直接使用最大最小值来取
# 优点:不用管各个点的顺序是顺时针或者逆时针.无序也可
# 局限性:必须是矩形的四个顶点
for roi in rois:
xmin = np.min([coordinates[0] for coordinates in roi])
xmax = np.max([coordinates[0] for coordinates in roi])
ymin = np.min([coordinates[1] for coordinates in roi])
ymax = np.max([coordinates[1] for coordinates in roi])
roi_list.append((xmin, xmax, ymin, ymax))
# 构建一个新的结果图像
result = np.zeros_like(img)
# 取原始图像中取对应roi数据,赋值给结果图像对应位置,注意y在前x在后
for roi in roi_list:
result[roi[2]:roi[3], roi[0]:roi[1]] = img[roi[2]:roi[3], roi[0]:roi[1]]
# 显示结果图像
cv2.imshow("result", result)
cv2.waitKey(0)
#八XML与txt的转换
#!/usr/bin/env python
# coding:utf-8
from lxml.etree import Element, SubElement, tostring
from xml.dom.minidom import parseString
import glob
import os
from PIL import Image
from tqdm import tqdm
def txtToXml(image_path, txt_path):
for txt_file in tqdm(glob.glob(txt_path + '/*.txt')):
txt_name_ = txt_file.split('\\')[-1][:-4]
data = {"shapes": []}
im = Image.open(image_path + '\\' + txt_name_ +'.jpg')
width = im.size[0]
height = im.size[1]
tree = open(txt_file, 'r', encoding='UTF-8')
node_root = Element('annotation')
node_folder = SubElement(node_root, 'folder')
node_folder.text = 'ICPR'
node_filename = SubElement(node_root, 'filename')
node_filename.text = txt_name_+ '.jpg'
node_size = SubElement(node_root, 'size')
node_width = SubElement(node_size, 'width')
node_width.text = str(width)
node_height = SubElement(node_size, 'height')
node_height.text = str(height)
node_depth = SubElement(node_size, 'depth')
node_depth.text = '3'
root = tree.readlines()
for i, line in enumerate(root):
column = line.split(',')
node_object = SubElement(node_root, 'object')
node_name = SubElement(node_object, 'name')
node_name.text = 'text' #做的是第二个项目,所以就把文本统一成了text
node_difficult = SubElement(node_object, 'difficult')
node_difficult.text = '0'
node_bndbox = SubElement(node_object, 'bndbox')
node_xmin = SubElement(node_bndbox, 'x0')
node_xmin.text = column[0]
node_ymin = SubElement(node_bndbox, 'y0')
node_ymin.text = column[1]
node_xmax = SubElement(node_bndbox, 'x1')
node_xmax.text = column[2]
node_ymax = SubElement(node_bndbox, 'y1')
node_ymax.text = column[3]
node_xmin = SubElement(node_bndbox, 'x2')
node_xmin.text = column[4]
node_ymin = SubElement(node_bndbox, 'y2')
node_ymin.text = column[5]
node_xmax = SubElement(node_bndbox, 'x3')
node_xmax.text = column[6]
node_ymax = SubElement(node_bndbox, 'y3')
node_ymax.text = column[7]
xml = tostring(node_root, pretty_print=True) #格式化显示,该换行的换行
dom = parseString(xml)
with open(txt_name_ + '.xml', 'w') as f:
dom.writexml(f, indent='\t', addindent='\t', newl='\n', encoding="utf-8")
if __name__ == "__main__":
data_path = os.path.join(os.getcwd(), 'txt_1000')
pic_path = os.path.join(os.getcwd(), 'image_1000')
txtToXml(pic_path, data_path )
八、
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('C:/Users/Administrator/Desktop/15.jpg',1)
#第一个为纵坐标参数,第二个为横坐标
points = np.array([[50, 1], [1, 50], [98, 98], [98, 1]])
cv2.fillConvexPoly(img, points,1)
plt.imshow(img, cmap='gray')
plt.show()
#一、批量修改xml节点voc脚本
# coding=utf-8
import os
import os.path
import xml.dom.minidom
#获得文件夹中所有文件
FindPath = '/home/ubuntu/Desktop/myvoc2007/Annotations/'
FileNames = os.listdir(FindPath)
s = []
xml_path = '/home/ubuntu/Desktop/new/'
for file_name in FileNames:
if not os.path.isdir(file_name): # 判断是否是文件夹,不是文件夹才打开
print file_name
#读取xml文件
dom = xml.dom.minidom.parse(os.path.join(FindPath,file_name))
root = dom.documentElement
# 获取标签对name之间的值
name = root.getElementsByTagName('name')
for i in range(len(name)):
print name[i].firstChild.data
if name[i] .firstChild.data== 'screw cap':
name[i].firstChild.data = 'screwnut'
print '修改后的 name'
print name[i].firstChild.data
#将修改后的xml文件保存
with open(os.path.join(xml_path, file_name), 'w') as fh:
dom.writexml(fh)
print('写入name/pose OK!')
#二、图片文件批量重命名
import os
class ImageRename():
def __init__(self):
self.path = 'C:/yawning/'
def rename(self):
filelist = os.listdir(self.path)
total_num = len(filelist)
i = 0
for item in filelist:
if item.endswith('.jpg'):
src = os.path.join(os.path.abspath(self.path), item)
dst = os.path.join(os.path.abspath(self.path), '' + format(str(i), '0>3s') + '.jpg')
os.rename(src, dst)
print('conerting %s to %s ...' %(src,dst))
i = i + 1
print( 'total %d to rename & converted %d jpgs'%(total_num, i))
if __name__ == '__main__':
newname = ImageRename()
newname.rename()
#三、爬取图像(以后尝试爬取视频)
import urllib.request
import re
def getHtml(url):
#url = urllib.parse.quote(url)
page = urllib.request.urlopen(url)
html = page.read()
return html
def getImg(html):
reg = 'src ="(.+?\.jpg)" alt='
image = re.compile(reg)
html =html.decode('utf-8') #python3
imglist = re.findall(imgre,html)
x = 0
for imgurl in imglist:
urllib.request.urlretrieve(imgurl,'%s.jpg'% x)
x+=1
return imglist
html = getHtml("http://www.123.com/13.html")
print(getImg(html))
#四、将视频按照帧数转换成图片(一帧12张)
import cv2
vc = vc2.VideoCapture("initialD.mp4")
c=1
if vc.isOpened():
rval,frame=vc.read()
else:
rval=False
while rval:
rval,frame=vc.read()
cv2.imwrite('F://selffakeedataset//'+str(r)'.jpg',frame)
c=c+1
cv2.waitKey(1)
vc.release()
#五、数据增强(通过旋转模糊和剪切将一张图片扩增为50张)
#https://github.com/aleju/imgaug
from keras.preprocessing.image import ImageDataGenerator,arry_to_img,img_to_array,load_img
datagen = ImageDataGenerator(
rotation_range=30,#参数为整数,图片随机转动的角度
width_shift_range=0.2,#参数为浮点数,图片水平比例偏移的幅度
height_shift_range =0.2,#参数为浮点数,图片竖直偏移的幅度
shear_range=0.2,#参数为浮点数,逆时针方向剪切变换的角度
zoom_range=0.2,#参数为浮点数,随机旋转的幅度
horizontal_flip=Ture,#布尔值,进行随机水平翻转
fil=mode='nearest')#参数为costant/nearest/reflect/wrap/,进行变化时候超出边界的点根据本参数的方法处理。
img = load_img('C:/users/train/000012''.jpg')#这是一个PIL图像
x = img_to_array(img)#把一个PIL图像转换成一个numpy数组形状为(3,150,150)
x = x.reshape((1,)+x.shape)#这是一个numpy数组形状为(1,3,150,150)
#下面是生成图像的代码
i = 0
for batch in datagen.flow(x,batch_size=1,save_to_dir='C:/PIC/',save_prefix='smoking',save_foramt='jpeg'):
i += 1
if i > 50:
break #否则退出生成器循环。
#六、根据图片和xml文件扩增数据
import cv2
import math
import numpy as np
import xml.etree.ElemetTree as ET
import os
def rotate_imge(src,angle,scale=1):
w = src.shape[1]
h = src.shape[0]
#将角度转化为弧度
range=np.deg2rad(angle)
#从新计算图片的宽度和高度
nw=(abs(np.sin(angle)*h)+abs(np.cos(rangle)*w))*scale
nh=(abs(np.cos(angle)*h)+abs(np.sin(rangle)*w))*scale
#访问opencv的旋转矩阵
rot_mat = cv2.getRotationMatrix2D(nw*0.5,nh*0.5),angle,scale)
#随着旋转计算从旧中心到新中心
rot_move = np.dot(rot_mat,np.arry([(nw - w)*0.5,(nh -h)*0.5,0]))
#这部分的转秩只是为了更新部分的参数转秩更新
rot_mat[0,2] += rot_move[0]
rot_mat[1,2] += rot_move[1]
dst =cv2.warpAffine(src,rot_mat,(int(math.ceil(nw)),int(math.ceil(nh))),flages=cv2.INTER_LANCZ0S4)
#仿射变化
return dst
def rotate_xml(src,xmin,ymin,xmax,ymax,angle,scale=1.):
w = src.shape[1]
h = src.shape[0]
rangle = np.deg2rad(angle)
#将弧度转为角度,获取旋转后图像的宽度和长度。
nw =(abs(np.sin(rangle)*h)+abs(np.cos(rangle)*w))*scale
nh = (abs(np.cos(rangle)*h))+abs(np.sin(rangle)*w)*scale
#访问opencv的旋转矩阵
rot_mat = cv2.getRotationMatrix2D((nw*0.5,nh*0.5),angle,scale)
#计算随着旋转从旧中心到新中心
rot_move = np.dot(rot_mat,np.arry([(nw-w)*0.5,(nh-h)*0.5,0]))
rot_mat[0,2] += rot_move[0]
rot_mat[1,2] += rot_move[1]
point1 = np.dot(rot_mat,np.arry([(xmin+xmax)/2,ymin,1]))
point2 = np.dot(rot_mat,np.arry([xmax,(ymin=ymax)/2,1]))
point3 = np.dot(rot_mat,np.arry([(xmin+xmax)/2,ymax,1]))
point4 = np.dot(rot_mat,np.arry([xmin,(ymin+ymax)/2,1]))
concat = np.vstack((point1,point2,point3,point4))
#修改arry类型
concat = concat.astype(np.int32)
rx,ry,rw,rh = cv.boundingRect(concat)
return rx,ry,rw,rh
#源图像路径
imgpath = 'c:/data/1/'
#源图像对应的标注xml
xmlpath = 'C:/data/2/'
#旋转后图像存放路径
rotated_imgpath ='c:/data/3/'
#旋转后的xml存放路径
rotated_xmlpath ='c:/data/4/'
for angle in (180,360):
for i in os.listdir(imgpath):
a,b = os.path.splitext(i)
img = cv2.imwrite(imgpath+a+'.jpg')
rotated_img = rotated_imgpath(img,angle)
cv2.imwrite(rotated_imgpath+a+''+str(angle)+'d.jpg',rotated_img)
print(str(i)+'has been rotated for'+str(angle)+'。')
tree = ET.parse(xmlpath+a+'.xml')
root = tree.getroot()
for box in root.iter('bndbox'):
xmin = float(box.find('xmin').text)
ymin = float(box.find('ymin').text)
xmax = float(box.find('xmax').text)
ymax = float(box.find('yamx').text)
x,y,w,h = rotate_xml(img,xmin,ymin,xmax,ymax,angle)#可以使用该步骤查看转换后的参数是否正确
box.find('xmin').text = str(x)
box.find('ymin').text = str(y)
box.find('xmax').text = str(x+w)
box.find('ymax').text = str(y+h)
tree.write(rotate_xmlpath+a+'_'+str(angle)+'d.xml')
print(str(a)+'.xml has been rotated for'+str(angle)+'。')
#七通过四个点坐标提起区域其他背景全mask为黑
# scale的问题 图片宽高scale多少,咱们的框也相应scale多少就行
import numpy as np
import cv2
# 读取原始图像
img = cv2.imread('./sample.jpg')
# 显示图像
cv2.imshow("original", img)
# 定义roi列表,按(xmin,xmax,ymin,ymax)格式存放所有roi
roi_list = list()
# rois的ndarray对象
# 四个顶点坐标:依次为左上,右上,右下,左下
# 取roi区域时,只需要知道xmin,xmax,ymin,ymax即可,对应左上和右下的两个点
rois = np.array([
[[14, 29], [499, 29], [499, 44], [14, 44]],
[[66, 63], [275, 63], [275, 105], [66, 105]]
])
# 遍历rois的ndarray对象,按照指定格式存入roi_list
# 第一种方式:取第1个点和第3个点
# 局限性:只对顺时针有效
# for roi in rois:
# roi_list.append((roi[0][0], roi[2][0], roi[0][1], roi[2][1]))
# 第二种方式:直接使用最大最小值来取
# 优点:不用管各个点的顺序是顺时针或者逆时针.无序也可
# 局限性:必须是矩形的四个顶点
for roi in rois:
xmin = np.min([coordinates[0] for coordinates in roi])
xmax = np.max([coordinates[0] for coordinates in roi])
ymin = np.min([coordinates[1] for coordinates in roi])
ymax = np.max([coordinates[1] for coordinates in roi])
roi_list.append((xmin, xmax, ymin, ymax))
# 构建一个新的结果图像
result = np.zeros_like(img)
# 取原始图像中取对应roi数据,赋值给结果图像对应位置,注意y在前x在后
for roi in roi_list:
result[roi[2]:roi[3], roi[0]:roi[1]] = img[roi[2]:roi[3], roi[0]:roi[1]]
# 显示结果图像
cv2.imshow("result", result)
cv2.waitKey(0)
#八XML与txt的转换
#!/usr/bin/env python
# coding:utf-8
from lxml.etree import Element, SubElement, tostring
from xml.dom.minidom import parseString
import glob
import os
from PIL import Image
from tqdm import tqdm
def txtToXml(image_path, txt_path):
for txt_file in tqdm(glob.glob(txt_path + '/*.txt')):
txt_name_ = txt_file.split('\\')[-1][:-4]
data = {"shapes": []}
im = Image.open(image_path + '\\' + txt_name_ +'.jpg')
width = im.size[0]
height = im.size[1]
tree = open(txt_file, 'r', encoding='UTF-8')
node_root = Element('annotation')
node_folder = SubElement(node_root, 'folder')
node_folder.text = 'ICPR'
node_filename = SubElement(node_root, 'filename')
node_filename.text = txt_name_+ '.jpg'
node_size = SubElement(node_root, 'size')
node_width = SubElement(node_size, 'width')
node_width.text = str(width)
node_height = SubElement(node_size, 'height')
node_height.text = str(height)
node_depth = SubElement(node_size, 'depth')
node_depth.text = '3'
root = tree.readlines()
for i, line in enumerate(root):
column = line.split(',')
node_object = SubElement(node_root, 'object')
node_name = SubElement(node_object, 'name')
node_name.text = 'text' #做的是第二个项目,所以就把文本统一成了text
node_difficult = SubElement(node_object, 'difficult')
node_difficult.text = '0'
node_bndbox = SubElement(node_object, 'bndbox')
node_xmin = SubElement(node_bndbox, 'x0')
node_xmin.text = column[0]
node_ymin = SubElement(node_bndbox, 'y0')
node_ymin.text = column[1]
node_xmax = SubElement(node_bndbox, 'x1')
node_xmax.text = column[2]
node_ymax = SubElement(node_bndbox, 'y1')
node_ymax.text = column[3]
node_xmin = SubElement(node_bndbox, 'x2')
node_xmin.text = column[4]
node_ymin = SubElement(node_bndbox, 'y2')
node_ymin.text = column[5]
node_xmax = SubElement(node_bndbox, 'x3')
node_xmax.text = column[6]
node_ymax = SubElement(node_bndbox, 'y3')
node_ymax.text = column[7]
xml = tostring(node_root, pretty_print=True) #格式化显示,该换行的换行
dom = parseString(xml)
with open(txt_name_ + '.xml', 'w') as f:
dom.writexml(f, indent='\t', addindent='\t', newl='\n', encoding="utf-8")
if __name__ == "__main__":
data_path = os.path.join(os.getcwd(), 'txt_1000')
pic_path = os.path.join(os.getcwd(), 'image_1000')
txtToXml(pic_path, data_path )
八、
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('C:/Users/Administrator/Desktop/15.jpg',1)
#第一个为纵坐标参数,第二个为横坐标
points = np.array([[50, 1], [1, 50], [98, 98], [98, 1]])
cv2.fillConvexPoly(img, points,1)
plt.imshow(img, cmap='gray')
plt.show()
一、python模板匹配算法(该算法对相似度的要求极高并且不支持任意角度)
import cv2 as cv
import numpy as np
def template_demo():
#读取模板图像
tpl = cv.imread("D:/1110/9.png")
#读取实例图像
target = cv.imread("D:/1110/7.png")
#cv.imshow("template image",tpl)
#cv.imshow("target image",target)
# 匹配方法 :平方差匹配、相关性匹配、相关性系数(相关性是越接近1越大越好,)
methods = [cv.TM_SQDIFF_NORMED,cv.TM_CCORR_NORMED,cv.TM_CCOEFF_NORMED]
th,tw = tpl.shape[:2]
for md in methods: