Yolov5(v6.1)数据增强方式解析

2年前Java源码8231
Yolov5(v6.1)数据增强方式解析 迪菲赫尔曼 已于2022-08-17 16:24:03修改 9110 收藏 111 分类专栏: YOLOv5改进实战 文章标签: 计算机视觉 深度学习 python 于2022-05-15 22:50:13首次发布 YOLOv5改进实战 专栏收录该内容 17 篇文章 300 订阅 订阅专栏 Yolov5提供了很多种数据增强的方式,一些基本的缩放、裁剪、旋转等我在之前的博文里介绍过了,这篇博文就主要讨论一下Mosaic数据增强

前些天发现了一个巨牛的人工智能学习网站,通俗易懂,风趣幽默,忍不住分享一下给大家。点击跳转到网站。


Mosaic数据增强概念

要获得一个表现良好的神经网络模型,往往需要大量的数据作支撑,然而获取新的数据这项工作往往需要花费大量的时间与人工成本。使用数据增强技术,可以充分利用计算机来生成数据,增加数据量,如采用缩放、平移、旋转、色彩变换等方法增强数据,数据增强的好处是能够增加训练样本的数量,同时添加合适的噪声数据,能够提高模型的泛化力。 在 YOLOv5 中除了使用最基本的数据增强方法外,还使用了 Mosaic 数据增强方法,其主要思想就是将 4 张图片进行随机裁剪、缩放后,再随机排列拼接形成一张图片,实现丰富数据集的同时,增加了小样本目标,提升网络的训练速度。在进行归一化操作时会一次性计算 4 张图片的数据,因此模型对内存的需求降低。Mosaic 数据增强的流程如图所示。 yolov5有关数据增强的参数都写到了data/hyps/hyp.scratch-med.yaml文件里,如果想关闭mosaic数据增强就直接可以把mosaic的参数设置为0

hsv_h: 0.015 # image HSV-Hue augmentation (fraction)色相 hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)饱和度 hsv_v: 0.4 # image HSV-Value augmentation (fraction)亮度 degrees: 0.0 # image rotation (+/- deg)旋转角度 translate: 0.1 # image translation (+/- fraction) scale: 0.5 # image scale (+/- gain) shear: 0.0 # image shear (+/- deg) perspective: 0.0 # image perspective (+/- fraction), range 0-0.001 flipud: 0.0 # image flip up-down (probability) fliplr: 0.5 # image flip left-right (probability) mosaic: 1.0 # image mosaic (probability) mixup: 0.0 # image mixup (probability) copy_paste: 0.0 # segment copy-paste (probability)

但是我在源码中看到了两种mosaic数据增强代码,一个是4-mosaic数据增强,另一个是9-mosaic数据增强 如果想换成9-mosaic数据增强可以将load_mosaic9()改成oad_mosaic(),然后将原本的load_mosaic()注释掉 或者干脆把两个名字换一下

def load_mosaic(self, index): # YOLOv5 4-mosaic loader. Loads 1 image + 3 random images into a 4-image mosaic labels4, segments4 = [], [] s = self.img_size yc, xc = (int(random.uniform(-x, 2 * s + x)) for x in self.mosaic_border) # mosaic center x, y indices = [index] + random.choices(self.indices, k=3) # 3 additional image indices random.shuffle(indices) for i, index in enumerate(indices): # Load image img, _, (h, w) = self.load_image(index) # place img in img4 if i == 0: # top left img4 = np.full((s * 2, s * 2, img.shape[2]), 114, dtype=np.uint8) # base image with 4 tiles x1a, y1a, x2a, y2a = max(xc - w, 0), max(yc - h, 0), xc, yc # xmin, ymin, xmax, ymax (large image) x1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, h # xmin, ymin, xmax, ymax (small image) elif i == 1: # top right x1a, y1a, x2a, y2a = xc, max(yc - h, 0), min(xc + w, s * 2), yc x1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), h elif i == 2: # bottom left x1a, y1a, x2a, y2a = max(xc - w, 0), yc, xc, min(s * 2, yc + h) x1b, y1b, x2b, y2b = w - (x2a - x1a), 0, w, min(y2a - y1a, h) elif i == 3: # bottom right x1a, y1a, x2a, y2a = xc, yc, min(xc + w, s * 2), min(s * 2, yc + h) x1b, y1b, x2b, y2b = 0, 0, min(w, x2a - x1a), min(y2a - y1a, h) img4[y1a:y2a, x1a:x2a] = img[y1b:y2b, x1b:x2b] # img4[ymin:ymax, xmin:xmax] padw = x1a - x1b padh = y1a - y1b # Labels labels, segments = self.labels[index].copy(), self.segments[index].copy() if labels.size: labels[:, 1:] = xywhn2xyxy(labels[:, 1:], w, h, padw, padh) # normalized xywh to pixel xyxy format segments = [xyn2xy(x, w, h, padw, padh) for x in segments] labels4.append(labels) segments4.extend(segments) # Concat/clip labels labels4 = np.concatenate(labels4, 0) for x in (labels4[:, 1:], *segments4): np.clip(x, 0, 2 * s, out=x) # clip when using random_perspective() # img4, labels4 = replicate(img4, labels4) # replicate # Augment img4, labels4, segments4 = copy_paste(img4, labels4, segments4, p=self.hyp['copy_paste']) img4, labels4 = random_perspective(img4, labels4, segments4, degrees=self.hyp['degrees'], translate=self.hyp['translate'], scale=self.hyp['scale'], shear=self.hyp['shear'], perspective=self.hyp['perspective'], border=self.mosaic_border) # border to remove return img4, labels4 def load_mosaic9(self, index): # YOLOv5 9-mosaic loader. Loads 1 image + 8 random images into a 9-image mosaic labels9, segments9 = [], [] s = self.img_size indices = [index] + random.choices(self.indices, k=8) # 8 additional image indices random.shuffle(indices) hp, wp = -1, -1 # height, width previous for i, index in enumerate(indices): # Load image img, _, (h, w) = self.load_image(index) # place img in img9 if i == 0: # center img9 = np.full((s * 3, s * 3, img.shape[2]), 114, dtype=np.uint8) # base image with 4 tiles h0, w0 = h, w c = s, s, s + w, s + h # xmin, ymin, xmax, ymax (base) coordinates elif i == 1: # top c = s, s - h, s + w, s elif i == 2: # top right c = s + wp, s - h, s + wp + w, s elif i == 3: # right c = s + w0, s, s + w0 + w, s + h elif i == 4: # bottom right c = s + w0, s + hp, s + w0 + w, s + hp + h elif i == 5: # bottom c = s + w0 - w, s + h0, s + w0, s + h0 + h elif i == 6: # bottom left c = s + w0 - wp - w, s + h0, s + w0 - wp, s + h0 + h elif i == 7: # left c = s - w, s + h0 - h, s, s + h0 elif i == 8: # top left c = s - w, s + h0 - hp - h, s, s + h0 - hp padx, pady = c[:2] x1, y1, x2, y2 = (max(x, 0) for x in c) # allocate coords # Labels labels, segments = self.labels[index].copy(), self.segments[index].copy() if labels.size: labels[:, 1:] = xywhn2xyxy(labels[:, 1:], w, h, padx, pady) # normalized xywh to pixel xyxy format segments = [xyn2xy(x, w, h, padx, pady) for x in segments] labels9.append(labels) segments9.extend(segments) # Image img9[y1:y2, x1:x2] = img[y1 - pady:, x1 - padx:] # img9[ymin:ymax, xmin:xmax] hp, wp = h, w # height, width previous # Offset yc, xc = (int(random.uniform(0, s)) for _ in self.mosaic_border) # mosaic center x, y img9 = img9[yc:yc + 2 * s, xc:xc + 2 * s] # Concat/clip labels labels9 = np.concatenate(labels9, 0) labels9[:, [1, 3]] -= xc labels9[:, [2, 4]] -= yc c = np.array([xc, yc]) # centers segments9 = [x - c for x in segments9] for x in (labels9[:, 1:], *segments9): np.clip(x, 0, 2 * s, out=x) # clip when using random_perspective() # img9, labels9 = replicate(img9, labels9) # replicate # Augment img9, labels9 = random_perspective(img9, labels9, segments9, degrees=self.hyp['degrees'], translate=self.hyp['translate'], scale=self.hyp['scale'], shear=self.hyp['shear'], perspective=self.hyp['perspective'], border=self.mosaic_border) # border to remove return img9, labels9

最后放上我在实际训练过程中的一些Mosaic数据增强后的图片,这几张是4-mosaic数据增强


这几张是9-mosaic数据增强


内容导航

1.手把手带你调参Yolo v5 (v6.1)(一)🌟强烈推荐

2.手把手带你调参Yolo v5 (v6.1)(二)🚀

3.如何快速使用自己的数据集训练Yolov5模型

4.手把手带你Yolov5 (v6.1)添加注意力机制(一)(并附上30多种顶会Attention原理图)🌟

5.手把手带你Yolov5 (v6.1)添加注意力机制(二)(在C3模块中加入注意力机制)

6.Yolov5如何更换激活函数?

7.Yolov5 (v6.1)数据增强方式解析

8.Yolov5更换上采样方式( 最近邻 / 双线性 / 双立方 / 三线性 / 转置卷积)

9.Yolov5如何更换EIOU / alpha IOU / SIoU?

10.Yolov5更换主干网络之《旷视轻量化卷积神经网络ShuffleNetv2》🍀

11.YOLOv5应用轻量级通用上采样算子CARAFE🍀

12.空间金字塔池化改进 SPP / SPPF / ASPP / RFB / SPPCSPC🍀

13.持续更新中


相关文章

【涨粉10万】CSDN年度总结——再见2021

【涨粉10万】CSDN年度总结——再见2021...

【Linux】进程概念(万字详解)—— 冯诺依曼体系结构 | 操作系统 | 进程

【Linux】进程概念(万字详解)—— 冯诺依曼体系结构 | 操作系统 | 进程...

第8期:云原生—— 大学生职场小白该如何学

第8期:云原生—— 大学生职场小白该如何学...

Elasticsearch RestHighLevelClient 已标记为被弃用 它的替代方案 Elasticsearch Java API Client 的基础教程及迁移方案

Elasticsearch RestHighLevelClient 已标记为被弃用 它的替代方案 Elasticsearch Java API Client 的基础教程及迁移方案...

用Python实现简单的人脸识别,10分钟搞定!(附源码)

用Python实现简单的人脸识别,10分钟搞定!(附源码)...

从零搭建完整python自动化测试框架(UI自动化和接口自动化 )——持续更新

从零搭建完整python自动化测试框架(UI自动化和接口自动化 )——持续更新...