部署yolov11训练点选验证码

1 本地环境安装

注意：因为部分图片是我用的笔记本截图的，所以可能有点模糊。

1.1 CUDA版本确认

1.1.1 确认自己的显卡

（最好是N卡，A卡好像只能使用CPU训练）

1.1.2 查看CUDA版本

1	nvidia-smi

1.2 安装miniconda

1.2.1 下载地址

1	https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/

1.2.2 选择版本

可以根据自己后期使用的python版本选择，也可以随便选择

1.2.3 安装

记住第三条不勾选，其他勾选

1.2.4 查看是否安装成功

1	conda --version

1.3 conda切换源

1.3.1 常用命令

# 查看镜像源
conda config --show channels

# 删除镜像源恢复默认源
conda config --remove-key channels

1.3.2 切换清华源

可以查看这里anaconda | 镜像站使用帮助 | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror
将 C:\Users\<YourUserName>\.condarc内容改成下面

channels:
  - defaults
show_channel_urls: true
default_channels:
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
  conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud

1.3.3 清除索引缓存

1	conda clean -i

1.3.4 查看安装情况

1	conda info

1.4 创建虚拟环境

1.4.1 常用命令

# 查看当前虚拟环境
conda-env list

# 进入虚拟环境
conda activate name

# 退出虚拟环境
conda deactivate

# 删除虚拟环境
conda remove -n name -all

1.4.2 创建虚拟环境

1	conda create -n yolov11 python=3.8

1.4.3 进入虚拟环境

1	conda activate yolov11

1.5 pytorch框架安装

1.5.1 部分对应版本

torch	torchvision	Python
`main` / `nightly`	`main` / `nightly`	`>=3.8`, `<=3.12`
`2.3`	`0.18`	`>=3.8`, `<=3.12`
`2.2`	`0.17`	`>=3.8`, `<=3.11`
`2.1`	`0.16`	`>=3.8`, `<=3.11`
`2.0`	`0.15`	`>=3.8`, `<=3.11`
`1.13`	`0.14`	`>=3.7.2`, `<=3.10`
`1.12`	`0.13`	`>=3.7`, `<=3.10`
`1.11`	`0.12`	`>=3.7`, `<=3.10`
`1.10`	`0.11`	`>=3.6`, `<=3.9`
`1.9`	`0.10`	`>=3.6`, `<=3.9`
`1.8`	`0.9`	`>=3.6`, `<=3.9`
`1.7`	`0.8`	`>=3.6`, `<=3.9`
`1.6`	`0.7`	`>=3.6`, `<=3.8`
`1.5`	`0.6`	`>=3.5`, `<=3.8`
`1.4`	`0.5`	`==2.7`, `>=3.5`, `<=3.8`
`1.3`	`0.4.2` `0.4.3`	`==2.7`, `>=3.5`, `<=3.7`
`1.2`	`0.4.1`	`==2.7`, `>=3.5`, `<=3.7`
`1.1`	`0.3`	`==2.7`, `>=3.5`, `<=3.7`
`<=1.0`	`0.2`	`==2.7`, `>=3.5`, `<=3.7`

1.5.2 安装pytorach

1.5.2.1 注意

因为我的CUDA是11.0，所以对于的pytorach版本是1.7.1和1.7.0
因为pytorach1.7对应的python版本是3.6 - 3.9，所以我选择python版本是3.8
如果你对于的python版本是其他的，则需要在系统环境中将python版本设置为需要版本，且重新创建conda虚拟环境，记得python环境要符合
查看conda的python版本

1.5.2.2 pytorch下载地址

1	https://pytorch.org/get-started/previous-versions/

1.5.2.3 下载pytorch

conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch

# 记得删除后的 -c pytorch ，这个是使用官方镜像的命令，会使下载速度很慢
# 所以使用下面的命令

conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0

1.5.3 查看安装情况

python
import torch
# 查看torch版本
torch.__version__
# 查看cuda版本号
torch.version.cuda
# 查看的一个显卡名称
torch.cuda.get_device_name(0)
# 查看cuda是否可用
torch.cuda.is_available()
# 查看cudnn版本
torch.backends.cudnn.version()
# 查看cudnn是否可用
torch.backends.cudnn.is_avaliable()
# 查看显卡数量
torch.cuda.device_count()

2 云平台部署

2.1 配置VScode

安装 VSCode 插件，Remote-SSH 、 Chinese (Simplified)

2.2 租用显卡

注意环境版本

2.3 登录云服务

2.4 创建虚拟环境

2.4.1 初始化

1	conda init

2.4.2 新建环境

1	conda create --name yolov11 python=3.10 -y

2.4.3 激活环境

（记得重新打开终端页面）

1	conda activate yolov11

3 开始训练

3.1 准备

3.1.1 下载代码

1	git clone https://github.com/ultralytics/ultralytics

3.1.2 安装环境

1	pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple

3.1.3 下载权重

1 2	wget https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt # 或者使用自己的权重文件再次训练

3.1.4 上传文件

上传自己准备好的数据集

上传后将文件从autodl-fs移动到autodl-tmp目录下

注意训练集文件格式

shujuji
	-images
		- train
		- val
	-labels
		- train
		- val

3.1.4 推理文件

推理模型`deletc.py`文件

from ultralytics import YOLO


if __name__ == '__main__':
  # 加载模型
  model = YOLO(r'/root/autodl-tmp/best.pt') # YOLOv8n模型
  model.predict(
    source=r'/root/autodl-tmp/img/big',
    save=True, # 保存预测结果
    project='runs/predict', # 项目名称（可选）
    name='exp', # 实验名称，结果保存在'project/name'目录下（可选）
  )

训练模型`train.py`文件

from ultralytics import YOLO


if __name__ == '__main__':
  # 加载模型
  model = YOLO('yolov11.yaml', task="detect") # 不使用预训练权重训练 | detect, segment, classify, pose, obb
  # model = YOLO(r'yolov11.yaml').load("yolov11n.pt") # 使用预训练权重训练
  # 训练参数 ----------------------------------------------------------------------------------------------
  model.train(
    data='coco128.yaml',
    epochs=100, # (int) 训练的周期数
    patience=50, # (int) 等待无明显改善以进行早期停止的周期数
    batch=32, # (int) 每批次的图像数量（-1 为自动批处理）
    imgsz=640, # (int) 输入图像的大小，整数或w，h
    save=True, # (bool) 保存训练检查点和预测结果
    save_period=-1, # (int) 每x周期保存检查点（如果小于1则禁用）
    cache=True, # (bool) True/ram、磁盘或False。使用缓存加载数据
    device='', # (int | str | list, optional) 运行的设备，例如 cuda device=0 或 device=0,1,2,3 或 device=cpu
    workers=8, # (int) 数据加载的工作线程数（每个DDP进程）
    project='runs/train', # (str, optional) 项目名称
    name='exp', # (str, optional) 实验名称，结果保存在'project/name'目录下
    pretrained=True, # (bool | str) 是否使用预训练模型（bool），或从中加载权重的模型（str）
    optimizer='SGD', # (str) 要使用的优化器，选择=[SGD，Adam，Adamax，AdamW，NAdam，RAdam，RMSProp，auto]
    verbose=True, # (bool) 是否打印详细输出
    seed=0, # (int) 用于可重复性的随机种子
    close_mosaic=0, # (int) 在最后几个周期禁用马赛克增强
    resume=False, # (bool) 从上一个检查点恢复训练
    amp=False, # (bool) 自动混合精度（AMP）训练，选择=[True, False]，True运行AMP检查
  )

评估模型`val.py`文件

from ultralytics import YOLO


if __name__ == '__main__':
  # 加载模型
  model = YOLO(r'yolov8n.pt')  
  # 验证模型
  metrics=model.val(
    data=r'coco128.yaml',
    split='val', # (str) 用于验证的数据集拆分，例如'val'、'test'或'train'
    batch=1, # (int) 每批的图像数量（-1 为自动批处理）
    imgsz=640, # 输入图像的大小，可以是整数或w，h
    device=0, # 运行的设备，例如 cuda device=0 或 device=0,1,2,3 或 device=cpu
    workers=8, # 数据加载的工作线程数（每个DDP进程）
    iou=0.6, # 非极大值抑制 (NMS) 的交并比 (IoU) 阈值
    project='runs/val', # 项目名称（可选）
    name='exp', # 实验名称，结果保存在'project/name'目录下（可选）
  )
  print(f"mAP50-95: {metrics.box.map}") # map50-95
  print(f"mAP50: {metrics.box.map50}") # map50
  print(f"mAP75: {metrics.box.map75}") # map75
  speed_metrics = metrics.speed
  total_time = sum(speed_metrics.values())
  fps = 1000 / total_time
  print(f"FPS: {fps}") # FPS

模型文件`coco1233.yaml`

文件目录/root/autodl-tmp/ultralytics/ultralytics/cfg/datasets/coco128.yaml

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# COCO128 dataset https://www.kaggle.com/datasets/ultralytics/coco128 (first 128 images from COCO train2017) by Ultralytics
# Documentation: https://docs.ultralytics.com/datasets/detect/coco/
# Example usage: yolo train data=coco128.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── coco128  ← downloads here (7 MB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
# path: autodl-tmp/img # dataset root dir
train: /root/autodl-tmp/img/images/train # train images (relative to 'path') 128 images
val: /root/autodl-tmp/img/images/val # val images (relative to 'path') 128 images
test: # test images (optional)

# Classes
names:
  0: text

# Download script/URL (optional)
download: https://github.com/ultralytics/assets/releases/download/v0.0.0/coco128.zip

3.2 训练

3.2.1 测试模型

1	python detect.py

测试结果所在文件夹

测试情况

3.2.2 训练模型

1	python train.py

4 导出使用

4.1 pt文件转成onnx文件

from ultralytics import YOLO

# Load the YOLO11 model
model = YOLO("yolo11n.pt")

# Export the model to ONNX format
model.export(format="onnx")  # creates 'yolo11n.onnx'

4.2 使用

import time

import cv2
import numpy as np
import onnxruntime as ort


class ONNXObjectDetector:
    def __init__(self, image_path):
        """
        初始化 ONNX 目标检测器
        - 加载 ONNX 模型
        - 设置类别名称、输入尺寸、置信度阈值等参数
        """
        self.image_path = image_path
        self.model_path = '../best3.onnx'  # ONNX 模型路径
        self.class_names = ["text"]  # 检测类别名称
        self.input_size = (512, 512)  # 输入图像的尺寸
        self.conf_threshold = 0.5  # 置信度阈值
        self.nms_threshold = 0.45  # 非极大值抑制阈值
        self.colors = np.random.uniform(0, 255, size=(len(self.class_names), 3))  # 随机生成颜色以标识类别
        self.session = ort.InferenceSession(self.model_path)  # 加载 ONNX 模型

    def preprocess_image(self):
        """
        预处理输入图像，使其适配模型输入要求
        :return: 预处理后的图像、原始图像和缩放比例
        """
        original_image = cv2.imread(self.image_path)  # 读取图像
        height, width, _ = original_image.shape  # 获取图像尺寸
        length = max(self.input_size)  # 获取最长边长度
        image = np.zeros((length, length, 3), np.uint8)  # 创建方形黑色背景
        image[0:height, 0:width] = original_image  # 将原图填充到方形图像中

        scale = length / self.input_size[0]  # 计算缩放比例
        blob = cv2.dnn.blobFromImage(image, scalefactor=1 / 255.0, size=self.input_size, swapRB=True)
        return blob, original_image, scale

    def detect_objects(self):
        """
        执行目标检测
        :return: 检测结果列表
        """
        blob, original_image, scale = self.preprocess_image()  # 预处理图像
        input_name = self.session.get_inputs()[0].name  # 获取模型输入层名称
        outputs = self.session.run(None, {input_name: blob})  # 运行模型推理
        outputs = np.array([cv2.transpose(outputs[0][0])])  # 转置输出数据

        rows = outputs.shape[1]  # 获取检测框数量
        boxes, scores, class_ids = [], [], []

        # 解析检测结果
        for i in range(rows):
            class_scores = outputs[0][i][4:]  # 获取所有类别的得分
            _, max_score, _, (x, max_class_idx) = cv2.minMaxLoc(class_scores)  # 获取最大类别得分
            if max_score >= self.conf_threshold:  # 如果得分高于阈值，则保存检测框
                box = [
                    outputs[0][i][0] - (0.5 * outputs[0][i][2]),  # 左上角 x
                    outputs[0][i][1] - (0.5 * outputs[0][i][3]),  # 左上角 y
                    outputs[0][i][2],  # 宽度
                    outputs[0][i][3],  # 高度
                ]
                boxes.append(box)
                scores.append(max_score)
                class_ids.append(max_class_idx)
        # print(boxes)
        # 进行非极大值抑制（NMS）
        result_indices = cv2.dnn.NMSBoxes(boxes, scores, self.conf_threshold, self.nms_threshold)
        detections = []
        if len(result_indices) > 0:
            for i in result_indices.flatten():
                box = boxes[i]
                detection = {
                    "class_id": class_ids[i],
                    "class_name": self.class_names[class_ids[i]],
                    "confidence": scores[i],
                    "box": [
                        round(box[0] * scale),
                        round(box[1] * scale),
                        round((box[0] + box[2]) * scale),
                        round((box[1] + box[3]) * scale),
                    ],
                }
                detections.append(detection)

        return detections, original_image

    def draw_detections(self, image, detections):
        """
        在图像上绘制检测结果
        :param image: 原始图像
        :param detections: 检测结果列表
        """
        for detection in detections:
            x1, y1, x2, y2 = detection["box"]
            class_id = detection["class_id"]
            color = self.colors[class_id]  # 获取类别对应颜色
            label = f"{detection['class_name']} ({detection['confidence']:.2f})"  # 标签内容

            cv2.rectangle(image, (x1, y1), (x2, y2), color, 2)  # 绘制边界框
            cv2.putText(image, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)  # 绘制类别标签

        cv2.imshow("Detections", image)  # 显示结果
        cv2.waitKey(0)
        cv2.destroyAllWindows()  # 关闭窗口


# 使用示例
if __name__ == "__main__":
    start = time.time()
    image_path = r"test.jpg"

    detector = ONNXObjectDetector(image_path)  # 创建检测器实例
    detections, processed_image = detector.detect_objects()  # 运行检测
    end = time.time()
    print(end - start)  # 输出检测时间
    if detections:
        print("检测结果：")
        for det in detections:
            print(f"类别: {det['class_name']}, 置信度: {det['confidence']:.2f}, 坐标: {det['box']}")
        detector.draw_detections(processed_image, detections)
    else:
        print("检测失败，数据集不足或该照片中无目标")

5 补充

5.1 xml转txt（批量）

import os
import xml.etree.ElementTree as ET

import cv2


def save_yolo_labels(yolo_labels, txt_file):
    """
    将YOLO格式的标注保存为txt文件

    :param yolo_labels: YOLO格式的标注数据
    :param txt_file: 保存的txt文件路径
    """
    with open(txt_file, 'w') as f:
        for label in yolo_labels:
            f.write(label + '\n')


# 创建一个转换函数
def convert_to_yolo_format(xml_file, image_width, image_height):
    tree = ET.parse(xml_file)
    root = tree.getroot()
    yolo_labels = []

    for obj in root.findall('object'):
        class_id = 0

        # 获取边界框坐标
        xmin = int(obj.find('.//bndbox/xmin').text)
        ymin = int(obj.find('.//bndbox/ymin').text)
        xmax = int(obj.find('.//bndbox/xmax').text)
        ymax = int(obj.find('.//bndbox/ymax').text)

        # 计算YOLO格式的坐标
        x_center = (xmin + xmax) / 2 / image_width
        y_center = (ymin + ymax) / 2 / image_height
        width = (xmax - xmin) / image_width
        height = (ymax - ymin) / image_height
        print(x_center, y_center, width, height)

        # 将结果写入txt文件
        yolo_labels.append(f"{class_id} {x_center} {y_center} {width} {height}")

    return yolo_labels


def convert_labels_to_yolo(xml_dir, image_dir, output_dir):
    """
    将目录下所有的XML文件转换为YOLO格式的txt文件

    :param xml_dir: XML文件所在的目录
    :param image_dir: 图像文件所在的目录
    :param output_dir: 输出的txt文件保存目录
    :param class_names: 类别名称列表
    """
    # 确保输出目录存在
    os.makedirs(output_dir, exist_ok=True)

    # 遍历XML文件
    for xml_file in os.listdir(xml_dir):
        if not xml_file.endswith('.xml'):
            continue

        # 获取图像的宽度和高度
        image_name = xml_file.replace('.xml', '.jpg')  # 假设图像是jpg格式
        image_path = os.path.join(image_dir, image_name)
        image = cv2.imread(image_path)
        image_height, image_width, _ = image.shape

        # 读取并转换XML为YOLO格式
        xml_path = os.path.join(xml_dir, xml_file)
        yolo_labels = convert_to_yolo_format(xml_path, image_width, image_height)

        # 保存到TXT文件
        txt_file = os.path.join(output_dir, xml_file.replace('.xml', '.txt'))
        save_yolo_labels(yolo_labels, txt_file)


if __name__ == '__main__':
    # 使用示例
    xml_dir = r'D:\tools\labelimg\shuju\labels\train'  # XML标注文件所在的文件夹路径
    image_dir = r'D:\tools\labelimg\shuju\images\train'  # 图像文件所在的文件夹路径
    output_dir = r'D:\tools\labelimg\shuju\labels\a'  # 输出TXT文件保存的文件夹路径
    convert_labels_to_yolo(xml_dir, image_dir, output_dir)

参考：

【训练验证码-深度学习概念了解】https://www.bilibili.com/video/BV1pu4y1h7W9?vd_source=cedf2fe216d2166df4888204fea0df25
https://www.bilibili.com/opus/997506326309371906
使用 OpenCV 和 ONNX Runtime 对图像进行目标检测(YOLOV8N&智能车竞赛&目标识别)_opencv onnxruntime-CSDN博客

1 本地环境安装

1.1 CUDA版本确认

1.1.1 确认自己的显卡

1.1.2 查看CUDA版本

1.2 安装miniconda

1.2.1 下载地址

1.2.2 选择版本

1.2.3 安装

1.2.4 查看是否安装成功

1.3 conda切换源

1.3.1 常用命令

1.3.2 切换清华源

1.3.3 清除索引缓存

1.3.4 查看安装情况

1.4 创建虚拟环境

1.4.1 常用命令

1.4.2 创建虚拟环境

1.4.3 进入虚拟环境

1.5 pytorch框架安装

1.5.1 部分对应版本

1.5.2 安装pytorach

1.5.2.1 注意

1.5.2.2 pytorch下载地址

1.5.2.3 下载pytorch

1.5.3 查看安装情况

2 云平台部署

2.1 配置VScode

2.2 租用显卡

2.3 登录云服务

2.4 创建虚拟环境

2.4.1 初始化

2.4.2 新建环境

2.4.3 激活环境

3 开始训练

3.1 准备

3.1.1 下载代码

3.1.2 安装环境

3.1.3 下载权重

3.1.4 上传文件

3.1.4 推理文件

推理模型deletc.py文件

训练模型train.py文件

评估模型val.py文件

模型文件coco1233.yaml

3.2 训练

3.2.1 测试模型

3.2.2 训练模型

4 导出使用

4.1 pt文件转成onnx文件

4.2 使用

5 补充

5.1 xml转txt（批量）

参考：

推理模型`deletc.py`文件

训练模型`train.py`文件

评估模型`val.py`文件

模型文件`coco1233.yaml`