实验概述
ResNet(残差网络)是2015年提出的经典深度学习架构,通过残差连接解决了深层网络的梯度消失问题。本实验使用ResNet-18在CIFAR-10数据集上训练,对比普通CNN的效果。
实验环境
硬件环境
软件环境
超参数配置
数据集说明
CIFAR-10数据集
类别标签对照表
数据预处理
# 训练集预处理(包含数据增强)
transform_train = transforms.Compose([
transforms.RandomCrop(32, padding=4), # 随机裁剪
transforms.RandomHorizontalFlip(), # 随机水平翻转
transforms.ToTensor(), # 转为张量 [0,1]
transforms.Normalize( # 归一化
mean=[0.4914, 0.4822, 0.4465], # RGB均值
std=[0.2470, 0.2435, 0.2616] # RGB标准差
)
])
# 测试集预处理
transform_test = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(
mean=[0.4914, 0.4822, 0.4465],
std=[0.2470, 0.2435, 0.2616]
)
])模型架构
ResNet-18 网络结构
ResNet-18 是一种带有残差连接的深度卷积神经网络:
ResNet-18 (适配CIFAR-10)
├── Conv Block: Conv3x3(3→64, stride=1) + BN + ReLU
│
├── Residual Block 1 (64→64)
│ ├── Conv3x3(64→64) + BN + ReLU
│ ├── Conv3x3(64→64) + BN
│ └── Add: x + F(x)
│
├── Residual Block 2 (64→128)
│ ├── Conv3x3(64→128, stride=2) + BN + ReLU
│ ├── Conv3x3(128→128) + BN
│ └── Downsample: x → 128维 (1x1卷积)
│ └── Add: x + F(x)
│
├── Residual Block 3 (128→256)
│ ├── Conv3x3(128→256, stride=2) + BN + ReLU
│ ├── Conv3x3(256→256) + BN
│ ├── Downsample: x → 256维
│ └── Add: x + F(x)
│
├── Residual Block 4 (256→512)
│ ├── Conv3x3(256→512, stride=2) + BN + ReLU
│ ├── Conv3x3(512→512) + BN
│ ├── Downsample: x → 512维
│ └── Add: x + F(x)
│
├── AdaptiveAvgPool2d: 7x7 → 1x1
└── FC Layer: 512 → 10残差连接原理
残差网络的核心是残差块:
-
普通网络: 输出 F(x)
-
残差网络: 输出 F(x) + x
当网络已经学得较好时,F(x) → 0,恒等映射使得梯度直接传递到浅层。
CIFAR-10适配
模型参数量
完整代码
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
ResNet-18 CIFAR-10 彩色图像分类器
基于PyTorch的深度学习残差网络实验
类别标签:
0: 飞机 (Airplane)
1: 汽车 (Automobile)
2: 鸟 (Bird)
3: 猫 (Cat)
4: 鹿 (Deer)
5: 狗 (Dog)
6: 青蛙 (Frog)
7: 马 (Horse)
8: 船 (Ship)
9: 卡车 (Truck)
作者: AI Assistant
日期: 2026-03-31
"""
import os
import time
import logging
from datetime import datetime
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision.datasets import CIFAR10
import torchvision.models as models
# ============================================================
# 1. GPU设备配置与选择逻辑
# ============================================================
def get_device():
"""自动检测并选择最佳计算设备"""
if torch.cuda.is_available():
device = torch.device("cuda")
gpu_name = torch.cuda.get_device_name(0)
gpu_count = torch.cuda.device_count()
print(f"[GPU] 使用CUDA设备: {gpu_name}")
print(f"[GPU] GPU数量: {gpu_count}")
print(f"[GPU] CUDA版本: {torch.version.cuda}")
mem_allocated = torch.cuda.memory_allocated(0) / 1024**2
mem_reserved = torch.cuda.memory_reserved(0) / 1024**2
print(f"[GPU] 已分配内存: {mem_allocated:.2f} MB, 预留: {mem_reserved:.2f} MB")
elif torch.backends.mps.is_available():
device = torch.device("mps")
print("[GPU] 使用Apple MPS设备")
else:
device = torch.device("cpu")
print("[CPU] 使用CPU设备")
return device
def setup_logger(log_file):
"""配置日志:同时输出到文件和控制台"""
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
logger.handlers.clear()
file_handler = logging.FileHandler(log_file, encoding='utf-8')
file_handler.setLevel(logging.INFO)
file_formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
file_handler.setFormatter(file_formatter)
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.INFO)
console_formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
console_handler.setFormatter(console_formatter)
logger.addHandler(file_handler)
logger.addHandler(console_handler)
return logger
# ============================================================
# 2. ResNet-18模型定义 (适配CIFAR-10 32x32图像)
# ============================================================
class CIFAR10ResNet(nn.Module):
"""
ResNet-18 适配器 - 针对CIFAR-10 32x32图像优化
原始ResNet针对ImageNet 224x224设计,本模型做了以下适配:
1. 初始卷积层stride=1代替2(避免过早下采样)
2. 移除第一个MaxPool层以保留更多空间信息
3. 调整最终全连接层输出为10类
结构: Conv Block + 4个Residual Block + AvgPool + FC
输入: 32x32 RGB图像
输出: 10个类别
"""
def __init__(self, num_classes=10):
super().__init__()
# 加载预训练的ResNet-18
self.resnet = models.resnet18(weights=models.ResNet18_Weights.IMAGENET1K_V1)
# 修改初始卷积层:适配32x32图像
# 原始: conv1: 7x7, stride=2, padding=3
# 修改为: 3x3, stride=1, padding=1 (保留更多特征)
self.resnet.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)
# 移除原始的MaxPool层
self.resnet.maxpool = nn.Identity()
# 修改最终全连接层
self.resnet.fc = nn.Linear(512, num_classes)
# 权重初始化
nn.init.xavier_uniform_(self.resnet.fc.weight)
nn.init.zeros_(self.resnet.fc.bias)
def forward(self, x):
return self.resnet(x)
# ============================================================
# 3. 训练函数
# ============================================================
def train_one_epoch(model, train_loader, criterion, optimizer, device, epoch):
"""训练一个epoch"""
model.train()
running_loss = 0.0
correct = 0
total = 0
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
outputs = model(data)
loss = criterion(outputs, target)
loss.backward()
optimizer.step()
running_loss += loss.item()
_, predicted = outputs.max(1)
total += target.size(0)
correct += predicted.eq(target).sum().item()
if (batch_idx + 1) % 100 == 0:
print(f' Epoch {epoch} - Batch {batch_idx + 1}/{len(train_loader)}: '
f'Loss={loss.item():.4f}, Acc={100.*correct/total:.2f}%')
epoch_loss = running_loss / len(train_loader)
epoch_acc = 100. * correct / total
return epoch_loss, epoch_acc
def evaluate(model, test_loader, criterion, device):
"""评估函数"""
model.eval()
test_loss = 0.0
correct = 0
total = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.to(device), target.to(device)
outputs = model(data)
test_loss += criterion(outputs, target).item()
_, predicted = outputs.max(1)
total += target.size(0)
correct += predicted.eq(target).sum().item()
test_loss = test_loss / len(test_loader)
test_acc = 100. * correct / total
return test_loss, test_acc
# ============================================================
# 4. 主训练流程
# ============================================================
def main():
os.makedirs('./data', exist_ok=True)
os.makedirs('./logs', exist_ok=True)
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
log_file = f'./logs/training_log_resnet18_cifar10_{timestamp}.txt'
logger = setup_logger(log_file)
logger.info("=" * 60)
logger.info("ResNet-18 CIFAR-10 彩色图像分类器训练开始")
logger.info("=" * 60)
device = get_device()
logger.info(f"使用设备: {device}")
# 数据增强
train_transform = transforms.Compose([
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(mean=(0.4914, 0.4822, 0.4465), std=(0.2470, 0.2435, 0.2616))
])
test_transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=(0.4914, 0.4822, 0.4465), std=(0.2470, 0.2435, 0.2616))
])
# 加载数据集
logger.info("加载CIFAR-10数据集...")
train_dataset = CIFAR10(root='./data', train=True, transform=train_transform, download=True)
test_dataset = CIFAR10(root='./data', train=False, transform=test_transform, download=True)
logger.info(f"训练集大小: {len(train_dataset)}")
logger.info(f"测试集大小: {len(test_dataset)}")
logger.info(f"类别数量: 10")
train_loader = DataLoader(train_dataset, batch_size=128, shuffle=True, num_workers=0, pin_memory=True)
test_loader = DataLoader(test_dataset, batch_size=256, shuffle=False, num_workers=0, pin_memory=True)
# 初始化模型
model = CIFAR10ResNet(num_classes=10).to(device)
logger.info(f"\n模型结构:\n{model}")
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
logger.info(f"\n总参数量: {total_params:,}")
logger.info(f"可训练参数量: {trainable_params:,}")
# 损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# 训练循环
num_epochs = 10
best_acc = 0.0
logger.info(f"\n开始训练: {num_epochs} epochs, batch_size=128")
logger.info("-" * 60)
for epoch in range(1, num_epochs + 1):
epoch_start_time = time.time()
train_loss, train_acc = train_one_epoch(model, train_loader, criterion, optimizer, device, epoch)
test_loss, test_acc = evaluate(model, test_loader, criterion, device)
epoch_time = time.time() - epoch_start_time
logger.info(
f"Epoch {epoch}/{num_epochs} - "
f"Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2f}% - "
f"Test Loss: {test_loss:.4f}, Test Acc: {test_acc:.2f}% - "
f"Time: {epoch_time:.1f}s"
)
if test_acc > best_acc:
best_acc = test_acc
model_path = f'./logs/resnet18_cifar10_best_model_{timestamp}.pth'
torch.save(model.state_dict(), model_path)
logger.info(f"*** 新最佳模型已保存! Test Acc: {best_acc:.2f}% ***")
logger.info("-" * 60)
logger.info(f"训练完成! 最佳测试准确率: {best_acc:.2f}%")
logger.info("=" * 60)
if torch.cuda.is_available():
logger.info(f"\nGPU内存统计:")
logger.info(f" - 已分配: {torch.cuda.memory_allocated(0) / 1024**2:.2f} MB")
logger.info(f" - 已预留: {torch.cuda.memory_reserved(0) / 1024**2:.2f} MB")
logger.info(f" - 最大分配: {torch.cuda.max_memory_allocated(0) / 1024**2:.2f} MB")
if __name__ == '__main__':
main()实验结果
训练环境
使用设备: cuda
GPU型号: NVIDIA GeForce RTX 5090 Laptop GPU
总参数数量: 11,173,962
训练集大小: 50,000
测试集大小: 10,000各轮次训练结果
注:
*表示该Epoch刷新最佳测试准确率
最终结果
ResNet vs 普通CNN对比
模型架构对比
性能对比
残差连接的优势
-
梯度消失问题缓解: 残差连接为梯度提供直接通道
-
特征复用: 浅层特征可直接被深层使用
-
训练稳定性: 更深的网络也能正常收敛
结论
实验总结
关键发现
-
残差连接效果: ResNet的残差连接使得更深网络能够正常训练
-
迁移学习: 使用ImageNet预训练权重加速收敛
-
特征复用: 浅层卷积核学习到的边缘、纹理特征被深层有效利用
改进建议
-
学习率调度: 使用CosineAnnealing或ReduceLROnPlateau
-
数据增强: 添加AutoAugment或RandAugment
-
更深网络: 尝试ResNet-34或ResNet-50
-
知识蒸馏: 用大模型指导小模型
输出文件清单
实验日期: 2026-03-31