写的都很好,第一个看不懂可以接着看第二个,第二个里面有复现代码,第三篇是一篇综述,进阶阶段可以看。
详解梯度下降算法https://blog.csdn.net/JaysonWong/article/details/119818497线性回归模型——梯度下降算法https://blog.csdn.net/m0_37940048/article/details/120923023梯度下降优化算法综述https://blog.csdn.net/google19890102/article/details/69942970机器学习:随机梯度下降法https://blog.csdn.net/qq_38150441/article/details/80533891
“反向传播算法”过程及公式推导(超直观好懂的Backpropagation)https://blog.csdn.net/ft_sunshine/article/details/90221691
第3章 损失函数https://www.cnblogs.com/woodyh5/p/12067215.html
什么是批标准化 (Batch Normalization)https://zhuanlan.zhihu.com/p/24810318
理解optimizer.zero_grad(), loss.backward(), optimizer.step()的作用及原理https://blog.csdn.net/PanYHHH/article/details/107361827?spm=1001.2101.3001.6650.1&utm_medium=distribute.pc_relevant.none-task-blog-2~default~OPENSEARCH~Rate-1-107361827-blog-108630740.pc_relevant_default&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2~default~OPENSEARCH~Rate-1-107361827-blog-108630740.pc_relevant_default&utm_relevant_index=2optimizer.zero_grad()https://blog.csdn.net/weixin_36670529/article/details/108630740
史上最全学习率调整策略lr_schedulerhttps://blog.csdn.net/weiman1/article/details/125647517torch.optim.lr_scheduler.MultiStepLR()用法研究 台阶/阶梯学习率https://blog.csdn.net/jiongta9473/article/details/112388516