【Pytorch基础】反向传播算法

回顾

 之前我们只讨论过线性模型$\hat{y} = w * x$的权重更新方法。实际上,它可以被看成一个最简单的神经网络:

 可以看出,得到预测之后直接就能将误差反馈给权重进而调整权重值,因为它仅仅只有一个结点。但是,如果是非常复杂的模型呢?比如:

上图是一个 4 隐层的神经网络,中间的的线条数就是其权重的个数,即

实际情况还可能更多。显然想用线性模型的算法传播算法来更新权重在这里是行不通的。

计算图

先把模型看成一个图,考虑数据在图内的传播算法。


 对于上图所示的模型,是一个两层(单隐层)的神经网络。假设其输入$X$的维度为$n$,隐层(中间层)输出$H$的维度为$m$,输出层$O$的维度为$n$其结构如下:

那么,对应的两个权重矩阵为$W{1 (m\times n)},W{2 (n\times m)}$。
$b_1,b_2$ 称为偏置。

问题

如果对上述模型进行变形可以发现,无论它有多少层,最终都会被简化为单层。

这样一来,意味着层数(权重数量)的增加对模型的效果没有起到积极意义。因此,我们要将每一层的输出作用于一个非线性的变化函数(如$Sigmoid(x_i)=\frac{1}{1+e^{-x_i}}$函数)。如此一来,就无法再对模型进行简化了。

如何计算(先算出损失,再反向传播调整权重)

1. 损失的计算(前馈 Forward):

前馈过程比较简单,将$x,w$带入模型$\hat{y} = f(x,w)$,再将$\hat{y}$传入一个非线性函数即可得到结果$Z$, 再与真实值比较计算损失$Loss$。

2. 误差的反向传播 (BackPropagation):

 当前馈计算得到损失$Loss$后,需要将损失一步一步分配到来源中去。这里要用到链式求导法则,即要计算$x,w$对于$Loss$的影响大小要先求$Z$对$Loss$的影响再乘以$x,w$对$Z$的影响大小,进而得到$\frac{\partial L}{\partial x},\frac{\partial L}{\partial w}$, 其中,$\frac{\partial L}{\partial w}$ 用来调整当前层的权重,而$\frac{\partial L}{\partial x}$被视为更后一层的误差继续向后传播。

计算图的计算过程示意:

 一般我们在前馈过程中计算中间梯度,误差方向传播时就可以直接用了。

Pytorch 中的前馈和反馈计算

 在Pytorch中,Tensor(类) 是一个在创建动态计算图中非常重要的组成部分。它可以存储标量,向量,矩阵(二维或高维),计算过程中的所有数值都可以保存在 Tensor 中。它包含两个重要成员 data 和 grad(tensor), 分别保存计算过程中用到的数据(如权重等)和梯度值(如$\frac{\partial loss}{\partial w}$)。

在Pytorch中构建计算图

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import torch as th
import matplotlib.pyplot as plt

# 准备数据集
x_data = [1.0,2.0,3.0]
y_data = [2.0,4.0,6.0]

w_list = []
loss_list = []

a = 0.01 # 学习率

# 初始化权重tensor
w = th.Tensor([1.0]) # 线性模型权重只有一个,多个可传入多维矩阵
w.requires_grad = True # 声明该权重需要计算梯度,否则计算过程中torch不会计算该变量的梯度

# 定义模型
def forward(x):
# 由于w为Tensor因此乘法被重载,x也自动转为tensor与之相乘(矩阵乘法)
# 又w需要计算梯度故返回的Tensor也需要计算梯度
return x * w

# 定义损失函数, 实际上该函数可以看成构建了一个计算图
def loss(x, y):
y_pred = forward(x)
return (y_pred - y) ** 2

print("Predict (Before training)", 4, forward(4).item()) #item函数使得里面的一个值编程标量

# 开始训练网络
for epoch in range(100):
# 采用随机梯度下降
for x, y in zip(x_data, y_data):
l = loss(x,y) # 前馈计算,创建一个计算图
l.backward() # 反向传播, 算出计算链路上所有需要计算梯度的变量的梯度,计算完成后释放计算图
print('\t grad:', x, y, w.grad.item())
w.data = w.data - a * w.grad.data # 更新权重,由于grad 也是一个tensor,因此需要取里面的data才能正常计算
w.grad.data.zero_() # 清零梯度,因为默认情况下用backward()函数计算的梯度会累加

w_list.append(w.data.item())
loss_list.append(l.item())
print("progress:",epoch,l.item())

print("predict (after training)",4, forward(4).item())

# 绘图(权重与平均损失的关系)
plt.plot(w_list, loss_list)
plt.ylabel('loss')
plt.xlabel('W')
plt.xlim(1.0,2)
plt.show()

输出的数据:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
Predict (Before training) 4 4.0
grad: 1.0 2.0 -2.0
grad: 2.0 4.0 -7.840000152587891
grad: 3.0 6.0 -16.228801727294922
progress: 0 7.315943717956543
grad: 1.0 2.0 -1.478623867034912
grad: 2.0 4.0 -5.796205520629883
grad: 3.0 6.0 -11.998146057128906
progress: 1 3.9987640380859375
grad: 1.0 2.0 -1.0931644439697266
grad: 2.0 4.0 -4.285204887390137
grad: 3.0 6.0 -8.870372772216797
progress: 2 2.1856532096862793
grad: 1.0 2.0 -0.8081896305084229
grad: 2.0 4.0 -3.1681032180786133
grad: 3.0 6.0 -6.557973861694336
progress: 3 1.1946394443511963
grad: 1.0 2.0 -0.5975041389465332
grad: 2.0 4.0 -2.3422164916992188
grad: 3.0 6.0 -4.848389625549316
progress: 4 0.6529689431190491
grad: 1.0 2.0 -0.4417421817779541
grad: 2.0 4.0 -1.7316293716430664
grad: 3.0 6.0 -3.58447265625
progress: 5 0.35690122842788696
grad: 1.0 2.0 -0.3265852928161621
grad: 2.0 4.0 -1.2802143096923828
grad: 3.0 6.0 -2.650045394897461
progress: 6 0.195076122879982
grad: 1.0 2.0 -0.24144840240478516
grad: 2.0 4.0 -0.9464778900146484
grad: 3.0 6.0 -1.9592113494873047
progress: 7 0.10662525147199631
grad: 1.0 2.0 -0.17850565910339355
grad: 2.0 4.0 -0.699742317199707
grad: 3.0 6.0 -1.4484672546386719
progress: 8 0.0582793727517128
grad: 1.0 2.0 -0.1319713592529297
grad: 2.0 4.0 -0.5173273086547852
grad: 3.0 6.0 -1.070866584777832
progress: 9 0.03185431286692619
grad: 1.0 2.0 -0.09756779670715332
grad: 2.0 4.0 -0.3824653625488281
grad: 3.0 6.0 -0.7917022705078125
progress: 10 0.017410902306437492
grad: 1.0 2.0 -0.07213282585144043
grad: 2.0 4.0 -0.2827606201171875
grad: 3.0 6.0 -0.5853137969970703
progress: 11 0.009516451507806778
grad: 1.0 2.0 -0.053328514099121094
grad: 2.0 4.0 -0.2090473175048828
grad: 3.0 6.0 -0.43272972106933594
progress: 12 0.005201528314501047
grad: 1.0 2.0 -0.039426326751708984
grad: 2.0 4.0 -0.15455150604248047
grad: 3.0 6.0 -0.3199195861816406
progress: 13 0.0028430151287466288
grad: 1.0 2.0 -0.029148340225219727
grad: 2.0 4.0 -0.11426162719726562
grad: 3.0 6.0 -0.23652076721191406
progress: 14 0.0015539465239271522
grad: 1.0 2.0 -0.021549701690673828
grad: 2.0 4.0 -0.08447456359863281
grad: 3.0 6.0 -0.17486286163330078
progress: 15 0.0008493617060594261
grad: 1.0 2.0 -0.01593184471130371
grad: 2.0 4.0 -0.062453269958496094
grad: 3.0 6.0 -0.12927818298339844
progress: 16 0.00046424579340964556
grad: 1.0 2.0 -0.011778593063354492
grad: 2.0 4.0 -0.046172142028808594
grad: 3.0 6.0 -0.09557533264160156
progress: 17 0.0002537401160225272
grad: 1.0 2.0 -0.00870823860168457
grad: 2.0 4.0 -0.03413581848144531
grad: 3.0 6.0 -0.07066154479980469
progress: 18 0.00013869594840798527
grad: 1.0 2.0 -0.006437778472900391
grad: 2.0 4.0 -0.025236129760742188
grad: 3.0 6.0 -0.052239418029785156
progress: 19 7.580435340059921e-05
grad: 1.0 2.0 -0.004759550094604492
grad: 2.0 4.0 -0.018657684326171875
grad: 3.0 6.0 -0.038620948791503906
progress: 20 4.143271507928148e-05
grad: 1.0 2.0 -0.003518819808959961
grad: 2.0 4.0 -0.0137939453125
grad: 3.0 6.0 -0.028553009033203125
progress: 21 2.264650902361609e-05
grad: 1.0 2.0 -0.00260162353515625
grad: 2.0 4.0 -0.010198593139648438
grad: 3.0 6.0 -0.021108627319335938
progress: 22 1.2377059647405986e-05
grad: 1.0 2.0 -0.0019233226776123047
grad: 2.0 4.0 -0.0075397491455078125
grad: 3.0 6.0 -0.0156097412109375
progress: 23 6.768445018678904e-06
grad: 1.0 2.0 -0.0014221668243408203
grad: 2.0 4.0 -0.0055751800537109375
grad: 3.0 6.0 -0.011541366577148438
progress: 24 3.7000872907810844e-06
grad: 1.0 2.0 -0.0010514259338378906
grad: 2.0 4.0 -0.0041217803955078125
grad: 3.0 6.0 -0.008531570434570312
progress: 25 2.021880391112063e-06
grad: 1.0 2.0 -0.0007772445678710938
grad: 2.0 4.0 -0.0030469894409179688
grad: 3.0 6.0 -0.006305694580078125
progress: 26 1.1044940038118511e-06
grad: 1.0 2.0 -0.0005745887756347656
grad: 2.0 4.0 -0.0022525787353515625
grad: 3.0 6.0 -0.0046634674072265625
progress: 27 6.041091182851233e-07
grad: 1.0 2.0 -0.0004248619079589844
grad: 2.0 4.0 -0.0016651153564453125
grad: 3.0 6.0 -0.003444671630859375
progress: 28 3.296045179013163e-07
grad: 1.0 2.0 -0.0003139972686767578
grad: 2.0 4.0 -0.0012311935424804688
grad: 3.0 6.0 -0.0025491714477539062
progress: 29 1.805076408345485e-07
grad: 1.0 2.0 -0.00023221969604492188
grad: 2.0 4.0 -0.0009107589721679688
grad: 3.0 6.0 -0.0018854141235351562
progress: 30 9.874406714516226e-08
grad: 1.0 2.0 -0.00017189979553222656
grad: 2.0 4.0 -0.0006742477416992188
grad: 3.0 6.0 -0.00139617919921875
progress: 31 5.4147676564753056e-08
grad: 1.0 2.0 -0.0001270771026611328
grad: 2.0 4.0 -0.0004978179931640625
grad: 3.0 6.0 -0.00102996826171875
progress: 32 2.9467628337442875e-08
grad: 1.0 2.0 -9.393692016601562e-05
grad: 2.0 4.0 -0.0003681182861328125
grad: 3.0 6.0 -0.0007610321044921875
progress: 33 1.6088051779661328e-08
grad: 1.0 2.0 -6.937980651855469e-05
grad: 2.0 4.0 -0.00027179718017578125
grad: 3.0 6.0 -0.000560760498046875
progress: 34 8.734787115827203e-09
grad: 1.0 2.0 -5.125999450683594e-05
grad: 2.0 4.0 -0.00020122528076171875
grad: 3.0 6.0 -0.0004177093505859375
progress: 35 4.8466972657479346e-09
grad: 1.0 2.0 -3.790855407714844e-05
grad: 2.0 4.0 -0.000148773193359375
grad: 3.0 6.0 -0.000308990478515625
progress: 36 2.6520865503698587e-09
grad: 1.0 2.0 -2.8133392333984375e-05
grad: 2.0 4.0 -0.000110626220703125
grad: 3.0 6.0 -0.0002288818359375
progress: 37 1.4551915228366852e-09
grad: 1.0 2.0 -2.09808349609375e-05
grad: 2.0 4.0 -8.20159912109375e-05
grad: 3.0 6.0 -0.00016880035400390625
progress: 38 7.914877642178908e-10
grad: 1.0 2.0 -1.5497207641601562e-05
grad: 2.0 4.0 -6.103515625e-05
grad: 3.0 6.0 -0.000125885009765625
progress: 39 4.4019543565809727e-10
grad: 1.0 2.0 -1.1444091796875e-05
grad: 2.0 4.0 -4.482269287109375e-05
grad: 3.0 6.0 -9.1552734375e-05
progress: 40 2.3283064365386963e-10
grad: 1.0 2.0 -8.344650268554688e-06
grad: 2.0 4.0 -3.24249267578125e-05
grad: 3.0 6.0 -6.580352783203125e-05
progress: 41 1.2028067430946976e-10
grad: 1.0 2.0 -5.9604644775390625e-06
grad: 2.0 4.0 -2.288818359375e-05
grad: 3.0 6.0 -4.57763671875e-05
progress: 42 5.820766091346741e-11
grad: 1.0 2.0 -4.291534423828125e-06
grad: 2.0 4.0 -1.71661376953125e-05
grad: 3.0 6.0 -3.719329833984375e-05
progress: 43 3.842615114990622e-11
grad: 1.0 2.0 -3.337860107421875e-06
grad: 2.0 4.0 -1.33514404296875e-05
grad: 3.0 6.0 -2.86102294921875e-05
progress: 44 2.2737367544323206e-11
grad: 1.0 2.0 -2.6226043701171875e-06
grad: 2.0 4.0 -1.049041748046875e-05
grad: 3.0 6.0 -2.288818359375e-05
progress: 45 1.4551915228366852e-11
grad: 1.0 2.0 -1.9073486328125e-06
grad: 2.0 4.0 -7.62939453125e-06
grad: 3.0 6.0 -1.430511474609375e-05
progress: 46 5.6843418860808015e-12
grad: 1.0 2.0 -1.430511474609375e-06
grad: 2.0 4.0 -5.7220458984375e-06
grad: 3.0 6.0 -1.1444091796875e-05
progress: 47 3.637978807091713e-12
grad: 1.0 2.0 -1.1920928955078125e-06
grad: 2.0 4.0 -4.76837158203125e-06
grad: 3.0 6.0 -1.1444091796875e-05
progress: 48 3.637978807091713e-12
grad: 1.0 2.0 -9.5367431640625e-07
grad: 2.0 4.0 -3.814697265625e-06
grad: 3.0 6.0 -8.58306884765625e-06
progress: 49 2.0463630789890885e-12
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 50 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 51 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 52 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 53 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 54 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 55 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 56 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 57 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 58 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 59 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 60 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 61 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 62 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 63 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 64 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 65 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 66 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 67 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 68 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 69 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 70 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 71 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 72 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 73 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 74 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 75 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 76 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 77 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 78 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 79 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 80 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 81 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 82 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 83 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 84 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 85 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 86 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 87 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 88 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 89 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 90 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 91 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 92 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 93 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 94 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 95 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 96 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 97 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 98 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 99 9.094947017729282e-13
predict (after training) 4 7.999998569488525

收敛情况图像: