- 矩阵式梯度下降
- 人工神经网络预测
- 损失函数梯度下降
- 梯度优化三输入和一输出神经网络
- 并行坐标可视化传播网络
- 森林火灾场景特征算法学习
- 前馈神经网络随机梯度优化,动量随机梯度优化和自适应矩估计
- 多回归自动微分行驶探测器模型
- 分类给定点颜色
- h ( 1 , 1 ) h_{(1,1)} h(1,1) ~ h ( 2 , 3 ) h_{(2,3)} h(2,3) 之间是隐藏层,O 是输出层
假设神经元具有 sigmoid 激活函数,在网络上执行前向和后向传递。同时假设 y 的实际输出为 0.5,学习率为 1。现在使用反向传播算法执行反向传播。
在开始计算前向传播之前,我们需要知道两个公式: a j = ∑ ( w i , j ∗ x i ) a_j=\sum\left(w_i, j * x_i\right) aj=∑(wi,j∗xi)
- a j a_{ j } aj 是每个节点的所有输入和权重的加权和,
- wi, j j j - 表示与 j th j^{\text {th }} jth 输入到 i th i ^{\text {th }} ith 神经元相关的权重。
- x i x_i xi - 表示 j th j^{\text {th }} jth 输入的值
y j = F ( a j ) = 1 1 + e − a j , y i − y_j=F\left(a_j\right)=\frac{1}{1+e^{-a j}},y _{ i }- yj=F(aj)=1+e−aj1,yi−为输出值,F表示激活函数【sigmoid激活函数为此处使用),它将加权和转换为输出值。为了计算前向传播,我们需要计算 y 3 、 y 4 y_3、y_4 y3、y4 和 y 5 y_5 y5 的输出。
如上图中, y 3 y_3 y3是 h 1 h_1 h1, y 4 y_4 y4 是 h 2 h_2 h2, y 5 y_5 y5是 O 3 O_3 O3
a j = ∑ ( w i , j ∗ x i ) a_j=\sum\left(w_{i, j} * x_i\right) aj=∑(wi,j∗xi) 为了找到 y 3 y_3 y3,我们需要考虑它的传入边及其权重和输入。这里的传入边来自 X 1 X_1 X1 和 X 2 X_2 X2。
在 h 1 h_1 h1节点, a 1 = ( w 1 , 1 x 1 ) + ( w 2 , 1 x 2 ) = ( 0.2 ∗ 0.35 ) + ( 0.2 ∗ 0.7 ) = 0.21 \begin{aligned} a_1 & =\left(w_{1,1} x_1\right)+\left(w_{2,1} x_2\right) \\ & =(0.2 * 0.35)+(0.2 * 0.7) \\ & =0.21\end{aligned} a1=(w1,1x1)+(w2,1x2)=(0.2∗0.35)+(0.2∗0.7)=0.21
一旦我们计算了 a 1 a_1 a1 值,我们现在可以继续查找 y 3 y_3 y3 值:
y j = F ( a j ) = 1 1 + e − a j y 3 = F ( 0.21 ) = 1 1 + e − 0.21 y 3 = 0.56 \begin{aligned} & y_j=F\left(a_j\right)=\frac{1}{1+e^{-a j}} \\ & y_3=F(0.21)=\frac{1}{1+e^{-0.21}} \\ & y_3=0.56 \end{aligned} yj=F(aj)=1+e−aj1y3=F(0.21)=1+e−0.211y3=0.56
同样,在 h 2 h_2 h2 处查找 y 4 y_4 y4 的值,在 O 3 O_3 O3 处查找 y 5 y_5 y5 的值,
a 2 = ( w 1 , 2 ∗ x 1 ) + ( w 2 , 2 ∗ x 2 ) = ( 0.3 ∗ 0.35 ) + ( 0.3 ∗ 0.7 ) = 0.315 y 4 = F ( 0.315 ) = 1 1 + e − 0.315 a 3 = ( w 1 , 3 ∗ y 3 ) + ( w 2 , 3 ∗ y 4 ) = ( 0.3 ∗ 0.57 ) + ( 0.9 ∗ 0.59 ) = 0.702 y 5 = F ( 0.702 ) = 1 1 + e − 0.7012 = 0.67 \begin{aligned} & a 2=\left(w_{1,2} * x_1\right)+\left(w_{2,2} * x_2\right)=(0.3 * 0.35)+(0.3 * 0.7)=0.315 \\ & y_4=F(0.315)=\frac{1}{1+e^{-0.315}} \\ & a 3=\left(w_{1,3} * y_3\right)+\left(w_{2,3} * y_4\right)=(0.3 * 0.57)+(0.9 * 0.59)=0.702 \\ & y_5=F(0.702)=\frac{1}{1+e^{-0.7012}}=0.67 \end{aligned} a2=(w1,2∗x1)+(w2,2∗x2)=(0.3∗0.35)+(0.3∗0.7)=0.315y4=F(0.315)=1+e−0.3151a3=(w1,3∗y3)+(w2,3∗y4)=(0.3∗0.57)+(0.9∗0.59)=0.702y5=F(0.702)=1+e−0.70121=0.67
请注意,我们的实际输出是 0.5,但我们得到的是 0.67。为了计算误差,我们可以使用以下公式:
误差 j = y 目标 − y 5 _j=y_{\text {目标}}-y_5 j=y目标−y5,误差 = 0.5 – 0.67 =-0.17。使用这个误差值,我们将进行反向传播。
import numpy as npclass NeuralNetwork:def __init__(self, input_size, hidden_size, output_size):self.input_size = input_sizeself.hidden_size = hidden_sizeself.output_size = output_size# Initialize weightsself.weights_input_hidden = np.random.randn(self.input_size, self.hidden_size)self.weights_hidden_output = np.random.randn(self.hidden_size, self.output_size)# Initialize the biasesself.bias_hidden = np.zeros((1, self.hidden_size))self.bias_output = np.zeros((1, self.output_size))def sigmoid(self, x):return 1 / (1 + np.exp(-x))def sigmoid_derivative(self, x):return x * (1 - x)def feedforward(self, X):# Input to hiddenself.hidden_activation = np.dot(X, self.weights_input_hidden) + self.bias_hiddenself.hidden_output = self.sigmoid(self.hidden_activation)# Hidden to outputself.output_activation = np.dot(self.hidden_output, self.weights_hidden_output) + self.bias_outputself.predicted_output = self.sigmoid(self.output_activation)return self.predicted_outputdef backward(self, X, y, learning_rate):# Compute the output layer erroroutput_error = y - self.predicted_outputoutput_delta = output_error * self.sigmoid_derivative(self.predicted_output)# Compute the hidden layer errorhidden_error = np.dot(output_delta, self.weights_hidden_output.T)hidden_delta = hidden_error * self.sigmoid_derivative(self.hidden_output)# Update weights and biasesself.weights_hidden_output += np.dot(self.hidden_output.T, output_delta) * learning_rateself.bias_output += np.sum(output_delta, axis=0, keepdims=True) * learning_rateself.weights_input_hidden += np.dot(X.T, hidden_delta) * learning_rateself.bias_hidden += np.sum(hidden_delta, axis=0, keepdims=True) * learning_ratedef train(self, X, y, epochs, learning_rate):for epoch in range(epochs):output = self.feedforward(X)self.backward(X, y, learning_rate)if epoch % 4000 == 0:loss = np.mean(np.square(y - output))print(f"Epoch {epoch}, Loss:{loss}")X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])nn = NeuralNetwork(input_size=2, hidden_size=4, output_size=1)
nn.train(X, y, epochs=10000, learning_rate=0.1)# Test the trained model
output = nn.feedforward(X)
print("Predictions after training:")
Epoch 0, Loss:0.36270360966344145
Epoch 4000, Loss:0.005546947165311874
Epoch 8000, Loss:0.00202378766386817
Predictions after training: