Backpropagation Intuition of Neural Network (CS229)

Recall cost function

$$J(\Theta)=-\frac{1}{m}\left[\sum_{i=1}^m\sum_{k=1}^Ky_k^{(i)}\log\left(h_{\Theta}(x^{(i)})\right)_k+(1-y_k^{(i)})\log\left(1-h_{\theta}(x^{(i)})_k\right)\right]+\frac{\lambda}{2m}\sum_{l=1}^{L-1}\sum_{i=1}^{s_l}\sum_{j=1}^{s_l+1}\left(\Theta_{ji}^{(l)}\right)^2\tag{1}$$

Taking $K=1$ as assuming there is only one output unit, $(1)$ can be rewrite to
$$J(\Theta)=-\frac{1}{m}\left[\sum_{i=1}^{m}y^{(i)}\log\left(h_{\Theta}(x^{(i)})\right)+(1-y^{(i)})\log\left(1-h_{\theta}(x^{(i)})\right)\right]+\frac{\lambda}{2m}\sum_{l=1}^{L-1}\sum_{i=1}^{s_l}\sum_{j=1}^{s_l+1}\left(\Theta_{ji}^{(l)}\right)^2\tag{2}$$

Focusing on a single example $x^{(i)},y^{(i)}$, and let $\lambda=0$ to ignore regularization, the cost of $i^{th}$ training example becomes
$$Cost(i)=y^{(i)}\log{h_{\Theta}\left(x^{(i)}\right)}+\left(1-y^{(i)}\right)\log{\left(1-h_{\Theta}\left(x^{(i)}\right)\right)}$$


(source: https://www.coursera.org/learn/machine-learning)

Note that the last term in the silde above is incorrect.