Logistic Regression of Cost Function
Recall logistic regression of cost function
$$J(\theta)=-\frac{1}{m}\left[\sum_{i=1}^my^{(i)}\log\left(h_{\theta}(x^{(i)})\right)+(1-y^{(i)})\log\left(1-h_{\theta}(x^{(i)})\right)\right]\tag{1}$$
Marathon. Food. Programming.
Recall logistic regression of cost function
$$J(\theta)=-\frac{1}{m}\left[\sum_{i=1}^my^{(i)}\log\left(h_{\theta}(x^{(i)})\right)+(1-y^{(i)})\log\left(1-h_{\theta}(x^{(i)})\right)\right]\tag{1}$$
For linear regression, we’ve learned two learning algorithms, one based on gradient descent, and another one based on the normal equation.
Underfitting (high bias) and overfitting (high varience) are both not good in regularization.
To compute $J(\theta)$ and $\frac{\partial}{\partial\theta_j}J(\theta)$ with given $\theta$ more efficiently, here are some algorithms:
Recall from the previous post, we know that
$$J(\theta)=\frac{1}{m}\sum_{i=1}^m\frac{1}{2}(h_{\theta}(x^{(i)})-y^{(i)})^2\\=\frac{1}{m}\sum_{i=1}^mCost(h_{\theta}(x^{(i)}),y^{(i)})$$
$$\lbrace(x^{(1)},y^{(1)}),(x^{(1)},y^{(1)}),…,(x^{(m)},y^{(m)})\rbrace$$
Assume $h_{\theta}(x)=g(\theta_0+\theta_1x_1+\theta_2x_2)$, and
$$\mathbf{\theta}=
\begin{Bmatrix}
\theta_0 \\
\theta_1 \\
\theta_2 \\
\end{Bmatrix}=
\begin{Bmatrix}
-3 \\
1 \\
1 \\
\end{Bmatrix}$$
$$h_{\theta}(x)=g(\mathbf{\theta^\top x})$$
where $g(z)$ is a sigmoid function (i.e. logistic function)
$$g(\mathbf{z})=\frac{1}{1+e^{-z}}$$