-
Cost Function with Regularization
Which parameters to include or which ones to include? It may be hard. You may not know which are the most important features and which ones to penalize. So, the way regularization is implemented to penalize all of the features, you penalize all wj parameters. shrink all of them by adding the regularization term. The […]
-
The Problem of Overfitting
The algorithm can run into a problem called overfitting. Overfitting can cause the algorithm to perform poorly. To solve this problem:Regularization will help you minimize the overfitting and get the algorithms to work much better. What is overfitting? fits the training set extremely well. The algorithm is trying to fit every single training example. NOT […]
-
Cost function of Logistic Regression
The gradient descent can be guaranteed to converge to the global minimum. If using the mean squared error for logistic regression, the cost function is non-convex. So, it is more difficult for gradient descent to find an optimal value for w and b. Linear Regression => squared error. cost -> convexLogistic Regression ==> if we […]
-
Logistic Regression
for binary classification problemsy can only be one of two values/two classes/two categories.yes or notrue or false1 or 0 Logistic Regression is a fit curve that looks like S shape curve to the dataset.it uses sigmoid function. Sigmoid function outputs value is between 0 and 1. g(z) = 1 / (1 + e^-z). 0 < […]
-
Feature Engineering and Polynomial Regression
How can linear regression model complex, even highly non-linear functions using feature engineering? The choice of features can have a huge impact on your learning algorithm’s performance Feature engineering : using domain knowledge to design new features, by transforming a feature or combining original features f(x) = w1*x1 + w2*b2 + bx1 : frontagex2: deptharea = […]
-
Feature Scaling
It is more likely that a good model will learn to choose a small parameter value, like 0.1, when a possible range of values of a feature is large, like 2000. In other words, when the possible values of the feature are small, then a reasonable value for its parameters will be large. prediction price […]
-
Linear Regression with Multiple Features
The screenshot below taken from https://www.coursera.org/learn/machine-learning/home/week/2 Enable gradient descent to run much faster : Feature Scaling
-
Vectorization
Why vectorization? Vector Dot Product The dot product multiplies the values in two vectors element-wise and then sums the result. Vector dot product requires the dimensions of the two vectors to be the same. dot product a.w = aT@w = np.matmul(aT,w) matmul = matrix multiplication calculation Loop version duration: 0.0162 msone. step at a one […]