Feature Engineering
π Feature Engineering in Linear Models One-Hot Encoding Definition One-hot encoding is a widely used method to represent categorical variables as numerical vectors. A categorical variable with...
π Feature Engineering in Linear Models One-Hot Encoding Definition One-hot encoding is a widely used method to represent categorical variables as numerical vectors. A categorical variable with...
π Bias, Variance, and Mean Squared Error (MSE) 1. Definitions Bias Bias is the systematic error of an estimator. [\mathrm{Bias}(\hat{\theta}) = \mathbb{E}[\hat{\theta}] - \theta] Variance V...
π Statistical Properties Model Assumption Assume the linear model: [Y = X\beta + \epsilon, \quad \epsilon \sim \mathcal{N}(0, \sigma^2 I)] Then: [Y \mid X \sim \mathcal{N}(X\beta, \sigma^2 I...
π MLE for Linear Regression β Equivalence to Least Squares Model Assume a linear model with Gaussian noise: [y_i = \beta^T x_i + \epsilon_i, \quad \epsilon_i \sim \mathcal{N}(0,\sigma^2)] The...
π Maximum Likelihood Estimation (MLE) β Bernoulli & Gaussian MLE Example: Bernoulli Distribution Bernoulli Model For a Bernoulli random variable \(x \in \{0,1\}\): [p_\theta(x) = \theta^x...
π Likelihood, Log-Likelihood, and Maximum Likelihood Estimation (MLE) Likelihood Definition Likelihood is the joint probability (or probability density) of the observed data, viewed as a functi...
π Linear Regression, BiasβVariance, and Model Interpretation π― BiasβVariance Decomposition [\underbrace{\mathbb{E}\big[\, y_0 - \hat{f}(x_0) \,\big]^2}_{\text{Expected squared prediction error}...
π― Supervised Learning β Model, Error, and Trade-offs π§ Supervised Learning Workflow Goal: Learn mapping from input ( x ) to output ( y ) using labeled data. π Step-by-step Step 1. Model Desig...
π€ Machine Learning & AI Overview π§ Intelligence Ability to perceive or infer information and apply it to adaptive behavior. π Statistical Machine Learning Machine Learning is a fiel...
π How to check Model Evaluation Choosing the Optimal Model How to Choose a Model Question Which model is the best one? Common answers: Smallest RSS Largest \(R^2\) Observation Th...