I am confused about variance and bias of function.How one can tell if function is overfitting or underfitting?how can you write formulas that express that?In machine learning if approximator has better results on training data and worse on unseen data than it overfits(it learns individual points of the data), if it is doing terribly on training set and loss(MSE), is high that means that it underfits.Can you give me forumal or concepts or similar things which can help me to understand, bias and variance in statistics.
2026-03-31 14:31:19.1774967479
difference between bias vs variance
104 Views Asked by user66906 https://math.techqa.club/user/user66906/detail At
1
There are 1 best solutions below
Related Questions in STATISTICS
- Given is $2$ dimensional random variable $(X,Y)$ with table. Determine the correlation between $X$ and $Y$
- Statistics based on empirical distribution
- Given $U,V \sim R(0,1)$. Determine covariance between $X = UV$ and $V$
- Fisher information of sufficient statistic
- Solving Equation with Euler's Number
- derive the expectation of exponential function $e^{-\left\Vert \mathbf{x} - V\mathbf{x}+\mathbf{a}\right\Vert^2}$ or its upper bound
- Determine the marginal distributions of $(T_1, T_2)$
- KL divergence between two multivariate Bernoulli distribution
- Given random variables $(T_1,T_2)$. Show that $T_1$ and $T_2$ are independent and exponentially distributed if..
- Probability of tossing marbles,covariance
Related Questions in STATISTICAL-INFERENCE
- co-variance matrix of discrete multivariate random variable
- Question on completeness of sufficient statistic.
- Probability of tossing marbles,covariance
- Estimate the square root of the success probability of a Binomial Distribution.
- A consistent estimator for theta is?
- Using averages to measure the dispersion of data
- Confidence when inferring p in a binomial distribution
- A problem on Maximum likelihood estimator of $\theta$
- Derive unbiased estimator for $\theta$ when $X_i\sim f(x\mid\theta)=\frac{2x}{\theta^2}\mathbb{1}_{(0,\theta)}(x)$
- Show that $\max(X_1,\ldots,X_n)$ is a sufficient statistic.
Related Questions in MACHINE-LEARNING
- KL divergence between two multivariate Bernoulli distribution
- Can someone explain the calculus within this gradient descent function?
- Gaussian Processes Regression with multiple input frequencies
- Kernel functions for vectors in discrete spaces
- Estimate $P(A_1|A_2 \cup A_3 \cup A_4...)$, given $P(A_i|A_j)$
- Relationship between Training Neural Networks and Calculus of Variations
- How does maximum a posteriori estimation (MAP) differs from maximum likelihood estimation (MLE)
- To find the new weights of an error function by minimizing it
- How to calculate Vapnik-Chervonenkis dimension?
- maximize a posteriori
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
The simple explanation is that you seem to be trying to predict some out-of-sample value $\theta$ where your estimator is $\hat \Theta$, and your loss will be proportional to $(\hat \Theta - \theta)^2$ so you should aim to minimise $\mathbb E\left[\left(\hat \Theta - \theta\right)^2\right]$
You can rewrite this as a sum: $\mathbb E\left[\left(\hat \Theta - \theta\right)^2\right] = \left(\mathbb E\left[\hat \Theta - \theta\right]\right)^2 +\mathbb E\left[\left(\hat \Theta - \mathbb E\left[\hat \Theta\right] \right)^2\right]$ where
This illustrates that you should not just try to make your estimator unbiased, nor just try to minimise the variance of your estimator, but take both into account at the same time. Your machine learning can be tuned to attempt this by methods such as cross-validation on your training set, and can do so without considering either the bias or the variance explicitly, by concentrating directly on $\mathbb E\left[\left(\hat \Theta - \theta\right)^2\right]$.
As an illustration of at the same time and that this is a broader question than over- or under-fitting, if you are trying to estimate the variance of a normally distributed random variable of unknown mean and variance, the estimator $\hat \sigma^2_{n-1} = \frac1{n-1} \sum (x_i-\bar x)^2$ has the merit of being unbiased and of having the smallest variance of all unbiased estimators. But it does not minimise $\mathbb E[(\hat \sigma^2 - \sigma^2)^2]$; on that criterion the best estimator would be $\hat \sigma^2_{n+1} = \frac1{n+1} \sum (x_i-\bar x)^2$ even though this is biased downwards with $E[\hat \sigma^2_{n+1} - \sigma^2] = -\frac{2\sigma^2}{n+1}$. Most machine learning questions are more complicated than this and so do not lend themselves to simple analysis, but the concept of finding the best model to minimise out-of-sample error is similar.