When we validate a statistical model, should we only consider the traditional Goodness-Of-Fit measures? Are there other ways in which the model should be validated? Should the validation be application specific or purely statistical? Do we have any examples that we can share with the class?
2026-04-08 05:47:51.1775627271
GOF measures in a model
114 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in STATISTICS
- Given is $2$ dimensional random variable $(X,Y)$ with table. Determine the correlation between $X$ and $Y$
- Statistics based on empirical distribution
- Given $U,V \sim R(0,1)$. Determine covariance between $X = UV$ and $V$
- Fisher information of sufficient statistic
- Solving Equation with Euler's Number
- derive the expectation of exponential function $e^{-\left\Vert \mathbf{x} - V\mathbf{x}+\mathbf{a}\right\Vert^2}$ or its upper bound
- Determine the marginal distributions of $(T_1, T_2)$
- KL divergence between two multivariate Bernoulli distribution
- Given random variables $(T_1,T_2)$. Show that $T_1$ and $T_2$ are independent and exponentially distributed if..
- Probability of tossing marbles,covariance
Related Questions in STATISTICAL-INFERENCE
- co-variance matrix of discrete multivariate random variable
- Question on completeness of sufficient statistic.
- Probability of tossing marbles,covariance
- Estimate the square root of the success probability of a Binomial Distribution.
- A consistent estimator for theta is?
- Using averages to measure the dispersion of data
- Confidence when inferring p in a binomial distribution
- A problem on Maximum likelihood estimator of $\theta$
- Derive unbiased estimator for $\theta$ when $X_i\sim f(x\mid\theta)=\frac{2x}{\theta^2}\mathbb{1}_{(0,\theta)}(x)$
- Show that $\max(X_1,\ldots,X_n)$ is a sufficient statistic.
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
Your question is not very specific. Here are three very different situations in which goodness-of-fit plays a central role. Maybe one or more of them are suitable for the level of your class.
One-way ANOVA with three levels of the factor. The model is
$$Y_{ij} = \mu + \alpha_i + e_{ij},$$
where the group effects $\alpha_i$ have $\sum_i \alpha_i = 0$ and $e_{ij} \stackrel{iid}{\sim} \mathsf{Norm}(0, \sigma),$ so that all groups have the same population SD $\sigma.$
It is customary to check for equal variances and normality. The check for normality must be done on the residuals, or on the three levels separately. (The mixture distribution of the three levels will not be normal unless all group population means are equal.) Analysis from Minitab 17 statistical software.
-
(a) The normal probability plot of residuals is roughly linear, suggesting approximately normal residuals. (b) Plot of residuals vs. fits shows about the same spread for each group, suggesting equal variances. (c) The histogram of residuals is of limited use for only 30 observations. (d) The plot of residuals in time order shows no trend or 'clumpiness' by group (1-10, 11-20, 21-30), suggesting independence of individual data values; this plot would be useless if data were sorted, which is often the case for textbook displays of data.
Formal (Bartlett) test for equal variances in Groups: No evidence of unequal group variances.
(Levene's test is best for clearly nonnormal data; not relevant here.)
Confession: Repeating the normal probability plot for residuals below, with a formal Anderson-Darling test for normality, we see a P-value < 0.05, indicating some departure from normality. This has to be a Type I Error because the data are fake data generated to be normal.
Fairness of a Die. A die is rolled 600 times with the following results.
Note that faces 1, 2, 3 appeared relatively more often than did 4, 5, 6. Is this evidence of unfairness?
Under the null hypothesis that the die is fair, we expect counts $E = 100$ for each face. Observed counts are $X = (145, 142, 128, 61, 58, 66).$
The chi-squared goodness-of-fit (GOF) test has test statistic
$$Q = \sum_{i=1}^6 \frac{(X_i - E)^2}{E} = 90.14.$$
Under the null hypothesis, $Q \stackrel{aprx}{\sim} \mathsf{Chisq}(df=5).$ The P-value of the test (essentially 0) is the probability in the right-hand tail of this distribution beyond 90.14. So there is strong evidence the die is unfair. This is not surprising because the data were simulated for a die for which faces 1, 2, 3 are twice as likely (probability 2/9 each) as faces 4, 5, 6 (1/9 each).
By contrast, if we had rolled the die only 60 times with face counts $X = (14,14,13,6,6,7),$ proportionately about the same as before, we would have $Q = 8.2,$ P-value 0.146 (> 0.05), and so no solid evidence of unfairness. There is simply not enough information in 60 rolls of the die to detect its considerable degree of unfairness.
Note that respective bar charts of the data for 600 and 60 rolls would look almost identical. So bar charts alone are hardly a guide to goodness-of-fit; formal GOF tests are required before drawing conclusions from bar charts.
Item Matching Problem. There are 12 letters and 12 properly matching envelopes on a desk. A weary administrative assistant thinks these are left-over mass mail and stuffs letters into envelopes at random. If $X$ is the number of envelopes randomly put into their proper envelopes, what are $E(X),$ and $SD(X)?$ It is easy to show that $E(X) = 1$ and possible to show that $SD(X) = 1.$ The equality of $E(X) = V(X) = 1$ raises the possibility that $X \stackrel {aprx}{\sim} \mathsf{Pois}(\lambda = 1).$ [Poisson is one of the few distribution families for which mean and variance are numerically equal.]
The fit to Poisson cannot be perfect: it is clear that $P(X = 11) = 0$ and that $P(X >12) = 0.$ But for some practical purposes $\mathsf{Pois}(1)$ is a useful approximate fit to the random envelope-matching model.
The R code below simulates a million of these 12-letter experiments and makes a histogram of the approximate probabilities for each value of $X.$ The dots show exact Poisson probabilities.