I am struggling with a problem of calculating the global minima of Neural Networks in Natural Language Processing. The first method I used is to finde the global minimum based on convexity prperties. However, as you know that the categorical cross entropy loss function is nonconvex and I could not get further. So I am looking now for another method, any ideas? This would be very appreciated.
2026-03-27 15:55:05.1774626905
Deep Learning Algorithm Global Minimum
40 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in ORDINARY-DIFFERENTIAL-EQUATIONS
- The Runge-Kutta method for a system of equations
- Analytical solution of a nonlinear ordinary differential equation
- Stability of system of ordinary nonlinear differential equations
- Maximal interval of existence of the IVP
- Power series solution of $y''+e^xy' - y=0$
- Change of variables in a differential equation
- Dimension of solution space of homogeneous differential equation, proof
- Solve the initial value problem $x^2y'+y(x-y)=0$
- Stability of system of parameters $\kappa, \lambda$ when there is a zero eigenvalue
- Derive an equation with Faraday's law
Related Questions in PARTIAL-DIFFERENTIAL-EQUATIONS
- PDE Separation of Variables Generality
- Partial Derivative vs Total Derivative: Function depending Implicitly and Explicitly on Variable
- Transition from theory of PDEs to applied analysis and industrial problems and models with PDEs
- Harmonic Functions are Analytic Evan’s Proof
- If $A$ generates the $C_0$-semigroup $\{T_t;t\ge0\}$, then $Au=f \Rightarrow u=-\int_0^\infty T_t f dt$?
- Regular surfaces with boundary and $C^1$ domains
- How might we express a second order PDE as a system of first order PDE's?
- Inhomogeneous biharmonic equation on $\mathbb{R}^d$
- PDE: Determine the region above the $x$-axis for which there is a classical solution.
- Division in differential equations when the dividing function is equal to $0$
Related Questions in CONVEX-ANALYSIS
- Proving that: $||x|^{s/2}-|y|^{s/2}|\le 2|x-y|^{s/2}$
- Convex open sets of $\Bbb R^m$: are they MORE than connected by polygonal paths parallel to the axis?
- Show that this function is concave?
- In resticted domain , Applying the Cauchy-Schwarz's inequality
- Area covered by convex polygon centered at vertices of the unit square
- How does positive (semi)definiteness help with showing convexity of quadratic forms?
- Why does one of the following constraints define a convex set while another defines a non-convex set?
- Concave function - proof
- Sufficient condition for strict minimality in infinite-dimensional spaces
- compact convex sets
Related Questions in CONVEX-OPTIMIZATION
- Optimization - If the sum of objective functions are similar, will sum of argmax's be similar
- Least Absolute Deviation (LAD) Line Fitting / Regression
- Check if $\phi$ is convex
- Transform LMI problem into different SDP form
- Can a linear matrix inequality constraint transform to second-order cone constraint(s)?
- Optimality conditions - necessary vs sufficient
- Minimization of a convex quadratic form
- Prove that the objective function of K-means is non convex
- How to solve a linear program without any given data?
- Distance between a point $x \in \mathbb R^2$ and $x_1^2+x_2^2 \le 4$
Related Questions in REGULARITY-THEORY-OF-PDES
- If $q>p$ then $H^q([0,2\pi])$ is dense in $H^p([0,2\pi])$>
- Motivation to define the boundary value
- 1-D Heat Equation, bounding difference in $\alpha$ given surface temperature
- Implications of weak convergence on the Lebesgue space to Sobolev space
- Harnack type Estimates for a p-Poisson equation with constant source term
- Regularity solution of the Poisson equation with mixed boundary condition
- Intuition for compact embedding of $H^1([0,1])$ in $L^2([0,1])$?
- Young's inequality with duality bracket
- laplace equation $L^p$ estimate
- Interior Gradient Estimate for the p-Elliptic equation
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
Loss function for Neural Network is non-convex and hence, finding global minima is very tough. Infact, verifying if the minima reached is global minima is also tough. This is one of the major problem in deep learning. You can try the following methods and try your luck.
Try different initialisations.
Use accelerated optimization techniques such as Adam, Adagrad.
Using Autoencoder, RBMs for weight initialisation.