This wiki page says
The empirical distribution function is an estimate of the cumulative distribution function that generated the points in the sample.
and gives this formula
${\displaystyle {\widehat {F}}_{n}(t)={\frac {{\mbox{number of elements in the sample}}\leq t}{n}}={\frac {1}{n}}\sum _{i=1}^{n}\mathbf {1} _{X_{i}\leq t},}$
Another wiki page says
kernel density estimation (KDE) is a non-parametric way to estimate the probability density function of a random variable
and gives this formula
${\displaystyle {\widehat {f}}_{h}(x)={\frac {1}{n}}\sum _{i=1}^{n}K_{h}(x-x_{i})={\frac {1}{nh}}\sum _{i=1}^{n}K{\Big (}{\frac {x-x_{i}}{h}}{\Big )},}$
This post says
the pdf is the first derivative of the cdf for a continuous random variable
question
Is there some connection between Kernel density estimation and Empirical distribution function, such as the former is the derivative of the latter for a continuous random variable? If yes, what is the derivation?
Not precisely.
About histograms, KDEs and ECDFs.
(1) Roughly speaking, a histogram (on a density scale so that the sum of areas of bars is unity) can be viewed as a estimate of the density function. A KDE is a more sophisticated method of density estimation. Generally speaking one cannot reconstruct the exact values of the data for either a histogram or a KDE.
(2) By contrast an empirical CDF (ECDF) retains exact information about all of the data. An ECDF is made as follows: (a) sort the data from smallest to largest, (b) make a stair-step function that begins at 0 below the minimum and increases by $1/n$ at each data value, where $n$ is the sample size. If $k$ values are tied then the increase is $k/n$ at the tied value.
Thus the ECDF approximates the CDF of the distribution, with increasingly accurate approximations for samples of increasing size. Generally speaking an ECDF gives a better approximation to the population CDF than a histogram gives for the density function. (Information is lost in binning data to make a histogram.)
[By suitable manipulation (a kind of numerical integration), information in a KDE could be used to make a function that imitates the population CDF, but it does not use the actual data values. In my experience, this is rarely done.]
Graphical illustrations.
(1) A sample of size $n = 100$ from $$\mathsf{Gamma}(\text{shape} = \alpha = 5,\,\text{rate} = \lambda = 1/6)$$ is simulated. The figure shows a density histogram (blue bars), the default KDE from R statistical software (red curve), and the population density function (black).
(2) Sampling from the same distribution, we show the ECDF for a sample of size $n = 20,$ so that the steps are easy to see.