Suppose I have an Erdős–Rényi graph ${\cal G}(n,p)$, where $n$ is the total number of nodes and $p$ is the probability of an edge between any pair of nodes (edges are added independently). I subsample the graph by sampling $m$ nodes from the graph with probability proportional to the node degrees. Specifically, the probability of sampling node $i$ is $q_i = \frac{d_i}{\sum_{j=1}^n d_j}$, where $d_i, i=1, \ldots, n$ is the node degree for node $i$. We obtain a subgraph ${\cal G}_s$ consisting of the sampled nodes and whatever edges exist between them in the original graph. Can one say something concrete about the underlying distribution of such subgraphs, i.e., subgraphs obtained from by subsampling Erdős–Rényi graphs through the described subsampling process? I presume one may be able to write down an expression for $p_s$ (the probability that there exists an edge between a pair of nodes in the subgraph), but are the subgraphs still Erdős–Rényi? I believe the independence of edges between the nodes may be affected. If they are no longer Erdős–Rényi, then what is the underlying probabilistic model of such graphs? This seems like a natural question, but has anyone investigated similar questions before? Any pointers or references would help.
2026-02-23 10:23:38.1771842218
Probabilistic subsampling of an Erdős–Rényi graph
296 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in PROBABILITY
- How to prove $\lim_{n \rightarrow\infty} e^{-n}\sum_{k=0}^{n}\frac{n^k}{k!} = \frac{1}{2}$?
- Is this a commonly known paradox?
- What's $P(A_1\cap A_2\cap A_3\cap A_4) $?
- Prove or disprove the following inequality
- Another application of the Central Limit Theorem
- Given is $2$ dimensional random variable $(X,Y)$ with table. Determine the correlation between $X$ and $Y$
- A random point $(a,b)$ is uniformly distributed in a unit square $K=[(u,v):0<u<1,0<v<1]$
- proving Kochen-Stone lemma...
- Solution Check. (Probability)
- Interpreting stationary distribution $P_{\infty}(X,V)$ of a random process
Related Questions in SAMPLING
- Defintion Ideally sampled image
- What is expected value of a sample mean?
- Why does k-fold cross validation generate an MSE estimator that has higher bias, but lower variance then leave-one-out cross-validation?
- Sampling Question
- Limit of chi square distribution.
- Sampling Distribution of Difference of Normal Random Variables
- Sampling Distribution and Chi Squared Random Variables
- Variance of $S^2$ taken from Normal Distribution
- Sample Variance Definition Disparity
- [data generating process]-[sampling from an infinite population]-[i.i.d.]: some clarifications
Related Questions in RANDOM-GRAPHS
- Bound degrees of sparse random graphs
- Connectivity of random graphs - proof $\frac{logn}{n}$ is threshold
- In weighed random graph where the edge weight is restricted to $[0,1]$, what are the usual assumptions of edge weight distribution?
- Upper Bound on Vertices in SCC Graph of Directed Random Graph
- The degree of a vertex in $G(n,m)$ is approx. Poisson
- What is the expected length of the diameter of a special random graph?
- Clique numbers and Theorem 4.5.1 in "The Probabilistic Method" by Alon and Spencer
- Expected global clustering coefficient for Erdős–Rényi graph
- Probability of having a path of a given length in a random graph?
- Correlation for random graph (Erdos-Renyi)
Related Questions in BOOTSTRAP-SAMPLING
- Bootstrap estimate of correlation
- Nonparametric Boostrap Confidence Interval For $\text{Var }(\overline X)$
- Probabilistic subsampling of an Erdős–Rényi graph
- Why does bootstrapping approach the distribution of estimator, not mean of the estimator with normal distribution?
- Expected number of tries to choose x unique values
- Sampling distribution of a functional T
- Is it possible to cluster particles and then resample each cluster?
- Consider why does bootstrap sampling distribution need to compute the estimate, If we knew the true parameters $\theta^∗$?
- Apply formula on random sampling without replacement, but with replacement between each iterations
- Apply function on two set to create a new (+ process validation: bootstrap iterations)
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
The distribution of the subgraphs is different from $\mathcal G(n,p)$. One major issue is that you're much more likely to get "clone" vertices.
Consider $p=\frac12$, say. Here, for any $\epsilon>0$, with probability tending to $1$, all degrees are between $(1-\epsilon)\frac n2$ and $(1+\epsilon)\frac n2$. This means that $q_i$ stays between $(1-\epsilon)\frac1n$ and $(1+\epsilon)\frac1n$. So for any pair of vertices in the sampled graph $\mathcal G_s$, the pair of vertices they represent from the original random graph is almost equally likely (within a factor of $(1+\epsilon)^2$) to be any pair. In particular, $p_s$ is between $(1-2\epsilon)\frac12$ and $(1+2\epsilon)\frac12$.
In an $m$-vertex Erdős–Rényi graph with edge probability around $\frac12$, the probability that two vertices $v_i$ and $v_j$ have exactly the same set of adjacent vertices is around $2^{-m}$, and there are almost never any pairs with this property. In $\mathcal G_s$, this probability is at least $\frac1n$: a lower bound on the chance that the two vertices came from the same vertex of the original random graph. By the birthday paradox, if $m \gg \sqrt n$, you are going to get many such pairs.
In the case of sampling without replacement, the graph is very similar to an Erdős–Rényi graph when $p$ is large, by the same argument: we are not too much more likely to choose any vertex than any other. If we chose a random set of vertices $S$, we'd get an Erdős–Rényi graph on $S$, and our actual choice of $S$ is only slightly biased.
When $p$ is small, things look very different. For example, if $p = \frac1n$, then the average degree is $1$, but the highest degree is around $\log n$, and there are about $e^{-1}n$ vertices with degree $0$. So when sampling from this graph, we are going to pick up all or almost all of the vertices with high degrees, and none of the isolated vertices. (Of course, their degrees in $\mathcal G_s$ will be different.)
For an extreme case where this makes a huge difference, suppose we take $m = (1-e^{-1}-\epsilon)n$ for some $\epsilon>0$. Then $\mathcal G_s$ picks up almost all of the edges of the original graph, so it should be similar to a $\mathcal G(m, \frac cm)$ Erdős–Rényi graph, for $c = \frac{1}{1-e^{-1}}$. But compared to that graph, $\mathcal G_s$ has almost no isolated vertices: only $O(\epsilon n)$ of them.
I don't think there is a simple description of this random graph beyond the way you're constructing it (by taking a random graph and subsampling).