I am trying to set up a linear programming problem that induces a subgraph of a complete graph G while trying to minimize the following objective function along with a certain set of constraints:
$$\sum_{\forall j \in V(G)} \sum_{\forall i \in V(G)} w_{i,j}x_{i,j}$$
Where $w_{i,j}$ is the weight of the edge between $i$ and $j$ in G and $x_{i,j}$ is the presence of an edge between them.
Is there a constraint I can add which will make the outputted graph be a connected component? My current crude solution is to find an MST and set those edges to equal 1, but I was wondering if there was a better way of going about this.
Setting the edges in a spanning tree equal to $1$ is not a true solution: it eliminates some valid solutions which are connected, but don't contain the specific spanning tree you found.
A general solution is to add the following constraints: for all $S \subset V(G)$ excluding $\varnothing$ and $V(G)$ itself, require $$ \sum_{i \in S} \sum_{j \notin S} x_{ij} \ge 1. $$ This says "$S$ cannot be a connected component: there is at least one edge between $S$ and the rest of the graph", so when this constraint exists for all $S$, it makes sure that the graph cannot be split into multiple connected components.
The downside is that this is exponentially many constraints. Taking into account that $S$ and $V(G)-S$ give the same constraint, we get $2^{n-1}-1$ constraints for an $n$-vertex graph.
The way to deal with this downside is row generation. First, don't include any constraints that enforce connectivity, and solve the linear program. If the graph you get is not connected (which can be checked efficiently), find a connectivity constraint it violates, and add it to the LP. Then, use the dual simplex method to solve the LP again from the previous optimal solution. Repeat until you get a connected solution.