A spanning tree of a graph is a subgraph obtained by deleting only edges of the graph and which is also a tree.
Why does one study "spanning tree" in graph theory?
What are "spanning trees" real life application?
We know the number of spanning trees of a graph is equal to any cofactor of the Laplacian matrix of the graph. Why is one interested in knowing number of spanning trees of a given graph?
I would like to know enough in order to motivate a undergraduate or graduate student.
It could be more easy to step ahead and talk about minimum weight spanning tree. The first and simple example is why Borůvka invented his algorithm for building minimum weight spanning tree. He needed to minimize length of cables used to supply power to the given set of cities. (Here we don't allow any branching points other than these cities, otherwise we will get Steiner's problem, which is much harder.) The same works with other communication lines.
Another example of using minimum weight spanning tree is clustering. One of possible approaches is based on building minimum weight spanning tree.
More applications of MST you can find here.
What about unweighted spanning tree it could be considered as an example of graph sparsification keeping connectivity. When you have a spanning tree, then building a path between two vertices becomes faster and takes $O(n)$ time instead of $O(m)$ for the initial graphs. A spanning tree is (indirectly) created in both BFS and DFS, so you may consider its applications. Particularly checking for bipartiteness can be done using spanning tree.
The weighted number of spanning trees is tightly connected to electrical networks. So the unweighted number of spanning trees appears if resistances are the same.