- Published on
Introduction to Graph Theory - Part 5
- Authors

- Name
- Till Heller
In Part 3, we briefly introduced trees as connected graphs without cycles and observed that they have exactly edges. We promised a rigorous proof — and now it's time to deliver. But trees deserve much more than just that one result. They are among the most fundamental structures in graph theory and computer science, appearing everywhere from network design to data structures to phylogenetics. In this part, we'll explore their properties in depth.
Leaves
Let's start with a question: in a tree, can every vertex have many neighbors, or are some vertices forced to sit at the "boundary"? Think of a real tree — it has branches that eventually end in leaves. The same is true for trees in graph theory.
Definition 21. A leaf of a tree is a vertex with degree .
In the star graph from Part 3, the vertices , , and are all leaves — each connected to vertex by a single edge. In the path graph , the two endpoints are leaves.
It turns out that leaves are not just common in trees — they are guaranteed to exist.

Lemma 2. Every tree with at least two vertices has at least two leaves.
Proof. Let be a tree with vertices. Consider a longest path in , say . Since has at least two vertices and is connected, this path has at least one edge, so .
We claim that is a leaf. Suppose for contradiction that . Then has a neighbor different from . If is already on the path, say for some , then the vertices would form a cycle — contradicting the fact that is a tree. If is not on the path, then is a path longer than — contradicting our choice of a longest path. In both cases, we reach a contradiction, so and is a leaf. By the same argument applied to , we get a second leaf.
This result is more powerful than it might first appear. It gives us a way to "peel off" a tree layer by layer: remove a leaf, and what remains is still a tree (with one fewer vertex). This idea is the key to proving the next theorem.
The Edge Count
In Part 3, we gave an informal argument for why a tree on vertices has edges. Now we can make this precise using the leaf removal technique.
Theorem 2. Every tree on vertices has exactly edges.
Proof. We prove this by induction on .
Base case: A tree on vertex has no edges, and . The statement holds.
Inductive step: Assume the statement holds for all trees with fewer than vertices, where . Let be a tree on vertices. By Lemma 2, has a leaf . Let be the graph obtained by removing and its single edge from .
We need to check that is still a tree. It is connected: since was a leaf, every path in that didn't use is still present in , and no path between two vertices in needed to go through (because is a dead end). It is acyclic: removing a vertex from a graph without cycles cannot create a cycle. So is a tree on vertices.
By the induction hypothesis, has edges. Since we removed exactly one edge to get from to , the tree has edges.
Equivalent Characterizations
One of the beautiful things about trees is that there are many different ways to define them, and they all turn out to be equivalent. We've been working with the definition "connected and acyclic," but here are several other characterizations.
Theorem 3. For a graph on vertices, the following statements are equivalent:
(i) is connected and acyclic (i.e., is a tree).
(ii) is connected and has exactly edges.
(iii) is acyclic and has exactly edges.
(iv) Between any two vertices in , there is exactly one path.
(v) is connected, but removing any edge disconnects it.
(vi) is acyclic, but adding any edge creates exactly one cycle.
We won't prove all implications here, but let's verify the most instructive ones to build intuition.
(i) (iv): Let be a tree, and let be two vertices. Since is connected, at least one path from to exists. Suppose there are two distinct paths and from to . Since and are different, they must diverge at some vertex and rejoin at a later vertex — but this creates a cycle, contradicting the fact that is acyclic. So the path is unique.
(iv) (v): If there is exactly one path between any two vertices, then the graph is connected. Now take any edge . This edge is the unique path between and . Removing it means there is no longer any path from to , so the graph becomes disconnected.
(v) (i): The graph is connected by assumption. If it contained a cycle, we could remove any edge on that cycle without disconnecting the graph — contradicting the assumption that removing any edge disconnects it. So the graph is acyclic, and therefore a tree.
The remaining implications follow similar reasoning. The key takeaway is that trees live at a kind of "sweet spot" — they have exactly enough edges to be connected, but not a single one more1.
Forests
What happens if we relax the requirement that a tree be connected? We get a graph that might consist of several separate trees — like a forest made up of individual trees.
Definition 22. A forest is a graph that contains no cycle as a subgraph.
In other words, a forest is an acyclic graph. Each connected component of a forest is a tree. A tree is simply a forest with exactly one connected component.
Observation 5. A forest on vertices with connected components has exactly edges.
This follows directly from Theorem 2: each connected component is a tree, and a tree on vertices has edges. Summing over all components, the total number of edges is .
Rooted Trees
So far, our trees have been "unrooted" — no vertex plays a special role. But in many applications, especially in computer science, we want to single out one vertex as the starting point. This gives the tree a natural hierarchy.
Definition 23. A rooted tree is a tree with one designated vertex called the root.
Once we fix a root , the tree gets a natural top-down structure. Every other vertex has a unique path to the root (by Theorem 3), and this path defines a hierarchy.
Definition 24. In a rooted tree with root , let be a vertex different from . The parent of is the first vertex after on the unique path from to . The children of a vertex are all vertices whose parent is . A vertex with no children is called a leaf2.
The depth of a vertex is the length of the unique path from to . The root has depth . The height of the tree is the maximum depth among all its vertices.

Think of a company's organizational chart: the CEO is the root, department heads are children of the CEO, team leads are children of department heads, and so on. The depth tells you how many levels of management sit above a person, and the height of the tree tells you how many levels the organization has.
Spanning Trees
Let's end with a concept that connects trees back to general graphs. Given a connected graph , we can always find a tree "hiding" inside it that still connects all the vertices.
Definition 25. A spanning tree of a connected graph is a subgraph that is a tree and contains all vertices of .
In other words, a spanning tree keeps every vertex but uses only of the edges — just enough to stay connected without any cycles.

Theorem 4. Every connected graph has a spanning tree.
Proof. Let be a connected graph. If has no cycle, it is already a tree, and we are done. Otherwise, contains a cycle. Pick any edge on that cycle and remove it. The resulting graph is still connected: any path that used can be rerouted along the rest of the cycle. Now repeat: if the graph still has a cycle, remove an edge from it. Each removal decreases the number of edges by one while preserving connectivity. Since the graph is finite, this process must terminate — and it terminates when no cycle remains, giving us a connected, acyclic subgraph that contains all vertices. That is a spanning tree.
Spanning trees are incredibly useful in practice. When designing a network — say, connecting cities with roads or computers with cables — a spanning tree gives you the cheapest way to keep everything connected without redundant links. Finding spanning trees with minimum total edge weight is the minimum spanning tree problem, one of the most classical problems in combinatorial optimization3.
Looking Ahead
Trees gave us our first taste of deeper structural results and rigorous proofs. In the next part, we will introduce directed graphs — graphs where edges have a direction, like one-way streets. This opens the door to modeling asymmetric relationships such as web links, task dependencies, and flows through networks.
Footnotes
This is why trees are sometimes called minimally connected graphs: they are connected, but removing any edge breaks the connection. ↩
This is consistent with our earlier definition: in a rooted tree, a leaf is a vertex with no children, which means its only neighbor is its parent, so it has degree — unless it is the root, which has degree only when the tree consists of a single vertex. ↩
Efficient algorithms for finding minimum spanning trees include Kruskal's algorithm and Prim's algorithm, both of which run in time polynomial in the size of the graph. ↩