Logo
Published on

Introduction to Graph Theory - Part 6

Authors
  • avatar
    Name
    Till Heller
    Twitter

Back in Part 2, we noted that the friendship between Alice and Bob is mutual — if Alice considers Bob a friend, then Bob considers Alice a friend too. But not all relationships work this way. On social media, Alice might follow Bob without Bob following her back. A one-way street lets you drive from intersection AA to intersection BB, but not the other way around. A webpage can link to another page without being linked back. In all these cases, the direction of the connection matters.

Until now, all our graphs have been undirected — edges had no direction. It's time to change that.

Directed Graphs

In Part 2, we mentioned in a footnote that (1,2)(1,2) and (2,1)(2,1) could be treated as different edges if we care about direction. Let's make this idea precise.

Definition 26. A directed graph (or digraph) is a pair D=(V,A)D = (V, A), where VV is a finite set of vertices and AV×VA \subseteq V \times V is a set of arcs (or directed edges). An arc (u,v)(u, v) goes from uu to vv.

The key difference from undirected graphs is that order matters: the arc (u,v)(u, v) goes from uu to vv, while (v,u)(v, u) goes in the opposite direction. These are two different arcs. When we draw a digraph, we use arrows instead of plain lines to indicate direction.

Consider the digraph D=({1,2,3},{(1,2),(2,3),(3,1)})D = (\{1,2,3\}, \{(1,2), (2,3), (3,1)\}). This describes a directed cycle: from 11 to 22, from 22 to 33, and from 33 back to 11. Notice that you cannot go from 22 to 11 directly — the arc only goes the other way.

directed-cycle

Let's revisit our social media example. Suppose Alice follows Bob and Charlie, Bob follows Charlie, and Charlie follows Alice. We can model this as a digraph with vertices {A,B,C}\{A, B, C\} and arcs {(A,B),(A,C),(B,C),(C,A)}\{(A,B), (A,C), (B,C), (C,A)\}. Notice that Alice follows Bob but Bob does not follow Alice — the relationship is not symmetric. This is exactly the kind of situation that undirected graphs cannot capture1.

In-Degree and Out-Degree

In undirected graphs, the degree of a vertex counts all its neighbors. In digraphs, we need to distinguish between incoming and outgoing connections.

Definition 27. The in-degree of a vertex vv, denoted d(v)d^-(v), is the number of arcs pointing to vv. The out-degree of a vertex vv, denoted d+(v)d^+(v), is the number of arcs pointing from vv.

In our social media digraph, Alice has in-degree 11 (only Charlie follows her) and out-degree 22 (she follows Bob and Charlie). Bob has in-degree 11 and out-degree 11. Charlie has in-degree 22 and out-degree 11.

social-digraph

Just like the Handshaking Lemma from Part 4, there is a directed version.

Lemma 3 (Directed Handshaking Lemma). For any digraph D=(V,A)D = (V, A), we have vVd(v)=vVd+(v)=A\sum_{v \in V} d^-(v) = \sum_{v \in V} d^+(v) = |A|.

Proof. Every arc (u,v)(u, v) contributes exactly 11 to the out-degree of uu and exactly 11 to the in-degree of vv. Summing over all arcs gives both equalities.

This makes intuitive sense: every arc that leaves some vertex must arrive at some vertex, so the total number of departures equals the total number of arrivals.

Directed Walks and Paths

The concepts of walks and paths from Part 4 carry over naturally to digraphs — we just have to follow the arrows.

Definition 28. A directed walk in a digraph D=(V,A)D = (V, A) is a sequence of vertices v0,v1,,vkv_0, v_1, \ldots, v_k such that (vi,vi+1)A(v_i, v_{i+1}) \in A for all 0i<k0 \leq i < k. A directed path is a directed walk with no repeated vertices.

The crucial difference from the undirected case is that we can only move in the direction of the arcs. In the digraph D=({1,2,3},{(1,2),(2,3),(3,1)})D = (\{1,2,3\}, \{(1,2), (2,3), (3,1)\}), the sequence 1,2,31, 2, 3 is a valid directed path, but 3,2,13, 2, 1 is not — there is no arc from 33 to 22.

Weak and Strong Connectivity

In undirected graphs, connectivity was straightforward: either you can reach every vertex from every other vertex, or you can't. In digraphs, things are more subtle. Due to the direction of arcs, you might be able to reach vv from uu but not uu from vv. This gives rise to two different notions of connectivity.

Definition 29. A digraph DD is weakly connected if the undirected graph obtained by ignoring all arc directions is connected.

Definition 30. A digraph DD is strongly connected if for every pair of vertices u,vVu, v \in V, there exists a directed path from uu to vv and a directed path from vv to uu.

Strong connectivity is a much stronger requirement. Our directed cycle D=({1,2,3},{(1,2),(2,3),(3,1)})D = (\{1,2,3\}, \{(1,2), (2,3), (3,1)\}) is strongly connected: from any vertex, you can reach any other by following the arcs around the cycle. But the digraph D=({1,2,3},{(1,2),(1,3)})D' = (\{1,2,3\}, \{(1,2), (1,3)\}) is only weakly connected — you can reach 22 and 33 from 11, but you cannot reach 11 from 22 or 33.

Just as undirected graphs decompose into connected components, digraphs decompose into strongly connected components.

Definition 31. A strongly connected component (SCC) of a digraph DD is a maximal subset SVS \subseteq V such that for every pair u,vSu, v \in S, there is a directed path from uu to vv and from vv to uu.

Every vertex belongs to exactly one SCC (even if that component contains only the vertex itself). The strongly connected components of a digraph reveal its large-scale structure: within each component, everything can reach everything else; between components, the flow goes only one way.

scc

Observation 6. If we contract each strongly connected component of a digraph into a single vertex, the resulting graph has no directed cycle.

This is worth a moment of thought. If two components were connected by directed paths in both directions, we could combine them into a single, larger strongly connected component — contradicting the fact that they are maximal. So the "condensed" digraph is always acyclic, which brings us to our next topic.

Directed Acyclic Graphs

Some of the most useful digraphs are those that contain no directed cycles at all. Think of prerequisite structures: to take Advanced Calculus, you first need Calculus I and II. There is a clear ordering — you can't have circular prerequisites. This is modeled by a directed acyclic graph.

Definition 32. A directed acyclic graph (or DAG) is a digraph that contains no directed cycle.

DAGs show up everywhere: task scheduling (some tasks must finish before others can start), dependency graphs in software builds, version histories in Git, and even citation networks among academic papers.

Two types of vertices play a special role in DAGs.

Definition 33. In a digraph, a source is a vertex with in-degree 00. A sink is a vertex with out-degree 00.

A source has no incoming arcs — nothing points to it. A sink has no outgoing arcs — it doesn't point to anything. In our prerequisite example, a course with no prerequisites is a source, and a course that is not a prerequisite for anything else is a sink.

Lemma 4. Every DAG has at least one source and at least one sink.

Proof. Suppose a DAG DD has no source. Then every vertex has at least one incoming arc. Starting at any vertex v0v_0, we can follow an incoming arc backwards to find a vertex v1v_1 with (v1,v0)A(v_1, v_0) \in A. From v1v_1, we can again follow an incoming arc to find v2v_2, and so on. Since the graph is finite, this process must eventually revisit a vertex — but that means we have found a directed cycle, contradicting the assumption that DD is a DAG. So every DAG has at least one source. The argument for sinks is symmetric: follow outgoing arcs forward instead of incoming arcs backward.

Topological Ordering

Since DAGs have no directed cycles, their vertices can be lined up in an order that respects all the arcs — a topological ordering.

Definition 34. A topological ordering of a digraph D=(V,A)D = (V, A) is a linear ordering v1,v2,,vnv_1, v_2, \ldots, v_n of all vertices such that for every arc (vi,vj)A(v_i, v_j) \in A, we have i<ji < j.

In other words, every arc points "forward" in the ordering. In our prerequisite example, a topological ordering tells you a valid sequence in which to take all your courses so that you never take a course before its prerequisites.

Theorem 5. A digraph has a topological ordering if and only if it is a DAG.

Proof. For the "only if" direction: suppose DD has a topological ordering and contains a directed cycle vi1,vi2,,vik,vi1v_{i_1}, v_{i_2}, \ldots, v_{i_k}, v_{i_1}. Then the arc (vi1,vi2)(v_{i_1}, v_{i_2}) requires i1<i2i_1 < i_2, the arc (vi2,vi3)(v_{i_2}, v_{i_3}) requires i2<i3i_2 < i_3, and so on, until the arc (vik,vi1)(v_{i_k}, v_{i_1}) requires ik<i1i_k < i_1. Chaining these inequalities gives i1<i2<<ik<i1i_1 < i_2 < \cdots < i_k < i_1, which is a contradiction. So DD must be a DAG.

For the "if" direction, we use Lemma 4 to build the ordering step by step. Let DD be a DAG. By Lemma 4, DD has a source ss. Place ss first in the ordering. Now remove ss and all its outgoing arcs from DD. The resulting digraph is still a DAG (removing vertices and arcs cannot create cycles), so it has a new source. Place that source next in the ordering. Continue until all vertices have been placed.

This constructive proof also gives us an algorithm: repeatedly find a source, output it, remove it, and repeat. This is known as Kahn's algorithm for topological sorting and runs efficiently even on large graphs2.

dag-topological

Looking Ahead

With directed graphs, we've expanded our toolkit considerably. Many real-world systems — from supply chains to web links — are naturally directed. If you're curious about what happens when we add capacities to arcs and ask how much "flow" we can push through a network, take a look at the series on Network Flows, which builds directly on the ideas from this part.

In the next part of this introduction, we will change direction and explore graph coloring — the art of assigning colors to vertices so that no two adjacent vertices share the same color. This seemingly simple idea leads to some of the most famous results and open problems in all of mathematics.

Footnotes

  1. If a relationship is symmetric — every arc (u,v)(u,v) has a corresponding arc (v,u)(v,u) — we can replace each pair with a single undirected edge and work with an undirected graph instead.

  2. Kahn's algorithm runs in O(n+m)O(n + m) time, where nn is the number of vertices and mm is the number of arcs. An alternative approach based on depth-first search achieves the same time complexity.