- Published on
Aspects of OR - Traveling Salesperson Problem - Heuristics (Part 2)
- Authors

- Name
- Till Heller
In Part 1 we formulated the Traveling Salesperson Problem as an integer linear program and solved it exactly with Gurobi. Exact methods guarantee an optimal solution, but they hit a wall as instance sizes grow — the ILP formulation has binary variables and subtour elimination constraints, and solve times can grow exponentially.
For large instances (hundreds or thousands of cities), we need heuristics: algorithms that trade optimality for speed. A good heuristic finds a solution that is close to optimal in a fraction of the time an exact method would need. In this post we cover two families of heuristics:
- Construction heuristics — build a tour from scratch (nearest neighbor, greedy).
- Improvement heuristics — start from an existing tour and iteratively improve it (2-opt, 3-opt, Or-opt).
In Part 3 we go further with metaheuristics that combine these building blocks in more sophisticated search frameworks.
Setup
Throughout this post, we assume a complete graph with cities and a symmetric distance matrix . We represent a tour as a list of city indices. All Python implementations use NumPy and a precomputed distance matrix:
import numpy as np
def make_distance_matrix(coords):
"""Compute Euclidean distance matrix from (x, y) coordinates."""
n = len(coords)
D = np.zeros((n, n))
for i in range(n):
for j in range(i + 1, n):
dx = coords[i][0] - coords[j][0]
dy = coords[i][1] - coords[j][1]
D[i, j] = D[j, i] = np.sqrt(dx * dx + dy * dy)
return D
def tour_length(tour, D):
"""Total length of a tour."""
return sum(D[tour[i], tour[(i + 1) % len(tour)]] for i in range(len(tour)))
We will use a random 50-city instance for demonstrations:
np.random.seed(42)
coords = np.random.rand(50, 2) * 100
D = make_distance_matrix(coords)
Nearest neighbor
The nearest neighbor heuristic is the simplest construction method. Start at some city, then repeatedly visit the closest unvisited city until all cities have been visited. Finally, return to the starting city.
Algorithm
- Choose a starting city .
- Set the current city to and mark it as visited.
- While there are unvisited cities:
- Find the nearest unvisited city to the current city.
- Move to and mark it as visited.
- Return to .
Analysis
Nearest neighbor runs in time — at each of the steps, we scan up to cities to find the nearest one. It is fast and easy to implement, but the solution quality can be poor. The heuristic tends to make good local choices early on but may be forced into long edges at the end when few unvisited cities remain.
The approximation ratio of nearest neighbor is — on worst-case instances it can produce tours up to times longer than optimal. In practice, the typical overhead is 20–30% above optimal.
A simple improvement: run nearest neighbor from every city as the starting point and keep the best tour. This costs but often yields noticeably better results.
Implementation
def nearest_neighbor(D, start=0):
"""Nearest neighbor heuristic for TSP."""
n = len(D)
visited = [False] * n
tour = [start]
visited[start] = True
for _ in range(n - 1):
current = tour[-1]
best_next = -1
best_dist = np.inf
for j in range(n):
if not visited[j] and D[current, j] < best_dist:
best_dist = D[current, j]
best_next = j
tour.append(best_next)
visited[best_next] = True
return tour
tour_nn = nearest_neighbor(D)
print(f'Nearest neighbor tour length: {tour_length(tour_nn, D):.2f}')
Greedy heuristic
The greedy heuristic takes a different approach: instead of building the tour city by city, it builds it edge by edge. Sort all edges by distance and greedily add the shortest edge that does not violate tour constraints.
Algorithm
- Sort all edges by distance.
- For each edge (shortest first):
- Add the edge to the tour if it does not:
- Give any city degree > 2 (at most two edges per city).
- Create a cycle of length < (premature subtour).
- Add the edge to the tour if it does not:
- The result is a Hamiltonian cycle.
Analysis
The greedy heuristic runs in due to the sorting step. It tends to produce better solutions than nearest neighbor because it considers the globally shortest edges first, rather than making locally greedy decisions from a single starting point.
Implementation
def greedy_tsp(D):
"""Greedy edge-insertion heuristic for TSP."""
n = len(D)
# Generate and sort all edges
edges = []
for i in range(n):
for j in range(i + 1, n):
edges.append((D[i, j], i, j))
edges.sort()
degree = [0] * n
adj = [[] for _ in range(n)]
# Union-Find for cycle detection
parent = list(range(n))
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(x, y):
parent[find(x)] = find(y)
edges_added = 0
for dist, i, j in edges:
if edges_added == n:
break
if degree[i] >= 2 or degree[j] >= 2:
continue
# Avoid premature cycle (unless it would complete the tour)
if find(i) == find(j) and edges_added < n - 1:
continue
adj[i].append(j)
adj[j].append(i)
degree[i] += 1
degree[j] += 1
union(i, j)
edges_added += 1
# Extract tour from adjacency list
tour = [0]
prev = -1
current = 0
for _ in range(n - 1):
for neighbor in adj[current]:
if neighbor != prev:
tour.append(neighbor)
prev = current
current = neighbor
break
return tour
tour_greedy = greedy_tsp(D)
print(f'Greedy tour length: {tour_length(tour_greedy, D):.2f}')
2-opt
The construction heuristics above give us a decent starting tour, but we can almost always improve it with local search. The most widely used improvement heuristic for the TSP is 2-opt.
The idea: take two edges in the tour and check whether reconnecting them (reversing the segment between them) would shorten the tour. If so, make the swap. Repeat until no improving 2-opt move exists.
The 2-opt move
A 2-opt move removes two edges and from the tour and reconnects them as and . This reverses the segment between positions and :
Before: ... → a → b → ... → c → d → ...
After: ... → a → c → ... → b → d → ...
(segment b...c is reversed)
The move improves the tour if:
Algorithm
- Start with an initial tour.
- For each pair of edges with :
- Compute the change in tour length from the 2-opt move.
- If the change is negative (improvement), apply the move.
- Repeat from Step 2 until a full pass finds no improvement (locally optimal).
Analysis
Each pass takes time (checking all pairs of edges). The number of passes is typically small in practice — often or less. The resulting tour is 2-optimal: no single 2-opt move can improve it.
2-opt typically improves a nearest neighbor tour by 15–25% and brings the solution within a few percent of optimal for moderate-sized instances.
Implementation
def two_opt(tour, D):
"""Improve a tour using 2-opt local search."""
n = len(tour)
improved = True
while improved:
improved = False
for i in range(n - 1):
for j in range(i + 2, n):
if j == n - 1 and i == 0:
continue # skip: same as no change
# Cost of current edges
a, b = tour[i], tour[i + 1]
c, d = tour[j], tour[(j + 1) % n]
# Cost change
delta = D[a, c] + D[b, d] - D[a, b] - D[c, d]
if delta < -1e-10:
# Reverse the segment between i+1 and j
tour[i + 1:j + 1] = tour[i + 1:j + 1][::-1]
improved = True
return tour
tour_2opt = two_opt(tour_nn.copy(), D)
print(f'2-opt tour length: {tour_length(tour_2opt, D):.2f}')
3-opt
3-opt generalizes 2-opt by removing three edges and reconnecting the three resulting segments. This allows moves that 2-opt cannot make — for example, rearranging the order of segments rather than just reversing one.
With three edges removed, there are 8 possible ways to reconnect the segments (including the original tour). After excluding the original and the moves that are equivalent to one or two 2-opt moves, there are a few genuinely new reconnection patterns.
Analysis
Each pass of 3-opt takes time — significantly more expensive than 2-opt. For this reason, full 3-opt is rarely used in practice. Instead, solvers often use restricted 3-opt or the closely related Lin-Kernighan heuristic, which uses a variable number of edge swaps guided by a clever search strategy.
The additional improvement from 3-opt over 2-opt is typically 1–3% — meaningful for high-quality solutions but at a much higher computational cost.
Implementation (simplified)
A full 3-opt implementation is quite involved due to the many reconnection cases. Here is a simplified version that considers the most common non-sequential 3-opt moves:
def three_opt_pass(tour, D):
"""One pass of 3-opt. Returns True if an improvement was found."""
n = len(tour)
for i in range(n - 2):
for j in range(i + 2, n - 1):
for k in range(j + 2, n + (1 if i > 0 else 0)):
# Segment endpoints
a, b = tour[i], tour[i + 1]
c, d = tour[j], tour[j + 1]
e, f = tour[k % n], tour[(k + 1) % n]
d0 = D[a, b] + D[c, d] + D[e, f]
# Try all reconnections and pick the best
# Reconnection: a-d, e-b, c-f (segment reversal + reorder)
d1 = D[a, d] + D[e, b] + D[c, f]
if d1 < d0 - 1e-10:
# Apply the move: reverse and reorder segments
seg1 = tour[i + 1:j + 1]
seg2 = tour[j + 1:k % n + 1] if k % n > j else tour[j + 1:]
tour[i + 1:i + 1 + len(seg2)] = seg2
tour[i + 1 + len(seg2):i + 1 + len(seg2) + len(seg1)] = seg1
return True
return False
In practice, most implementations skip full 3-opt and use the Or-opt move instead, which is a special case that is much faster.
Or-opt
Or-opt is a fast improvement heuristic that relocates short segments of the tour. Specifically, it takes a segment of 1, 2, or 3 consecutive cities and moves it to a different position in the tour.
The Or-opt move
Remove a segment of consecutive cities () from the tour and insert it between two other consecutive cities. For , this is simply relocating a single city.
Analysis
Or-opt runs in per pass — the same as 2-opt — but considers moves that 2-opt cannot make (relocating a city or segment without reversing). It is an excellent complement to 2-opt: applying 2-opt followed by Or-opt (and iterating) often reaches solutions close to what full 3-opt would achieve, at much lower cost.
Implementation
def or_opt(tour, D, segment_sizes=(1, 2, 3)):
"""Improve a tour using Or-opt (relocate segments of size 1, 2, 3)."""
n = len(tour)
improved = True
while improved:
improved = False
for size in segment_sizes:
for i in range(n):
# Segment to relocate: tour[i], ..., tour[i+size-1]
if i + size > n:
continue
prev_i = (i - 1) % n
next_seg = (i + size) % n
# Cost of removing the segment
removal_gain = (
D[tour[prev_i], tour[i]]
+ D[tour[(i + size - 1) % n], tour[next_seg]]
- D[tour[prev_i], tour[next_seg]]
)
# Try inserting the segment at every other position
for j in range(n):
if j == prev_i or j == i or j == (i + size - 1) % n:
continue
next_j = (j + 1) % n
insertion_cost = (
D[tour[j], tour[i]]
+ D[tour[(i + size - 1) % n], tour[next_j]]
- D[tour[j], tour[next_j]]
)
if insertion_cost < removal_gain - 1e-10:
# Extract segment and reinsert
segment = [tour[(i + s) % n] for s in range(size)]
remaining = [tour[t] for t in range(n)
if t not in set(range(i, i + size)) and
(i + size <= n or t not in set(range(0, (i + size) % n)))]
# Simpler approach: rebuild tour
new_tour = []
seg_inserted = False
for t in range(n):
idx = tour[t]
if t >= i and t < i + size:
continue
new_tour.append(idx)
if idx == tour[j] and not seg_inserted:
new_tour.extend(segment)
seg_inserted = True
if len(new_tour) == n:
tour = new_tour
improved = True
break
if improved:
break
if improved:
break
return tour
Combining heuristics
The best results come from combining construction and improvement heuristics. A typical pipeline is:
- Construct an initial tour with nearest neighbor (from multiple starting cities).
- Improve with 2-opt until locally optimal.
- Improve further with Or-opt.
- Repeat steps 2–3 until no improvement.
def solve_tsp_heuristic(D):
"""Combined heuristic: best nearest neighbor + 2-opt + Or-opt."""
n = len(D)
best_tour = None
best_length = np.inf
for start in range(n):
tour = nearest_neighbor(D, start=start)
tour = two_opt(tour, D)
length = tour_length(tour, D)
if length < best_length:
best_length = length
best_tour = tour[:]
return best_tour, best_length
tour_best, length_best = solve_tsp_heuristic(D)
print(f'Best heuristic tour length: {length_best:.2f}')
Comparison
Let us compare the heuristics on our 50-city instance:
# Run all methods
tour_nn = nearest_neighbor(D, start=0)
tour_gr = greedy_tsp(D)
tour_nn_2opt = two_opt(nearest_neighbor(D, start=0), D)
tour_best, _ = solve_tsp_heuristic(D)
print(f'{"Method":<30} {"Tour length":>12}')
print('-' * 44)
print(f'{"Nearest neighbor":<30} {tour_length(tour_nn, D):>12.2f}')
print(f'{"Greedy":<30} {tour_length(tour_gr, D):>12.2f}')
print(f'{"NN + 2-opt":<30} {tour_length(tour_nn_2opt, D):>12.2f}')
print(f'{"Best NN + 2-opt (all starts)":<30} {tour_length(tour_best, D):>12.2f}')
Typical results for random Euclidean instances: nearest neighbor produces tours 20–30% above optimal, greedy is slightly better, 2-opt brings it down to 5–10% above optimal, and the multi-start combination reaches 2–5% above optimal.
Limitations
Improvement heuristics like 2-opt and Or-opt are local search methods: they stop at the first local optimum they encounter. If the initial tour happens to lead to a poor local optimum, no amount of 2-opt moves will escape it.
To overcome this, we need strategies that can escape local optima — metaheuristics. In Part 3, we explore simulated annealing, genetic algorithms, and ant colony optimization, all of which use randomization and population-based search to explore the solution space more broadly.