Aspects of OR - Integer Programming (Part 2)

In Part 1 we saw that integer programs are much harder than LPs, and that naive rounding of the LP relaxation solution does not work. We need a systematic way to search the space of integer solutions while avoiding exhaustive enumeration. The answer is branch and bound — an elegant algorithm that combines divide-and-conquer with LP relaxation bounds to explore only a fraction of the solution space.

Branch and bound (B&B) was introduced by Land and Doig in 1960 and remains the backbone of every modern IP solver. In this post, we explain the algorithm in detail, walk through a complete example on the knapsack problem, and discuss the key design choices that make B&B efficient.

The idea

Branch and bound works by building a search tree. The root of the tree is the original IP. At each node, we solve the LP relaxation. If the LP solution happens to be integer, we have found a feasible IP solution. If not, we pick a fractional variable and branch: create two child nodes by adding constraints that force this variable to be either rounded down or rounded up. We then solve the LP relaxation at each child node.

The key insight is that we do not need to explore every node. The LP relaxation at each node provides a bound on the best IP solution achievable in that subtree. If this bound is worse than the best integer solution found so far (the incumbent), we can prune the entire subtree — no integer solution in it can beat the incumbent.

The algorithm

More precisely, branch and bound for a maximization IP works as follows:

Initialize: Set the global upper bound to $+\infty$ and the global lower bound (incumbent value) to $-\infty$ . Create a list of active nodes containing only the root node.

Step 1 — Select. Choose an active node from the list.

Step 2 — Relax. Solve the LP relaxation at this node.

Step 3 — Prune. Check if the node can be pruned by one of three criteria:

Pruning by bound: The LP optimal value $\leq$ incumbent value. No integer solution in this subtree can improve on the best known solution.
Pruning by infeasibility: The LP relaxation is infeasible. The added branching constraints made the subproblem impossible.
Pruning by integrality: The LP solution is integer. Update the incumbent if this solution is better than the current best.

Step 4 — Branch. If the node is not pruned, select a fractional variable $x_j$ with LP value $x_j^* = f$ (where $f \notin \mathbb{Z}$ ). Create two child nodes:

Left child: add the constraint $x_j \leq \lfloor f \rfloor$ .
Right child: add the constraint $x_j \geq \lceil f \rceil$ .

Add both children to the active node list.

Step 5 — Terminate. If the active node list is empty, the incumbent is optimal. Otherwise, go to Step 1.

Worked example: knapsack

Let us apply branch and bound to the knapsack instance from Part 1:

$\max \; 60 x_1 + 100 x_2 + 120 x_3 \quad \text{s.t.} \quad 10 x_1 + 20 x_2 + 30 x_3 \leq 50, \quad x_1, x_2, x_3 \in \{0, 1\}.$

Node 0 (root)

Solve the LP relaxation (allow $0 \leq x_j \leq 1$ ). The greedy LP solution sorts by value-to-weight ratio: item 1 (ratio 6), item 2 (ratio 5), item 3 (ratio 4). Take $x_1 = 1$ (weight 10), $x_2 = 1$ (weight 20), $x_3 = 2/3$ (weight 20). Total weight 50, LP value $= 60 + 100 + 80 = 240$ .

Fractional variable: $x_3 = 2/3$ . Branch on $x_3$ .

Incumbent: none yet. Upper bound: 240.

Node 1: $x_3 \leq 0$ (i.e. $x_3 = 0$ )

Remaining problem: $\max \; 60 x_1 + 100 x_2$ s.t. $10 x_1 + 20 x_2 \leq 50$ , $x_1, x_2 \in [0,1]$ .

LP solution: $x_1 = 1$ (weight 10), $x_2 = 1$ (weight 20). Remaining capacity 20, but $x_2$ is already at 1. Total weight 30 $\leq$ 50. LP value $= 160$ .

This solution is integer! Update incumbent to $160$ .

Pruned by integrality. Incumbent = 160.

Node 2: $x_3 \geq 1$ (i.e. $x_3 = 1$ )

Remaining problem: $\max \; 60 x_1 + 100 x_2 + 120$ s.t. $10 x_1 + 20 x_2 \leq 20$ , $x_1, x_2 \in [0,1]$ .

LP solution: $x_1 = 1$ (weight 10), $x_2 = 0.5$ (weight 10). LP value $= 60 + 50 + 120 = 230$ .

Fractional variable: $x_2 = 0.5$ . LP bound 230 > incumbent 160, so we cannot prune. Branch on $x_2$ .

Node 3: $x_3 = 1, x_2 \leq 0$ (i.e. $x_2 = 0$ )

Remaining: $\max \; 60 x_1 + 120$ s.t. $10 x_1 \leq 20$ , $x_1 \in [0,1]$ .

LP solution: $x_1 = 1$ . Value $= 60 + 120 = 180$ . Integer solution!

Update incumbent to $180$ .

Pruned by integrality. Incumbent = 180.

Node 4: $x_3 = 1, x_2 \geq 1$ (i.e. $x_2 = 1$ )

Remaining: $\max \; 60 x_1 + 100 + 120$ s.t. $10 x_1 \leq 0$ , $x_1 \in [0,1]$ .

LP solution: $x_1 = 0$ . Value $= 0 + 100 + 120 = 220$ . Integer solution!

Update incumbent to $220$ .

Pruned by integrality. Incumbent = 220.

Termination

All nodes are pruned. The optimal solution is $x_1 = 0, x_2 = 1, x_3 = 1$ with value 220.

The search tree had 5 nodes — much less than the $2^3 = 8$ possible solutions. For larger problems, the savings from pruning are dramatic.

Here is a summary of the search tree:

                Node 0: LP=240
               /              \
      x₃=0 /                  \ x₃=1
          /                    \
    Node 1: LP=160          Node 2: LP=230
    INTEGER ✓                /            \
    (incumbent=160)   x₂=0 /              \ x₂=1
                          /                \
                   Node 3: LP=180     Node 4: LP=220
                   INTEGER ✓          INTEGER ✓
                   (incumbent=180)    (incumbent=220)

Node selection strategies

The order in which we explore nodes significantly affects performance. Common strategies:

Best-first (best bound): Always explore the node with the best LP bound. This strategy tends to improve the global upper bound quickly, which is useful for proving optimality. However, it may take long to find feasible integer solutions.

Depth-first: Explore the most recently created node. This strategy dives deep into the tree quickly, finding integer solutions early (which helps pruning). It also uses less memory since it only keeps one path in the tree. The downside is that it may explore many suboptimal nodes before reaching the optimal solution.

Best-first with diving: A hybrid. Mostly use best-first, but periodically dive deep from a promising node to find feasible solutions. This is close to what modern solvers do.

Most solvers use sophisticated combinations of these strategies, adapting dynamically based on the problem's characteristics.

Variable selection (branching rules)

Which fractional variable should we branch on? This choice has a major impact on the tree size:

Most fractional: Branch on the variable closest to 0.5. The idea is that this variable is the "most undecided". Simple but often not the most effective.

Strong branching: For each candidate variable, tentatively solve the LP relaxations of both children. Pick the variable that gives the biggest improvement in bound. This produces small trees but is expensive per node.

Reliability branching: A compromise — use strong branching for the first few occurrences of each variable, then switch to a cheaper heuristic based on the observed improvements. This is the default in most modern solvers.

Pseudocost branching: Track the historical improvement per unit change for each variable and use these "pseudocosts" to estimate branching quality. Cheap per node once the pseudocosts are initialized.

Bounding and the integrality gap

The effectiveness of branch and bound depends directly on the quality of the LP bounds. The tighter the LP relaxation, the more nodes can be pruned, and the smaller the search tree.

The integrality gap is the ratio between the LP optimal value and the IP optimal value. For the knapsack example above: LP bound $= 240$ , IP optimum $= 220$ , gap $= (240 - 220)/220 \approx 9\%$ .

In practice, solvers report the MIP gap: the relative difference between the best known integer solution (lower bound for maximization) and the best LP bound (upper bound for maximization). The solve is complete when the MIP gap reaches zero (or a user-specified tolerance).

Implementation in Python

Here is a basic branch-and-bound implementation for binary programs using SciPy for LP relaxations:

from scipy.optimize import linprog
import numpy as np

def branch_and_bound(c, A_ub, b_ub, n):
    """
    Solve max c^T x s.t. A_ub x <= b_ub, x in {0,1}^n.
    Returns (optimal_value, optimal_x).
    """
    best_val = -np.inf
    best_x = None

    # Stack of subproblems: each is (lower_bounds, upper_bounds)
    stack = [(np.zeros(n), np.ones(n))]

    while stack:
        lb, ub = stack.pop()  # depth-first

        # Solve LP relaxation (linprog minimizes, so negate c)
        bounds = list(zip(lb, ub))
        res = linprog(-c, A_ub=A_ub, b_ub=b_ub, bounds=bounds, method='highs')

        if not res.success:
            continue  # infeasible, prune

        lp_val = -res.fun  # negate back to maximization

        if lp_val <= best_val:
            continue  # bound pruning

        x = res.x

        # Check integrality
        frac = np.abs(x - np.round(x))
        if np.all(frac < 1e-6):
            # Integer solution found
            if lp_val > best_val:
                best_val = lp_val
                best_x = np.round(x)
            continue

        # Branch on most fractional variable
        frac_idx = np.argmax(np.minimum(x - np.floor(x), np.ceil(x) - x))
        # But we want the variable *most* fractional (closest to 0.5)
        frac_vals = np.minimum(x - np.floor(x), np.ceil(x) - x)
        frac_idx = np.argmax(frac_vals)

        # Left child: x_j <= floor(x_j) = 0
        new_ub = ub.copy()
        new_ub[frac_idx] = np.floor(x[frac_idx])
        stack.append((lb.copy(), new_ub))

        # Right child: x_j >= ceil(x_j) = 1
        new_lb = lb.copy()
        new_lb[frac_idx] = np.ceil(x[frac_idx])
        stack.append((new_lb, ub.copy()))

    return best_val, best_x

Applying this to our knapsack example:

c = np.array([60.0, 100.0, 120.0])
A_ub = np.array([[10.0, 20.0, 30.0]])
b_ub = np.array([50.0])

val, x = branch_and_bound(c, A_ub, b_ub, n=3)
print(f'Optimal value: {val:.0f}')
print(f'Solution: {x}')
# Optimal value: 220
# Solution: [0. 1. 1.]

This implementation is purely educational. Production solvers add hundreds of enhancements: warm-starting LP solves, sophisticated branching rules, cutting planes (Part 3), heuristics, parallelism, and more.

Looking ahead

Branch and bound alone can solve many IPs, but the LP relaxation bounds are sometimes too loose, leading to enormous search trees. In Part 3, we introduce cutting planes — valid inequalities that tighten the LP relaxation by cutting off fractional solutions. The combination of branch and bound with cutting planes, called branch-and-cut, is the state of the art for solving IPs and is used by all major solvers.