Maximum Sum Subarray (Leetcode 53)

The Problem

Suppose we have an array of integers¹. How can we find the subarray whose elements sum to the greatest amount? For simplicity we’re going to just return the sum of that array. But it wouldn’t be too difficult to modify these algorithms to return the array itself.

A common interview problem, you can also find the problem on LeetCode:

https://leetcode.com/problems/maximum-subarray/

The Obvious Solution

A pretty obvious brute-force algorithm suggest itself: take each pair of indices (i, j with i <= j), iterate over the array from i to j, sum up the elements, and if the total of these elements is greater than what we’ve seen before, record the new maximum.

What we’re doing is just looking at every distinct subarray, summing it, and recording the maximum we find. This clearly works. It’s almost a direct transformation of our problem statement into an algorithm.

Below is a Ruby implementation of the brute-force algorithm:

def max_sub_array(nums)
  max_sum = -1.0/0 # negative infinity
  nums.each_with_index do |num, i|
    (i..nums.length - 1).each do |j|
      max_sum = [max_sum, nums[i..j].sum].max
    end
  end
  max_sum
end

But what’s the run time complexity of our solution?

We iterate over (n choose 2) + n = n*(n-1)/2 + n = O(n^2)² pairs of indices. On average these subarrays will have ~n/2 elements. This means on average the algorithm will require((n choose 2) + n) * n/2 = O(N^3) operations. That’s pretty slow!

Can we do better?

Optimization: Saving Our Sums

Let’s make an observation about the previous algorithm.

Suppose we’re iterating through the array and we’re looking at the subarray starting at i and ending at j. We sum this up and for our next iteration we increment j and now consider the subarray starting at i and ending at j+1. When we go to sum these values, observe that we’ve already calculated the sum from i to j on our previous iteration. If we had saved that value, we could just add the value at j+1 to that and we’d have the sum from i to j+1.

Below is a Ruby implementation of the brute force solution with caching:

def max_sub_array(nums)
  max_sum = -1.0/0 # negative infinity
  nums.each_with_index do |num, i|
    sums = {}
    (i..nums.length - 1).each do |j|
      sums[j] = sums[j-1] ? sums[j-1] + nums[j] : nums[j]
      max_sum = [max_sum, sums[j]].max
    end
  end
  max_sum
end

If we save the previous sum values, we turn the sum operation from O(n) complexity into O(1). This changes our algorithm’s running time as a whole from O(n^3) into O(n^2). It’s still a little bit slow, but a significant improvement.

Can we do better still?

A Dynamic Programming Approach

Our previous optimization of recording the sums turned the problem into something that’s starting to resemble a dynamic programming problem; the essence of dynamic programming is storing solutions to previous subproblems and then using those solutions to speed up the later subproblems. Maybe we can elaborate on this dynamic programming approach to improve our algorithm even further.

But let’s step back from our current problem for a moment.

A lot of interview problems out there have a structure similar to the maximum sum subarray problem. You’re given an array and you’re asked to find the longest __, the maximum __, the minimum __, etc. Often you can transform these into a dynamic programming solution. You transform a question about the entire array into subproblems of the form: what is the maximum (or longest, or minimum, etc) __ ending at index i? In our case, we can transform this problem into subproblems of the form “what is the maximum sum subarray ending at index i?”.

The magic in this approach is that it transforms the problem into a bunch of distinct but simpler cases. The maximum sum subarray ends and some index, so if we find the maximum sum subarray ending at each index, we’ll have found the maximum sum subarray for the entire array.

Kadane’s Algorithm

Okay, but how do we find the maximum subarray ending at a specific index? Well, if we start at index 0, that’s an easy one. We just take the array consisting of the element at index 0, that’s the only nonempty array ending there so it’s the maximum. Now suppose we’re at index i with i > 0. Also let’s suppose we know the maximum sum subarray ending at i-1. How can we use that information to find speed up finding the maximum sum subarray ending at i?

Consider the maximum sum subarray ending at i. There are two distinct possibilities:

It consists only of the element at index i
It consists of the element at index i and another subarray ending at index i-1

In the first case, the sum of the array is just the value at index i. In the second case, the subarray ending at index i-1 has to be the maximum sum subarray ending at index i-1. Why is that? If we take any other subarray ending at i-1 that isn’t the maximum, its sum is lower than the maximum by definition. So if we add the value at i to the lesser array ending at i-1 it will be less than if we add it the maximum array ending at i-1.

This means that we only need to check two values at each index i of the array:

The value of the current element
The value of the current element plus the value of the maximum sum subarray ending at i-1

And we simply take whichever is greater. This is Kadane’s algorithm.

Below is a Ruby implementation of Kadane’s algorithm:

def max_sub_array(nums)
  max_sum = -1.0/0 # negative infinity
  dp = []
  nums.each_with_index do |num, i|
    dp[i] = dp[i-1] ? [num, num + dp[i-1]].max : num
    max_sum = [max_sum, dp[i]].max
  end
  max_sum
end

What’s the running time of Kadane’s algorithm? Because we’re looping over the array a single time and at each step we’re only doing a constant number of operations, Kadane’s algorithm has a running time of O(n).

Despite being the fastest algorithm we’ve discussed, it’s about as simple as any of the others. The difficult part is in figuring out the proper approach to using the previously solved subproblems.

Stay tuned for my upcoming article on how to use Kadane’s algorithm to efficiently solve the analogous problem for 2D arrays!

Footnotes

When we say integers this implies that the numbers can be positive or negative. If it were only non-negative numbers we would say the natural numbers instead. ↩
How did we get the value (n choose 2) + n? We get the n choose 2 because we “choose” all the combinations of two distinct indices i and j. This gives us all the subarrays that start and stop at different points. Then we add + n because there’s also the subarrays that start and end at the same index. ↩