Lesson C: Subset Sum
Copyright (c) 1998, C. Heeren, T. Magliery, and L. Pitt. This lesson and supporting materials may be copied for nonprofit educational use only.

Motivation:

There are many problems that are quite easily stated, yet are very difficult to solve because no method is known that is significantly better than searching through an enormous number of possible solutions. In such problems, the number of possibilities grows so rapidly that even supercomputers cannot effectively solve them, even after allowing for centuries of computation time. This lesson investigates one such problem while at the same time demonstrates that an exponential growth rate (of the number of possibilities to be searched) is far too fast to be manageable. Connections are made to simple combinatorics.

Lesson Plan:

  1. Tell the students the following story: The picnic committee at Karpsville Elementary school is planning on having fun games at the school picnic. One such game is the tug-of-war. In order to ensure fairness, they have bought the deluxe version of the Acme Pull-o-meter, which measures each student's pulling strength. At the picnic, each student is tested on the Pull-o-meter, and a number reflecting their pulling ability is recorded. The committee would now like to break the picnicgoers into two groups of (not necessarily the same number of) students, so that the total amount of pulling power in one group (added up) is as close as possible to the amount of pulling power in the remaining group.
  2. Give a simple example, where there are only 3 people, and the pull-o-meter records strengths of 1, 2, and 3 tugpower. Let the students conclude the obvious; that one team should consist of the people with strengths 1 and 2, and the other team should have only the person of tugpower 3.
  3. After that warmup, give them this still simple additional problem: 5 people (give them fun names), with strengths 1, 2, 3, 4, and 6.
  4. Here is a set of strengths of 6 different people that makes a slightly harder (but still pretty easy) problem:
    24, 26, 33, 87, 90, 94
  5. All of the problems so far have had fairly easy solutions. We'll see now that the problem will get much harder very quickly. Give the students the following list of strengths of 12 different people, and see if they can find the best way to break them up into two sides:
    11, 23, 24, 24, 34, 37, 37, 41, 44, 44, 47, 50

    By the way, there is a unique "best" solution to the problem (with the two sides evenly matched). Divide the numbers into two sets as follows:

    {11, 24, 24, 34, 37, 37, 41} and {23, 44, 44, 47, 50}
  6. After some time, they may solve it. In any case, they should all agree that finding the solution seems much more difficult with 12 people than with just 5 or 6. Let's see why. How many different ways are there to break a collection of people into two teams? It depends on the number of people. The table below shows that if the number of people to be split into teams is just 1, then there are two ways to do it, if there are 2 people, then there are 4 ways, if there are 3 people, then there are 8 ways, ..., and if there are n people, then there are 2n ways. (Note that we're counting "silly" teams, such as A and B on team 1, with nobody on team 2. Also, we really should divide these in half, since there really is no difference between having Team 1 contain only A and team 2 only B, or having Team 1 contain only B and Team 2 contain only A.)

    Number of people Names Team 1 Team 2 TOTAL NUM WAYS
    1 A A Nobody
    Nobody A 2
    2 A, B A,B Nobody
    A B
    B A
    Nobody A,B 4
    3 A, B, C A, B, C Nobody
    A, B C
    A, C B
    B, C A
    A B, C
    B A, C
    C A, B
    Nobody A, B, C 8

  7. Do they see the pattern? These are just the powers of two. There are a couple of explanations/intuitions for why this is the case.
    • Explanation 1: For 3 people A, B, and C, there are two possibilities for person A: on team 1 or on team 2. For each of those cases, there are two possibilities for where to put person B (on team 1 or team 2), thus making 4 ways to put A and B on either team 1 or team 2. Finally, for each of those 4 ways to place persons A and B, there are 2 ways to place person C.

      Recall the rule of product: If there are x possible ways that event X could occur, and y possible ways for event Y to occur, then as long as the events do not influence each other, the number of combinations of ways that both X and Y will occur is x times y. Refer back to problems involving how many ways to get dressed (choices of 3 pants, 4 shirts, 2 pair shoes, gives at least 3 x 4 x 2 = 24 different clothes combinations).

    • Explanation 2: Consider the case of three people again, A, B, and C. Break them into two teams, and write the numbers 1 or O below the letters A, B, and C as follows: If A is in team 1, then write "1" below A. Otherwise, write "0" below A. So if A and B are on team 1, with C on team 2, the bits beneath A, B, and C would read 110. If everybody were on team 1, then we'd have written "111", and if everybody were on team 2 we'd have written "000". Point out that each assignment of A, B, and C to team 1 or team 2 corresponds to a 3-bit sequence in the manner just described. How many such sequences are there? This is asking how many different numbers can you write in binary using only 3 bits. The answer is 8: {000, 001, 010, 011, 100, 101, 110, 111}. In general, with n bits, you can count up to 2n, so there will be 2n ways to place n people.on 2 teams.
  8. Now let's investigate just how fast 2n grows - how big is this number? If the number of people n = 5 (as in our earlier example) then there are 25, or 32 different ways. But, for n=12, there are 212=4096 different ways! No wonder the problem with 12 people was more difficult - the number of possibilities was much much greater.
  9. Now for a fun calculation: What if there were 300 people? The number of ways to assign them to two teams is 2300, which is greater than the estimated total number of electrons, protons, and neutrons, in the entire universe!!
  10. Recall the "rule of halving" from the lesson on binary search. This said that we can search efficiently among many sorted items because the number of times we need to divide a large number in half before it reaches 1 is very small. We can find an item among 2300 sorted items in only 300 questions, since it takes only 300 divisions by 2 to reduce this number to 1. The reverse of the rule of halving is the "rule of doubling": When you double a number repeatedly, it gets large VERY QUICKLY. This means that any strategy that attempts to search among all possible solutions, when there are 2n possibilities, will have a tough job even if n is only of moderately large size.
  11. But who says that the search for the best tug-of-war division has to occur by enumerating all of the possibilities? Clearly we can eliminate many without even considering them (e.g., all on one team, the strongest half on one team, etc.) Moreover, reasonable strategies that sound like they might work come immediately to mind: Put the strongest on team 1, the next two strongest on team 2, then continue alternating between team 1 and team 2. Perhaps shuffle the people a bit at the end to fine-tune the balance. As it turns out, while this method, or other methods like this may work reasonably well in some cases, in others they perform terribly. And nobody has ever found any method that is guaranteed to divide the teams as equally as possible, by doing anything that is significantly more efficient than an exhaustive search through a truly astronomical number of possible solutions.
  12. You may want to point out that a number like 2300 is so large that even a computer counting at a speed of over a million or billion each second, would not reach 2300 until long after our sun had burned out.

Conclusion:

Some problems have a structure that hides solutions in an incredibly large sea of possibilities. For some of these, no methods are known that are significantly better than a brute force and exhaustive search. In some cases, algorithms that sometimes work are employed, although there is never any guarantee that what is found is anywhere near optimal.

Evaluation:

Because the problem to be solved is by nature a difficult one, students should not be penalized for failing to find the best division into two teams. The point of the lesson is that there is no logical or methodical way that is always guaranteed to perform optimally, as was the case for the minimum spanning tree problem (Muddy City). Instead of solutions, look for any creative methods that students may have used in attempting to solve the problems, whether they were logical and intuitive, and whether they attempted to verify or discredit their hypothesis that their method does the best possible.

Extensions:

There are countless extensions dealing with counting problems and elementary combinatorics.

A particularly fun project is to read "The Token Gift" (which presents the fable involving a grain of rice on the first square of a chessboard on day 1, twice as many on the second square on day 2, twice as many as that on the 3rd square on day 3, etc. Now as a class project, bring in a large bag of rice, and start with one grain in a paper cup on day 1. Double each day the amount from the previous day's cup, and put it into a new cup. At some point (well before 64 days), the students will have to revert to measuring by weight or volume, rather than counting the number of grains of rice, since the amount will exceed many thousands of pounds. Have fun and estimate just how much rice would have been needed, and do some research to determine whether that much is available.

Standards:

  • NCTM K-4: 1, 2, 3, 4, 5, 8.
  • NCTM 5-8: 1, 2, 3, 4, 5, 7, 9.
  • NCTM 9-12: 1, 2, 3, 4, 5, 12, 14.

Skills:

  • Problem solving / reasoning / communication / connections
  • Addition of any type of number
  • Exponential growth intuition (not computation)

Copyright (c) 1998, C. Heeren, T. Magliery, and L. Pitt. This lesson and supporting materials may be copied for nonprofit educational use only.