|
The subset sum problem is an important problem in complexity theory and cryptography. The problem is this: given a set of integers, does the sum of some non-empty subset equal exactly zero? For example, given the set { −7, −3, −2, 5, 8}, the answer is YES because the subset { −3, −2, 5} sums to zero. The problem is NP-Complete, and is perhaps the simplest such problem to describe. As a branch of the theory of computation in computer science, computational complexity theory describes the scalability of algorithms, and the inherent difficulty in providing scalable algorithms for specific computational problems. ...
The German Lorenz cipher machine, used in World War II for encryption of very high-level general staff messages Cryptography (or cryptology; derived from Greek κÏÏ
ÏÏÏÏ kryptós hidden, and the verb γÏάÏÏ gráfo write) is the study of message secrecy. ...
In complexity theory, the NP-complete problems are the most difficult problems in NP, in the sense that they are the ones most likely not to be in P. The reason is that if you could find a way to solve an NP-complete problem quickly, then you could use...
An equivalent problem is this: given a set of integers and an integer s, does any non-empty subset sum to s? Subset sum can also be thought of as a special case of the knapsack problem. One interesting special case of subset sum is the partition problem, in which s is half of the sum of all elements in the set. Example of a one-dimensional (constraint) knapsack problem: which boxes should be chosen to maximize the amount of money while still keeping the overall weight under 15 kg? A multi dimensional problem could consider the density or dimensions of the boxes, the latter a typical packing problem. ...
The Partition problem is an NP-Complete problem in Computer Science. ...
General discussion
The subset sum problem is a good introduction to the NP-complete class of problems. There are two reasons for this A solution that has a ± 1% precision is good enough for many physical problems. Being asked to solve a subset sum problem for 100-digit numbers with a precision of ±10−100 might seem silly and irrelevant. There are two reasons why this is not the case. In computability theory and computational complexity theory, a decision problem is a question in some formal system with a yes-or-no answer. ...
In computer science, an optimization problem is the problem to find among all feasible solutions for some problem the best one. ...
First, the number of place values in the problem is essentially equivilant to the number of simultaneous constraints that need to be solved. A numerical precision of 1% means solving the problem to just the first 7 base two place values (any numerical error after that is less than 1/128 of the first digit). However, if there are 100 base 2 place values in the problem, solving just 7 of them amounts to solving only 7% of the constraints. Moreover, given that the volume of the solution space in this case would be 2^100, and you have only covered a volume of 2^7, then there is still a solution space of 2^93 left uncovered. In this way a solution with a 1% numerical precision has covered essentially none of the real problem. The only way that a solution to the Subset Sum Problem can be used as a solution to other NP problems is to solve all of the problem (and all of the constraints) exactly. Second, in at least one context, it is actually important to solve real subset sum problems exactly. In cryptography, Subset Sum problem comes up when a codebreaker attempts, given a message and ciphertext, to deduce the secret key. A key that is not equal to but within ± 1% of the real key is essentially useless for the codebreaker. This article is about algorithms for encryption and decryption. ...
A key is a piece of information that controls the operation of a cryptography algorithm. ...
Although the subset sum problem is a decision problem, the cases when an approximate solution is sufficient have also been studied, in the field of approximation algorithms. One algorithm for the approximate version of the subset sum problem is given below. In computer science, approximation algorithms are an approach to attacking NP-hard optimization problems. ...
The complexity of subset sum The complexity (difficulty of solution) of subset sum can be viewed as depending on two parameters, N, the number of decision variables, and P, the precision of the problem (stated as the number of binary place values that it takes to state the problem). (Note: here the letters N and P mean something different than what they mean in the NP class of problems.) The complexity of the best known algorithms is exponential in the smaller of the two parameters N and P. Thus, the problem is most difficult if N and P are of the same order. It only becomes easy if either N or P becomes very small. If N (the number of variables) is small, then an exhaustive search for the solution is practical. If P (the number of place values) is a small fixed number, then there are dynamic programming problems that can solve it exactly. What is happening is that the problem becomes seemingly non-exponential when it is practical to count the entire solution space. There are two ways to count the solution space in the subset sum problem. One is to count the number of ways the variables can be combined. There are 2^N possible ways to combine the variables. However, with N = 10, there are only 1024 possible combinations to check. These can be counted easily with a branching search. The other way is to count all possible numerical values that the combinations can take. There are (2^P)x N possible numerical sums. However, with P = 5 there are only 32 x N possible numerical values that the combinations can take. These can be counted easily with a dynamic programming problem. When N = P and both are large, then there is no aspect of the solution space that can be counted easily. We give efficient algorithms for both small N and small P cases below.
Exponential time algorithm There are several ways to solve subset sum in time exponential in N. The most naïve algorithm would be to cycle through all subsets of N numbers and, for every one of them, check if the subset sums to the right number. The running time is of order O(2NN), since there are 2N subsets and, to check each subset, we need to sum at most N elements. A better exponential time algorithm is known, which runs in time O(2N/2N). The algorithm splits the N elements into two sets of N/2 each. For each of these two sets, it calculates sums of all 2N/2 possible subsets of its elements and stores them in an array of length 2N/2. It then sorts each of these two arrays, which can be done in time O(2N/2N). When arrays are sorted, the algorithm can check if an element of the first array and an element of the second array sum up to s in time O(2N/2). To do that, the algorithm passes through the first array in decreasing order (starting at the largest element) and the second array in increasing order (starting at the smallest element). Whenever the sum of the current element in the first array and the current element in the second array is more than s, the algorithm moves to the next element in the first array. If it is less than s, the algorithm moves to the next element in the second array. If two elements with sum s are found, it stops. It is likely that this improved algorithm has the best running-time possible of all algorithms which solve the subset sum problem, since no better algorithm has ever been found since Horowitz and Sahni first published this algorithm in 1974. [1]
Pseudo-polynomial time dynamic programming solution The problem can be solved as follows using dynamic programming. Suppose the sequence is In computer science, dynamic programming is a method of solving problems exhibiting the properties of overlapping subproblems and optimal substructure (described below) that takes much less time than naive methods. ...
- x1, ..., xn
and we wish to find a nonempty subset which sums to zero. Let N be the sum of the negative values and P the sum of the positive values. Define the function Q(i,s) to be 0 if there is no subset of x1, ..., xi which sums to s; 1 if there is a nonempty such subset; or 2 if only empty subset sums to s (i.e. when s is zero). (Thus, the question we really want to know is whether Q(n,0) equals 1.) Clearly: Q(i,s) = 0 for s<N or s>P. Create an array to hold the values Q(i,s) for 1≤i≤n and N≤s≤P. The array can now be filled in using a simple recursion. Initialize all Q(1,s) to 0. Let Q(1,0) be 2. Let Q(1, x1) be 1. For i>1, if Q(i-1,s-xi) is nonzero let Q(i,s) be 1 otherwise let it be value of Q(i-1,s). (Note that Q(i,s) can be made a boolean valued function if we are interested in subset which sums to something other than zero.) The total number of arithmetic operations is - O(n(P − N)).
For example, if all the values are - O(nk)
for some k, then the time required is - O(nk+1).
This solution does not count as polynomial time in complexity theory because P-N is not polynomial in the size of the problem, which is the number of bits used to represent it.
Polynomial time approximate algorithm An approximate version of the subset sum would be: given a set of N numbers x1, x2, ..., xN and a number s, output In computer science, approximation algorithms are an approach to attacking NP-hard optimization problems. ...
- yes, if there is a subset that sums up to s;
- no, if there is no subset summing up to a number between (1-c)s and s for some small c>0;
- any answer, if there is a subset summing up to a number between (1-c)s and s but no subset summing up to s.
If all numbers are non-negative, the approximate subset sum is solvable in time polynomial in N and 1/c. The solution for subset sum also provides the solution for the original subset sum problem in the case if the numbers are small (again, for nonnegative numbers). If any sum of the numbers can be specified with at most P bits, then solving the problem approximately with c=2-P is equivalent to solving it exactly. Then, the polynomial time algorithm for approximate subset sum becomes an exact algorithm with running time polynomial in N and 2P (i.e., exponential in P). The algorithm for the approximate subset sum problem is as follows: initialize a list S to contain one element 0. for each i from 1 to N do let T be a list consisting of xi+y, for all y in S let U be the union of T and S sort U make S empty let y be the smallest element of U add y to S for each element z of U in increasing order do //trim the list by eliminating numbers close one to another if y<(1-c/N)z, set y=z and add z to S if S contains a number between (1-c)s and s, output yes, otherwise no The algorithm is polynomial time because the lists S, T and U always remain of size polynomial in N and 1/c and, as long as they are of polynomial size, all operations on them can be done in polynomial time. The size of lists is kept polynomial by the trimming step, in which we only include a number z into S if the previous y is at most - (1 − c/N)z.
This step ensures that each element in S is smaller than the next one by at least a factor of (1 − c/N) and any list with that property is of at most polynomial size. The algorithm is correct because each step introduces a multiplicative error of at most (1 −c/N) and N steps together introduce the error of at most - (1 − c/N)N < 1 − c.
Generalizations There are other formulations of the subset sum problem which are very similar to the one given above in terms of integers and addition. For example, the problem could be restated: given an integer n and a set of integers in the range [0, n−1], does any subset sum to zero modulo n?[citation needed] This form of the problem has been used as a basis for several public key cryptography systems. However, most of them have been broken, reducing confidence in those still unbroken. In the cryptography field, the subset sum problem is traditionally known as the "knapsack problem". Modular arithmetic (sometimes called modulo arithmetic, or clock arithmetic because of its use in the 24-hour clock system) is a system of arithmetic for integers, where numbers wrap around after they reach a certain value â the modulus. ...
Public key cryptography is a form of cryptography which generally allows users to communicate securely without having prior access to a shared secret key, by using a pair of cryptographic keys, designated as public key and private key, which are related mathematically. ...
No algorithm is known for which we can prove that it solves subset sum in polynomial time. If any such algorithms exist, then some of them are already known. See the bottom of the complexity classes P and NP for one such algorithm. Diagram of complexity classes provided that P â NP. The existence of problems outside both P and NP-complete in this case was established by Ladner. ...
References - T. Cormen, C. Leiserson, R. Rivest. Introduction to Algorithms. MIT Press, 2001. Chapter 35.5, The subset-sum problem.
- Michael R. Garey and David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman. ISBN 0-7167-1045-5. A3.2: SP13, pg.223.
Michael R. Garey is a computer science researcher, and co-author (with David S. Johnson) of Computers and Intractibility: A Guide to the Theory of NP-completeness. ...
David S. Johnson (born December 9, 1945) is a computer scientist specializing in algorithms and optimization. ...
External links - An algorithm (exponential time) for solving the subset sum problem
|