To have a “wormhole”
you need a worm to dig it. If no worm, no hole either…

In short this is the
solution of “P vs. NP” problem. The answer is “No”. P is not equal to NP, as
there is no “worm” to dig a shortcut and this way bypass the exponential
algorithm tragedy. There is no shortcut and such one is impossible. The purpose
of this article is to prove this.

To achieve this the
concept of this article is to prove that a specific NP-complete problem cannot
be solved non-exponentially. This will prove that P≠NP, because otherwise
contradiction will appear. If a NP-complete problem is proved as un-solvable in
polynomic time and another NP-complete is solved then the first will also be
solved as it could be converted into second one. Obviously not-possible.

To prove this assertion
we can use the “Subset sum problem”. If we prove that it cannot be solved
non-exponentially, this will be the end of “P vs. NP” discussion.

And the answer of
“Subset sum problem” is very short and simple:

**You can not avoid exponents (x**

^{n}) in a problem where combinatorics is involved. No such possibility. And that is in short the solution of “P vs. NP”. P is not equal to NP.
You don’t need to read
any more this article unless you are keen on brain gymnastics and developing
intellect by thinking on complex issues. If you are interested just in the
answer, that is all – exponents are inevitable with combinatorics involved.

And you can not solve
the subset sum problem without combinatorics. Theoretically if such a decision
was possible, so it will mean existence of functional dependency between the
numbers. Functional dependency and calculating by formula is the only way to
find the right answer among many other wrong answers. But this will contradict
the initial condition of the problem – finding a subset in a set of arbitrary
integers. The problem is defined as universal – the algorithm must work on ANY
set of integers. And this excludes functional dependency. As no dependency
exists, combinatorics is the way. And combinatorics means exponents.

You can do whatever you
want with the initial set (below are some thoughts on this), but never the less
what you do, some subset will remain for combinatorial checkup. Therefore the
only result of your effort will be a hypothetic reduction of the exponent size
(x

^{n}, x^{n/2}…x^{n/1000}…). But in all these cases some exponential element will remain. Generally the most used approach in this area is splitting the initial set to subsets, because this way the exponential power goes down. But anyway there remains an exponential component. In addition, splitting has some limits. You can not split too much, because you will start missing some combinations containing possible solution. To avoid this you will have to combine again subsets each other and this will bring you back to a previous level of splitting. But this is in the next part of analysis. The real solution of “P vs. NP” is as follows:**“Subset sum problem” is a complete NP-problem. “Subset sum problem” can not be solved without using combinatorics at all. The existence of combinatorics means unavoidable exponents. And this means that no pure polynomic solution is possible. As “Subset sum problem” is proven unsolvable in polynomic time, and as it is a NP-complete, therefore:**

**P≠NP**

__2.Some thoughts on what is theoretically lowest needed complexity for “Subset sum problem”__
Now let’s return to the
initial epigram…

To have a “wormhole”
you need a worm to dig it. If no worm, no hole either…

So let’s start
searching for “worms”. I.e. for something that could help us reduce this
horrifying check-all-combinations calculation burden. What could help us avoid
this? Is there a “worm”, a “privilege”, any concept that could help us?

Unfortunately No.

All we have is a set of
numbers. And nothing more. No number is marked as preferable. No any arithmetic
operation on numbers can help us avoid some of the calculations. There is
nothing that can help us. No “worm” to dig. No “filtering privilege” to send in
trash these unneeded and exhausting combinations. No such savior among numbers.

Anyway we have
something that can help. But not enough. We have 3 important things that can be
used for filtering:

-We have the initial
set itself and can somehow manipulate it. Generally we can split it

-We have a target
number that the subsets must equal - for simplification we could accept this as
zero(0).

-We have the value of
the current calculated combination

That is all we have. In
fact these elements can be used for some filtering. But not enoung to make any potential
algorithm non-exponential. In fact in one way or another exactly these 3
elements are used in all algorithms for solving the problem, including the most
effective ones. These elements can be used in different ways – directly or
indirectly, by summing, sorting, merging, etc. But all ways descend visibly or
not, from these 3 – manipulating the set, target number and value of current
calculated combination.

So if based only on
these 3 factors (worms, privilege filters) we can not have a non-exponential
solution, therefore we can not have such a solution at all.

So the purpose of the
second part of this article is to find the filtering limit, based on the set,
current calculated value and target number.

Let’s start with the
set, because this is used together with the rest 2 factors.

Splitting the set is a
powerful tool of lowering the exponential power. Working with smaller subsets
you can reduce the number of combinations to be checked. Using it along with
other filters the overall job done will go down. Anyway splitting does not
remove the exponential nightmare for the computer. It just makes it smaller.

In addition there are
limits of splitting. If you split only in 2, then you will lower the exponent
power to the level of the bigger subset. If 2 subsets are equal, you will have
exponential power of n/2. So why not then spilt in 4,5…1000? Why not split
until we reach a subset of 1 number and solve the problem instantly? The answer
is obvious. If you split on more than 2 subsets, you risk starting to miss some
of the combinations needed for check. To avoid this you will have to combine
again subsets each other. And this will bring you back to a previous level of splitting.
At last you will be back to the initial split in 2 subsets. In this case with
any subset having just one another subset as a possible “partner” there is no
risk of missing combinations. So the theoretical limit of filtering power of
splitting is n/2.

Now let’s go to the
target number. It is the weaker filter. The main thing it can be used for is to
conveniently divide the initial set into 2 subsets (for instance positive and
negative after sorting). So it is an interesting “split point” that can be
useful in some cases. In fact you can divide it at any number, but if it is not
the target number you will lose some of the filtering power of this number. In
fact some of algorithms are doing exactly this, but just because the “profit”
from the other split is bigger than the loss from this. Anyway this is not so
important issue, now we are on the matter of target value.

As we said above dividing
the set to subsets is very useful because this way we can lower the complexity
to calculate all possible combinations in each of the 2 subsets. For instance,
if 2 subsets are equal, then we will have complexity lowered from O(2

^{n}) to (2*2^{n/2}). If they are not equal we will have different reductions in each of 2 subsets.
Why the target number
is interesting? The answer is because it splits a hypothetically sorted set of
numbers to 2 subsets that does not need to calculate the combinations of
numbers that are only from one of the sides of the target number. I.e. if the
split is on positive and negative numbers and the target is 0, then you don’t
need to check all combinations containing only positive or only negative
numbers. You will save much of the time of your processor.

Anyway the target
number as filtering factor is not so important. Sometimes it can be “more
profitable” to split at any other number.

Once we have 2 subsets,
theoretically we can go back to initial version by making each-with-each
combination and then the complexity will go back to O(2

^{n}).
And here it comes the
second filter – the value of the current combination. Using it we can prevent
part of exactly this “going back”. In fact, we will really go back as we will
have to check many combinations that contain members of both 2 subsets. But
using the “current value” we can avoid part of the job.

So what is the
“filtering power” of current value?

Generally we can
benefit in 2 ways – by filtering final calculated values of other combinations,
or by filtering the initial numbers that start creating combinations and also
already calculated combinations from which next level of combinations descend.

These are 2 different
approaches that are used in different algorithms. In this article I will not
talk about any algorithm, nor focus on inventing algorithms. I am only
analyzing the limits of filters that are one or another way implemented in
these algorithms.

In filtering against
final calculated values of other combinations, using the current value of
current combination can be very powerful. If the subsets are sorted the current
value can filter almost all useless combinations. Imagine a set of 1000
calculated positive values that are sorted. You have a negative current value.
So you can theoretically filter (by some type of genius algorithm) all, or
almost all, the values that are not equal to your current value. So the
theoretical filtering power of current value is close to 100% when used against
other calculated values that are sorted.

Anyway in this case the
computer power will have to go to calculating other values and sorting them. In
the case of a set divided in 2 subsets, we will have O(2

^{n/2}) for creating each subset plus some logarithmic cost of sorting. So if we want to use this extreme filtering power we will have to “invest” in initial calculations for creating subsets. If they are equal the complexity will be O(2^{n/2}). But if we have unequal as size subsets then we risk increasing the complexity to the one of the bigger of the 2 subsets (for instance O(2^{n/4})- small and O(2^{3n/4})-big).
So we come to the
second approach. We can use the filtering power of current calculated value to
filter combinations before they are calculated. For instance with current
calculated value of -5 and a target of 0, we can filter all combinations that
are based on positive numbers larger than 6. This means all 1-number
possibilities and all 2,3,4…n combinations that descend from 6 and upper
numbers. In addition we can on next step filter all 2-number combinations that
exceed value of 5. For instance we will check the number 2, but will filter the
combination “2+4” and all consecutive combinations. And so on.

All this can be done if
we are using some type of tree-building algorithm with any new combination
coming from the previous one. Combined with initially sorted set (or subset),
this filtering can be very impressive as a size. And we don’t forget that in
this case we have not still invested computer power in calculating
all-possible-combinations.

But what is the limit
of this filtering before calculating? In fact the calculation is not complex.
In this case the dependency is something like “linear”. Not exactly linear
because the function is based on discrete numbers and is not uninterrupted. But
generally the calculations are the same. The number of calculations that can be
filtered and avoided of calculating is depending on the absolute size of the
current calculated value. If it is very low and has a low absolute value of the
proportion of this value and maximal theoretic value for the subset, then we
will be able to filter almost all of the combinations. Imagine current value of
-2 and 1000 positive numbers in subset of positives with 999 numbers higher
than 2. You will be able to filter 99,9% of all these combinations.

But look at the other
side. Imagine current calculated value of -10000 and 1000 positive numbers with
999 of them lower than 10000. You will avoid just 0,01% of combinations. Of
course you will have the ability to filter among 2,3…n number combinations. But
theoretically if you have a current value that is close to the maximal possible
calculated value (as absolute value, it can be negative) then you will be able
to filter almost nothing.

So this dependency is
something like linear. There are math rules here that lead us to filtering
about half of the possible combinations. Тhe function is something like:

y(x)=(x/constant1)*constant2.

Where

z – final number of
calculated combinations

y - number of checked
combinations for a given value of x

x – current calculated
value

constant1 – theoretical
max calculated value

constant2 – average
density of combinations in set

The sum of all
members(y) is something like half of the all possible combinations with
linear-type function like this.

So is that good and
enough?

If you don’t have any
initial divide this economy is not so impressive. You will go to O(2

^{n}/2). If we try a mixed approach we can calculate all and sort the smaller subset, and then try filtering with the bigger subset. In our previous cate - O(2^{n/4})- small and O(2^{3n/4})-big). So the optimization will be to a level of O(2^{3n/4}/2). Obviously much worse than O(2^{n/2}) with 2 equal as size subsets.
And that is the end.
That is all. All possible possibilities are checked. The power to filter the
combinations is limited to O(2

^{n/2}). And no algorithm based on these three “worms” can do better. In fact this is the answer of the question why after 4 decades of searching it is not yet discovered an algorithm better than the one of Horowitz and Sahni*. In fact this algorithm is very close to the theoretically possible lowest level complexity O(2^{n/2}) and any other better algorithm will not bring enormous improvement.
So finally – if the
“Subset sum problem” is proven unsolvable in non-exponential time, and as it is
a NP-complete problem, therefore:

__P≠NP__
Jan-04-2016 Author: Dobri Bozhilov

------------------

__References:__*Horowitz, Ellis; Sahni, Sartaj (1974), "Computing partitions with applications to the knapsack problem", Journal of the Association for Computing Machinery 21: 277–292, doi:10.1145/321812.321823, MR 0354006*

## No comments:

## Post a Comment