A quantum approach to statistics
Imagine you are playing the finale of a quiz which has three doors. Behind one door is a new Model X Tesla which you greatly desire. Behind the two other doors there are goats, and you do not particularly like goats. The quizmaster only speaks the truth and asks you to pick one of the doors.
You pick one door, and the quizmaster subsequently points to one of the two remaining doors saying, “It is not that door”. The question is; do you change your initial choice?
The right answer is at the end of this blog. But immediately we recognize that this problem, is just 3 qubits in a certain superposition. So why don’t we translate the statistics immediately into a language we can use in quantum computers (QC), rather than make a classical representation first, and then try to solve it with QCs? It might be weird for us to deal with maybes in statistics, but for the QC it is fair game either way.
A quantum wildcard
Let’s look at a practical example, a game of clustering. Let’s say we have a database of criminals, each criminal is represented by a binary vector. Each dimension of the vector is a behavioral property of the criminal which it either has or doesn’t have. We set ourselves the challenge to see if we can cluster types of behavior into groups, so that we can more efficiently profile these criminals and prevent more crimes.
We set out to find the (distance) correlation between each vector, make a map of the vectors and their respective distances between one another (see picture), and subsequently try to cut this map up into regions of high correlations.
There are classical and quantum methods to do this, and provided the quantum algorithms converge quickly enough, the quantum approach will even be faster.
But what if we started off our representation not as a binary vector, but as quantum states? What if we added a quantum wildcard that is a superposition of 1 and 0? Meaning; what if we add an additional dimension, a behavioral type that we do not know yet to exist, set it as both present and not present at the same time, and try to cluster again? Aside from some challenges, our initial classical vectors are not suitable to represent this, but the representation how we put them into the quantum computer is well capable of handling a superposition of having this mystery behavioral superposition.
We can than run the algorithm again, but this time with one of the states in superposition from the start rather than a definite yes or no.
This has 3 possible outcomes:
- The algorithm cannot handle this much randomness and no convergence is found,
- We get a slightly different clustering than before which is worth checking out,
- We get the same clustering as we did without the wildcard.
The third outcome is of course the most interesting, as we now know that the clustering is ‘good’, because it stays the same even if we add another unknown dimension, with all possible outcomes in it. Our clustering is ‘complete’. This would not have been possible to uncover if we’d stick with our original classical binary representation. This is because quantum computers allow us, to a certain extent, calculate with things we know we do not know, rather than only with things we know.
What will happen is still under active research, but with QCs emerging on the commercial market, I’d invite everyone who has one, to do this experiment by themselves. As for the goat, switching gives you a 2/3 chance of winning, and remaining a 1/3 chance. You’d be wise to switch doors. How we introduce the concept of ‘choice’ in quantum statistics, well, that is something for a different time.