In last week’s article, ‘Cupid with Computers‘, we engaged in a matchmaking activity with a group of 19 international OR bloggers. Since then, we have had some interest (both internal and external) on what technique we used, so we thought we would lift some of the mystery on how we not only recommended a best match for individual bloggers, but how we came up with a matching for the entire community, ensuring that everyone had a good friend.
Our analysis of available data gave us a set of pair-wise compatibility scores, indicating for any pair of bloggers how compatible we thought they were. Building from this, we sought to match pairs of bloggers together such that the total compatibility score across all pairs was as high as possible. We solved it mathematically using a greedy algorithm that was quick and easy to implement.
A ‘greedy’ algorithm always selects the best option currently available (i.e. it selects the most compatible available pair) and does not consider trade-offs against future choices. However, as we all probably know, the course of true love never did run smooth and we should really have used an alternative technique, such as integer programming to guarantee an optimal solution.
But what is love after all? How do you know when you have found your ‘optimal’ match? Although many people do keep trying more possibilities (and enjoy themselves in the process) shouldn’t you just be ‘greedy’ and grab a good opportunity when you see one…
Our Greedy Lust
Consider what we did with our final five bloggers (John, Patricia, Phillip, Shiva and Hakan) with the compatibility matrix as below:
Our greedy method matches John with Patricia to score 4.4 and then chooses the best remaining option, in this case Hakan with Phillip to get a combined score of 6.6. But could we have done better? The fiery red 4.4 certainly is tempting, but it prevents us from later choosing the 3.7 and we could have selected this, together with the 4.2 for a combined score of 7.9.
But is 7.9 the best possible score? How can we be 100% certain? We could look at all possible solutions and their scores and show that none is better. That is 15 different possible answers for five bloggers so not too difficult. But if we let the number of bloggers grow to 19 (as in last week’s problem) then the number of possible pairings rapidly grows to 654,729,075 and checking all solutions is a lot more difficult…
Fortunately, love looks not with the eyes, but with the mind and a technique called Integer Programming comes to our rescue. With the pressure of the INFORMS Blog Challenge deadline behind us, we brought out our copy of XpressMP optimisation software to properly formulate the problem.
For the interested, and we welcome you to skip ahead if you aren’t, the problem can be expressed mathematically as an Integer Programming problem as follows (see also the attached Excel formulation or XpressMP model file)
At first it can seem like we’ve made the problem harder rather than easier. There are 171 decision variables which each can take a value of 0 or 1. This yields 2171 = 3 x 10^51 (3 followed by 51 zeros) possible solutions of which, we know, a mere 654,729,075 are valid and possibly only one is optimal. Searching for a needle in a hay universe to say the least.
However, using a method called branch and bound (for which, you’ll be pleased to know, we’ll leave out the details) the optimisation software can find the optimal solution for all 19 bloggers. This solution has a global compatibility score of 51.9 compared to our score last week of 50.6. So, for our sins, our greedy lust wasn’t optimal but, at 97.5% of the optimal solution, it wasn’t too bad at all.
Applications of Integer Programming
Back in the real world, the Capgemini OR team solves Optimisation problems such as this for our clients. For example, we have used Xpress-MP to:
- optimise stock control and production plans for a steel manufacturer
- optimise warehouse location for a European pharmaceutical (pictured) and
- minimise the total collection and delivery costs for a UK distribution company.
This last project demonstrated that the optimal allocation of zones to depots and depots to hubs would save the client more than £1,000,000 per annum. Thus, optimisation techniques allow us to deliver optimal business value and find solutions to problems where perhaps no solution, let alone a poor one existed previously.
Also, who knew love was so algorithmically complicated?! Come to think of it… it could explain quite a lot!
Attachments: Integer-Programming.xls contains the compatibility matrix and the formulation that can be copied and pasted into XpressMP