Set Cover Problem (P =?= NP) and the Archetype technique for classification

I have published a heuristic solver for SCP on GitHub and PyPI (you can install it via pip):

https://github.com/guangtunbenzhu/SetCoverPy 

The paper can be found here: http://arxiv.org/abs/1606.07156

The set cover problem (SCP) is one of the open problems in operations research. It has many real-life applications, such as crew-assigning for trains and airlines, fire station/school-location selection, nurse-scheduling etc. As an astronomical application of SCP, I have developed a new Archetype technique for classification purposes.

The SCP can formulated as an integer linear programming problem as follows. There are M rows to be covered and there are N columns to select from (to cover the rows). Let a binary MxN matrix A represent the relation between columns and rows. If a column j covers row i, then a_ij = 1, while a_ij = 0 if otherwise. If the cost of a column j is c_j, the problem is to find a set of columns S with minimal total cost, subject to that all rows must be covered. Formally, the SCP is to find

The problem is notoriously difficult. It is one of Karp's 21 NP-complete problems and therefore has important implications for P =?= NP, one of the seven millennium prize problems.

 

I have developed a heuristic solver based on Lagrangian relaxation,

https://github.com/guangtunbenzhu/SetCoverPy

You can install it through pip:

>> pip install SetCoverPy

For the standard tests (4,5,6, A-G instances from Beasley's OR Library), the code yields a solution that is on average 99% optimal (in terms of the final cost). The code can still be improved and any feedback will be greatly appreciated. 

★ This work has been funded by #HST-HF2-51351

Johns Hopkins University

© 2016 Guangtun Ben Zhu

NASA Hubble Fellowship Program