- Introduction: This is an implementation of maximal biclique enumeration algorithm from Alexe et al. (http://citeseer.nj.nec.com/alexe02consensus.html)
-
Files and usages:
- max_biclique.zip: The archive package of necessary source files and some additional test data file.
- biclique2.h: The implementation of the algorithm.
- biclique2_MS.h: The same class implementation as above but modified for Visual Studio environment. Thanks to Thomas Sharpless for the contribution.
-
sbtest_ii.cpp: The main program to
handle the file input/output as well as demonstrate how to use the bigraph class.
To compile it (under Linux or Mac OSX), simply use the command:$ g++ -o sbtest sbtest_ii.cpp
To run it, use the command line (all the file names are mandatory):
$ sbtest [input bigraph file] [output bclique file] [output size file]
For Visual Studio, please switch to the simple_biclique2_MS.h header file. - sbtest.linux, sbtest.osx, sbtest.sun: These are pre-compiled binaries for indicated platforms.
-
n400-x150-d80.bigraph: An
example of a bipartite graph which has 400 nodes in first set,
150 nodes in second set and around 12000 edges.
A bipartite graph has two node sets, say X and Y.
Each edge is represented by a node x from X and a node y from Y.
The input file consists of a list of edges
where x is in the first column and y is in the second column.
For example, the following bipartite graph is represend as:
in a text file.2 2 2 3 2 4 3 1 3 2 3 3
-
n400-x150-d80.biclique,
n400-x150-d80.size:
The output of the above input bipartite graph file. The purpose of providing
these files is for verification. Please note that the generated
n400-x150-d80.biclique file has size of 18MB, and for this reason, the
file is not included in the archive file.
In the output file, each biclique is represented by 3 lines:
[list of nodes from 1st node set]
[list of nodes from 2nd node set]
[empty line]
As the above example, the output bicliques in a file will be:3 1 2 3
2 3 2 3
2 2 3 4
-
bicli_break.py: A python2
script divides the generated bicliques into separated files of single biclique.
Optional cut-off values m and n can be applied to filter out
unnecessary small bicliques, where m is the minumal size for first node set and n for the second.
$ python bicli_break.py [biclique file] [m-cutoff] [n-cutoff]
- Performance: The theoretical running time of the algorithm is O(Bn^3) where B is the number of maximal bicliques. On a P4 3.0 GHz linux machine, it takes about 18 mins to enumerate all 621948 bicliqus from the example file (n400-x150-d80.bigraph).
- Known problems: The program has been tested on Linux with gcc 2.96 and 3.0.4 without any problem. Although there are warnings with gcc 3.2, the compiled program works correctly.
- Acknowledgements: This work was supported in part by NSF award #EF-0334832.
- Distribution: This work is distributed under GPL http://www.gnu.org/copyleft/gpl.html.
- Contact: The content is maintained by Wen-Chieh Chang (wcchang@iastate.edu).
-Last Update: 2004/12/09