Skip to content

hadisfr/biclique

Repository files navigation

Travis (.org)

Maximal Biclique Enumeration

  • Introduction: This is an implementation of maximal biclique enumeration algorithm from Alexe et al. (http://citeseer.nj.nec.com/alexe02consensus.html)
  • Files and usages:

    • max_biclique.zip: The archive package of necessary source files and some additional test data file.
    • biclique2.h: The implementation of the algorithm.
    • biclique2_MS.h: The same class implementation as above but modified for Visual Studio environment. Thanks to Thomas Sharpless for the contribution.
    • sbtest_ii.cpp: The main program to handle the file input/output as well as demonstrate how to use the bigraph class. 
      To compile it (under Linux or Mac OSX), simply use the command:
      $ g++ -o sbtest sbtest_ii.cpp
      
      To run it, use the command line (all the file names are mandatory):
      $ sbtest [input bigraph file] [output bclique file] [output size file]
      
      For Visual Studio, please switch to the simple_biclique2_MS.h header file.
    • sbtest.linux, sbtest.osx, sbtest.sun: These are pre-compiled binaries for indicated platforms.
    • n400-x150-d80.bigraph: An example of a bipartite graph which has 400 nodes in first set, 150 nodes in second set and around 12000 edges. A bipartite graph has two node sets, say X and Y. Each edge is represented by a node x from X and a node y from Y. The input file consists of a list of edges where x is in the first column and y is in the second column. For example, the following bipartite graph is represend as:
      example graph
       2 2
       2 3
       2 4
       3 1
       3 2
       3 3
      
      in a text file.
    • n400-x150-d80.biclique, n400-x150-d80.size: The output of the above input bipartite graph file. The purpose of providing these files is for verification. Please note that the generated n400-x150-d80.biclique file has size of 18MB, and for this reason, the file is not included in the archive file.
      In the output file, each biclique is represented by 3 lines: 
      [list of nodes from 1st node set]
      [list of nodes from 2nd node set]
      [empty line]
      As the above example, the output bicliques in a file will be:
      3
      1 2 3
      

      2 3 2 3

      2 2 3 4

    • bicli_break.py: A python2 script divides the generated bicliques into separated files of single biclique. Optional cut-off values m and n can be applied to filter out unnecessary small bicliques, where m is the minumal size for first node set and n for the second.
      $ python bicli_break.py [biclique file] [m-cutoff] [n-cutoff]
      
  • Performance: The theoretical running time of the algorithm is O(Bn^3) where B is the number of maximal bicliques. On a P4 3.0 GHz linux machine, it takes about 18 mins to enumerate all 621948 bicliqus from the example file (n400-x150-d80.bigraph).
  • Known problems: The program has been tested on Linux with gcc 2.96 and 3.0.4 without any problem. Although there are warnings with gcc 3.2, the compiled program works correctly.
  • Acknowledgements: This work was supported in part by NSF award #EF-0334832.
  • Distribution: This work is distributed under GPL http://www.gnu.org/copyleft/gpl.html.
  • Contact: The content is maintained by Wen-Chieh Chang (wcchang@iastate.edu).

-Last Update: 2004/12/09