Skip to content

Used Python to implement the Soft Regularized Markov Clustering (SR-MCL) algorithm and tested it on an extensive real-world dataset, weighted yeast proteins interactome, for time complexity and performance check beside a small random graph for showing the nodes' predicted clusters.

License

Notifications You must be signed in to change notification settings

arabporr/SR-MCL_Graph_Clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Identifying overlapping functional modules of protein interactions graph

Introduction

In this project, I tried implementing one of the most efficient algorithms for large graphs clustering, the SR-MCL, in python and checked how it works on both big and small graphs. Due to this, the implemented code was then run on a real-world dataset, weighted yeast proteins interactome, to check how long it takes to reach an end and converges with a graph of size 1e3 nodes and 1e4 edges!

Methodology

The SR-MCL algorithm is an improvement to the MCL (Markov Clustering Algorithm). The MCL is a practical algorithm for clustering biological networks, for instance, clustering protein-protein interaction (PPI) networks to identify functional modules. But it has some limitations and problems with bridge nodes.

A few years later, the R-MCL (Regularized MCL) introduced by Yu-Keng Shih and Srinivasan Parthasarathy in 2010 solved some of the MCL problems and improved execution time. However, still, the problem with bridge nodes remains. Two years later, the same people came up with the new idea of SR-MCL. The main idea behind it was to make the R-MCL algorithm not focus on bridges by softening the weights of the canonical flow matrix after each step.

Result on a small random graph

  • After testing its performance to see how it works, I created a random graph with 50 nodes and some edges between them.

RandomGraph

  • Then by running the algorithm on them, I colored each cluster a different color, But the bridge nodes colored yellow to be identified!

ClusteredRandomGraph

Reference

About

Used Python to implement the Soft Regularized Markov Clustering (SR-MCL) algorithm and tested it on an extensive real-world dataset, weighted yeast proteins interactome, for time complexity and performance check beside a small random graph for showing the nodes' predicted clusters.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published