How to find all simple cycles in an undirected graph efficiently?

MapReduce: How to find Online Communities by removing nodes(vertices) from a social graph?

I want to carry out Graph Clustering in a huge undirected graph with millions of edges and nodes. Graph is almost clustered with different clusters joined together only by some nodes(kind of ambiguous nodes which can relate to multiple clusters). There will be very few or almost no edges between two clusters. This problem is almost similar to finding vertex cut set of a graph, with one exception that graph needs to be partitioned into many components(their number being unknown).(Refer this picture https://docs.google.com/file/d/0B7_3zLD0XdtAd3ZwMFAwWDZuU00/edit?pli=1) Its almost like different strongly connected components sharing a couple of nodes between them and i am supposed to remove those nodes to separate those strongly connected components. Edges are weighted but this problem is more like finding structures in a graph, so edge weights won't be of relevance. (Another way to think about the problem would be to visualize Solid Spheres touching each other at some points with Spheres being those strongly connected components and touching points being those ambiguous nodes) I am prototyping something, so am quiet short of time to pick up Graph Clustering Algorithms by myself and to select the best possible solution. Plus i need a solution that would cut nodes and not edges since different clusters share nodes and not edges in my case. Is there any research paper, blog that addresses this or somewhat related problem? Or can anyone come up with a solution to this problem howsoever dirty. Since millions of nodes and edges are involved, i would need a MapReduce implementation of the solution. Any inputs, links for that too? Is there any current open source implementation in MapReduce that can i directly use? I think this problem is analogous to Finding Communities in Online Social Network Graphs with communities need to be discovered by removing nodes(vertices).
Answer:

To address your algorithmic challenge, the metric you are describing is a "betweenness" metric. That is, if you construct shortest paths between all nodes in a graph (using your weighting mechanism). The nodes / edges of high centrality will be reused in a large number of those paths, as they are related to the congestion that represents these points. These edges / nodes have high "betweenness centrality". I recommend looking into approximation methods for identifying central edges; note that after each removal of a highly central graph element, recalculation is potentially expensive. This recalculation is necessary to avoid removing edges that are no longer central after removals (e.g., if you had two clusters linked by two nodes). Read about Girvan-Newman betweenness centrality clustering algorithms. The JUNG java library has an implementation, although their implementation does not support hierarchical clustering (e.g., saving the clusters after each cluster step).

Jacob Ouellette at Quora Visit the source

Was this solution helpful to you?