Jul 8, 2011

My Circles

Google+ seems well received so far.  Most of the people I've talked to really enjoy the "Circles" feature and feel that it fits their mental model well.  I thought it would be fun to do a little data dive on my facebook graph to see how my relationships were actually interconnected and whether the data matched my perceptions of how I was planning my circles.  First the graph, then the explanation:
Gregable's Facebook Social Graph - click through see a larger version.

Each vertex is a friend of mine on facebook.  There is an edge between two friends if they are friends with each other.  I'm not actually in this graph, otherwise there would be a vertex with an edge to every other vertex.

I called out my wife, Cristin, who bridges several regions of my graph.  Even without me calling out that node, it would have been easy to guess.

The green region is high school.  The blue region includes Google employees, which is naturally over-represented.  College gets broken up a bit, and I wasn't really using facebook much for keeping in touch there.  You can see some other dense clusters in here as well, but I'd didn't bother labeling them.

To me, this reinforces the Circles concept pretty well!

Technical Details:
I installed a firefox plugin that saved a copy of every page I visited. I then turned off javascript and clicked through to all of my friends to extract the data.  I'm sure there is any easier way to do this, but I would have spent just as long figuring it out for this one-off.

Parsing the pages was some hacky python and regex.  Some of the templates weren't parsed as easily as others, so a few friends just got dropped due to laziness.  Any friend who didn't have any mutual friends also got dropped.  However, I didn't require a connected graph.  That just happened on it's own, to my surprise.

The graph was laid out using neato, and then I added the colored boxes on top by hand using Gimp.

With neato, I can change a line and get the nodes as text boxes with names instead, which is fascinating to look through, but not something I feel comfortable sharing publicly.

Also note that I don't do a good job of maintaining facebook, and have had purge cycles many times in the past.  A larger graph with a bigger picture would be even more fascinating I think.


Andrew Steele said...

Greg I would be interested in repeating your experiment, do you have your code available?

Greg said...

Unfortunately I don't have anything that could really be reused all that easily.

harry said...

Does the distance between any two nodes mean something?

Greg said...

No. The edges are modeled as springs that all have the same optimal length and a function that pushes them towards that length. The layout algorithm then tries to find the optimal layout that minimizes the overall differences in length between the spring's actual and optimal length.