Social Circles

An early look at my 'social graph'

July 08, 2011

Google+ seems well received so far. Most of the people I've talked to really enjoy the "Circles" feature and feel that it fits their mental model well. I thought it would be fun to do a little data dive on my facebook graph to see how my relationships were actually interconnected and whether the data matched my perceptions of how I was planning my circles. First the graph, then the explanation:

Greg's Facebook 'social graph'. Click to expand.

Each vertex is a friend of mine on facebook. There is an edge between two friends if they are friends with each other. I'm not actually in this graph, otherwise there would be a vertex with an edge to every other vertex.

I called out my wife, Cristin, who bridges several regions of my graph. Even without me calling out that node, it would have been easy to guess.

The green region is high school. The blue region includes Google employees, which is naturally over-represented. College gets broken up a bit, and I wasn't really using facebook much for keeping in touch there. You can see some other dense clusters in here as well, but I'd didn't bother labeling them.

To me, this reinforces the Circles concept pretty well!

Technical Details

I installed a firefox plugin that saved a copy of every page I visited. I then turned off javascript and clicked through to all of my friends to extract the data. I'm sure there is any easier way to do this, but I would have spent just as long figuring it out for this one-off.

Parsing the pages was some hacky python and regex. Some of the templates weren't parsed as easily as others, so a few friends just got dropped due to laziness. Any friend who didn't have any mutual friends also got dropped. However, I didn't require a connected graph. That just happened on it's own, to my surprise.

The graph was laid out using neato, and then I added the colored boxes on top by hand using Gimp. With neato, I can change a line and get the nodes as text boxes with names instead, which is fascinating to look through, but not something I feel comfortable sharing publicly.

The distance between edges in the graph has no particular meaning. Neato models the layout problem with each edge as a spring with an optimal length, pushing the edges toward that length. The algorithm tries to find the optimal layout that minimizes the overall differences in length between the spring's actual and optimal length.