How to collect lots of random Facebook profile photos in a spreadsheet
I’ve been thinking about Facebook profile photos. For instance, have you noticed what a lot of people have kids in their profile photos? Some people use a photo of themselves with one of their kids, some use a photo of themselves as a child, some use a photo of their child instead of themselves. The non-parent demographic often shows photos of themselves with friends or with their lover. Some people use cartoons or landscapes, some people use close-ups, some people show themselves at such a distance you could never recognize them.
I’m actually most interested in the ways we use children in our profile photos (are they there to anonymise us? generalize us as parents? are children seen as extensions of us? probably all of the above and much more) but I’ve been thinking more generally about how a pool of data like this could be fascinating for teaching. But I’m not really a programmer and to do something interesting you’d want lots of photos, right? Turns out you really don’t have to be a programmer to get hold of lots of profile photos. Mind you, the method I used takes a bit of time and only about 10% of the URLs I came up with actually gave me real profile pictures.
Facebook’s Graph API gives every object on Facebook a unique ID – people, events, photos, pages and so on and the connections between them too. So all these can be accessed by a URL. The URL to any Facebook profile picture is:
where USER-ID is the user ID number. Mine is 337800042 (I never chose a name so still have a number) so you can see my Facebook profile picture at https://graph.facebook.com/337800042/picture?type=large Now, a lot of the objects on Facebook are private, and you would have to be logged in to view them. Profile pictures are always public, though.
If you want to download a lot of Facebook profile pictures, you can use Excel to generate a long list of consecutive user ID numbers. If you write out three consecutive numbers beneath each other, you can drag down from there as far as you like and Excel will fill out the rest for you. Make as many as you like. I started from my own user ID, but pick any you like.
Now copy your list of possible user IDs to TextWrangler or another text editor which allows you to do interesting search and replace. In TextWrangler, you can add prefixes and suffixes to each line very easily. Simply pick “Prefix/Suffix Lines” from the “Text” menu and paste in the start and end of the URLs to create a long list of URLs to possible profile pictures.
Now copy your list of URLs into a Google Docs spreadsheet. You can insert images into spreadsheet cells in Google Docs by entering the formula
or, more efficiently, you click on the cell with the URL in it instead of typing “URL”. So if you have all your URLs in column A, type
in column B and click the adjacent URL and then the final bracket. Then copy and paste from cell B1 all the way down the column.
Google docs will get a bit fussed out going to find all those images, but slowly the column will fill up – you’ll get some images, some #N/A to show that there was no image at the URL (I’m not sure why this happens – there are lots of these, and when I paste the URL into my address bar they all show up profile photos) and you’ll get lots of those default Facebook profile pictures that new users have. Presumably not all the user IDs are actually in use.
To get rid of the #N/As and the default profile pics, select column B, and then choose “Sort Sheet by Column B”. Now you can delete the boring pics all at once.
When you’re happy with your selection, resize the rows by selecting the top row, then pressing Ctrl + Shift + Down, then right-clicking (or control-clicking) in the selection and choosing “Resize Rows” from the contextual menu. I found that 100 pixels let me see the images well enough.
Now all you have to do is figure out how you’d like to analyze your photos. That’s what I plan to have my students do tomorrow. I’m thinking marking the list up with columns specifying the number of people, whether the photo is a closeup or a distant shot, children, etc. We’ll see what the students come up with.
We’ll also discuss whether or not this is actually within Facebook’s Terms of Service. I think it’s fine for doing in class, but would it be OK for a research project? The user is supposed to own the data, and have the right to opt out of its being used in an app (sort of) and while we’re not using this to build an app, we are certainly using their data.
The final discussion is research ethics. Would it be OK to use this data in a research project? It’s published as in it’s public and doesn’t require a login to access. But you could argue that people don’t really get to make that choice as Facebook requires public profile photos and you pretty much need to upload a profile photo to have a functional Facebook account. I’m pretty sure you’d need to report your research project to Norsk samfunnsvitenskapelig datatjeneste in Norway (Human Research Board or similar in other countries). You’d want to discuss general research ethical guidelines too, probably taking the AoIR guidelines as a starting point.
Sorry, but comments from before December 2010 are lost in the database and I've not yet figured out how to display them properly.