How to collect lots of random Facebook profile photos in a spreadsheet
I’ve been thinking about Facebook profile photos. For instance, have you noticed what a lot of people have kids in their profile photos? Some people use a photo of themselves with one of their kids, some use a photo of themselves as a child, some use a photo of their child instead of themselves. The non-parent demographic often shows photos of themselves with friends or with their lover. Some people use cartoons or landscapes, some people use close-ups, some people show themselves at such a distance you could never recognize them.
I’m actually most interested in the ways we use children in our profile photos (are they there to anonymise us? generalize us as parents? are children seen as extensions of us? probably all of the above and much more) but I’ve been thinking more generally about how a pool of data like this could be fascinating for teaching. But I’m not really a programmer and to do something interesting you’d want lots of photos, right? Turns out you really don’t have to be a programmer to get hold of lots of profile photos. Mind you, the method I used takes a bit of time and only about 10% of the URLs I came up with actually gave me real profile pictures.
Facebook’s Graph API gives every object on Facebook a unique ID – people, events, photos, pages and so on and the connections between them too. So all these can be accessed by a URL. The URL to any Facebook profile picture is:
where USER-ID is the user ID number. Mine is 337800042 (I never chose a name so still have a number) so you can see my Facebook profile picture at https://graph.facebook.com/337800042/picture?type=large Now, a lot of the objects on Facebook are private, and you would have to be logged in to view them. Profile pictures are always public, though.
If you want to download a lot of Facebook profile pictures, you can use Excel to generate a long list of consecutive user ID numbers. If you write out three consecutive numbers beneath each other, you can drag down from there as far as you like and Excel will fill out the rest for you. Make as many as you like. I started from my own user ID, but pick any you like.
Now copy your list of possible user IDs to TextWrangler or another text editor which allows you to do interesting search and replace. In TextWrangler, you can add prefixes and suffixes to each line very easily. Simply pick “Prefix/Suffix Lines” from the “Text” menu and paste in the start and end of the URLs to create a long list of URLs to possible profile pictures.
Now copy your list of URLs into a Google Docs spreadsheet. You can insert images into spreadsheet cells in Google Docs by entering the formula
or, more efficiently, you click on the cell with the URL in it instead of typing “URL”. So if you have all your URLs in column A, type
in column B and click the adjacent URL and then the final bracket. Then copy and paste from cell B1 all the way down the column. You can do that simply by click and dragging the bottom right corner of the B1 cell all the way down the column.
Google docs will get a bit fussed out going to find all those images, but slowly the column will fill up – you’ll get some images, some #N/A to show that there was no image at the URL (I’m not sure why this happens – there are lots of these, and when I paste the URL into my address bar they all show up profile photos) and you’ll get lots of those default Facebook profile pictures that new users have. Presumably not all the user IDs are actually in use.
To get rid of the #N/As and the default profile pics, select column B, and then choose “Sort Sheet by Column B”. Now you can delete the boring pics all at once.
When you’re happy with your selection, resize the rows by selecting the top row, then pressing Ctrl + Shift + Down, then right-clicking (or control-clicking) in the selection and choosing “Resize Rows” from the contextual menu. I found that 100 pixels let me see the images well enough.
Now all you have to do is figure out how you’d like to analyze your photos. That’s what I plan to have my students do tomorrow. I’m thinking marking the list up with columns specifying the number of people, whether the photo is a closeup or a distant shot, children, etc. We’ll see what the students come up with.
We’ll also discuss whether or not this is actually within Facebook’s Terms of Service. I think it’s fine for doing in class, but would it be OK for a research project? The user is supposed to own the data, and have the right to opt out of its being used in an app (sort of) and while we’re not using this to build an app, we are certainly using their data.
The final discussion is research ethics. Would it be OK to use this data in a research project? It’s published as in it’s public and doesn’t require a login to access. But you could argue that people don’t really get to make that choice as Facebook requires public profile photos and you pretty much need to upload a profile photo to have a functional Facebook account. I’m pretty sure you’d need to report your research project to Norsk samfunnsvitenskapelig datatjeneste in Norway (Human Research Board or similar in other countries). You’d want to discuss general research ethical guidelines too, probably taking the AoIR guidelines as a starting point.
8 thoughts on “How to collect lots of random Facebook profile photos in a spreadsheet”
Interesting idea… by creating a random profile id you should be theoretically able to get a random sample. Unfortunately it’s not that easy to generate such a valid id and Facebook is voluntarily hiding the way they generate this number. It’s of course possible to generate a random number, see it return something and if not move to the next random number until you find a valid profile id with photo. This method is discussed at http://stackoverflow.com/questions/7758090/use-facebook-api-with-objective-c-to-find-random-facebook-user-image. I don’t think this method comply with Facebook terms of service.
I had an UG student do a dissertation on Facebook profile photos a year or so ago and rather than use a randomised approach she used snowball sampling to be invited into people’s profiles and grow her data set (she set up a new Facebook profile to facilitate the research). One of her interesting findings was the gendered nature of the profiles, esp. the fact that males tended to use alternatives to their photographs, preferring avatars and so forth. I was expecting more of the old ‘MySpace Angles’ type images, but in fact in her dataset this wasn’t as pronounced as I was expecting.
The snowball approach would probably be better in terms of getting informed consent, but on the other hand you’d obviously only be getting one part of the network – people likely to be similar to each other. That might be useful in some ways. Interesting that males used more avatars than females. One of my students is planning on doing her MA on this in some way, so it will be interesting to see what she comes up with.
Alternatively, you can use this website, http://www.imagecrashers.com, it generates random facebook profiles and you can filter by gender. If you use consecutive numbers, chances are that a lot of these numbers aren’t actual IDs since there are many missing in between.
How are you avoiding the random ids and differentiating between male and female?
You could also qualify each result with a powershell script or similar script language. A bit harder to do but more efficient.
Hello, my name is Jose, I am from Spain. I created an application to get photos of facebook profiles randomly. But not using the ID number, but with names generated previously, namely 50,000 lines with a name and a name on each. Randomly gets the name of a line and the name of another, together, and if there download the photo. You can download it here. http://adf.ly/jIR9E Greetings from Spain!
Hi Jill, the question of children in (profile-)pictures on social media profiles is what I am working at at the moment. In Basel/Switzerland we just started a new resarch projetct with the title “Family Photography in the Social Web” funded by the Swiss National Science Foundation. Our website is http://www.netzbilder.net, I will put more information and content online, as soon as possible. All the best from Basel!