cover of Programming Collective IntelligenceEspen Andersen noted the new O’Reilly book Programming Collective Intelligence, by Toby Segaran, which looks really interesting. In an excellent blog post discussing the book, Tim O’Reilly writes about the importance of what users implicitly contribute to the web, rather than just looking at the photos and videos blog posts and Facebook profiles that are explicitly contributed.

No one would characterize Google as a “user generated content” company, yet they are clearly at the very heart of Web 2.0. That’s why I prefer the phrase “harnessing collective intelligence” as the touchstone of the revolution. A link is user-generated content, but PageRank is a technique for extracting intelligence from that content. So is Flickr’s “interestingness” algorithm, or Amazon’s “people who bought this product also bought…”, Last.Fm’s algorithms for “similar artist radio”, ebay’s reputation system, and Google’s AdSense.

This is a book explaining the practical sides of actually using this information – it “teaches algorithms and techniques for extracting meaning from data, including user data”, O’Reilly writes. For instance, it explains that you might be able “to determine if there are groups of blogs that frequently write about similar subjects or write in similar styles” by “by clustering blogs based on word frequencies”, and that this “could be very useful in searching, cataloging, and discovering the huge number of blogs that are currently online.” It then proceeds to tell you exactly how to do this by “downloading the [RSS] feeds from a set of blogs, extracting the text from the entries, and creating a table of word frequencies.”

And the way they’ve set up the online table of contents, with extracts from each subchapter, is a thing of beauty. The bit about finding word clusters in blogs is from Chapter 3, in the sub-section “Word Vectors”.


Discover more from Jill Walker Rettberg

Subscribe to get the latest posts sent to your email.

1 Comment

  1. William Patrick Wend

    Thank you for posting about this book, Jill. I think this might be useful for my MA thesis.

Leave A Comment

Recommended Posts

Triple book talk: Watch James Dobson, Jussi Parikka and me discuss our 2023 books

Thanks to everyone who came to the triple book talk of three recent books on machine vision by James Dobson, Jussi Parikka and me, and thanks for excellent questions. Several people have emailed to asked if we recorded it, and yes we did! Here you go! James and Jussi’s books […]