Relationships between nodes in the RoSE project and the ELMCIP Knowledge Base
Today we met with Alan Liu, Rita Raley, Dana Solomon and Lindsay Thomas of the RoSE project at UCSB. RoSE stands for “research-oriented social environment” and, according to the project description, allows “tracking and integrating relations between authors and documents in a combined “social-document graph.” So authors and documents are equal nodes in a system, and there are different kinds of relationships between them. It’s “Facebook for the dead”, as Alan quipped, although the nodes in the system range from the long-dead (Aristotle) to the very much alive and kicking (William Gibson or Alan Liu himself). RoSE uses network graphs both as a visualization of the connections between nodes and simply as a means of navigating the database.
Unlike Facebook, the “edges” or connections between nodes are more nuanced than just “friends”. For instance, someone could be described as a “scholar of” Aristotle, or, more complicatedly, as a “lover of” or a “cousin of”.
The list of possible relationships is long. Scrolling down beyond the menu you see on the screen are further options, such as “enemy of” and many more. Alan and Rita noted that perhaps it was actually too simplistic to view a node as a single entity. Relationships evolve over time – even after the death of the author.
(Btw, that’s Dana Solomon lurking beside the projection in that photo – he’s just started working on a PhD on data visualization in the digital humanities, and I noticed lots of interesting links in his Twitter feed.)
One of the challenges of the project is the time involved in manually entering relationships. The database is seeded by metadata harvested from Project Gutenberg, YAGO, and the SNAC project, but this only provides a very thin framework because not a lot of relationships can be automatically read out of library metadata other than “X is the author of Y”. This provides a large number of nodes, but relationships between them are thin. Many nodes are singletons or just one relationship.
On top of that, RoSE allows groups to add a thick description on top of that, because people can manually add relationships. So there’s a top-down controlled vocabulary as well as a bottom up crowdsourced vocabulary.
There are a few examples of projects visualizing relationships between humanities data, such as KNALIJ (which needs to be pronounced “knowledge” in an American accent) which visualizes relationships between scientific medical publications. This works because the data is well-behaved and in a standardized format, unlike, say, electronic literature, or people. Projects like RoSE and the ELMCIP Knowledge Base have had to manually enter all the relationships that KNALIJ is able to automatically extract. It’s hard to see how such relationships could be automatically harvested, beyond the very thin framework that projects like SNAC can harvest.
RoSE is built on Ruby on Rails with Flash for the visualizations, and was coded in house by a collaboration between the English department and Media Arts and Technology at UCSB. The ELMCIP Knowledge Base is built in Drupal using modules that support extensive node-referencing.
A clear connection between both the RoSE project and the ELMCIP Knowledge base is our shared deep interest in how creative communities form, take shape and evolve, and how the connections and relationships between people, texts and activities are crucial for documenting and understanding this.