I spent three months this year on sabbatical at Culture Lab, Newcastle University (UK). It was a privelege to spend time in such a vibrant research lab, as well as to get to know the city of Newcastle. One of the projects to come out of my visit is Succession, an experiment in generative digital heritage that uses Newcastle and its history to think about industrialisation, global capital, our shared pasts and potential futures. Personally, it brings together two strands of my work that have been separate until now - on generative systems and digital cultural collections. Hence you'll also find this cross-posted over on The Teeming Void. Here, some notes and documentation on the work, some musings on generative and computational heritage.

Much of my recent work with digital cultural collections has worked to create rich representations of these ever-expanding datasets. A key thread has been an interest in the complexity of these collections; the multitudes they contain, their wealth of potential meaning as complex, interrelated wholes, rather than simply respositories of individual resources. Visualisation can provide a macroscopic view of this complexity, but it can be just as vivid when sampled at a micro scale. Tim Sherratt's Trove News Bot tweets digitised newspaper articles in response to the day's news headlines, creating little juxapositions, timely sparks of meaning that can be pithy, funny, or provocative. Trove News Bot appropriates the twitter bot - the joking-but-deadly-serious computational voice of our age - and adapts it to work with the digital archive. We could call this generative heritage; using computational processes to create new artefacts (and meanings) from historical material.

Succession applies this generative approach to the digital heritage of Newcastle Upon Tyne. Newcastle has a rich industrial heritage; it played a major role in the Industrial Revolution that began in Britain and went on to remake global civilisation. Today Newcastle is a post-industrial or de-industrialised city: coal, steel and shipbuilding have given way to service industries: education, retail, entertainment and tourism. As an outsider exploring the city I was struck by the mixture of pre-modern, industrial and post-industrial eras in the fabric of the city. Different (often inconsistent) patterns of life, work and economy are accreted in layers as the city continues the everyday process of adaptation, experimentation with the possible; working out what comes next.



The city, like the digital archive, is a multitude; an unthinkably complex matrix of people, things, systems, narratives. Newcastle - more than many other cities - also speaks to the expansive dynamics of industrialisation, globalisation, extractive industry, fossil fuels; the whole modern trajectory that has brought us to our current predicament. This seems to be both urgent and unthinkable - or perhaps, unsayable. How can we speak back to this complexity; how can we make in a way that responds to this tangled, expansive mess? Here generative techniques offer a way to synthesise complexity and create multitudes, formations that might portray the city as it was, or hint at what it could be. Automatic juxtaposition and remix create nonsense but also, occasionally, glimmers of a new sense, or at least a texture or sensation that emerges from a random constellation of images, sources and contexts. Succession requires us to piece together fragments of history; and this is a work of imagination, as Ross Gibson writes, framing his own work of generative heritage (with Kate Richards, Life After Wartime

Our parlous states need imagination. We need to propose “what if” scenarios that help us account for what has happened in our habitat so that we can then better envisage what might happen. We need to apprehend the past. Otherwise, we won’t be able to align ourselves to historical momentum. Without doing this we won’t be able to divine the continuous tendencies that are making us as they persist out of the past into the present.

In practical terms, the work is based on a corpus of around two thousand images sourced from the Flickr Commons. Most come from the (wonderful) Tyne and Wear Archives and Museums collection; many more from the Internet Archive Books collection, with a smattering of others from UK and international institutions. Succession uses these ingredients to generate new digital "fossils"; composite images assembled in the browser using HTML Canvas. This generative process is extremely simple: pick five sources at random, and place them in the frame using some semi-random rules for positioning, compositing and repetition. Opacity is kept low, so that the sources blend and merge. The visual process often obscures the source images - they end up buried,  cropped or indistinguishable, squashed like fossil strata. But at the same time the source items are preserved and presented in context, so each composite retains references to its sources and their attendant contexts. Composites can be saved, acquiring an ID and permalink; the images in this post show some of my favourites, but there are over a hundred to sift through already.



As a generative system this is, in formal terms, incredibly simple. It's essentially a combinatorial process, in that each composite consists of five elements from a set of around two thousand. Yet already this adds up to 2.5 x 10^15 unique combinations - it would take eight million years to see them all, at one per second. Compositing and layout parameters are random within constraints - so this simple machine can produce an immense variety of unique results; I'm still surprised and delighted by the fossils people discover (or generate). But this computational variety is also strongly shaped by the human creative choices involved in making the work. This is what Bill Seaman (combinatorial media artist par excellence) calls "authored space" - a domain of potential that is expansive but never arbitrary. The corpus reflects a handful of coherent themes, seasoned with generous sprinklings of the lateral and miscellaneous; the aim is, in Seaman's words, a kind of "resonant unfixity." Also the corpus and the compositing process work in tandem; for example compositor treats the largely monochrome line-art and engravings of the Internet Archive material differently to other (largely photographic) sources. The generative machine is programmed in part by the textures and qualities of its material.



The Internet Archive book images are interesting on several fronts; for one, they are an amazing demonstration of the power of computational processes for generating and describing large collections (like 2.6 million items large). Given the right kind of source material, this computational leverage changes the logic of collections completely. When adding and describing items is expensive, it makes sense to be selective, and publish only what is most "significant". Automation makes it possible to simply publish everything - for who's to say (really) what is significant, or how it might one day be significant? In Succession the Internet Archive material plays a crucial role. The line art and diagrams - many from obscure publications like the Transactions of the North of England Institute of Mining and Mechanical Engineers - offer evocative fragments of the machinery of mid-nineteenth century industrialisation.

As for generative digital heritage, it's a fairly open-ended proposal. What happens when we turn algorithms loose on our digital culture with makerly, synthetic, speculative or poetic intention? There are some pretty solid precedents in the digital humanities for these approaches; Schnapp and Presner call for a "generative" DH in their 2009 manifesto. Before that Drucker and Nowviskie outlined a "speculative computing" with a strongly generative flavour. Gibson and Richards' Life After Wartime is an early exemplar of generative heritage in the digital arts. More recently we've seen the rise of massive online collections, web-scale computing, and a proliferation of cultural, critical and creative bots, not to mention projects like #NaNoGenMo. If there is such a thing as generative digital heritage, then now's the time.







In the previous post I introduced our Discover the Queenslander project for the SLQ, and mentioned that we used the AngularJS web framework. That process has got me thinking about some of the technical challenges in creating rich collection interfaces, and the different approaches in play, and I'll report on these in the next two posts. In this one I'll focus on AngularJS, and in the next, some broader questions on working with collection data on the client side.

AngularJS is a Javascript-based framework that focuses on extending HTML to deal with dynamic content. Angular "binds" data to HTML elements; so change the data, and the HTML updates. Even better, the bindings are two-way: interacting with an HTML element can also change its bound data. Angular implements a MVC (Model View Controller) architecture, where the data structure is the Model, the HTML document is the View, and a Javascript Controller links the two together.

Our previous web-based collections projects (TroveMosaic, Manly Images, Prints and Printmaking) were built in plain JS and jQuery. The general approach is pretty straightforward: load and manipulate some collection metadata (either from an API or a static JSON file), then build the HTML dynamically (adding and styling elements according to the data). jQuery makes handling interactions with the HTML pretty straightforward. It also (in my experience) makes for a verbose mess. Because all the HTML is built dynamically there's a lot of code devoted to creating elements, setting attributes, then stuffing them into the DOM. Code that loads and munges data gets tangled with code that builds the document and code handling interactions. Some elements get styled with static CSS, others are styled with hard-coded attributes. It all works fine - jQuery is very robust - but under the surface, it's bad code.

AngularJS tidies this process up quite a bit. Here's a quick example showing how straightforward it is to bind some collection data to some HTML. Say we have a JSON array items where each item looks something like:
{ "id":"702692-19340823-s002b",
 "title":"Illustrated advertisement from The Queenslander, 23 August, 1934",
 "description":["Caption: Practical garments","An Advertisement for women's clothing sewing patterns acquired through mail order from The Queenslander Pattern Service."],
 "subjects":["women's clothing & accessories","advertisements"],
 "thumbURL":"702692-19340823-s002.jpg",
 "year":"1934"
}
To create a HTML list where each item appears as a list element:
<ul>
 <li ng-repeat="i in items"> 
  <h1>{{i.title}}</h1>
  <img ng-src="{{i.thumbURL}}"/>
 </li> 
</ul>

Angular lets us iterate over a list of elements with the ng-repeat directive; it will simply generate a <li> for each element in the items array. Attributes of each item i are easily bound to the HTML using the {{moustache}} notation - so the item title will appear inside the h1. Apart from the compact, HTML-based rendering syntax, the killer feature here is that the HTML stays bound to the data: in order to change the display, we simply change the contents of items. No jQuery-style DOM manipulation; the data drives the document.


So rendering items in a list is trivially easy; but what about more complex displays? It's a matter of creating the data structures you need, then binding them to HTML in the same way. The Queenslander grid interface (above) includes a histogram showing items per year. In HTML this is simply another list, where each column is a list element. To create the data structure we sort the items into a new array where each element contains both the year, and a count of its items. Then as in the example, we run through the array with Angular building an element (this time a column) for each year. Angular's ng-style directive lets us create a custom height for each element, based on the number of items in the year list. With an array yearTable, where each year y has a totalCount
<ul>
     <li ng-repeat="y in yearTable">
           <div ng-style="{height: y.totalCount+'px'}"
           ng-click="setYearFilter(y.year);" >
     </li>
</ul>
Here Angular is doing some rudimentary data vis, linking variables in the data to the dimensions of each HTML element. Note also that each column element has an ng-click directive, calling a function that filters the items displayed. The term clouds for subjects and creators work the same way.

Hopefully this gives a hint of how AngularJS can be applied to cultural collection interfaces. From a developer's perspective, there are a number of big advantages. Compared to our previous jQuery process, Angular simplifies the page-building process immensely; the templating approach encourages a separation of concerns and more organised, maintainable code. Angular's data-centric binding also provides some big wins. Data structures (models) become more important; Angular requires that you get your data organised before binding it to the DOM. Coming from the free-wheeling procedural world of jQuery, this data-centric approach was the biggest conceptual challenge. The bottom line is: manipulate the data, not the HTML. The payoff is that the work of keeping the HTML and the data coordinated just disappears. Angular's modular architecture and active developer community also bring benefits: in the Queenslander project for example we used ngStorage, a module that made the favourites incredibly easy to build.

Compared to standard web interfaces, the big difference here is that all the collection data (in this case some 1000 items worth) is in the browser, on the client side. No server calls, pulling down a few items at a time - instead we load the whole set up front, and build the interface dynamically based on that data. The biggest payoff for this approach is responsiveness - filtering and exploration are lightning fast - but there are problems too; search engines can't index this dynamic content, and it requires modern browsers with fast JS engines. Some would argue that this approach is just plain wrong; abusing the client/server architecture of the web. I'm more of a pragmatist, but there are certainly some technical issues to consider, and in the next post I'll go a bit deeper into this notion of client-side data for digital cultural collections.

Discover the Queenslander

Discover the Queenslander is our latest generous interface project, commissioned by the State Library of Queensland to showcase their collection of covers and pages from The Queenslander newspaper. Published 1866-1939, The Queenslander was the illustrated weekend supplement to the Brisbane Courier Mail. This collection includes around 1000 covers, advertisements and illustrations - a beautiful slice of Australian pre-WWII visual culture. Geoff Hinchcliffe and I developed a web-based interface that builds on our previous approaches - rich overview, browsing and visual exploration - and adds some new techniques. Here I'll provide a quick outline; in the following post I will focus on the web framework we used - AngularJS - which I think has some interesting applications for digital collections.


The Mosaic view provides a chronological overview of the collection - each tile represents items from a single year. Like the Manly Images mosaic, the tiles gradually reveal their contents - in this case they are also directly navigable. The Grid view is a more general-purpose explorer for browsing subjects, creators and years as well as colours. Both Grid and Mosaic interfaces link to a detailed item view. There's nothing radically new here - though there are a few new elements that extend on our generous interfaces repertoire.

Inspired by the qualities of the collection images and the related work happening at Cooper Hewitt, Geoff and I were keen to experiment with using colour to explore the collection. The process was (surprise!) more complex than we expected, but ultimately rewarding. Using some palette extraction code that Geoff developed, we first pre-built a palette for each item. These colours are stored in the collection metadata, and act much like any other metadata field. The interface then dynamically builds an "overview" palette revealing the colours in the current set of items, and both the item palette and the overview palette act in turn as filters; rinse and repeat for open-ended colour-browsing. Note also how the filters and facets in the grid view interact; selecting a colour will also reveal corresponding dates, creators and subjects (and vice versa).


This project also introduces some simple personalisation, with the ability to curate and share a collection of favourite items. We opted for a lightweight, no-login approach using HTML5 web storage (essentially fancy cookies) to simply track item IDs. Sharing a collection is a simple as sharing a URL with a list of IDs baked in; and because collections operate within the standard grid view they get filters and facets too.

Finally a little feature that I am particularly fond of is the Trove link on the item page; a simple demonstration of how we might start to link up collections across institutional boundaries. In this case, the State Library of Queensland has high-res images of covers and illustrations, while the NLA's Trove publishes the full contents of The Queenslander (albeit with low-res scans). Using the Trove API we simply harvested the full list of issue dates and corresponding Trove IDs, then matched them against the SLQ items. So each Queenslander item also provides a link to its source issue, providing additional context as well as opening onto further exploration.



Over the past twelve months we have been developing some new approaches to the challenge of providing rich, revealing interfaces to cultural collections. The key idea here is the notion of generous interfaces - an argument that we can (and should) show more of these collections than the search box normally allows; and that there's a zone between conventional web design and interactive data visualisation, where generous interfaces might happen. There's more on this concept in my NDF 2011 presentation, or (in a more formal mode) in the paper I presented at the recent ICA conference.

Here I want to introduce an experimental "generous interface" prototype. Manly Images is an explorer for the Manly Local Studies Image Library - a collection hosted by the Manly Library. This is a collection of around 7000 images, documenting the history of the Manly region from the 1800s to the 1990s. The aim here was to develop a "generous," exploratory, non-search interface to the collection, delivered in HTML.


The original intention here was simply to adapt our CommonsExplorer work into HTML - CommonsExplorer uses a linked combination of thumbnails and title words to provide a dense overview of an image collection. But to "show everything" would mean 7000 elements, a stretch even for modern browsers; and I wanted to experiment with some new approaches to overview which remains the key problem here - a really juicy one. Given 7000 images with titles and little else, how can we provide a compact but revealing representation of the whole collection?

Here, the strategy was to break the collection into smaller segments based on either terms in the title, or date, and to draw each segment as a simple HTML div, where the size of the box reflects the number of items in that segment.  These segments also act as navigational elements, opening a "slider" type display for browsing through specific records, and finally a lightbox for larger images, with links to canonical URLs on both Trove and the Manly site.

As a visualisation, it's a bit like a treemap (without the heirarchy), or a reconfigured histogram. But a collection like this is more than a list of quantities; the texture and character of the images is crucial. So as well as showing quantity, the segments become windows revealing (fragments of) the images inside them in a rolling slideshow. We get a visual core-sample of each segment, revealing the character of that group; and across the collection as a whole, a shifting mosaic that reveals diversity (and consistency), and invites further exploration. An interesting side effect is that it becomes possible to surf through the whole collection without doing a thing; it will (eventually) just roll past. This might not be realistic in a traditional browser context, but that traditional, "sit-forward" user model is not what it used to be - as Marian Dork argues, the leisurely drift of the information flaneur might be more apt.


So, a rich exploratory interface to 7000 images, without search, and delivered entirely in HTML; we have shown that it's possible, but is it any good? I'll write up my own evaluation with some technical documentation shortly; meantime, feedback on the prototype is very welcome - and if you are interested in building on it, or adapting it for other collections, the source is up on GitHub.

Finally some acknowledgements: this project was funded by the State Library of New South Wales and supported by Cameron Morley and Ellen Forsyth; thanks to John Taggart of Manly Library for permission to use the image collection. The collection data is harvested from the excellent Trove API, developed by the National Library of Australia.

I recently gave this presentation at the National Digital Forum 2011 in Wellington. It proposes a way to think about collection interfaces through the concept of generosity - "sharing abundantly". The presentation argues that collection interfaces dominated by search are stingy, or ungenerous: they don't provide adequate context, and they demand the user make the first move. By contrast, there seems to be a move towards more open, exploratory and generous ways of presenting collections, building on familiar web conventions and extending them. This presentation features "generous interfaces" by developers including Icelab, Tim Sherratt and Paul Hagon, and it includes a preview of some work I am currently doing with the National Gallery of Australia's Prints and Printmaking collection, in collaboration with Ben Ennis Butler.

Template based on Cutline port by Blogcrowds