Tuesday, 27 September 2011

Pelagios usability testing results

In my earlier posts, Evaluating Pelagios' usability and Evaluating usability: what happens in a user testing session?, I promised I'd share some preliminary results. Because my last posts were getting rather long, I'll keep this short and sweet.

I had two main questions going into the user testing: could a user discover all the functionality of Pelagios, and could they make sense of what they were shown?  In short, the answer to the first question is 'no', and the answer to the second is 'yes'.

I'm a big fan of testing the usability of a site with real users.  Test participants not only give you incredibly useful insights into your site or application, but they help you clarify your own thoughts about the design. It was exciting to see test participants realise the potential of the site - particularly the map and data views, which is a cue to make them more prominent when the site first loads - but it was clear that the graph interface needs improvements to make the full range of actions available for selecting, linking and exploring datasets more visible to the user.  The test participants also used search heavily when looking for particular resources, so this would be a key area for future work.

If you've been working on a project, user testing is wonderful and painful in equal measures.  It's definitely easier to test someone else's project, not least because it's easier to prioritise tasks from the users' point of view when you don't have to deal with the details of implementing the changes.

The overall goal of the usability testing was to produce a prioritised list of design and development tasks to improve the usability of the Pelagios visualisation for a defined target audience (non/semi-specialist adults with an interest in the ancient world), and this user testing was really successful in giving the team a clear list of future tasks.

Friday, 23 September 2011

Evaluating usability: what happens in a user testing session?

In my last post I talked about the test plan for assessing the usability of the Pelagios 'graph explorer' for the project's (deep breath) 'non/semi-specialist adults with an interest in the ancient world' audience. Before I get into the details of what happens in a usability test session, I thought I'd introduce you to our design persona, Johanna.
Image credit: @ANDYwithCAMERA
Johanna is 21, and is a third year History student. She moved from her native Germany to the UK three years ago for university. Her goal is to get a First so she has more options for future academic work, perhaps in the Classics. She's slightly swotty, and is always organised and methodical, but finds that she's easily distracted by Facebook and chat when she's working on the computer. She can often be found having coffee or in the pub with friends, at her part-time job in a clothing store, or in the library (her shared house is often noisy when she's trying to work). She dislikes distractions when she's trying to study, and hates rude customers at work. She likes her bike, RomComs and catching up with friends. Her favourite brands are Facebook, MacBook, Topshop, Spiegel Online and The Body Shop. Her most important personal belongings are her laptop, her mobile phone, and photos of friends and family from Hamburg and college.
Johanna is technically competent, and prefers to learn through trial and error rather than reading manuals or instructions. But she also has limited patience and will give up on interfaces that are too difficult. Johanna is a heavy user of social networks and also uses online research databases and library catalogues.
Johanna has an assignment on inscriptions due in a month. She hates the emphasis on big battles and big men in the subject, and finds inscriptions dry, but has been told they can also convey interesting social history and cultural values. She's not convinced (and she's not sure whether she'll be able to make much of the language of the inscriptions) so she wants to find an ancient place that also has other historical material about it to make the assignment more relevant to her own interests.
To create our persona and design the test tasks, I quizzed Elton on the types of questions people ask when they find out he's a Classicist to get a sense of common (mis)perceptions and interests, and about the types of students he's encountered.

So, onto the usability tests themselves. The time and venue for each test was organised directly with the participant, with the restriction that we had to be able to get online, be in an environment where it was ok to talk aloud, and ideally we'd meet somewhere the participant would feel comfortable.

In my last post I mentioned writing and testings some set tasks for the usability test, a short semi-structured interview, and an introductory script. Once the participant had arrived, and was settled with a cup of tea or whatever, I'd introduce myself and explain how I came to be working with the project. I've included the basic introductory script below so you can get a sense of how a test session starts:
Thank you for agreeing to help us test the usability of the current interface for Pelagios.
We'll be using these tests to produce a prioritised list of design and development tasks to improve the Pelagios visualisation for people like you.
The session will take up to an hour and will start with a short interview, then your initial impressions of the site, and finally we'll go through some typical tasks on the site. I'll ask you to 'think aloud' as you use the site - a running stream of thoughts about what you're seeing and how you think it works. I might also ask you questions to clarify or explore interesting things that come during the session.
I want you to know that you're not being tested! We're testing the interface - anything that goes wrong is almost definitely its fault, not yours! Also, I haven't been involved in the project design, so you don't need to worry about hurting my feelings - be as direct as you like about what you're seeing.
I won't be recording this, but I will be taking notes as we go, and summarising them to pass them on the project team.
You can stop for a break or questions at any time.
Do you have any questions before we begin?
The next phase of the test session was the short interview. Again, I've included the questions below:
  • Demographic data: what is your age, gender, educational level, nationality/cultural background?
  • What websites do you use regularly (on a daily/weekly basis)?
  • What's your favourite website, and why?
  • What websites do you use in your research/daily work?
  • Have you seen sites like [Guardian, Gapminder, etc] that feature interactive visualisations?
  • How would you describe your level of experience with the classics? (e.g. a lot, a little). Do you focus on any particular area?
  • What is your definition of the classics? (Geographical, chronological scope)
Once the questionnaire was over, and any questions that had arisen had been discussed, the test began. The first part of the test covered first impressions of the 'look and feel' of the site, what they thought the site might be about, and what content it would include, and what they thought the 'blobs' that are the first view of the graph visualisation represented. I was also observing the kinds of interactions participants tried with the visualisation, whether single or double mouse-clicks, dragging, right-clicking, etc, because I wanted to know how much of the functionality of the site was intuitively discoverable.

The first formal task was: "find all the resources related to Cyrene" [or a place related to their own interests]. I'd note the actions the participants took along with their comments as they 'thought aloud'. Sometimes I'd ask for more information about why they were doing certain things, or remind them to tell me about the options they were considering. I also noted the points where the participant expressed confusion or frustration, or gave up on a task, though I didn't time the tasks or record a qualitative count of errors.

After the task, I'd ask (if it hadn't already come up):
  • What do you think these resources are?
  • How do you think they relate to your actions?
  • What contextual information might you need to make sense of these resources?
These questions were based on the team's review of the site and were aimed at making sure we understood the participant's 'mental model' of the site. If there's a mismatch between the users' mental model and what your site actually does, you need to help users develop a more appropriate mental model.

The second task, "Are there links between [Place 1, Place 2]? If so, what are they and how many are there?" was more open-ended and designed to see how participants managed small result sets on the site. Again, I had questions prepared as prompts in case they hadn't already been answered during the task:
  • What do you think you're looking at here?
  • What does the screen tell you?
  • What do you think the links mean/are?
  • What do you think the movements on the screen mean?
  • How do you interpret the results?
  • How do you think they're selected?
Finally, I asked some questions aimed at giving the project some metrics to measure improvement in the usability of the site after design updates: 'would you use the site again?', 'how likely are you to recommend it to a friend?'.  The final questions were: 'what would you suggest as first priority?' and 'any final comments?'.

After running each test, I'd tidy up my notes and summarise the key points for the team so they could prioritise the next items of design or development work. Which leads me onto my next post, which will include some preliminary results...

Tuesday, 20 September 2011

Evaluating Pelagios' usability

Hello!  I'm Mia, and I was drafted into the Pelagios project to run some usability testing on the 'graph explorer'. (These days I'm working on a PhD in Digital Humanities in the department of History at the Open University, but until quite recently I worked as an analyst/programmer and user experience designer, mostly in museums, and in early 2011 I completed City University London's MSc in Human-Computer Interaction).

There's a range of usability methods we could have used to evaluate Pelagios' usability, but the 'gold standard' is user testing (basically, showing the site to typical users and gathering their feedback as they complete set tasks). It requires more resources to set up and run the user tests than other usability techniques, but it's particularly useful for 'novel' interfaces like the Pelagios visualisation.  Other common methods are testing with paper prototypes (e.g. if you haven't got a working site), card sorting, or having experts review the site according to usability checklists (AKA 'heuristic evaluation').

Quite a bit of work goes into preparing and piloting user testing. Once I'd written a usability testing plan, I worked with Elton, Leif and Rainer to define the key audiences ('subject specialists' and 'non-specialists with an interest in the classics/ancient world' - specialist is a relative term in this context) and create a persona to represent the specific target audience we were going to test for, the snappily-titled 'non-specialist adults with an interest in the classics/ancient world'.  In addition to focusing our minds on the usability requirements of this audience and typical tasks they might undertake on the site, this persona would be used in future design processes to ensure the project delivers user-centred designs.

We also reviewed the available functionality and interfaces to design test tasks that would help us understand where improvements were needed to make the graph visualisation more useful to its audiences.  The trick is to write tasks that make sense to the test participants that will also lead them to use key areas of the site.  I also included an initial open-ended question to elicit overall impressions of the site, as it's a good way to gather feedback on the overall design and get a sense of what the participant thinks the scope of the site might be.

I also designed a short semi-structured interview - a set list of questions to ensure consistent data collection, with the flexibility to explore interesting issues that will provide insight into user requirements, expectations and mental models as they arise.  I included questions on other sites the participants use regularly for research or leisure, as these will give some idea of their expectations of other sites. It's helpful to order your questions so the easy ones are first, as it gives the participant a chance to relax and get used to the situation.  User tests use the 'think aloud' protocol, where the participant, well, thinks aloud, sharing the thoughts and questions that are running through their minds as they use the site and go through their tasks.

Meanwhile, Elton was recruiting participants - we were aiming to include people interested (but not yet specialist) in the Classics or ancient world, to match our target audience as closely as possible. Once I had the test tasks and interview in order, I wrote a short introductory script to read at the start of each testing session.  This script helps the tester remember to give everyone the same information so the tests are consistent and the participant has a positive experience.  I then ran a pilot test with a volunteer participant, including the interview and intro script - this is one of the most important stages, because it helps you refine your language so it's clear to people new to the project, check the timing of tasks and make sure everything works as expected.

In my next post I'll explain what happens in a usability test session, and share our design persona with you. In the post after that, I'll share some of the results... Post below if you have any questions or comments!

Friday, 16 September 2011

The *Child of 10* standard

While Pelagios has been largely about building an alliance of leading ancient world research groups with the aim of linking their data in an open and transparent way, the 'front end' of our product has never been far from our minds. After all, many of the partners are also users of the data that they gather, or, if not the actual users, they have their own user groups to think about and appeal to. As a classicist myself - that is, as someone who spends most of the time reading and analysing ancient Greek texts - I want to be able to access sources easily and trust the data that I get: in other words, I want to be able to turn on the tap and find that the water runs (either hot or cold, depending on what I'm doing); I'm not interested in the plumbing that brings the water to me.

So it is with timely fashion that JISC brought to our attention a fellow jiscGEO project, called G3. In an earlier post, they had talked about a useful benchmark in user interface design being the Child of 10 standard, meaning that a child of 10 should be able to learn to do something useful with the system within 10 minutes. This indicates whether a system is “easy to use” or not.

Will our tool, the Pelagios Graph Explorer, fit the bill, I wonder? While our natural target audience are university researchers (lecturers and undergrads), given the seemingly never-ending appeal of Classics in popular culture, we would be mad not to take seriously the point that a 10 year old should be able to use our tool to find out interesting stuff about the ancient world. Indeed, the technical skills of the average Classicist researcher - not least this one - makes it imperative that we address this question. At the time of writing, then, we are currently engaging in user testing of the Graph Explorer with a sample representative audience, the results from which we will help inform our delivery of the product at the end of October (though it's already clear that this will be a work-in-progress...). All next week Mia Ridge, who has been conducting the user testing, will blog about it, setting out the methods (why we chose them, what prep is done), what actually happens in a session, and then some initial results.

But I can give a sneak preview here of the answer to that question, does the
Pelagios Graph Explorer pass the *child of 10* test. On current performance, that would be a 'no'. Which is not to say that things haven't gone well! On the contrary, the very fact that issues are being raised with what you can do now that stuff is linked shows how successful we've been in linking our data: when we started out, it simply wasn't possible to imagine an ancient world of linked data, let alone think seriously about traversing it. But now that we have linked stuff together, the bar has been raised and people - rightly - want to do more with it. This presents a challenge to all the Pelagios partners to provide as much detail as possible in their metadata, in order to allow the kind of free play that a 10 year old - or a classicist - might want.

Perhaps we could start with the name: the Pelagios Graph Explorer isn't very sexy. Suggestions on the back of a postcard, or, ideally, on this blog, welcome.

Thursday, 8 September 2011

(Re-)Using the Graph Explorer Pt. 3: Getting Your Data Inside

With a little delay, I'd like to conclude our 3-part introduction to the PELAGIOS Graph Explorer (see here for part 1 and part 2). This time we're looking at data importing.

Data Preparation - the Basics

Getting our initial batch of data from the PELAGIOS partners into the Graph Explorer was both easy and a bit of a challenge at the same time. As for the easy part: the two 'PELAGIOS principles' of...

  • aligning place references with PLEIADES and
  • using the OAC vocabulary to express them in RDF

make the 'baseline' import almost effortless. We can simply parse the RDF, pick out the OAC annotations, verify whether they point to a valid Pleiades URI - job done. Therefore, if you want to make your own data PELAGIOS-ready, complying with these two principles is really all you need to do. We've included some RDF samples into our code repository, which you can use as a reference regarding the exact RDF syntax. Furthermore, we are also working on a (yet unfinished) online application that generates maps from properly-formatted data dumps, thereby providing online validation for your data.

Structuring the Data

Now on to the advanced part... Once you have produced your OAC-formatted list of PLEIADES URIs, it's really just that: a long, flat list of places. Already that's useful for building basic visualizations - such as maps showing a dataset's geographic extent, or Google-Map-mashups where pushpin-markers link to source texts. But for the Graph Explorer, we wanted to show a more fine-grained picture of the connections within the data.

Usually, a dataset will have some sort of internal hierarchy: a subdivision of an archaeological collection into different sub-collections perhaps; or a structuring of a text corpus into books, subdivided into volumes, chapters, paragraphs and so on. Speaking in terms of the Graph Explorer, this means that when we search for, say, Memphis and Delos, it can tell us that both are mentioned on Herodotus, page 125, rather than giving us the (somewhat less useful) information that both are referenced in GAP's Google Books dataset.

Unfortunately, the 'PELAGIOS principles' don't define an explicit mechanism for expressing such structural information at the moment. Nonetheless our partners' datasets often reflect hierarchy in the design of their resources' URIs: for example, GAP's Google Book URIs carry book IDs and page numbers; annotations provided by Perseus include subdivisions into individual chapters, sections, poems, etc.

To make the Graph Explorer's output more useful, I therefore exploited this implicit information to build the hierarchy in the import script. The import scripts also generate human-readable labels for the hierarchical dataset units, based either on consultation with partners (e.g. we simply agreed on how we would name SPQR's sub-collections and coded that into the import script), or additional metadata in the data dumps (e.g. GAP has rdfs:labels included in the data dump to define the labels).

In hindsight, it may have made sense to think about such an additional 'principle' to cover this (e.g. by including an RDF vocabulary like VoID). But then again, at the start of the project the discussion was very much revolving around the groundwork of getting datasets aligned at all, and the Graph Explorer was still yet a vague idea. (Not to mention that the sheer diversity of the datasets would make the development of a consistent, reasonably fine-grained description scheme a project in its own...)

Importing your Data

The bottom line of all this is: getting a hierarchical, custom-labeled dataset into the Graph Explorer will still require some manual tweaking (read 'coding effort') at the moment. With a little bit of Java development skills, the process should be fairly viable, however: the essential importer classes are fairly well documented, and there are a number of code examples in the repository.

By the way: we've implemented most of the importers in the Java-based scripting language Groovy, which worked really well for us and helped us trim down the size of the import scripts by some lines of code, compared to plain old Java. In particular I'd recommend taking a look at the GAP and Perseus importer source code to get started.

P.S.: The online demo of the PELAGIOS Graph Explorer is available here. Screencasts explaining the basic usage are in this blogpost: The PELAGIOS Graph Explorer: A First Look