Yeller Gets a Redesign

This is a blog about the development of Yeller, The Exception Tracker with Answers

Read more about Yeller here

I’ve been hard at work recently to get a redesign of Yeller’s UI out of the door. I had three big priorities with the new UI design:

  1. Speed
  2. Understanding
  3. Density

Here’s a little about what I mean by those, and how Yeller’s new UI works to accomplish these goals:

Speed

Whilst Yeller has a bunch of heavy performance requirements on the backend, good performance of the website is key to helping you learn about your errors faster. Great UI has to be fast in my book - there’s no getting around it. Sites that are slow don’t get used as much, or at the very least make their users very angry.

I’ve done a whole pile of tuning on Yeller’s web UI, both on the frontend, and on the server. The server was relatively well tuned before (to the tune of 100ms or so per page render), but that’s dropped down to around 60ms with this release. Here are the server latency histograms as they stand right now (numbers are all milliseconds):

the dashboard is on the left, the individual error page is on the right

On the frontend side of things, I’ve switched over to a upgraded nginx to take advantage of some ssl performance improvements, and also moved over to using google’s nginx pagespeed plugin, which gives you 40+ frontend performance tuning settings that rewrite your pages to e.g. concatenate javascript files, concatenate and minify css, compress images and so on.

This was a big improvement for Yeller’s performance. It took about 10 minutes to do the whole thing that caused a huge performance boost for the end user.

Understanding (the job to be done)

The Jobs to Be Done way of thinking was a cutting set of revalations for me. Understanding the jobs your customers are trying to do not only helps you decide what the product should do, but also helps with UI design on each individual page.

I think of jobs to be done for UI design as figuring out the situations and stories for when people use your product. Not just “what do people want to do on this page”, but “when and why do people come to this page, and in those situations what are they trying to accomplish”

Yeller has basically two pages that are looked at a lot more than the rest of the site, so that’s where I’ve been focussing my efforts. Here’s how the analysis breaks down:

page 1: the dashboard

This interface has changed the most in this redesign. Here’s what’s changed:

  • the obvious change: Yeller now uses a list, rather than cards. Lists present you with much easier scannability of the exceptions, the human eye is well optimized for picking up patterns whilst looking down a vertical list. Looking for patterns and comparing data across cards on the other hand, means you can to scan in two directions at once, and is much more difficult.

This new design means that you can pick out what errors are important to fix right now, and leave those that aren’t important to your business right now.

  • deploys are now surfaced inline in the dashboard, so you can see how the deploys relate to the exceptions in time.
  • the search filters are now detailed as a sentence at the top of the list rather than being grouped in the top navigation bar
  • lastly (and this is UI wide), the in-project navigation (between errors/ deploys/and settings) has moved into the top bar. This let the content in the page move up, and removed a bunch of blank, non-useful space.

This page has one primary job to be done:

What Exceptions are bad?

This job expresses the desires of a bunch of situations - they all come to this page to answer this question

Here are some situations this page is used in:

  • just after shipping a new deploy, to see if things are busted
  • in larger, more mature organizations, by higher up folk, to assign exceptions to people
  • I’ve got 500 exceptions open right now, how do I prioritize which one to fix
  • By developers at irregular intervals, just to find new work to do

Secondary Jobs are quite numerous for this page - is this exception related to some other exceptions? - I’m going to clear out all the unhandled exceptions we have in production. Where should I start?

page 2: the diagnosis page

This page has two primary jobs to be done:

Should I be fixing this right now?

This is a followup from the list of exceptions - you think you should be fixing this right now, but is there something that you see in more detail that means you don’t have to do that?

How do I fix this exception?

This job is the core of Yeller. How you design a UI that lets users sift through potentially huge amounts of information? This job took a bunch of research, and more learning about UI design to fulfill. That leads us nicely into my last goal for this redesign:

Density

Density is a key principle espoused by Edward Tufte through his books. He neatly summarizes it like this:

Remove non-data ink

(he was talking about printed design there, but the phrase applies just as neatly to on-screen design)

Lately this idea has had a resurgence in UI design circles, but expressed as the (I think) less deep phrase:

Remove navigation

Yeller’s old exception diagnosis page, and the list of exceptions had quite a bit of space wasted for navigation and quite a bit of just blank space that prevented you from seeing the data. Tufte goes on to say at a different point:

Show the data

Requiring users to scroll or navigate to find the information they need is a waste of their time.

The last inspiration for this bout of redesign was Bret Victor’s older Magic Ink paper. This is one of my favorite articles on UI design ever, arguing (persuasively!) for software to be designed with less input, and a deep focus on presentation. Yeller fits nicely into this - once you’ve set up the client libraries to send it exceptions, you don’t need to manipulate anything else, so it’s purely a tool that helps you understand information.

Inspired by that paper, I directly stole Bret (and Tufte’s!) technique of using sentences to present abstract information, in two key places:

on the dashboard to explain the current search: on the error page, to explain the surrounding data from the current error:

I really like this technique, it explains abstract information in an ancient and very human form.

So that’s Yeller’s new UI design. Three big improvements to help you diagnose your exceptions faster, and figure out what’s important.

You should try out Yeller today. It will save you a significant amount of time when it comes to debugging exceptions that you have hit in production. Setup takes 30 seconds for Ruby, Rails, Java, Clojure and Go - there are native client libraries for each of those.

This is a blog about the development of Yeller, the Exception Tracker with Answers.

Read more about Yeller here

Looking for more about running production applications, debugging, Clojure development and distributed systems? Subscribe to our newsletter: