Walking Talking SVG

Speaking Calculator

This is an example of accessibility features available using SVG, along with a Text-To-Speech engine. This calculator was designed to scale larger as the window is expanded, so it will be easy for the user to hit the right button. Each button, and the text area, can be accessed by “tabbing” (using the q button) and selected with the space or enter keys. As a button is selected via mouse or keyboard, the value is read to the user, as is the total of the operation, and the button is highlighted with colors chosen to contrast well with a variety of color-deficient visions.

This sample requires Windows IE, enhanced with the Adobe SVG viewer (available at http://www.adobe.com/svg/viewer/install/main.html), and the Microsoft TTS plug-ins, available as 2 downloads weighing in at under 2MB together:

Instructions and Explanation

This example uses the upcoming “focusable” and “nav-index” attributes from SVG 1.2 on the buttons, along with a small script to allow them to work in ASV3. In a nutshell, “nav-index” is an attribute applied to any element that you wish to allow the user to select using the keyboard, the value of which is a positive integer (a “counting” number); “tabbing” from one element to another simply takes you to the element with the next sequential nav-index integer value, in sequence, or back to the starting point (number 1) or ending point (the highest number) when you have cycled through the whole set from either direction. Because ASV3 does not give access to the tab key [note that this will change in UAs that are compliant to the SVG 1.2 Specification], I have had to make custom key mappings. Use q and shift+q to tab between buttons, and space or enter to select the current button.

Directional Navigation

One challenge to navigation using the nav-indices is the tedium of cycling through each button to get to the desired choice. In order to facilitate navigation, I am using nav-index properties that have not yet been defined for SVG: “nav-up”, “nav-down”, “nav-left”, and “nav-right” attributes —from CSS3's Directional Navigation mapped to the arrow keys and to the w, s, a, and d keys, respectively. Since the arrow keys are often used for other purposes (scrollbars, incrementing or decrementing values, moving target elements around the screen), a better directional navigation mapping might be shift+arrow keys. Please note that to use the keyboard, you must click on the SVG image area and ensure that the mouse pointer remains over it to ensure focus.

This example assumes a grid arrangement of the items to be navigated, a fairly conventional and reasonable practice; the particular mechanism that I used was slightly different than that specified in CSS3; rather than using a fully-specified URI, I chose to use the nav-index of the target element, since this is more consistent with the nav-index system, is less verbose, does not require that the focus target have an “id” attribute, and would be easier to assign and track programmatically. I manually assigned to each button's directional focus attributes the nav-index values of its neighbors to the north, south, east, and west. In a simple case such as this calculator, this works fine; my particular implementation, however, is further predicated on a static relationship between elements, with no movement relative to neighbor elements and with no disabled or disappearing intervening elements. In the case of static but disabled or hidden elements, this approach still works; a check could be made as to the status of the target element, and its neighbor in the appropriate direction can be subsequently targeted in its stead. I think that for most cases, my current approach is workable, often more elegant than using a URI, and is more consistent with the current nav-index concept; thus, I would like to see it find its way into the SVG or CSS3, or DOM Specifications. (I am not sure why navigation details are under the CSS umbrella, rather than XLink, or DOM, but I suppose it could be considered a type of presentation, and whatever gets the job done works for me.)

As I said, static content is less challenging and more predictable. On the other hand, when the focusable elements can move around, as might be the case for the “auto” nav value, basing directional navigation on the nav-index is not an option; the best approach would probably be to find the focusable element that is closest to the last focused element and nearest to the axis of the navigation direction; some compromises may have to be made when deciding whether the directionality or the proximity is the most important. Below is an example of a directional navigation that does not use set values for the nav-[direction] attributes, but calculates the proper target focus instead. Focus can be changed by mouse-clicking, by “tabbing”, or by using the direction keys. As above, you can tab through the elements with the q and shift+q keys or chose a direction with the w, s, a, and d keys. Unlike the calculator, you can move any focusable element around with the mouse or with the arrow keys (note: since this SVG is embedded in an HTML page with a scrollbar, the arrow keys may not work; to see this in action, just right-click on the SVG and choose “View SVG”). Since the location of the focusable items is dynamic, this demonstrates how the “auto” value might work. As in the calculator above, I generate an array of all the focusable elements, which in this case are the <circle> elements; the rounded <rect> elements do not have the “focusable” attribute, nor does the background canvas. From this list, I find each one that fits the directional criterion (i.e. for “nav-left” the “x” value of the potential focus targets must be less than that of the current focus item, and for “nav-down” the “y” value of the targets must be greater), then use the Pythagorean distance formula to find the closest of these. Occasionally this leads to somewhat unintuitive —but technically correct— results, such as a jumping to an element that is closer to the centerpoint of the origin element, but further than other elements from the axis perpendicular to the navigation direction; but it works well on the whole, and might well be suitable to be included as a basic feature of a UA.

The following is a tiny bit more complicated, using <circle> and <text> elements nested in an <svg> element as the focusable items. An additional navigation option is included: pressing the 1 - 9 keys will focus the relevant item. For double-digit indices, the UI might check for a rapid succession of number keys, or test for a specific control key being pressed down before the number sequence and then up to mark the end (alt-down_1_6_alt-up equals nav-index 16). Also, in this example, the rectangles can be moved with the mouse, but cannot receive focus; the rectangles show that only focusable elements can be navigated to via the keyboard, regardless of other manipulations and positions. The technique I used of nesting elements within a container also points out another potential flaw of the “focusable” attribute: where the attribute is applied. In this instance, in order to get the focusable element when I clicked on it, I simply had to reference the event target's parentNode, but had the focus item been more complex, I would have had to walk up the DOM tree, testing each node along the way for the “focusable” attribute, until I reached the proper node; this would not be too onerous a task, but it demands that something be done in script, and focusability is one of those things that should simply work in the UA. For a complex widget, created with XBL, multiple levels of nesting would not be unusual. I'm not certain how dire a problem this could turn out to be, or what could be done about it, but one possible solution is to have the mouse event bubble up to its focusable parent. This would be analogous to the currentTarget attribute of an event. At the present time, actually, it is not clear what the inheritibility of “focusable” is, although it is explicitly applicable to container elements; this is a property unique to SVG, so it should definitely allow for objects made up of multiple elements.

Hierarchical Navigation

Another approach, in most ways compatible with this cardinal direction scheme, is a hierarchical focus concept demonstrated by Jan-Klaas Kollhof. This concept rather handily addresses the need for navigation within dialogs. This indicates a manner by which tabbing only shifts you within a certain level of nesting (say, within a single modal dialog). To shift to another hierarchical level of nesting, the user presses a specific up- or down- key sequence (shift+tab springs to mind).

I would extend this in a manner that does not rely on the actual nesting level of the elements in question, since this unnecessarily restricts the author; instead, I would like to simply add the hierarchy information to the nav-index, using a parameterized list syntax rather than a simple integer. For example:

  • nav-index='1'
    • nav-index='1 1'
    • nav-index='1 2'
      • nav-index='1 2 1'
      • nav-index='1 2 2'
      • nav-index='1 2 3'
      • nav-index='1 2 4'
    • nav-index='1 3'
      • nav-index='1 3 1'
      • nav-index='1 3 2'
  • nav-index='2'
    • nav-index='2 1'
    • nav-index='2 2'
    • nav-index='2 3'
  • nav-index='3'
This would also allow for easier expansion of individual document fragment “dialogs”, and in the case of automatic direction selection, would allow the navigation to be constrained within a certain nested level. Further, I would propose that a style cue be given when hierarchical options exists. The first thing that comes to mind as a visual cue for <text> elements is an underline when the element has child items, an overline when it has a parent, and both when applicable; another might be color-coding (as with visited versus unvisited links) applied either to the element itself or as a shape outline; for mouseovers, a special cursor might be used; for aural styles, a rising or falling tone might work. Below is an example implementation inspired by Jan-Klaas´, but using a flat structure (that is, without true nesting of the child elements) with hierarchy enforced solely by the parameterized nav-index. Pragmatically, I would recommend using a nested structure, of course, for both semantic and practical reasons (moving around a dialog with multiple controls, and suchlike), but this demo shows that it is not necessary; this scheme allows for flexibility in terms of how relatively deeply nested individual sibling items are, and how the document is organized. To step through each focusable item in the same group and level, press q for forward and shift+q for backward; to go down a level, press ctrl+q, and to go upwards, press shift+ctrl+q. Style coding of the focused element options is as follows: solid black stroke if no parents or children are available; gold stroke if a parent is available; and dashed stroke if children are available.

Future Directions and Critiques

While as always I applaud SVG´s versatility and accessibility, there is room for improvement. If directional navigation is included in the Spec (as of the March 18, 2004 Working Draft, there is no indication of this), and there is some consideration for hierarchical navigation, I think it will make a very robust accessible application architecture. More details need to come out about such things as inheritance and how focus can be gained at a parent level, and what sort of default behavior is to be expected when an element that has focusable=“true”, and a nav-index value, but which also has display=“none”, visibility=“none”, or opacity=“0”, or when the focusable element is outside the current viewport. I anticipate that display=“none” would simply not take focus, since the element isn't rendered, but the other three are trickier. It would be confusing to a user to tab onto one or more invisible elements (as on a hidden dialog). While the author could simply take care to set “focusable” to “false”, I think this should be taken care of by the UA in most cases. This implies that there might be an explicit refocusing on the next available element, as I described above. When the focusable element is simply offscreen, I could see 2 reasonable behaviors: pan to the element; or zoom out to a point where it´s visible. I believe that a link in SVG does the former, so that would be fine, though it would be nice if the author or user could specify which behavior they would prefer.

But man does not perceive by graphics alone. Since, as far as I can find, only Microsoft´s Internet Explorer on the Windows platform offers this speech synthesis API for free, there is a severe limitation on how accessible such examples as this can be cross-platform and cross-browser. Even this example is flawed from the perspective of ease of implementation or use, since it requires the SVG to be embedded in an HTML document with the ActiveX speech plug-in, and cannot stand alone. The Opera browser now has speech-recognition control, but does not seem to have speech synthesis. As far as I can find, there is not a free implementation of a standardized manner to actively send text strings to be voiced via a browser speech synthesizer; screen readers do not allow for this fine degree of content control. This may change now that VoiceXML and SSML are W3C Recommendations. I have been talking with several people who share my concerns, and we hope to have a hand in the creation of a universal basic standards-based speech-synthesis browser plug-in. Of course, not everyone would want to have content voiced at them (I, for one, find my calculator above rather annoying), and those that do may not want to be innundated with it all the time; thus, there should be a simple way to activate or deactivate this functionality to any user´s preference, either in the viewer or the voicing plug-in, or in the application itself.

All of the examples above are predicated on the notion that accessibility for a set of people with limited abilities means greater accessibility for everyone else as well; all of the keyboard navigation could just as easily be used by someone with a preference for keyboard input as someone who has difficulty using a mouse. And, of course, it is equally friendly to people who do use a mouse. I think it is crucial to convince people that it is as easy (or easier) to use accessible techniques to make a good product as it is to leave accessibility out. And a hallmark of good product is one that is intuitive and easy for any given user.


Many thanks to Jim Ley and Dave Pawson, who first experimented with SVG+TTS, and from whose code I learned to use the speech synthesis engine. Please see Jim's great examples. Thanks also to Jonathan Chetwynd and Jan-Klaas Kollhof, who got me thinking about these issues.

Is this page a mess?
Web Standards shift like wind
Please upgrade browsers