The goal of this research is to study effectiveness of methods that can be used to assess accessibility of web sites and applications.
In particular a method called "Barrier Walkthrough" has been devised, has been suggested to a group of students, and experimentally evaluated in order to compare its effectiveness with that of the more usual method called "Conformance Testing".
The barrier walkthrough method aims at determining the level of accessibility of a web site, defined as:
web sites are accessible when individuals with impairments can access and use them as effectively and secure as people who are not impaired (Slatin and Rush, 2003)
One underlying hypothesis is that nowadays web accessibility is poorely achieved also because it can be poorely tested and measured. With more knowledge about different methods, and their strengths and weaknesses, this can change.
In general, comparison of methods should be aimed at understanding their validity, usefulness, reliability and efficiency, which are defined as:
More details are available from the paper A Comparative Test of Web Accessibility Evaluation Methods.
The goal is the formally compare effectiveness and reliability of Barrier Walkthrough (BW) against Conformance Review (CR) using the technical requirements entailed by the Italian Accessibility Law.
An experiment was set up whereby 12 students of mine (who attended a series of lecture on web accessibility, including on BW) evaluated, each, two pages using BW and another two pages using CR. Websites, methods and order were randomized to counterbalance fatigue and learning effects.
Data of the experiment are available as a compressed tar file (for other researchers that might be interested in futher studies).
We found significant differences on (please refer to the paper mentioned above for details):
More details are available from the paper Web Accessibility Testing: When the Method is the Culprit.
Nineteen accessibility reports produced by students teams were analyzed; 11 were based on barrier walkthrough. A judge (myself) went through the reports, identified the reported problems and classified them as being true problems or false ones (i.e. mistakes made by students). The judge also classified the severity of the problems into the scale 1-2-3 (3 is the worst one).
Data were analyzed by identifying:
These variables are defined as:
Wrt novice evaluators, and the limited sample: