The VERDI System

Here you can find an experimental evaluation of the Web verification system VERDI (VErification and Rewriting for Debugging Internet Sites), which is a system prototype being able to detect missing and incomplete Web pages w.r.t. a given formal specification.

The system implements a rewriting-based, formal specification language which allows us to define syntactic as well as semantic properties of Web sites, which are given as finite sets of semistructured expressions, and a verification technique that obtains the requirements not fulfilled by the Web site, and helps to repair the errors by finding out incomplete information and/or missing pages. The methodology is based on a novel rewriting-based technique, called partial rewriting, in which the traditional pattern matching mechanism is replaced by tree simulation, a suitable technique for recognizing patterns inside semistructured documents.

VERDI is written in DrScheme v205 , the implementation consists of about 80 function definitions (approximately 1000 lines of source code). It includes a parser for semistructured expressions and Web specifications, and several modules implementing the user interface, the partial rewriting mechanism and the verification technique. The system allows the user to load a Web site consisting of a finite set of semistructured expressions together with a Web specification. Additionally, he/she can inspect the loaded data and finally check the Web pages w.r.t the Web site specification. The user interface is guided by textual menus, which are (hopefully) self-explaining.

You can find the VERDI distribution containing the source code and some examples here

A new release of the system VERDI, which includes a full parser for XHMTL pages and some examples for the verification of real Web sites, is available here .

In the following we provide some examples in order to show how to use it.

Example 1

Consider a Web site containing information about a research group such as group member affiliation, scientific publications, personal data. A Web site W containing such information is provided here .
We are interested to first model and, then, check the following properties in W.

A VERDI Web specification I which is suitable for modeling these properties follows.

                  member(name(X),surname(Y)) -> #hpage(name(X),#surname(#Y),status())
                  hpage(status(professor)) -> #hpage(#status(#professor),teaching())
                  pubs(pub(name(X),surname(Y))) -> #member(name(X),surname(Y))

By executing the VERDI system on Web site W and Web specification I, we are able to find the following errors.
Warning: page  #hpage(name ('Woody'),#surname(#'Allen'),status()) is missing!

Warning: page  hpage(name('John'),surname('Ford'), 
                     status('Professor'),phone ('3333'), 
                     hobbies(hobby('gardening'),hobby('reading')))
         
         is incomplete w.r.t. requirement 
                
               #hpage(#status(#'Professor'),teaching())

Warning: page  members(member(name('John'),surname('Ford'),status('Professor')) 
                       member(name('Boris'),surname('Karloff'),status('Technician')) 
                       member(name('Woody'),surname('Allen'),status('Student'))) 
         
         is incomplete w.r.t. requirement 

               #member(name('Gwyneth'),surname('Paltrow'))


Example 2

Consider the Web site W that you can download here . We want to model the following properties. A VERDI Web specification I which models these requirements could be the following.

                 html(head()) -> #html(#head(title()))
                 link(X) -> #html(#head(#title(#X))) 
                 list() -> #list(listelement())

VERDI is able to detect the following errors in W w.r.t. I.
Warning: page  #html(#head(#title(#'link1'))) is missing!

Warning: page  #html(#head (#title(#'link2'))) is missing!

Warning: page  html(head('errorintitle'),body(link('link2'))) 

         is incomplete w.r.t. requirement 

               #html (#head(title()))

Warning: page  #html(#head(#title(#'link3'))) is missing!

Warning: page  html(head(title('example5')),
                    body(image('image1'),
                         link('link3'), 
                         list())) 

         is incomplete w.r.t. requirement 

               #list(listelement())


Example 3

Consider a Web site W of an electronic store which contains a Web page for each article to be sold. Article Web pages display information about the name, the features and the price of the product. Moreover, the Web site also includes a Web page which summarizes the prices of all the articles (the price list). Here you can find Web site W. Now, we are interested in checking the properties below. A VERDI specification for W might be the following.

                 article(name(X),price(Y)) -> #pricelist(#articledata(price(Y),#name(#X)))
                 article() -> #article(features(feature()))

By running VERDI on Web site W and the specification system, we get the following error messages.
Warning: page  #pricelist(#articledata(price('30'),#name(#'C'))) is missing!

Warning: page  article(price('200'),name('D')) 

         is incomplete w.r.t. requirement 

               #article(features(feature()))

Warning: page  article(name('C'),price('30'),features('NotAvailable')) 
         
         is incomplete w.r.t. requirement 
         
               #article(features(feature())))