Tree Distance module for Headliner

Package Description

This package provides implementations of various algorithms for deterimining the 'distance' between two trees.

Last Modified: Nov 4, 2007

Installation and Dependencies

You need to have Java 1.5 installed.

Usage

You can run the tool directly from the command line by executing:
java -classpath treedistance.jar headliner.treedistance.TestZhangShasha "<tree1>" "<tree2>"

Trees are defined as a series or ordered edges of the form -. Edges are separated by semicolons. The first nodeLabel mentioned is assumed to be the root of the tree. NodeLabels can be any string (sans the semicolon character). The order in which you list them is the ordering that siblings will have in the tree. In the first example below, the node b precedes c. Zhang and Shasha's algorithm is for ordered trees.
Examples:

The onus is on the user to ensure that the edges specifiy a tree (that is, no cycles). It may be the case that you wish to have multiple distinct nodes with the same "label" (for the purposes of aligning nodes). For example, a phrase structure parse tree may have multiple Deteriminer Part-of-Speech nodes. Similarly, a dependency tree might have multiple determiners, and other closed class words.
You can allow distinct nodes to be specified by appending a ":" which turns the node label into a unique identifier. However, when it comes to matching, only the string before the (last) colon is treated as the node label.
Example: These are identical trees

Licence:

Copyright (C) 2004 Stephen Wan

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

To contact the author, send mail to swan@ics.mq.edu.au

@author Stephen Wan