LINGUIST List 19.166

Tue Jan 15 2008

Software: Release of Parallel Treebank Corpus and Tool

Editor for this issue: Hannah Morales <hannahlinguistlist.org>


        1.    Martin Volk, Release of Parallel Treebank Corpus and Tool


Message 1: Release of Parallel Treebank Corpus and Tool
Date: 15-Jan-2008
From: Martin Volk <volkling.su.se>
Subject: Release of Parallel Treebank Corpus and Tool
E-mail this message to a friend

The Computational Linguistics Group at the Department of Linguistics atStockholm University makes available an aligned parallel treebank (calledSMULTRON) and an accompanying alignment and query tool (called theStockholm TreeAligner).

SMULTRON (Stockholm MULtilingual TReebank) is a parallel treebank andcontains around 1000 sentences in English, German and Swedish. Thesentences have been PoS-tagged and annotated with phrase structure trees.The trees have been aligned across languages on sentence, phrase and wordlevel. Additionally, the German and Swedish monolingual treebanks containlemma information.

SMULTRON is freely available for research purposes fromhttp://www.ling.su.se/DaLi/research/smultron/index.htm

The Stockholm TreeAligner allows the user to view alignment links acrosstwo parallel trees. It also allows the user to create and modify such linksbetween corresponding nodes or words in two treebanks.

The Stockholm TreeAligner displays trees from input files in TigerXMLformat with node labels, edge labels, and crossing branches, making ituseful for browsing TigerXML files.

Moreover the Stockholm TreeAligner allows querying parallel treebanks(inspired by the TIGERSearch query language but additionally allowingalignment queries). Search results are highlighted in a graphical display.

The Stockholm TreeAligner is free software and can be downloaded fromhttp://www.ling.su.se/dali/downloads/treealigner/index.htm

Linguistic Field(s): Computational Linguistics                             Syntax                             Text/Corpus Linguistics