utt/_old/app/dist/files
2009-10-17 14:48:21 +02:00
..
COPYRIGHT Move old files to _old dir. 2009-10-17 14:48:21 +02:00
LICENCE Move old files to _old dir. 2009-10-17 14:48:21 +02:00
README Move old files to _old dir. 2009-10-17 14:48:21 +02:00

General information
*********************

UAM Text Tools (UTT) is a package of language processing tools
developed at Adam Mickiewicz University. Its functionality includes:
* tokenization
* dictionary-based morphological analysis
* heuristic morphological analysis of unknown words
* spelling correction
* pattern search
* sentence splitting
* generation of concordance tables
                     
The toolkit is destined for processing of raw (not annotated)
unrestricted text for any conceivable purpose.
                        

Installation
**************

1) unpack the UTT tar archive
2) in the same directory, unpack the tar archives of all UTT dictionary modules you have
3) run
	make install
   in the root directory of the installation
4) add the bin directory to the PATH variable


Requirements
*************

* File::HomeDir

  the Perl package File::HomeDir must be installed
  (to install the package, run 'perl -MCPAN -e shell' and write
   'install File::HomeDir' after the 'cpan>' prompt appears)
   
* flex

  to run the ser component, flex must be installed in your system

* ruby

  to run the tre component, ruby must be installed in your system

* locale pl_PL.iso-8852-2

  the locales pl_PL.iso-8859-2 (pl_PL in short) must be installed
  and set while using UTT with the Polish module. The text you 
  process with UTT must be encoded in iso-8859-2.