u-versions (utf8) of principal programs

This commit is contained in:
Tomasz Obrębski 2016-12-05 20:13:07 +01:00
parent 4a25a92db7
commit fb40f34eaa
61 changed files with 1742 additions and 1959 deletions

2
.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
*~
\#*\#

View File

@ -42,4 +42,4 @@ ifdef DOC_DIR
endif endif
clean: clean:
rm utt.info utt.dvi utt.html utt.pdf utt.ps || true rm -f *.info *.dvi *.html *.pdf *.ps *.log *.aux

209
doc/dgp.tex Normal file
View File

@ -0,0 +1,209 @@
\documentclass[a4paper]{report}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\title{DGP}
\author{Tomasz Obrębski}
\begin{document}
\maketitle
\chapter{Introduction}
\chapter{Grammar}
\chapter{Parsing algorithm}
\chapter{Input}
Wejście dla parsera przygotowuje się w następujący sposób:
\begin{verbatim}
cat text.txt | tok | sen | lem | canonize | gph | dgp ...
\end{verbatim}
Plik wejściowy
dgp bierze na wejściu graf słów (wordgraph). Numery wierzchołków tego
grafu to wartości pola gph. Pole to jest wprowadzane do pliku przez
program gph.
Poza polem gph, dgp odczytuje też wartość pola lem.
\chapter{Output}
Format:
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:4:1,2,3 dgp:6;s
\begin{verbatim}
dgp:<node>;<saturation>[;<links>][;<sets>][;<constraints>]
\end{verbatim}
\begin{description}
\item[{\it node}] Dependency graph node number.
\item[saturation] The information whether the node is saturated. A
node is saturated if the list of required connections for this node
is empty, it is unsaturated otherwise.
\item[links] The comma separated list of connections. For each node
either the list of its dependents or the list of its heads may be
printed, or both (this dependes on the value of the \verb|--info|
parameter).
\item[sets] For each node, the sets of all its left neighbours,
transitive left heads, transitive left dependents, and nodes visible
on the left can be printed. (This information is useful for fast
tree generation.)
\item[constraints] the information on constraints imposed on the
node. Constraints follow from the SGL and REQ grammar rules and have
the form of a comma-separated list of dependency types required by
the node and forbidden for the node. The elements of the list have
the following format:
\begin{tabular}{ll}
\verb|!|{\it dependency type} & {\it dependency type} is required\\
\verb|&|{\it dependency type} & {\it dependency type} is forbidden
\end{tabular}
\end{description}
Wynikiem pracy dgp jest graf zależności. Graf ten może zawierać
(zwykle tak jest) więcej wierzchołków niż graf wejściowy.
* numer wierzchołka w wyjściowym grafie zależności
Numery wierzchołków w wyjściowym grafie są inne. Podczas działania
parser tworzy kopie (klony) wierzchołków wejściowych. Dzieje się tak w
sytuacji, kiedy do wierzchołka (jako nadrzędnika) dowiazywana jest
zależnośc objęta ograniczeniami. Ograniczenia wynikają z reguł
gramatyki SGL i OBL.
SGL - zależność jednokrotna
OBL - zależność obligatiryjna
node saturation \verb|s| or \verb|u|
s - wierzchołek nasycony
u - wierzchołek nienasycony
Wierzchołek nienasycony to taki, któremu brakuje obowiązkowy podrzędnik.
Obowiązkowe podrzędniki określane są w regułach OBL gramatyki.
connections
* connection list
connections are lista zależności zawiera oddzielony przecinkami ciąg wyrażeń
--<typ>-<w1>/<w2>
jeśli w wywołaniu programu dla parametru --info podano wśród wartości 'd'
(od dependents)
lub
++<typ>-<w1>/<w2>
jeśli w wywołaniu programu dla parametru --info podano wśród wartości 'h'
Może też zawierac oba typy wyrażeń, jeśli podano zarówno 'd' jak i 'h'.
Wyrażenie
--<typ>-<w1>/<w2>
oznacza możliwość istnienia zależności typu <typ>, której nadrzędnikiem jest aktualny wierzchołek, a podrzędnikiem
wierzchołek <w1> (o <w2> za chwilę).
pies goni czarnego kota w butach.
\begin{figure}
\begin{verbatim}
0000 00 BOS *
0000 04 W Pies lem:pies,N/CnGaNs
0004 01 S _
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp
0009 01 S _
0010 08 W czarnego lem:czarny,ADJ/CaDpGapNs
0010 08 W czarnego lem:czarny,ADJ/CgDpGainpNs
0018 01 S _
0019 04 W kota lem:kota,N/CnGfNs
0019 04 W kota lem:kot,N/CaGaNs
0019 04 W kota lem:kot,N/CgGaNs
0023 01 S _
0024 01 W w lem:w,P/Cal
0025 01 S _
0026 06 W butach lem:buta,N/ClGfNp
0026 06 W butach lem:but,N/ClGiNp
0032 01 P .
0033 01 S \n
0034 00 EOS *
\end{verbatim}
\caption{output of \verb@tok | sen | lem | canonize@}
\end{figure}
\begin{figure}
\scriptsize
\begin{verbatim}
0000 00 BOS * gph:0:
0000 04 W Pies lem:pies,N/CnGaNs gph:1:0
0004 01 S _
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:2:1
0009 01 S _
0010 08 W czarnego lem:czarny,ADJ/CaDpGapNs gph:3:2
0010 08 W czarnego lem:czarny,ADJ/CgDpGainpNs gph:4:2
0018 01 S _
0019 04 W kota lem:kota,N/CnGfNs gph:5:3,4
0019 04 W kota lem:kot,N/CaGaNs gph:6:3,4
0019 04 W kota lem:kot,N/CgGaNs gph:7:3,4
0023 01 S _
0024 01 W w lem:w,P/Cal gph:8:5,6,7
0025 01 S _
0026 06 W butach lem:buta,N/ClGfNp gph:9:8
0026 06 W butach lem:but,N/ClGiNp gph:10:8
0032 01 P .
0033 01 S \n
0034 00 EOS * gph:11:9,10
\end{verbatim}
\caption{Word graph representation: sentence annotated with gph.}
\end{figure}
\begin{figure}
\scriptsize
\begin{verbatim}
0000 00 BOS * gph:0: dgp:0;s;;
0000 04 W Pies lem:pies,N/CnGaNs gph:1:0 dgp:1;s;;
0004 01 S _
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:2:1 dgp:2;s;;
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:2:1 dgp:3;s;--subj-1/2;!subj
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:2:1 dgp:8;s;--cmpl_ga-7/3,--cmpl_ga-10/3,--prep-11/8;!subj!cmpl_ga
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:2:1 dgp:9;s;--cmpl_ga-7/2,--cmpl_ga-10/2,--prep-11/9;!cmpl_ga
0009 01 S _
0010 08 W czarnego lem:czarny,ADJ/CaDpGapNs gph:3:2 dgp:4;s;;
0010 08 W czarnego lem:czarny,ADJ/CgDpGainpNs gph:4:2 dgp:5;s;;
0018 01 S _
0019 04 W kota lem:kota,N/CnGfNs gph:5:3,4 dgp:6;s;--prep-11/6;
0019 04 W kota lem:kot,N/CaGaNs gph:6:3,4 dgp:7;s;--mod-4/7,--prep-11/7;
0019 04 W kota lem:kot,N/CgGaNs gph:7:3,4 dgp:10;s;--mod-5/10,--prep-11/10;
0023 01 S _
0024 01 W w lem:w,P/Cal gph:8:5,6,7 dgp:11;u;;&pcmpl
0024 01 W w lem:w,P/Cal gph:8:5,6,7 dgp:13;s;--pcmpl-12/11,--pcmpl-14/11;!pcmpl
0025 01 S _
0026 06 W butach lem:buta,N/ClGfNp gph:9:8 dgp:12;s;;
0026 06 W butach lem:but,N/ClGiNp gph:10:8 dgp:14;s;;
0032 01 P .
0033 01 S \n
0034 00 EOS * gph:11:9,10 dgp:15;s;;
\end{verbatim}
\caption{dgp output}
\end{figure}
\end{document}

14
doc/dgp/WARNINGS Normal file
View File

@ -0,0 +1,14 @@
No implementation found for style `fontenc'
? brace missing for \contentsline
couldn't convert character `tilde into available encodings
...set $ACCENT_IMAGES to get an image
No number for "outputof"
Failed to convert image /tmp/l2h18811/image003.ps
Failed to convert image /tmp/l2h18811/image002.ps
Failed to convert image /tmp/l2h18811/image001.ps

35
doc/dgp/dgp.css Normal file
View File

@ -0,0 +1,35 @@
/* Century Schoolbook font is very similar to Computer Modern Math: cmmi */
.MATH { font-family: "Century Schoolbook", serif; }
.MATH I { font-family: "Century Schoolbook", serif; font-style: italic }
.BOLDMATH { font-family: "Century Schoolbook", serif; font-weight: bold }
/* implement both fixed-size and relative sizes */
SMALL.XTINY { font-size : xx-small }
SMALL.TINY { font-size : x-small }
SMALL.SCRIPTSIZE { font-size : smaller }
SMALL.FOOTNOTESIZE { font-size : small }
SMALL.SMALL { }
BIG.LARGE { }
BIG.XLARGE { font-size : large }
BIG.XXLARGE { font-size : x-large }
BIG.HUGE { font-size : larger }
BIG.XHUGE { font-size : xx-large }
/* heading styles */
H1 { }
H2 { }
H3 { }
H4 { }
H5 { }
/* mathematics styles */
DIV.displaymath { } /* math displays */
TD.eqno { } /* equation-number cells */
/* document-specific styles come next */
DIV.navigation { }
SPAN.normalfont { }
PRE.preform { }
SPAN.it { }
SPAN.arabic { }

72
doc/dgp/dgp.html Normal file
View File

@ -0,0 +1,72 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!--Converted with LaTeX2HTML 2008 (1.71)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>DGP</TITLE>
<META NAME="description" CONTENT="DGP">
<META NAME="keywords" CONTENT="dgp">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<META NAME="Generator" CONTENT="LaTeX2HTML v2008">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="dgp.css">
<LINK REL="next" HREF="node1.html">
</HEAD>
<BODY >
<DIV CLASS="navigation"><!--Navigation Panel-->
<A NAME="tex2html5"
HREF="node1.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="/usr/share/latex2html/icons/next.png"></A>
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="/usr/share/latex2html/icons/up_g.png">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="/usr/share/latex2html/icons/prev_g.png">
<BR>
<B> Next:</B> <A NAME="tex2html6"
HREF="node1.html">Introduction</A>
<BR>
<BR></DIV>
<!--End of Navigation Panel-->
<H1 ALIGN="CENTER">DGP</H1>
<DIV CLASS="author_info">
<P ALIGN="CENTER"><STRONG>Tomasz Obr&#196;&#153;bski</STRONG></P>
</DIV>
<BR><HR>
<!--Table of Child-Links-->
<A NAME="CHILD_LINKS"></A>
<UL CLASS="ChildLinks">
<LI><A NAME="tex2html7"
HREF="node1.html">Introduction</A>
<LI><A NAME="tex2html8"
HREF="node2.html">Grammar</A>
<LI><A NAME="tex2html9"
HREF="node3.html">Parsing algorithm</A>
<LI><A NAME="tex2html10"
HREF="node4.html">Input</A>
<LI><A NAME="tex2html11"
HREF="node5.html">Output</A>
<LI><A NAME="tex2html12"
HREF="node6.html">About this document ...</A>
</UL>
<!--End of Table of Child-Links-->
<BR><HR>
<ADDRESS>
to
2014-12-19
</ADDRESS>
</BODY>
</HTML>

1
doc/dgp/images.aux Normal file
View File

@ -0,0 +1 @@
\relax

357
doc/dgp/images.log Normal file
View File

@ -0,0 +1,357 @@
This is pdfTeX, Version 3.1415926-2.4-1.40.13 (TeX Live 2012/Debian) (format=latex 2014.12.5) 19 DEC 2014 18:49
entering extended mode
restricted \write18 enabled.
%&-line parsing enabled.
**./images.tex
(./images.tex
LaTeX2e <2011/06/27>
Babel <v3.8m> and hyphenation patterns for english, dumylang, nohyphenation, po
lish, loaded.
(/usr/share/texlive/texmf-dist/tex/latex/base/report.cls
Document Class: report 2007/10/19 v1.4h Standard LaTeX document class
(/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo
File: size10.clo 2007/10/19 v1.4h Standard LaTeX file (size option)
)
\c@part=\count79
\c@chapter=\count80
\c@section=\count81
\c@subsection=\count82
\c@subsubsection=\count83
\c@paragraph=\count84
\c@subparagraph=\count85
\c@figure=\count86
\c@table=\count87
\abovecaptionskip=\skip41
\belowcaptionskip=\skip42
\bibindent=\dimen102
) (/usr/share/texlive/texmf-dist/tex/latex/base/ifthen.sty
Package: ifthen 2001/05/26 v1.1c Standard LaTeX ifthen package (DPC)
) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty
Package: fontenc 2005/09/27 v1.99g Standard LaTeX package
(/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def
File: t1enc.def 2005/09/27 v1.99g Standard LaTeX file
LaTeX Font Info: Redeclaring font encoding T1 on input line 43.
)) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty
Package: inputenc 2008/03/30 v1.1d Input encoding file
\inpenc@prehook=\toks14
\inpenc@posthook=\toks15
(/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def
File: utf8.def 2008/04/05 v1.1m UTF-8 support for inputenc
Now handling font encoding OML ...
... no UTF-8 mapping file for font encoding OML
Now handling font encoding T1 ...
... processing UTF-8 mapping file for font encoding T1
(/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu
File: t1enc.dfu 2008/04/05 v1.1m UTF-8 support for inputenc
defining Unicode char U+00A1 (decimal 161)
defining Unicode char U+00A3 (decimal 163)
defining Unicode char U+00AB (decimal 171)
defining Unicode char U+00BB (decimal 187)
defining Unicode char U+00BF (decimal 191)
defining Unicode char U+00C0 (decimal 192)
defining Unicode char U+00C1 (decimal 193)
defining Unicode char U+00C2 (decimal 194)
defining Unicode char U+00C3 (decimal 195)
defining Unicode char U+00C4 (decimal 196)
defining Unicode char U+00C5 (decimal 197)
defining Unicode char U+00C6 (decimal 198)
defining Unicode char U+00C7 (decimal 199)
defining Unicode char U+00C8 (decimal 200)
defining Unicode char U+00C9 (decimal 201)
defining Unicode char U+00CA (decimal 202)
defining Unicode char U+00CB (decimal 203)
defining Unicode char U+00CC (decimal 204)
defining Unicode char U+00CD (decimal 205)
defining Unicode char U+00CE (decimal 206)
defining Unicode char U+00CF (decimal 207)
defining Unicode char U+00D0 (decimal 208)
defining Unicode char U+00D1 (decimal 209)
defining Unicode char U+00D2 (decimal 210)
defining Unicode char U+00D3 (decimal 211)
defining Unicode char U+00D4 (decimal 212)
defining Unicode char U+00D5 (decimal 213)
defining Unicode char U+00D6 (decimal 214)
defining Unicode char U+00D8 (decimal 216)
defining Unicode char U+00D9 (decimal 217)
defining Unicode char U+00DA (decimal 218)
defining Unicode char U+00DB (decimal 219)
defining Unicode char U+00DC (decimal 220)
defining Unicode char U+00DD (decimal 221)
defining Unicode char U+00DE (decimal 222)
defining Unicode char U+00DF (decimal 223)
defining Unicode char U+00E0 (decimal 224)
defining Unicode char U+00E1 (decimal 225)
defining Unicode char U+00E2 (decimal 226)
defining Unicode char U+00E3 (decimal 227)
defining Unicode char U+00E4 (decimal 228)
defining Unicode char U+00E5 (decimal 229)
defining Unicode char U+00E6 (decimal 230)
defining Unicode char U+00E7 (decimal 231)
defining Unicode char U+00E8 (decimal 232)
defining Unicode char U+00E9 (decimal 233)
defining Unicode char U+00EA (decimal 234)
defining Unicode char U+00EB (decimal 235)
defining Unicode char U+00EC (decimal 236)
defining Unicode char U+00ED (decimal 237)
defining Unicode char U+00EE (decimal 238)
defining Unicode char U+00EF (decimal 239)
defining Unicode char U+00F0 (decimal 240)
defining Unicode char U+00F1 (decimal 241)
defining Unicode char U+00F2 (decimal 242)
defining Unicode char U+00F3 (decimal 243)
defining Unicode char U+00F4 (decimal 244)
defining Unicode char U+00F5 (decimal 245)
defining Unicode char U+00F6 (decimal 246)
defining Unicode char U+00F8 (decimal 248)
defining Unicode char U+00F9 (decimal 249)
defining Unicode char U+00FA (decimal 250)
defining Unicode char U+00FB (decimal 251)
defining Unicode char U+00FC (decimal 252)
defining Unicode char U+00FD (decimal 253)
defining Unicode char U+00FE (decimal 254)
defining Unicode char U+00FF (decimal 255)
defining Unicode char U+0102 (decimal 258)
defining Unicode char U+0103 (decimal 259)
defining Unicode char U+0104 (decimal 260)
defining Unicode char U+0105 (decimal 261)
defining Unicode char U+0106 (decimal 262)
defining Unicode char U+0107 (decimal 263)
defining Unicode char U+010C (decimal 268)
defining Unicode char U+010D (decimal 269)
defining Unicode char U+010E (decimal 270)
defining Unicode char U+010F (decimal 271)
defining Unicode char U+0110 (decimal 272)
defining Unicode char U+0111 (decimal 273)
defining Unicode char U+0118 (decimal 280)
defining Unicode char U+0119 (decimal 281)
defining Unicode char U+011A (decimal 282)
defining Unicode char U+011B (decimal 283)
defining Unicode char U+011E (decimal 286)
defining Unicode char U+011F (decimal 287)
defining Unicode char U+0130 (decimal 304)
defining Unicode char U+0131 (decimal 305)
defining Unicode char U+0132 (decimal 306)
defining Unicode char U+0133 (decimal 307)
defining Unicode char U+0139 (decimal 313)
defining Unicode char U+013A (decimal 314)
defining Unicode char U+013D (decimal 317)
defining Unicode char U+013E (decimal 318)
defining Unicode char U+0141 (decimal 321)
defining Unicode char U+0142 (decimal 322)
defining Unicode char U+0143 (decimal 323)
defining Unicode char U+0144 (decimal 324)
defining Unicode char U+0147 (decimal 327)
defining Unicode char U+0148 (decimal 328)
defining Unicode char U+014A (decimal 330)
defining Unicode char U+014B (decimal 331)
defining Unicode char U+0150 (decimal 336)
defining Unicode char U+0151 (decimal 337)
defining Unicode char U+0152 (decimal 338)
defining Unicode char U+0153 (decimal 339)
defining Unicode char U+0154 (decimal 340)
defining Unicode char U+0155 (decimal 341)
defining Unicode char U+0158 (decimal 344)
defining Unicode char U+0159 (decimal 345)
defining Unicode char U+015A (decimal 346)
defining Unicode char U+015B (decimal 347)
defining Unicode char U+015E (decimal 350)
defining Unicode char U+015F (decimal 351)
defining Unicode char U+0160 (decimal 352)
defining Unicode char U+0161 (decimal 353)
defining Unicode char U+0162 (decimal 354)
defining Unicode char U+0163 (decimal 355)
defining Unicode char U+0164 (decimal 356)
defining Unicode char U+0165 (decimal 357)
defining Unicode char U+016E (decimal 366)
defining Unicode char U+016F (decimal 367)
defining Unicode char U+0170 (decimal 368)
defining Unicode char U+0171 (decimal 369)
defining Unicode char U+0178 (decimal 376)
defining Unicode char U+0179 (decimal 377)
defining Unicode char U+017A (decimal 378)
defining Unicode char U+017B (decimal 379)
defining Unicode char U+017C (decimal 380)
defining Unicode char U+017D (decimal 381)
defining Unicode char U+017E (decimal 382)
defining Unicode char U+200C (decimal 8204)
defining Unicode char U+2013 (decimal 8211)
defining Unicode char U+2014 (decimal 8212)
defining Unicode char U+2018 (decimal 8216)
defining Unicode char U+2019 (decimal 8217)
defining Unicode char U+201A (decimal 8218)
defining Unicode char U+201C (decimal 8220)
defining Unicode char U+201D (decimal 8221)
defining Unicode char U+201E (decimal 8222)
defining Unicode char U+2030 (decimal 8240)
defining Unicode char U+2031 (decimal 8241)
defining Unicode char U+2039 (decimal 8249)
defining Unicode char U+203A (decimal 8250)
defining Unicode char U+2423 (decimal 9251)
)
Now handling font encoding OT1 ...
... processing UTF-8 mapping file for font encoding OT1
(/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu
File: ot1enc.dfu 2008/04/05 v1.1m UTF-8 support for inputenc
defining Unicode char U+00A1 (decimal 161)
defining Unicode char U+00A3 (decimal 163)
defining Unicode char U+00B8 (decimal 184)
defining Unicode char U+00BF (decimal 191)
defining Unicode char U+00C5 (decimal 197)
defining Unicode char U+00C6 (decimal 198)
defining Unicode char U+00D8 (decimal 216)
defining Unicode char U+00DF (decimal 223)
defining Unicode char U+00E6 (decimal 230)
defining Unicode char U+00EC (decimal 236)
defining Unicode char U+00ED (decimal 237)
defining Unicode char U+00EE (decimal 238)
defining Unicode char U+00EF (decimal 239)
defining Unicode char U+00F8 (decimal 248)
defining Unicode char U+0131 (decimal 305)
defining Unicode char U+0141 (decimal 321)
defining Unicode char U+0142 (decimal 322)
defining Unicode char U+0152 (decimal 338)
defining Unicode char U+0153 (decimal 339)
defining Unicode char U+2013 (decimal 8211)
defining Unicode char U+2014 (decimal 8212)
defining Unicode char U+2018 (decimal 8216)
defining Unicode char U+2019 (decimal 8217)
defining Unicode char U+201C (decimal 8220)
defining Unicode char U+201D (decimal 8221)
)
Now handling font encoding OMS ...
... processing UTF-8 mapping file for font encoding OMS
(/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu
File: omsenc.dfu 2008/04/05 v1.1m UTF-8 support for inputenc
defining Unicode char U+00A7 (decimal 167)
defining Unicode char U+00B6 (decimal 182)
defining Unicode char U+00B7 (decimal 183)
defining Unicode char U+2020 (decimal 8224)
defining Unicode char U+2021 (decimal 8225)
defining Unicode char U+2022 (decimal 8226)
)
Now handling font encoding OMX ...
... no UTF-8 mapping file for font encoding OMX
Now handling font encoding U ...
... no UTF-8 mapping file for font encoding U
defining Unicode char U+00A9 (decimal 169)
defining Unicode char U+00AA (decimal 170)
defining Unicode char U+00AE (decimal 174)
defining Unicode char U+00BA (decimal 186)
defining Unicode char U+02C6 (decimal 710)
defining Unicode char U+02DC (decimal 732)
defining Unicode char U+200C (decimal 8204)
defining Unicode char U+2026 (decimal 8230)
defining Unicode char U+2122 (decimal 8482)
defining Unicode char U+2423 (decimal 9251)
)) (/usr/share/texlive/texmf-dist/tex/latex/graphics/color.sty
Package: color 2005/11/14 v1.0j Standard LaTeX Color (DPC)
(/usr/share/texlive/texmf-dist/tex/latex/latexconfig/color.cfg
File: color.cfg 2007/01/18 v1.5 color configuration of teTeX/TeXLive
)
Package color Info: Driver file: dvips.def on input line 130.
(/usr/share/texlive/texmf-dist/tex/latex/graphics/dvips.def
File: dvips.def 1999/02/16 v3.0i Driver-dependant file (DPC,SPQR)
) (/usr/share/texlive/texmf-dist/tex/latex/graphics/dvipsnam.def
File: dvipsnam.def 1999/02/16 v3.0i Driver-dependant file (DPC,SPQR)
))
! LaTeX Error: Option clash for package inputenc.
See the LaTeX manual or LaTeX Companion for explanation.
Type H <return> for immediate help.
...
l.24
The package inputenc has already been loaded with options:
[utf8]
There has now been an attempt to load it with options
[latin1]
Adding the global options:
utf8,latin1
to your \documentclass declaration may fix this.
Try typing <return> to proceed.
\sizebox=\box26
\lthtmlwrite=\write3
(./images.aux)
\openout1 = `images.aux'.
LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 123.
LaTeX Font Info: ... okay on input line 123.
LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 123.
LaTeX Font Info: ... okay on input line 123.
LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 123.
LaTeX Font Info: ... okay on input line 123.
LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 123.
LaTeX Font Info: ... okay on input line 123.
LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 123.
LaTeX Font Info: ... okay on input line 123.
LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 123.
LaTeX Font Info: ... okay on input line 123.
latex2htmlLength hsize=349.0pt
latex2htmlLength vsize=682.0pt
latex2htmlLength hoffset=0.0pt
latex2htmlLength voffset=0.0pt
latex2htmlLength topmargin=0.0pt
latex2htmlLength topskip=0.00003pt
latex2htmlLength headheight=0.0pt
latex2htmlLength headsep=0.0pt
latex2htmlLength parskip=0.0pt plus 1.0pt
latex2htmlLength oddsidemargin=53.0pt
latex2htmlLength evensidemargin=53.0pt
LaTeX Font Info: Try loading font information for T1+cmtt on input line 153.
(/usr/share/texlive/texmf-dist/tex/latex/base/t1cmtt.fd
File: t1cmtt.fd 1999/05/25 v2.5h Standard LaTeX font definitions
)
l2hSize :figure26:238.0pt::0.0pt::349.0pt.
[1
]
l2hSize :figure31:163.55518pt::0.0pt::349.0pt.
[2
]
Overfull \hbox (63.67963pt too wide) in paragraph at lines 235--235
[]\T1/cmtt/m/n/7 0005 04 W goni lem:goni¢,V/AiMdNsP3TfrVp gph:2:1 dgp:8;s;--cmp
l_ga-7/3,--cmpl_ga-10/3,--prep-11/8;!subj!cmpl_ga[]
[]
Overfull \hbox (45.09045pt too wide) in paragraph at lines 235--235
[]\T1/cmtt/m/n/7 0005 04 W goni lem:goni¢,V/AiMdNsP3TfrVp gph:2:1 dgp:9;s;--cmp
l_ga-7/2,--cmpl_ga-10/2,--prep-11/9;!cmpl_ga[]
[]
l2hSize :figure36:195.55518pt::0.0pt::349.0pt.
[3
] (./images.aux) )
Here is how much of TeX's memory you used:
1035 strings out of 495025
11803 string characters out of 3181177
58180 words of memory out of 3000000
4265 multiletter control sequences out of 15000+200000
5797 words of font info for 18 fonts, out of 3000000 for 9000
34 hyphenation exceptions out of 8191
23i,5n,19p,4058b,297s stack positions out of 5000i,500n,10000p,200000b,50000s
Output written on images.dvi (3 pages, 3420 bytes).

6
doc/dgp/images.pl Normal file
View File

@ -0,0 +1,6 @@
# LaTeX2HTML 2008 (1.71)
# Associate images original text with physical files.
1;

242
doc/dgp/images.tex Normal file
View File

@ -0,0 +1,242 @@
\batchmode
\documentclass[a4paper]{report}
\RequirePackage{ifthen}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\title{DGP}
\author{Tomasz Obrębski}
\usepackage[dvips]{color}
\pagecolor[gray]{.7}
\usepackage[latin1]{inputenc}
\makeatletter
\makeatletter
\count@=\the\catcode`\_ \catcode`\_=8
\newenvironment{tex2html_wrap}{}{}%
\catcode`\<=12\catcode`\_=\count@
\newcommand{\providedcommand}[1]{\expandafter\providecommand\csname #1\endcsname}%
\newcommand{\renewedcommand}[1]{\expandafter\providecommand\csname #1\endcsname{}%
\expandafter\renewcommand\csname #1\endcsname}%
\newcommand{\newedenvironment}[1]{\newenvironment{#1}{}{}\renewenvironment{#1}}%
\let\newedcommand\renewedcommand
\let\renewedenvironment\newedenvironment
\makeatother
\let\mathon=$
\let\mathoff=$
\ifx\AtBeginDocument\undefined \newcommand{\AtBeginDocument}[1]{}\fi
\newbox\sizebox
\setlength{\hoffset}{0pt}\setlength{\voffset}{0pt}
\addtolength{\textheight}{\footskip}\setlength{\footskip}{0pt}
\addtolength{\textheight}{\topmargin}\setlength{\topmargin}{0pt}
\addtolength{\textheight}{\headheight}\setlength{\headheight}{0pt}
\addtolength{\textheight}{\headsep}\setlength{\headsep}{0pt}
\setlength{\textwidth}{349pt}
\newwrite\lthtmlwrite
\makeatletter
\let\realnormalsize=\normalsize
\global\topskip=2sp
\def\preveqno{}\let\real@float=\@float \let\realend@float=\end@float
\def\@float{\let\@savefreelist\@freelist\real@float}
\def\liih@math{\ifmmode$\else\bad@math\fi}
\def\end@float{\realend@float\global\let\@freelist\@savefreelist}
\let\real@dbflt=\@dbflt \let\end@dblfloat=\end@float
\let\@largefloatcheck=\relax
\let\if@boxedmulticols=\iftrue
\def\@dbflt{\let\@savefreelist\@freelist\real@dbflt}
\def\adjustnormalsize{\def\normalsize{\mathsurround=0pt \realnormalsize
\parindent=0pt\abovedisplayskip=0pt\belowdisplayskip=0pt}%
\def\phantompar{\csname par\endcsname}\normalsize}%
\def\lthtmltypeout#1{{\let\protect\string \immediate\write\lthtmlwrite{#1}}}%
\newcommand\lthtmlhboxmathA{\adjustnormalsize\setbox\sizebox=\hbox\bgroup\kern.05em }%
\newcommand\lthtmlhboxmathB{\adjustnormalsize\setbox\sizebox=\hbox to\hsize\bgroup\hfill }%
\newcommand\lthtmlvboxmathA{\adjustnormalsize\setbox\sizebox=\vbox\bgroup %
\let\ifinner=\iffalse \let\)\liih@math }%
\newcommand\lthtmlboxmathZ{\@next\next\@currlist{}{\def\next{\voidb@x}}%
\expandafter\box\next\egroup}%
\newcommand\lthtmlmathtype[1]{\gdef\lthtmlmathenv{#1}}%
\newcommand\lthtmllogmath{\dimen0\ht\sizebox \advance\dimen0\dp\sizebox
\ifdim\dimen0>.95\vsize
\lthtmltypeout{%
*** image for \lthtmlmathenv\space is too tall at \the\dimen0, reducing to .95 vsize ***}%
\ht\sizebox.95\vsize \dp\sizebox\z@ \fi
\lthtmltypeout{l2hSize %
:\lthtmlmathenv:\the\ht\sizebox::\the\dp\sizebox::\the\wd\sizebox.\preveqno}}%
\newcommand\lthtmlfigureA[1]{\let\@savefreelist\@freelist
\lthtmlmathtype{#1}\lthtmlvboxmathA}%
\newcommand\lthtmlpictureA{\bgroup\catcode`\_=8 \lthtmlpictureB}%
\newcommand\lthtmlpictureB[1]{\lthtmlmathtype{#1}\egroup
\let\@savefreelist\@freelist \lthtmlhboxmathB}%
\newcommand\lthtmlpictureZ[1]{\hfill\lthtmlfigureZ}%
\newcommand\lthtmlfigureZ{\lthtmlboxmathZ\lthtmllogmath\copy\sizebox
\global\let\@freelist\@savefreelist}%
\newcommand\lthtmldisplayA{\bgroup\catcode`\_=8 \lthtmldisplayAi}%
\newcommand\lthtmldisplayAi[1]{\lthtmlmathtype{#1}\egroup\lthtmlvboxmathA}%
\newcommand\lthtmldisplayB[1]{\edef\preveqno{(\theequation)}%
\lthtmldisplayA{#1}\let\@eqnnum\relax}%
\newcommand\lthtmldisplayZ{\lthtmlboxmathZ\lthtmllogmath\lthtmlsetmath}%
\newcommand\lthtmlinlinemathA{\bgroup\catcode`\_=8 \lthtmlinlinemathB}
\newcommand\lthtmlinlinemathB[1]{\lthtmlmathtype{#1}\egroup\lthtmlhboxmathA
\vrule height1.5ex width0pt }%
\newcommand\lthtmlinlineA{\bgroup\catcode`\_=8 \lthtmlinlineB}%
\newcommand\lthtmlinlineB[1]{\lthtmlmathtype{#1}\egroup\lthtmlhboxmathA}%
\newcommand\lthtmlinlineZ{\egroup\expandafter\ifdim\dp\sizebox>0pt %
\expandafter\centerinlinemath\fi\lthtmllogmath\lthtmlsetinline}
\newcommand\lthtmlinlinemathZ{\egroup\expandafter\ifdim\dp\sizebox>0pt %
\expandafter\centerinlinemath\fi\lthtmllogmath\lthtmlsetmath}
\newcommand\lthtmlindisplaymathZ{\egroup %
\centerinlinemath\lthtmllogmath\lthtmlsetmath}
\def\lthtmlsetinline{\hbox{\vrule width.1em \vtop{\vbox{%
\kern.1em\copy\sizebox}\ifdim\dp\sizebox>0pt\kern.1em\else\kern.3pt\fi
\ifdim\hsize>\wd\sizebox \hrule depth1pt\fi}}}
\def\lthtmlsetmath{\hbox{\vrule width.1em\kern-.05em\vtop{\vbox{%
\kern.1em\kern0.8 pt\hbox{\hglue.17em\copy\sizebox\hglue0.8 pt}}\kern.3pt%
\ifdim\dp\sizebox>0pt\kern.1em\fi \kern0.8 pt%
\ifdim\hsize>\wd\sizebox \hrule depth1pt\fi}}}
\def\centerinlinemath{%
\dimen1=\ifdim\ht\sizebox<\dp\sizebox \dp\sizebox\else\ht\sizebox\fi
\advance\dimen1by.5pt \vrule width0pt height\dimen1 depth\dimen1
\dp\sizebox=\dimen1\ht\sizebox=\dimen1\relax}
\def\lthtmlcheckvsize{\ifdim\ht\sizebox<\vsize
\ifdim\wd\sizebox<\hsize\expandafter\hfill\fi \expandafter\vfill
\else\expandafter\vss\fi}%
\providecommand{\selectlanguage}[1]{}%
\makeatletter \tracingstats = 1
\begin{document}
\pagestyle{empty}\thispagestyle{empty}\lthtmltypeout{}%
\lthtmltypeout{latex2htmlLength hsize=\the\hsize}\lthtmltypeout{}%
\lthtmltypeout{latex2htmlLength vsize=\the\vsize}\lthtmltypeout{}%
\lthtmltypeout{latex2htmlLength hoffset=\the\hoffset}\lthtmltypeout{}%
\lthtmltypeout{latex2htmlLength voffset=\the\voffset}\lthtmltypeout{}%
\lthtmltypeout{latex2htmlLength topmargin=\the\topmargin}\lthtmltypeout{}%
\lthtmltypeout{latex2htmlLength topskip=\the\topskip}\lthtmltypeout{}%
\lthtmltypeout{latex2htmlLength headheight=\the\headheight}\lthtmltypeout{}%
\lthtmltypeout{latex2htmlLength headsep=\the\headsep}\lthtmltypeout{}%
\lthtmltypeout{latex2htmlLength parskip=\the\parskip}\lthtmltypeout{}%
\lthtmltypeout{latex2htmlLength oddsidemargin=\the\oddsidemargin}\lthtmltypeout{}%
\makeatletter
\if@twoside\lthtmltypeout{latex2htmlLength evensidemargin=\the\evensidemargin}%
\else\lthtmltypeout{latex2htmlLength evensidemargin=\the\oddsidemargin}\fi%
\lthtmltypeout{}%
\makeatother
\setcounter{page}{1}
\onecolumn
% !!! IMAGES START HERE !!!
\bgroup \egroup
\stepcounter{chapter}
\stepcounter{chapter}
\stepcounter{chapter}
\stepcounter{chapter}
\stepcounter{chapter}
{\newpage\clearpage
\lthtmlfigureA{figure26}%
\begin{figure}\begin{verbatim}
0000 00 BOS *
0000 04 W Pies lem:pies,N/CnGaNs
0004 01 S _
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp
0009 01 S _
0010 08 W czarnego lem:czarny,ADJ/CaDpGapNs
0010 08 W czarnego lem:czarny,ADJ/CgDpGainpNs
0018 01 S _
0019 04 W kota lem:kota,N/CnGfNs
0019 04 W kota lem:kot,N/CaGaNs
0019 04 W kota lem:kot,N/CgGaNs
0023 01 S _
0024 01 W w lem:w,P/Cal
0025 01 S _
0026 06 W butach lem:buta,N/ClGfNp
0026 06 W butach lem:but,N/ClGiNp
0032 01 P .
0033 01 S \n
0034 00 EOS *\end{verbatim}
\end{figure}%
\lthtmlfigureZ
\lthtmlcheckvsize\clearpage}
{\newpage\clearpage
\lthtmlfigureA{figure31}%
\begin{figure}\scriptsize
\begin{verbatim}
0000 00 BOS * gph:0:
0000 04 W Pies lem:pies,N/CnGaNs gph:1:0
0004 01 S _
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:2:1
0009 01 S _
0010 08 W czarnego lem:czarny,ADJ/CaDpGapNs gph:3:2
0010 08 W czarnego lem:czarny,ADJ/CgDpGainpNs gph:4:2
0018 01 S _
0019 04 W kota lem:kota,N/CnGfNs gph:5:3,4
0019 04 W kota lem:kot,N/CaGaNs gph:6:3,4
0019 04 W kota lem:kot,N/CgGaNs gph:7:3,4
0023 01 S _
0024 01 W w lem:w,P/Cal gph:8:5,6,7
0025 01 S _
0026 06 W butach lem:buta,N/ClGfNp gph:9:8
0026 06 W butach lem:but,N/ClGiNp gph:10:8
0032 01 P .
0033 01 S \n
0034 00 EOS * gph:11:9,10\end{verbatim}
\end{figure}%
\lthtmlfigureZ
\lthtmlcheckvsize\clearpage}
{\newpage\clearpage
\lthtmlfigureA{figure36}%
\begin{figure} \scriptsize
\begin{verbatim}
0000 00 BOS * gph:0: dgp:0;s;;
0000 04 W Pies lem:pies,N/CnGaNs gph:1:0 dgp:1;s;;
0004 01 S _
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:2:1 dgp:2;s;;
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:2:1 dgp:3;s;--subj-1/2;!subj
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:2:1 dgp:8;s;--cmpl_ga-7/3,--cmpl_ga-10/3,--prep-11/8;!subj!cmpl_ga
0005 04 W goni lem:gonić,V/AiMdNsP3TfrVp gph:2:1 dgp:9;s;--cmpl_ga-7/2,--cmpl_ga-10/2,--prep-11/9;!cmpl_ga
0009 01 S _
0010 08 W czarnego lem:czarny,ADJ/CaDpGapNs gph:3:2 dgp:4;s;;
0010 08 W czarnego lem:czarny,ADJ/CgDpGainpNs gph:4:2 dgp:5;s;;
0018 01 S _
0019 04 W kota lem:kota,N/CnGfNs gph:5:3,4 dgp:6;s;--prep-11/6;
0019 04 W kota lem:kot,N/CaGaNs gph:6:3,4 dgp:7;s;--mod-4/7,--prep-11/7;
0019 04 W kota lem:kot,N/CgGaNs gph:7:3,4 dgp:10;s;--mod-5/10,--prep-11/10;
0023 01 S _
0024 01 W w lem:w,P/Cal gph:8:5,6,7 dgp:11;u;;&pcmpl
0024 01 W w lem:w,P/Cal gph:8:5,6,7 dgp:13;s;--pcmpl-12/11,--pcmpl-14/11;!pcmpl
0025 01 S _
0026 06 W butach lem:buta,N/ClGfNp gph:9:8 dgp:12;s;;
0026 06 W butach lem:but,N/ClGiNp gph:10:8 dgp:14;s;;
0032 01 P .
0033 01 S \n
0034 00 EOS * gph:11:9,10 dgp:15;s;;\end{verbatim}
\end{figure}%
\lthtmlfigureZ
\lthtmlcheckvsize\clearpage}
\end{document}

72
doc/dgp/index.html Normal file
View File

@ -0,0 +1,72 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!--Converted with LaTeX2HTML 2008 (1.71)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>DGP</TITLE>
<META NAME="description" CONTENT="DGP">
<META NAME="keywords" CONTENT="dgp">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<META NAME="Generator" CONTENT="LaTeX2HTML v2008">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="dgp.css">
<LINK REL="next" HREF="node1.html">
</HEAD>
<BODY >
<DIV CLASS="navigation"><!--Navigation Panel-->
<A NAME="tex2html5"
HREF="node1.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="/usr/share/latex2html/icons/next.png"></A>
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="/usr/share/latex2html/icons/up_g.png">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="/usr/share/latex2html/icons/prev_g.png">
<BR>
<B> Next:</B> <A NAME="tex2html6"
HREF="node1.html">Introduction</A>
<BR>
<BR></DIV>
<!--End of Navigation Panel-->
<H1 ALIGN="CENTER">DGP</H1>
<DIV CLASS="author_info">
<P ALIGN="CENTER"><STRONG>Tomasz Obr&#196;&#153;bski</STRONG></P>
</DIV>
<BR><HR>
<!--Table of Child-Links-->
<A NAME="CHILD_LINKS"></A>
<UL CLASS="ChildLinks">
<LI><A NAME="tex2html7"
HREF="node1.html">Introduction</A>
<LI><A NAME="tex2html8"
HREF="node2.html">Grammar</A>
<LI><A NAME="tex2html9"
HREF="node3.html">Parsing algorithm</A>
<LI><A NAME="tex2html10"
HREF="node4.html">Input</A>
<LI><A NAME="tex2html11"
HREF="node5.html">Output</A>
<LI><A NAME="tex2html12"
HREF="node6.html">About this document ...</A>
</UL>
<!--End of Table of Child-Links-->
<BR><HR>
<ADDRESS>
to
2014-12-19
</ADDRESS>
</BODY>
</HTML>

13
doc/dgp/labels.pl Normal file
View File

@ -0,0 +1,13 @@
# LaTeX2HTML 2008 (1.71)
# Associate labels original text with physical files.
1;
# LaTeX2HTML 2008 (1.71)
# labels from external_latex_labels array.
1;

1
doc/dgp/missfont.log Normal file
View File

@ -0,0 +1 @@
mktexpk --mfmode ljfour --bdpi 8000 --mag 0+7000/(2*4000) --dpi 7000 ectt0800

63
doc/dgp/node1.html Normal file
View File

@ -0,0 +1,63 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!--Converted with LaTeX2HTML 2008 (1.71)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>Introduction</TITLE>
<META NAME="description" CONTENT="Introduction">
<META NAME="keywords" CONTENT="dgp">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<META NAME="Generator" CONTENT="LaTeX2HTML v2008">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="dgp.css">
<LINK REL="next" HREF="node2.html">
<LINK REL="previous" HREF="dgp.html">
<LINK REL="up" HREF="dgp.html">
<LINK REL="next" HREF="node2.html">
</HEAD>
<BODY >
<DIV CLASS="navigation"><!--Navigation Panel-->
<A NAME="tex2html21"
HREF="node2.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="/usr/share/latex2html/icons/next.png"></A>
<A NAME="tex2html19"
HREF="dgp.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="/usr/share/latex2html/icons/up.png"></A>
<A NAME="tex2html13"
HREF="dgp.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="/usr/share/latex2html/icons/prev.png"></A>
<BR>
<B> Next:</B> <A NAME="tex2html22"
HREF="node2.html">Grammar</A>
<B> Up:</B> <A NAME="tex2html20"
HREF="dgp.html">DGP</A>
<B> Previous:</B> <A NAME="tex2html14"
HREF="dgp.html">DGP</A>
<BR>
<BR></DIV>
<!--End of Navigation Panel-->
<H1><A NAME="SECTION00100000000000000000">
Introduction</A>
</H1>
<BR><HR>
<ADDRESS>
to
2014-12-19
</ADDRESS>
</BODY>
</HTML>

63
doc/dgp/node2.html Normal file
View File

@ -0,0 +1,63 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!--Converted with LaTeX2HTML 2008 (1.71)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>Grammar</TITLE>
<META NAME="description" CONTENT="Grammar">
<META NAME="keywords" CONTENT="dgp">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<META NAME="Generator" CONTENT="LaTeX2HTML v2008">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="dgp.css">
<LINK REL="next" HREF="node3.html">
<LINK REL="previous" HREF="node1.html">
<LINK REL="up" HREF="dgp.html">
<LINK REL="next" HREF="node3.html">
</HEAD>
<BODY >
<DIV CLASS="navigation"><!--Navigation Panel-->
<A NAME="tex2html31"
HREF="node3.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="/usr/share/latex2html/icons/next.png"></A>
<A NAME="tex2html29"
HREF="dgp.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="/usr/share/latex2html/icons/up.png"></A>
<A NAME="tex2html23"
HREF="node1.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="/usr/share/latex2html/icons/prev.png"></A>
<BR>
<B> Next:</B> <A NAME="tex2html32"
HREF="node3.html">Parsing algorithm</A>
<B> Up:</B> <A NAME="tex2html30"
HREF="dgp.html">DGP</A>
<B> Previous:</B> <A NAME="tex2html24"
HREF="node1.html">Introduction</A>
<BR>
<BR></DIV>
<!--End of Navigation Panel-->
<H1><A NAME="SECTION00200000000000000000">
Grammar</A>
</H1>
<BR><HR>
<ADDRESS>
to
2014-12-19
</ADDRESS>
</BODY>
</HTML>

63
doc/dgp/node3.html Normal file
View File

@ -0,0 +1,63 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!--Converted with LaTeX2HTML 2008 (1.71)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>Parsing algorithm</TITLE>
<META NAME="description" CONTENT="Parsing algorithm">
<META NAME="keywords" CONTENT="dgp">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<META NAME="Generator" CONTENT="LaTeX2HTML v2008">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="dgp.css">
<LINK REL="next" HREF="node4.html">
<LINK REL="previous" HREF="node2.html">
<LINK REL="up" HREF="dgp.html">
<LINK REL="next" HREF="node4.html">
</HEAD>
<BODY >
<DIV CLASS="navigation"><!--Navigation Panel-->
<A NAME="tex2html41"
HREF="node4.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="/usr/share/latex2html/icons/next.png"></A>
<A NAME="tex2html39"
HREF="dgp.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="/usr/share/latex2html/icons/up.png"></A>
<A NAME="tex2html33"
HREF="node2.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="/usr/share/latex2html/icons/prev.png"></A>
<BR>
<B> Next:</B> <A NAME="tex2html42"
HREF="node4.html">Input</A>
<B> Up:</B> <A NAME="tex2html40"
HREF="dgp.html">DGP</A>
<B> Previous:</B> <A NAME="tex2html34"
HREF="node2.html">Grammar</A>
<BR>
<BR></DIV>
<!--End of Navigation Panel-->
<H1><A NAME="SECTION00300000000000000000">
Parsing algorithm</A>
</H1>
<BR><HR>
<ADDRESS>
to
2014-12-19
</ADDRESS>
</BODY>
</HTML>

76
doc/dgp/node4.html Normal file
View File

@ -0,0 +1,76 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!--Converted with LaTeX2HTML 2008 (1.71)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>Input</TITLE>
<META NAME="description" CONTENT="Input">
<META NAME="keywords" CONTENT="dgp">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<META NAME="Generator" CONTENT="LaTeX2HTML v2008">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="dgp.css">
<LINK REL="next" HREF="node5.html">
<LINK REL="previous" HREF="node3.html">
<LINK REL="up" HREF="dgp.html">
<LINK REL="next" HREF="node5.html">
</HEAD>
<BODY >
<DIV CLASS="navigation"><!--Navigation Panel-->
<A NAME="tex2html51"
HREF="node5.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="/usr/share/latex2html/icons/next.png"></A>
<A NAME="tex2html49"
HREF="dgp.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="/usr/share/latex2html/icons/up.png"></A>
<A NAME="tex2html43"
HREF="node3.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="/usr/share/latex2html/icons/prev.png"></A>
<BR>
<B> Next:</B> <A NAME="tex2html52"
HREF="node5.html">Output</A>
<B> Up:</B> <A NAME="tex2html50"
HREF="dgp.html">DGP</A>
<B> Previous:</B> <A NAME="tex2html44"
HREF="node3.html">Parsing algorithm</A>
<BR>
<BR></DIV>
<!--End of Navigation Panel-->
<H1><A NAME="SECTION00400000000000000000">
Input</A>
</H1>
Wej&#197;&#155;cie dla parsera przygotowuje si&#196;&#153; w nast&#196;&#153;puj&#196;&#133;cy spos&#195;&#179;b:
<PRE>
cat text.txt | tok | sen | lem | canonize | gph | dgp ...
</PRE>
Plik wej&#197;&#155;ciowy
dgp bierze na wej&#197;&#155;ciu graf s&#197;&#130;&#195;&#179;w (wordgraph). Numery wierzcho&#197;&#130;k&#195;&#179;w tego
grafu to warto&#197;&#155;ci pola gph. Pole to jest wprowadzane do pliku przez
program gph.
Poza polem gph, dgp odczytuje te&#197;&#188; warto&#197;&#155;&#196;&#135; pola lem.
<BR><HR>
<ADDRESS>
to
2014-12-19
</ADDRESS>
</BODY>
</HTML>

204
doc/dgp/node5.html Normal file
View File

@ -0,0 +1,204 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!--Converted with LaTeX2HTML 2008 (1.71)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>Output</TITLE>
<META NAME="description" CONTENT="Output">
<META NAME="keywords" CONTENT="dgp">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<META NAME="Generator" CONTENT="LaTeX2HTML v2008">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="dgp.css">
<LINK REL="next" HREF="node6.html">
<LINK REL="previous" HREF="node4.html">
<LINK REL="up" HREF="dgp.html">
<LINK REL="next" HREF="node6.html">
</HEAD>
<BODY >
<DIV CLASS="navigation"><!--Navigation Panel-->
<A NAME="tex2html61"
HREF="node6.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="/usr/share/latex2html/icons/next.png"></A>
<A NAME="tex2html59"
HREF="dgp.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="/usr/share/latex2html/icons/up.png"></A>
<A NAME="tex2html53"
HREF="node4.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="/usr/share/latex2html/icons/prev.png"></A>
<BR>
<B> Next:</B> <A NAME="tex2html62"
HREF="node6.html">About this document ...</A>
<B> Up:</B> <A NAME="tex2html60"
HREF="dgp.html">DGP</A>
<B> Previous:</B> <A NAME="tex2html54"
HREF="node4.html">Input</A>
<BR>
<BR></DIV>
<!--End of Navigation Panel-->
<H1><A NAME="SECTION00500000000000000000">
Output</A>
</H1>
Format:
0005 04 W goni lem:goni&#196;&#135;,V/AiMdNsP3TfrVp gph:4:1,2,3 dgp:6;s
<PRE>
dgp:&lt;node&gt;;&lt;saturation&gt;[;&lt;links&gt;][;&lt;sets&gt;][;&lt;constraints&gt;]
</PRE>
<DL>
<DT><STRONG><I>node</I></STRONG></DT>
<DD>Dependency graph node number.
</DD>
<DT><STRONG>saturation</STRONG></DT>
<DD>The information whether the node is saturated. A
node is saturated if the list of required connections for this node
is empty, it is unsaturated otherwise.
</DD>
<DT><STRONG>links</STRONG></DT>
<DD>The comma separated list of connections. For each node
either the list of its dependents or the list of its heads may be
printed, or both (this dependes on the value of the <code>--info</code>
parameter).
</DD>
<DT><STRONG>sets</STRONG></DT>
<DD>For each node, the sets of all its left neighbours,
transitive left heads, transitive left dependents, and nodes visible
on the left can be printed. (This information is useful for fast
tree generation.)
</DD>
<DT><STRONG>constraints</STRONG></DT>
<DD>the information on constraints imposed on the
node. Constraints follow from the SGL and REQ grammar rules and have
the form of a comma-separated list of dependency types required by
the node and forbidden for the node. The elements of the list have
the following format:
<TABLE CELLPADDING=3>
<TR><TD ALIGN="LEFT"><code>!</code><I>dependency type</I></TD>
<TD ALIGN="LEFT"><I>dependency type</I> is required</TD>
</TR>
<TR><TD ALIGN="LEFT"><code>&amp;</code><I>dependency type</I></TD>
<TD ALIGN="LEFT"><I>dependency type</I> is forbidden</TD>
</TR>
</TABLE>
</DD>
</DL>
Wynikiem pracy dgp jest graf zale&#197;&#188;no&#197;&#155;ci. Graf ten mo&#197;&#188;e zawiera&#196;&#135;
(zwykle tak jest) wi&#196;&#153;cej wierzcho&#197;&#130;k&#195;&#179;w ni&#197;&#188; graf wej&#197;&#155;ciowy.
* numer wierzcho&#197;&#130;ka w wyj&#197;&#155;ciowym grafie zale&#197;&#188;no&#197;&#155;ci
Numery wierzcho&#197;&#130;k&#195;&#179;w w wyj&#197;&#155;ciowym grafie s&#196;&#133; inne. Podczas dzia&#197;&#130;ania
parser tworzy kopie (klony) wierzcho&#197;&#130;k&#195;&#179;w wej&#197;&#155;ciowych. Dzieje si&#196;&#153; tak w
sytuacji, kiedy do wierzcho&#197;&#130;ka (jako nadrz&#196;&#153;dnika) dowiazywana jest
zale&#197;&#188;no&#197;&#155;c obj&#196;&#153;ta ograniczeniami. Ograniczenia wynikaj&#196;&#133; z regu&#197;&#130;
gramatyki SGL i OBL.
SGL - zale&#197;&#188;no&#197;&#155;&#196;&#135; jednokrotna
OBL - zale&#197;&#188;no&#197;&#155;&#196;&#135; obligatiryjna
node saturation <code>s</code> or <code>u</code>
s - wierzcho&#197;&#130;ek nasycony
u - wierzcho&#197;&#130;ek nienasycony
Wierzcho&#197;&#130;ek nienasycony to taki, kt&#195;&#179;remu brakuje obowi&#196;&#133;zkowy podrz&#196;&#153;dnik.
Obowi&#196;&#133;zkowe podrz&#196;&#153;dniki okre&#197;&#155;lane s&#196;&#133; w regu&#197;&#130;ach OBL gramatyki.
connections
* connection list
connections are lista zale&#197;&#188;no&#197;&#155;ci zawiera oddzielony przecinkami ci&#196;&#133;g wyra&#197;&#188;e&#197;&#132;
-&lt;typ&gt;-&lt;w1&gt;/&lt;w2&gt;
je&#197;&#155;li w wywo&#197;&#130;aniu programu dla parametru -info podano w&#197;&#155;r&#195;&#179;d warto&#197;&#155;ci 'd'
(od dependents)
lub
++&lt;typ&gt;-&lt;w1&gt;/&lt;w2&gt;
je&#197;&#155;li w wywo&#197;&#130;aniu programu dla parametru -info podano w&#197;&#155;r&#195;&#179;d warto&#197;&#155;ci 'h'
Mo&#197;&#188;e te&#197;&#188; zawierac oba typy wyra&#197;&#188;e&#197;&#132;, je&#197;&#155;li podano zar&#195;&#179;wno 'd' jak i 'h'.
Wyra&#197;&#188;enie
-&lt;typ&gt;-&lt;w1&gt;/&lt;w2&gt;
oznacza mo&#197;&#188;liwo&#197;&#155;&#196;&#135; istnienia zale&#197;&#188;no&#197;&#155;ci typu &lt;typ&gt;, kt&#195;&#179;rej nadrz&#196;&#153;dnikiem jest aktualny wierzcho&#197;&#130;ek, a podrz&#196;&#153;dnikiem
wierzcho&#197;&#130;ek &lt;w1&gt; (o &lt;w2&gt; za chwil&#196;&#153;).
pies goni czarnego kota w butach.
<DIV ALIGN="CENTER"><A NAME="29"></A>
<TABLE>
<CAPTION ALIGN="BOTTOM"><STRONG>Figure:</STRONG>
output of <code>tok | sen | lem | canonize</code></CAPTION>
<TR><TD></TD></TR>
</TABLE>
</DIV>
<DIV ALIGN="CENTER"><A NAME="34"></A>
<TABLE>
<CAPTION ALIGN="BOTTOM"><STRONG>Figure 5.2:</STRONG>
Word graph representation: sentence annotated with gph.</CAPTION>
<TR><TD></TD></TR>
</TABLE>
</DIV>
<DIV ALIGN="CENTER"><A NAME="39"></A>
<TABLE>
<CAPTION ALIGN="BOTTOM"><STRONG>Figure 5.3:</STRONG>
dgp output</CAPTION>
<TR><TD></TD></TR>
</TABLE>
</DIV>
<DIV CLASS="navigation"><HR>
<!--Navigation Panel-->
<A NAME="tex2html61"
HREF="node6.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="/usr/share/latex2html/icons/next.png"></A>
<A NAME="tex2html59"
HREF="dgp.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="/usr/share/latex2html/icons/up.png"></A>
<A NAME="tex2html53"
HREF="node4.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="/usr/share/latex2html/icons/prev.png"></A>
<BR>
<B> Next:</B> <A NAME="tex2html62"
HREF="node6.html">About this document ...</A>
<B> Up:</B> <A NAME="tex2html60"
HREF="dgp.html">DGP</A>
<B> Previous:</B> <A NAME="tex2html54"
HREF="node4.html">Input</A></DIV>
<!--End of Navigation Panel-->
<ADDRESS>
to
2014-12-19
</ADDRESS>
</BODY>
</HTML>

74
doc/dgp/node6.html Normal file
View File

@ -0,0 +1,74 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!--Converted with LaTeX2HTML 2008 (1.71)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>About this document ...</TITLE>
<META NAME="description" CONTENT="About this document ...">
<META NAME="keywords" CONTENT="dgp">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<META NAME="Generator" CONTENT="LaTeX2HTML v2008">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="dgp.css">
<LINK REL="previous" HREF="node5.html">
<LINK REL="up" HREF="dgp.html">
</HEAD>
<BODY >
<DIV CLASS="navigation"><!--Navigation Panel-->
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="/usr/share/latex2html/icons/next_g.png">
<A NAME="tex2html67"
HREF="dgp.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="/usr/share/latex2html/icons/up.png"></A>
<A NAME="tex2html63"
HREF="node5.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="/usr/share/latex2html/icons/prev.png"></A>
<BR>
<B> Up:</B> <A NAME="tex2html68"
HREF="dgp.html">DGP</A>
<B> Previous:</B> <A NAME="tex2html64"
HREF="node5.html">Output</A>
<BR>
<BR></DIV>
<!--End of Navigation Panel-->
<H1><A NAME="SECTION00600000000000000000">
About this document ...</A>
</H1>
<STRONG>DGP</STRONG><P>
This document was generated using the
<A HREF="http://www.latex2html.org/"><STRONG>LaTeX</STRONG>2<tt>HTML</tt></A> translator Version 2008 (1.71)
<P>
Copyright &#169; 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
<BR>
Copyright &#169; 1997, 1998, 1999,
<A HREF="http://www.maths.mq.edu.au/~ross/">Ross Moore</A>,
Mathematics Department, Macquarie University, Sydney.
<P>
The command line arguments were: <BR>
<STRONG>latex2html</STRONG> <TT><A NAME="tex2html4"
HREF="../dgp.tex">dgp.tex</A></TT>
<P>
The translation was initiated by to on 2014-12-19
<BR><HR>
<ADDRESS>
to
2014-12-19
</ADDRESS>
</BODY>
</HTML>

View File

@ -11,19 +11,25 @@ PATTERN {
if(yytext[yyleng-1]!='\n') if(yytext[yyleng-1]!='\n')
{fprintf(stderr,"ser: pattern matches incomplete line\n"); exit(1);} {fprintf(stderr,"ser: pattern matches incomplete line\n"); exit(1);}
n++; n++;
sscanf(yytext,"%d %d",&start,&len); if( sscanf(yytext,"%d %d",&start,&len) != 2 ) {start=-1; len=-1;};
yytext[yyleng-1]='\0'; yytext[yyleng-1]='\0';
if(tmp=strrchr(yytext,'\n')) if(tmp=strrchr(yytext,'\n'))
{ {
lastseg=tmp+1; lastseg=tmp+1;
sscanf(lastseg,"%d %d", &end, &len); if( sscanf(lastseg,"%d %d",&end,&len) != 2 ) {start=-1; len=-1;};
} }
else else
end=start; end=start;
yytext[yyleng-1]='\n'; yytext[yyleng-1]='\n';
printf("%04d 00 BOM * ser:%d\n",start,n); if(start >= 0 && end >=0)
printf("%04d 00 BOM * ser:%d\n",start,n);
else
printf("BOM * ser:%d\n",n);
ECHO; ECHO;
printf("%04d 00 EOM * ser:%d\n",end+len,n); if(start>=0 && end >=0)
printf("%04d 00 EOM * ser:%d\n",end+len,n);
else
printf("EOM * ser:%d\n",n);
} }

View File

@ -15,7 +15,7 @@ gram.dgp: gram.dgc
.PHONY: install .PHONY: install
install: install: install-dictionaries
.PHONY: install-grammar .PHONY: install-grammar
install-grammar: install-grammar:
@ -26,31 +26,31 @@ install-grammar:
install-dictionaries: install-dictionaries:
ifdef LANG_DIR ifdef LANG_DIR
install -d $(LANG_DIR)/pl_PL.ISO-8859-2 install -d $(LANG_DIR)/pl_PL.ISO-8859-2
install -d $(LANG_DIR)/pl_PL.UTF-8 # install -d $(LANG_DIR)/pl_PL.UTF-8
install -m 0644 pl_PL.ISO-8859-2/cor.bin $(LANG_DIR)/pl_PL.ISO-8859-2 install -m 0644 pl_PL.ISO-8859-2/cor.bin $(LANG_DIR)/pl_PL.ISO-8859-2
install -m 0644 pl_PL.ISO-8859-2/gue.bin $(LANG_DIR)/pl_PL.ISO-8859-2 install -m 0644 pl_PL.ISO-8859-2/gue.bin $(LANG_DIR)/pl_PL.ISO-8859-2
install -m 0644 pl_PL.ISO-8859-2/lem.bin $(LANG_DIR)/pl_PL.ISO-8859-2 install -m 0644 pl_PL.ISO-8859-2/lem.bin $(LANG_DIR)/pl_PL.ISO-8859-2
install -m 0644 pl_PL.ISO-8859-2/lem.fst $(LANG_DIR)/pl_PL.ISO-8859-2 # install -m 0644 pl_PL.ISO-8859-2/lem.fst $(LANG_DIR)/pl_PL.ISO-8859-2
install -m 0644 pl_PL.ISO-8859-2/lem.cats $(LANG_DIR)/pl_PL.ISO-8859-2 install -m 0644 pl_PL.ISO-8859-2/lem.cats $(LANG_DIR)/pl_PL.ISO-8859-2
install -m 0644 pl_PL.ISO-8859-2/pl_PL.ISO-8859-2.sym $(LANG_DIR)/pl_PL.ISO-8859-2 install -m 0644 pl_PL.ISO-8859-2/pl_PL.ISO-8859-2.sym $(LANG_DIR)/pl_PL.ISO-8859-2
install -m 0644 pl_PL.UTF-8/lem.bin $(LANG_DIR)/pl_PL.UTF-8 # install -m 0644 pl_PL.UTF-8/lem.bin $(LANG_DIR)/pl_PL.UTF-8
install -m 0644 weights.kor $(LANG_DIR) install -m 0644 weights.kor $(LANG_DIR)
endif endif
.PHONY: uninstall .PHONY: uninstall
uninstall: uninstall:
ifdef LANG_DIR ifdef LANG_DIR
rm $(LANG_DIR)/weights.kor rm -f $(LANG_DIR)/weights.kor
rm $(LANG_DIR)/gram.* rm -f $(LANG_DIR)/gram.*
rm $(LANG_DIR)/pl_PL.UTF-8/lem.bin rm -f $(LANG_DIR)/pl_PL.UTF-8/lem.bin
rm $(LANG_DIR)/pl_PL.ISO-8859-2/pl_PL.ISO-8859-2.sym rm -f $(LANG_DIR)/pl_PL.ISO-8859-2/pl_PL.ISO-8859-2.sym
rm $(LANG_DIR)/pl_PL.ISO-8859-2/lem.cats rm -f $(LANG_DIR)/pl_PL.ISO-8859-2/lem.cats
rm $(LANG_DIR)/pl_PL.ISO-8859-2/lem.bin rm -f $(LANG_DIR)/pl_PL.ISO-8859-2/lem.bin
rm $(LANG_DIR)/pl_PL.ISO-8859-2/lem.fst # rm -f $(LANG_DIR)/pl_PL.ISO-8859-2/lem.fst
rm $(LANG_DIR)/pl_PL.ISO-8859-2/gue.bin rm -f $(LANG_DIR)/pl_PL.ISO-8859-2/gue.bin
rm $(LANG_DIR)/pl_PL.ISO-8859-2/cor.bin rm -f $(LANG_DIR)/pl_PL.ISO-8859-2/cor.bin
rmdir $(LANG_DIR)/pl_PL.ISO-8859-2 rmdir $(LANG_DIR)/pl_PL.ISO-8859-2
rmdir $(LANG_DIR)/pl_PL.UTF-8 # rmdir $(LANG_DIR)/pl_PL.UTF-8
endif endif

View File

@ -1,9 +1,9 @@
include ../../config.mak include ../../config.mak
#TARGETS = lem.bin lem.cats cor.bin gue.bin TARGETS = lem.bin lem.cats cor.bin gue.bin
.PHONY: all .PHONY: all
all: $(TARGETS) all:
# ------------------------------------------------------------------ # ------------------------------------------------------------------
# main section # main section
@ -21,4 +21,4 @@ lem.cats: lem.dic
.PHONY: clean .PHONY: clean
clean: clean:
rm -f lem.bin lem.fst lem.cats rm -f lem.fst lem.cats

View File

@ -21,7 +21,6 @@ ifdef BIN_DIR
install -m 0755 compdic-dic-to-fst $(BIN_DIR) install -m 0755 compdic-dic-to-fst $(BIN_DIR)
install -m 0755 compdic-dic-to-cats $(BIN_DIR) install -m 0755 compdic-dic-to-cats $(BIN_DIR)
install -m 0755 compdic-fst-to-bin $(BIN_DIR) install -m 0755 compdic-fst-to-bin $(BIN_DIR)
install -m 0755 canonize $(BIN_DIR) install -m 0755 canonize $(BIN_DIR)
install -m 0755 fsm2aut $(BIN_DIR) install -m 0755 fsm2aut $(BIN_DIR)
install -m 0755 aut2fsa $(BIN_DIR) install -m 0755 aut2fsa $(BIN_DIR)

50
src/compdic/canonize Executable file
View File

@ -0,0 +1,50 @@
#!/usr/bin/perl
#package: UAM TExt Tools
#component: canonize
#version: 1.0
#author: Tomasz Obrebski
use lib "/usr/local/lib/utt";
use lib "$ENV{'HOME'}/.local/lib/utt";
use strict;
use Getopt::Long;
use attr;
my $help;
GetOptions("help|h" => \$help);
if($help)
{
print <<'END'
Transforms syntactic categories to their canonical form.
Usage: canonize
Options:
--help -h Help.
END
;
exit 0;
}
#$|=1;
my %tra;
while(<>)
{
s/$attr::pos_re\/$attr::avlist_re/trans($&)/ge;
print;
}
sub trans
{
my $cat=shift;
exists($tra{$cat}) ? $tra{$cat} : ( $tra{$cat} = attr::canonize $cat );
}

File diff suppressed because it is too large Load Diff

View File

@ -1 +0,0 @@
cmdline.o cmdline.d : cmdline.cc cmdline.h

View File

@ -1,52 +0,0 @@
package "dgp"
version "0.1"
option "grammar" g "Grammar file"
string no typestr="filename"
option "long" l "Long output"
flag off
option "debug" d "Debug mode."
flag off
option "time" - "Print parse time."
flag off
option "info" - "Print info.
h - heads d - dependents
s - sets
c - constraints n - node/arc counts"
string no default="h"
#section "Common UTT options"
option "input" f "Input file" string no
option "output" o "Output file" string no
option "only-fail" - "Print only segments the program failed to process" flag off hidden
option "no-fail" - "Print only segments the program processed" flag off hidden
option "copy" c "Copy succesfully processed segments to output" flag off
option "process" p "Process segments of this type only" string no multiple
option "select" s "Select only segments containing this field" string no multiple
option "ignore" S "Select only segments, which doesn't contain this field" string no multiple
option "output-field" O "Output field name (default: program name)" string no
option "input-field" I "Input field name (default: the FORM field)" string no multiple
option "interactive" i "Toggle interactive mode" flag off
option "config" - "Configuration file" string typestr="FILENAME" no
option "one-field" 1 "Print all alternative results in one field (creates compact ambiguous annotation)" flag off
option "one-line" - "Print annotation alternatives as additional fields in the same segment" flag off
option "language" - "Language." string no

View File

@ -1,294 +0,0 @@
/** @file cmdline.h
* @brief The header file for the command line option parser
* generated by GNU Gengetopt version 2.22.6
* http://www.gnu.org/software/gengetopt.
* DO NOT modify this file, since it can be overwritten
* @author GNU Gengetopt by Lorenzo Bettini */
#ifndef CMDLINE_H
#define CMDLINE_H
/* If we use autoconf. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include <stdio.h> /* for FILE */
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */
#ifndef CMDLINE_PARSER_PACKAGE
/** @brief the program name (used for printing errors) */
#define CMDLINE_PARSER_PACKAGE "dgp"
#endif
#ifndef CMDLINE_PARSER_PACKAGE_NAME
/** @brief the complete program name (used for help and version) */
#define CMDLINE_PARSER_PACKAGE_NAME "dgp"
#endif
#ifndef CMDLINE_PARSER_VERSION
/** @brief the program version */
#define CMDLINE_PARSER_VERSION "0.1"
#endif
/** @brief Where the command line options are stored */
struct gengetopt_args_info
{
const char *help_help; /**< @brief Print help and exit help description. */
const char *full_help_help; /**< @brief Print help, including hidden options, and exit help description. */
const char *version_help; /**< @brief Print version and exit help description. */
char * grammar_arg; /**< @brief Grammar file. */
char * grammar_orig; /**< @brief Grammar file original value given at command line. */
const char *grammar_help; /**< @brief Grammar file help description. */
int long_flag; /**< @brief Long output (default=off). */
const char *long_help; /**< @brief Long output help description. */
int debug_flag; /**< @brief Debug mode. (default=off). */
const char *debug_help; /**< @brief Debug mode. help description. */
int time_flag; /**< @brief Print parse time. (default=off). */
const char *time_help; /**< @brief Print parse time. help description. */
char * info_arg; /**< @brief Print info.
h - heads d - dependents
s - sets
c - constraints n - node/arc counts (default='h'). */
char * info_orig; /**< @brief Print info.
h - heads d - dependents
s - sets
c - constraints n - node/arc counts original value given at command line. */
const char *info_help; /**< @brief Print info.
h - heads d - dependents
s - sets
c - constraints n - node/arc counts help description. */
char * input_arg; /**< @brief Input file. */
char * input_orig; /**< @brief Input file original value given at command line. */
const char *input_help; /**< @brief Input file help description. */
char * output_arg; /**< @brief Output file. */
char * output_orig; /**< @brief Output file original value given at command line. */
const char *output_help; /**< @brief Output file help description. */
int only_fail_flag; /**< @brief Print only segments the program failed to process (default=off). */
const char *only_fail_help; /**< @brief Print only segments the program failed to process help description. */
int no_fail_flag; /**< @brief Print only segments the program processed (default=off). */
const char *no_fail_help; /**< @brief Print only segments the program processed help description. */
int copy_flag; /**< @brief Copy succesfully processed segments to output (default=off). */
const char *copy_help; /**< @brief Copy succesfully processed segments to output help description. */
char ** process_arg; /**< @brief Process segments of this type only. */
char ** process_orig; /**< @brief Process segments of this type only original value given at command line. */
unsigned int process_min; /**< @brief Process segments of this type only's minimum occurreces */
unsigned int process_max; /**< @brief Process segments of this type only's maximum occurreces */
const char *process_help; /**< @brief Process segments of this type only help description. */
char ** select_arg; /**< @brief Select only segments containing this field. */
char ** select_orig; /**< @brief Select only segments containing this field original value given at command line. */
unsigned int select_min; /**< @brief Select only segments containing this field's minimum occurreces */
unsigned int select_max; /**< @brief Select only segments containing this field's maximum occurreces */
const char *select_help; /**< @brief Select only segments containing this field help description. */
char ** ignore_arg; /**< @brief Select only segments, which doesn't contain this field. */
char ** ignore_orig; /**< @brief Select only segments, which doesn't contain this field original value given at command line. */
unsigned int ignore_min; /**< @brief Select only segments, which doesn't contain this field's minimum occurreces */
unsigned int ignore_max; /**< @brief Select only segments, which doesn't contain this field's maximum occurreces */
const char *ignore_help; /**< @brief Select only segments, which doesn't contain this field help description. */
char * output_field_arg; /**< @brief Output field name (default: program name). */
char * output_field_orig; /**< @brief Output field name (default: program name) original value given at command line. */
const char *output_field_help; /**< @brief Output field name (default: program name) help description. */
char ** input_field_arg; /**< @brief Input field name (default: the FORM field). */
char ** input_field_orig; /**< @brief Input field name (default: the FORM field) original value given at command line. */
unsigned int input_field_min; /**< @brief Input field name (default: the FORM field)'s minimum occurreces */
unsigned int input_field_max; /**< @brief Input field name (default: the FORM field)'s maximum occurreces */
const char *input_field_help; /**< @brief Input field name (default: the FORM field) help description. */
int interactive_flag; /**< @brief Toggle interactive mode (default=off). */
const char *interactive_help; /**< @brief Toggle interactive mode help description. */
char * config_arg; /**< @brief Configuration file. */
char * config_orig; /**< @brief Configuration file original value given at command line. */
const char *config_help; /**< @brief Configuration file help description. */
int one_field_flag; /**< @brief Print all alternative results in one field (creates compact ambiguous annotation) (default=off). */
const char *one_field_help; /**< @brief Print all alternative results in one field (creates compact ambiguous annotation) help description. */
int one_line_flag; /**< @brief Print annotation alternatives as additional fields in the same segment (default=off). */
const char *one_line_help; /**< @brief Print annotation alternatives as additional fields in the same segment help description. */
char * language_arg; /**< @brief Language.. */
char * language_orig; /**< @brief Language. original value given at command line. */
const char *language_help; /**< @brief Language. help description. */
unsigned int help_given ; /**< @brief Whether help was given. */
unsigned int full_help_given ; /**< @brief Whether full-help was given. */
unsigned int version_given ; /**< @brief Whether version was given. */
unsigned int grammar_given ; /**< @brief Whether grammar was given. */
unsigned int long_given ; /**< @brief Whether long was given. */
unsigned int debug_given ; /**< @brief Whether debug was given. */
unsigned int time_given ; /**< @brief Whether time was given. */
unsigned int info_given ; /**< @brief Whether info was given. */
unsigned int input_given ; /**< @brief Whether input was given. */
unsigned int output_given ; /**< @brief Whether output was given. */
unsigned int only_fail_given ; /**< @brief Whether only-fail was given. */
unsigned int no_fail_given ; /**< @brief Whether no-fail was given. */
unsigned int copy_given ; /**< @brief Whether copy was given. */
unsigned int process_given ; /**< @brief Whether process was given. */
unsigned int select_given ; /**< @brief Whether select was given. */
unsigned int ignore_given ; /**< @brief Whether ignore was given. */
unsigned int output_field_given ; /**< @brief Whether output-field was given. */
unsigned int input_field_given ; /**< @brief Whether input-field was given. */
unsigned int interactive_given ; /**< @brief Whether interactive was given. */
unsigned int config_given ; /**< @brief Whether config was given. */
unsigned int one_field_given ; /**< @brief Whether one-field was given. */
unsigned int one_line_given ; /**< @brief Whether one-line was given. */
unsigned int language_given ; /**< @brief Whether language was given. */
} ;
/** @brief The additional parameters to pass to parser functions */
struct cmdline_parser_params
{
int override; /**< @brief whether to override possibly already present options (default 0) */
int initialize; /**< @brief whether to initialize the option structure gengetopt_args_info (default 1) */
int check_required; /**< @brief whether to check that all required options were provided (default 1) */
int check_ambiguity; /**< @brief whether to check for options already specified in the option structure gengetopt_args_info (default 0) */
int print_errors; /**< @brief whether getopt_long should print an error message for a bad option (default 1) */
} ;
/** @brief the purpose string of the program */
extern const char *gengetopt_args_info_purpose;
/** @brief the usage string of the program */
extern const char *gengetopt_args_info_usage;
/** @brief the description string of the program */
extern const char *gengetopt_args_info_description;
/** @brief all the lines making the help output */
extern const char *gengetopt_args_info_help[];
/** @brief all the lines making the full help output (including hidden options) */
extern const char *gengetopt_args_info_full_help[];
/**
* The command line parser
* @param argc the number of command line options
* @param argv the command line options
* @param args_info the structure where option information will be stored
* @return 0 if everything went fine, NON 0 if an error took place
*/
int cmdline_parser (int argc, char **argv,
struct gengetopt_args_info *args_info);
/**
* The command line parser (version with additional parameters - deprecated)
* @param argc the number of command line options
* @param argv the command line options
* @param args_info the structure where option information will be stored
* @param override whether to override possibly already present options
* @param initialize whether to initialize the option structure my_args_info
* @param check_required whether to check that all required options were provided
* @return 0 if everything went fine, NON 0 if an error took place
* @deprecated use cmdline_parser_ext() instead
*/
int cmdline_parser2 (int argc, char **argv,
struct gengetopt_args_info *args_info,
int override, int initialize, int check_required);
/**
* The command line parser (version with additional parameters)
* @param argc the number of command line options
* @param argv the command line options
* @param args_info the structure where option information will be stored
* @param params additional parameters for the parser
* @return 0 if everything went fine, NON 0 if an error took place
*/
int cmdline_parser_ext (int argc, char **argv,
struct gengetopt_args_info *args_info,
struct cmdline_parser_params *params);
/**
* Save the contents of the option struct into an already open FILE stream.
* @param outfile the stream where to dump options
* @param args_info the option struct to dump
* @return 0 if everything went fine, NON 0 if an error took place
*/
int cmdline_parser_dump(FILE *outfile,
struct gengetopt_args_info *args_info);
/**
* Save the contents of the option struct into a (text) file.
* This file can be read by the config file parser (if generated by gengetopt)
* @param filename the file where to save
* @param args_info the option struct to save
* @return 0 if everything went fine, NON 0 if an error took place
*/
int cmdline_parser_file_save(const char *filename,
struct gengetopt_args_info *args_info);
/**
* Print the help
*/
void cmdline_parser_print_help(void);
/**
* Print the full help (including hidden options)
*/
void cmdline_parser_print_full_help(void);
/**
* Print the version
*/
void cmdline_parser_print_version(void);
/**
* Initializes all the fields a cmdline_parser_params structure
* to their default values
* @param params the structure to initialize
*/
void cmdline_parser_params_init(struct cmdline_parser_params *params);
/**
* Allocates dynamically a cmdline_parser_params structure and initializes
* all its fields to their default values
* @return the created and initialized cmdline_parser_params structure
*/
struct cmdline_parser_params *cmdline_parser_params_create(void);
/**
* Initializes the passed gengetopt_args_info structure's fields
* (also set default values for options that have a default)
* @param args_info the structure to initialize
*/
void cmdline_parser_init (struct gengetopt_args_info *args_info);
/**
* Deallocates the string fields of the gengetopt_args_info structure
* (but does not deallocate the structure itself)
* @param args_info the structure to deallocate
*/
void cmdline_parser_free (struct gengetopt_args_info *args_info);
/**
* The config file parser (deprecated version)
* @param filename the name of the config file
* @param args_info the structure where option information will be stored
* @param override whether to override possibly already present options
* @param initialize whether to initialize the option structure my_args_info
* @param check_required whether to check that all required options were provided
* @return 0 if everything went fine, NON 0 if an error took place
* @deprecated use cmdline_parser_config_file() instead
*/
int cmdline_parser_configfile (const char *filename,
struct gengetopt_args_info *args_info,
int override, int initialize, int check_required);
/**
* The config file parser
* @param filename the name of the config file
* @param args_info the structure where option information will be stored
* @param params additional parameters for the parser
* @return 0 if everything went fine, NON 0 if an error took place
*/
int cmdline_parser_config_file (const char *filename,
struct gengetopt_args_info *args_info,
struct cmdline_parser_params *params);
/**
* Checks that all the required options were specified
* @param args_info the structure to check
* @param prog_name the name of the program that will be used to print
* possible errors
* @return
*/
int cmdline_parser_required (struct gengetopt_args_info *args_info,
const char *prog_name);
#ifdef __cplusplus
}
#endif /* __cplusplus */
#endif /* CMDLINE_H */

Binary file not shown.

Binary file not shown.

View File

@ -1,3 +0,0 @@
dgp1.o dgp1.d : dgp1.cc dgp1.hh grammar.hh const.hh thesymbols.hh symbol.hh \
sgraph.hh mgraph.hh ../common/common.h ../common/../lib/const.h \
../common/../dgp/cmdline.h boubble.hh global.hh

Binary file not shown.

View File

@ -1 +0,0 @@
global.o global.d : global.cc global.hh

Binary file not shown.

View File

@ -1,3 +0,0 @@
grammar.o grammar.d : grammar.cc grammar.hh const.hh thesymbols.hh symbol.hh \
sgraph.hh mgraph.hh ../common/common.h ../common/../lib/const.h \
../common/../dgp/cmdline.h boubble.hh global.hh

Binary file not shown.

View File

@ -1,3 +0,0 @@
main.o main.d : main.cc global.hh mgraph.hh const.hh thesymbols.hh symbol.hh \
../common/common.h ../common/../lib/const.h ../common/../dgp/cmdline.h \
sgraph.hh boubble.hh grammar.hh dgp1.hh cmdline.h

Binary file not shown.

View File

@ -1,2 +0,0 @@
mgraph.o mgraph.d : mgraph.cc mgraph.hh const.hh thesymbols.hh symbol.hh \
../common/common.h ../common/../lib/const.h ../common/../dgp/cmdline.h

Binary file not shown.

View File

@ -1,3 +0,0 @@
sgraph.o sgraph.d : sgraph.cc sgraph.hh const.hh mgraph.hh thesymbols.hh symbol.hh \
../common/common.h ../common/../lib/const.h ../common/../dgp/cmdline.h \
boubble.hh global.hh grammar.hh

Binary file not shown.

View File

@ -1 +0,0 @@
symbol.o symbol.d : symbol.cc symbol.hh

Binary file not shown.

View File

@ -6,12 +6,14 @@ grp:
install: install:
ifdef BIN_DIR ifdef BIN_DIR
install -m 0755 grp $(BIN_DIR) install -m 0755 grp $(BIN_DIR)
install -m 0755 ugrp $(BIN_DIR)
endif endif
.PHONY: uninstall .PHONY: uninstall
uninstall: uninstall:
ifdef BIN_DIR ifdef BIN_DIR
rm $(BIN_DIR)/grp rm $(BIN_DIR)/grp
rm $(BIN_DIR)/ugrp
endif endif
clean: clean:

View File

@ -156,6 +156,7 @@ $grepre =~ s/\\W/[^a-z
# extensions # extensions
$grepre =~ s/\\l/[a-z±æê³ñ󶼿]/g; #lowercase letter $grepre =~ s/\\l/[a-z±æê³ñ󶼿]/g; #lowercase letter
$grepre =~ s/\\L/[A-Z¡ÆÊ£ÑÓ¦¬¯]/g; #upercase letter $grepre =~ s/\\L/[A-Z¡ÆÊ£ÑÓ¦¬¯]/g; #upercase letter
$grepre =~ s/`,'/,/g;
my $grep_command = ($action =~ /g/) ? "egrep '$grepre'" : " cat "; my $grep_command = ($action =~ /g/) ? "egrep '$grepre'" : " cat ";

11
src/grp/ugrp Executable file
View File

@ -0,0 +1,11 @@
case $LANG in
pl_PL.UTF-8 ) ARGS=''
for a in "$@"
do
ARG=$(printf '%s' $a | recode -f utf8..l2)
ARGS="$ARGS $ARG"
done
recode -f utf8..l2 | LANG=pl_PL grp $ARGS | recode -f l2..utf8;;
pl_PL|pl_PL.ISO-8859-2 ) grp $@ ;;
* ) echo "LANG variable must be set pl_PL.UTF-8, pl_PL, or pl_PL.ISO-8859-2";;
esac

View File

@ -6,12 +6,14 @@ kon:
install: install:
ifdef BIN_DIR ifdef BIN_DIR
install -m 0755 kon $(BIN_DIR) install -m 0755 kon $(BIN_DIR)
install -m 0755 ukon $(BIN_DIR)
endif endif
.PHONY: uninstall .PHONY: uninstall
uninstall: uninstall:
ifdef BIN_DIR ifdef BIN_DIR
rm $(BIN_DIR)/kon rm $(BIN_DIR)/kon
rm $(BIN_DIR)/ukon
endif endif
clean: clean:

7
src/kon/ukon Executable file
View File

@ -0,0 +1,7 @@
case $LANG in
pl_PL.UTF-8 ) recode -f utf8..l2 | LANG=pl_PL kon $@ | recode -f l2..utf8;;
pl_PL|pl_PL.ISO-8859-2 ) kon $@ ;;
* ) echo "LANG variable must be set pl_PL.UTF-8, pl_PL, or pl_PL.ISO-8859-2";;
esac

View File

@ -6,12 +6,14 @@ kot:
install: install:
ifdef BIN_DIR ifdef BIN_DIR
install -m 0755 kot $(BIN_DIR) install -m 0755 kot $(BIN_DIR)
install -m 0755 ukot $(BIN_DIR)
endif endif
.PHONY: uninstall .PHONY: uninstall
uninstall: uninstall:
ifdef BIN_DIR ifdef BIN_DIR
rm $(BIN_DIR)/kot rm $(BIN_DIR)/kot
rm $(BIN_DIR)/ukot
endif endif
clean: clean:

View File

@ -51,15 +51,11 @@ GetOptions("gap-fill|g=s" => \$gap_fill,
if($help) if($help)
{ {
print <<'END' print <<'END'
Usage: ser [OPTIONS] [file ..] Usage: kot [OPTIONS] [file ..]
Options: Options:
--gap-fill -g Help. --gap-fill -g Help.
--spaces -r --spaces -r
--define=FILE Read macrodefinitions from FILE.
--flex-template=FILE Read flex code template from FILE.
--only-matching -m Print only fragments matching PATTERN.
--flex Print only the generated flex code and exit.
END END
; ;
exit 0; exit 0;
@ -76,16 +72,25 @@ my $count=0;
while(<>) while(<>)
{ {
my ($start,$len,$type,$form) = /^\s*(\d+)\s+(\d+)\s+(\S+)\s+(\S+)/; my ($start,$len) = /^\s*(\d+)\s+(\d+)/;
my ($type,$form) = /^(?:\d|\s)*(\S+)\s+(\S+)/;
if($start > $prevend) if ($start && $len)
{ {
print $gap_fill unless $count++ == 0; if($start > $prevend)
{
print $gap_fill unless $count++ == 0;
}
$prevend=$start+$len;
next if $len==0;# || $form eq "*";
}
else
{
next if $form eq "*";
$prevend = -1;
} }
$prevend=$start+$len;
next if $len==0;# || $form eq "*";
$form =~ s/\\\*/*/g; $form =~ s/\\\*/*/g;

5
src/kot/ukot Executable file
View File

@ -0,0 +1,5 @@
case $LANG in
pl_PL.UTF-8 ) recode -f utf8..l2 | LANG=pl_PL kot $@ | recode -f l2..utf8;;
pl_PL|pl_PL.ISO-8859-2 ) kot $@ ;;
* ) echo "LANG variable must be set pl_PL.UTF-8, pl_PL, or pl_PL.ISO-8859-2";;
esac

View File

@ -1,5 +1,7 @@
include ../../config.mak include ../../config.mak
export LANG=pl_PL
LDFLAGS += -static LDFLAGS += -static
CXXFLAGS += -O2 -fpermissive CXXFLAGS += -O2 -fpermissive
@ -45,10 +47,12 @@ clean.cmdline:
install: install:
ifdef BIN_DIR ifdef BIN_DIR
install -m 0755 lem $(BIN_DIR) install -m 0755 lem $(BIN_DIR)
install -m 0755 ulem $(BIN_DIR)
endif endif
.PHONY: uninstall .PHONY: uninstall
uninstall: uninstall:
ifdef BIN_DIR ifdef BIN_DIR
rm $(BIN_DIR)/lem rm $(BIN_DIR)/lem
rm $(BIN_DIR)/ulem
endif endif

7
src/lem/ulem Executable file
View File

@ -0,0 +1,7 @@
case $LANG in
pl_PL.UTF-8 ) recode -f utf8..l2 | LANG=pl_PL lem $@ | recode -f l2..utf8;;
pl_PL|pl_PL.ISO-8859-2 ) lem $@ ;;
* ) echo "LANG variable must be set pl_PL.UTF-8, pl_PL, or pl_PL.ISO-8859-2";;
esac

View File

@ -6,5 +6,4 @@
#author: Tomasz Obrebski #author: Tomasz Obrebski
/[0-9]+[ \t]+[0-9]+[ \t]+BOS/! /[0-9]+[ \t]+[0-9]+[ \t]+BOS/! s/[0-9]+[ \t]+[0-9]+[ \t]//
s/[0-9]+[ \t]+[0-9]+[ \t]//

View File

@ -1,5 +1,7 @@
include ../../config.mak include ../../config.mak
export LANG=pl_PL
ifeq ($(BUILD_STATIC), yes) ifeq ($(BUILD_STATIC), yes)
LDFLAGS += -static LDFLAGS += -static
endif endif
@ -17,12 +19,14 @@ lex.yy.c: sen.l
install: install:
ifdef BIN_DIR ifdef BIN_DIR
install -m 0755 sen $(BIN_DIR) install -m 0755 sen $(BIN_DIR)
install -m 0755 usen $(BIN_DIR)
endif endif
.PHONY: uninstall .PHONY: uninstall
uninstall: uninstall:
ifdef BIN_DIR ifdef BIN_DIR
rm $(BIN_DIR)/sen rm $(BIN_DIR)/sen
rm $(BIN_DIR)/usen
endif endif
clean: clean.flex clean: clean.flex

5
src/sen/usen Executable file
View File

@ -0,0 +1,5 @@
case $LANG in
pl_PL.UTF-8 ) recode -f utf8..l2 | LANG=pl_PL sen $@ | recode -f l2..utf8;;
pl_PL|pl_PL.ISO-8859-2 ) sen $@ ;;
* ) echo "LANG variable must be set pl_PL.UTF-8, pl_PL, or pl_PL.ISO-8859-2";;
esac

View File

@ -6,12 +6,14 @@ ser:
install: install:
ifdef BIN_DIR ifdef BIN_DIR
install -m 0755 ser $(BIN_DIR) install -m 0755 ser $(BIN_DIR)
install -m 0755 user $(BIN_DIR)
endif endif
.PHONY: uninstall .PHONY: uninstall
uninstall: uninstall:
ifdef BIN_DIR ifdef BIN_DIR
rm $(BIN_DIR)/ser rm $(BIN_DIR)/ser
rm $(BIN_DIR)/user
endif endif
clean: clean:

15
src/ser/user Executable file
View File

@ -0,0 +1,15 @@
ser $@
# SAFER BUT SLOWER:
#
# case $LANG in
# pl_PL.UTF-8 ) ARGS=''
# for a in "$@"
# do
# ARG=$(printf '%s' $a | recode -f utf8..l2)
# ARGS="$ARGS $ARG"
# done
# recode -f utf8..l2 | LANG=pl_PL ser $ARGS | recode -f l2..utf8;;
# pl_PL|pl_PL.ISO-8859-2 ) ser $@ ;;
# * ) echo "LANG variable must be set pl_PL.UTF-8, pl_PL, or pl_PL.ISO-8859-2";;
# esac

View File

@ -1,5 +1,7 @@
include ../../config.mak include ../../config.mak
export LANG=pl_PL
ifeq ($(BUILD_STATIC), yes) ifeq ($(BUILD_STATIC), yes)
LDFLAGS += -static LDFLAGS += -static
endif endif
@ -35,12 +37,14 @@ cmdline.c cmdline.h: cmdline.ggo
install: install:
ifdef BIN_DIR ifdef BIN_DIR
install -m 0755 tok_c $(BIN_DIR) install -m 0755 tok_c $(BIN_DIR)
install -m 0755 utok $(BIN_DIR)
endif endif
.PHONY: uninstall .PHONY: uninstall
uninstall: uninstall:
ifdef BIN_DIR ifdef BIN_DIR
rm $(BIN_DIR)/tok_c rm $(BIN_DIR)/tok_c
rm $(BIN_DIR)/utok
endif endif
clean: clean.cmdline clean: clean.cmdline

5
src/tok.c/utok Executable file
View File

@ -0,0 +1,5 @@
case $LANG in
pl_PL.UTF-8 ) recode -f utf8..l2 | LANG=pl_PL tok $@ | recode -f l2..utf8;;
pl_PL|pl_PL.ISO-8859-2 ) tok $@ ;;
* ) echo "LANG variable must be set pl_PL.UTF-8, pl_PL, or pl_PL.ISO-8859-2";;
esac