PCQRSCANER/venv/Lib/site-packages/nltk/test/japanese.doctest

49 lines
1.0 KiB
Plaintext
Raw Normal View History

2019-12-22 21:51:47 +01:00
.. Copyright (C) 2001-2019 NLTK Project
.. For license information, see LICENSE.TXT
============================
Japanese Language Processing
============================
>>> from nltk import *
-------------
Corpus Access
-------------
KNB Corpus
----------
>>> from nltk.corpus import knbc
Access the words: this should produce a list of strings:
>>> type(knbc.words()[0]) is not bytes
True
Access the sentences: this should produce a list of lists of strings:
>>> type(knbc.sents()[0][0]) is not bytes
True
Access the tagged words: this should produce a list of word, tag pairs:
>>> type(knbc.tagged_words()[0])
<... 'tuple'>
Access the tagged sentences: this should produce a list of lists of word, tag pairs:
>>> type(knbc.tagged_sents()[0][0])
<... 'tuple'>
JEITA Corpus
------------
>>> from nltk.corpus import jeita
Access the tagged words: this should produce a list of word, tag pairs, where a tag is a string:
>>> type(jeita.tagged_words()[0][1]) is not bytes
True