CorPy¶
What is CorPy?¶
A fancy plural for corpus ;) Also, a collection of handy but not especially mutually integrated tools for dealing with linguistic data. It abstracts away functionality which is often needed in practice for teaching and/or day to day work at the Czech National Corpus, without aspiring to be a fully featured or consistent NLP framework.
The short URL to the docs is: https://corpy.rtfd.io/
Here’s an idea of what you can do with CorPy:
tokenize and morphologically tag raw textual data using MorphoDiTa
wrangle corpora in the vertical format devised originally for CWB, used also by (No)SketchEngine
plus some command line utilities
Installation¶
$ pip3 install corpy
License¶
Copyright © 2016–present ÚČNK/David Lukeš
Distributed under the GNU General Public License v3.