corpy.vis#

Convenience wrappers for visualizing linguistic data.

corpy.vis.size_in_pixels(width, height, unit='in', ppi=300)#

Convert size in inches/cm to pixels.

Parameters:
  • width – width, measured in unit

  • height – height, measured in unit

  • unit"in" for inches, "cm" for centimeters

  • ppi – pixels per inch

Returns:

(width, height) in pixels

Return type:

(int, int)

Sample values for ppi:

  • for displays: you can detect your monitor’s DPI using the following website: <https://www.infobyip.com/detectmonitordpi.php>; a typical value is 96 (of course, double that for HiDPI)

  • for print output: 300 at least, 600 is high quality

corpy.vis.wordcloud(data, size=(400, 400), *, rounded=False, fast=True, fast_limit=800, **kwargs)#

Generate a wordcloud.

If data is a string, the wordcloud is generated using the method WordCloud.generate_from_text(), which automatically ignores stopwords (customizable with the stopwords argument) and includes “collocations” (i.e. bigrams).

If data is a sequence or a mapping, the wordcloud is generated using the method WordCloud.generate_from_frequencies() and these preprocessing responsibilities fall to the user.

Parameters:
  • data – input data – either one long string of text, or an iterable of tokens, or a mapping of word types to their frequencies; use the second or third option if you want full control over the output

  • size – size in pixels, as a tuple of integers, (width, height); if you want to specify the size in inches or cm, use the size_in_pixels() function to generate this tuple

  • rounded – whether or not to enclose the wordcloud in an ellipse; incompatible with the mask keyword argument

  • fast – when True, optimizes large wordclouds for speed of generation rather than precision of word placement

  • fast_limit – speed optimizations for “large” wordclouds are applied when the requested canvas size is larger than fast_limit**2

  • kwargs – remaining keyword arguments are passed on to the wordcloud.WordCloud initializer

Returns:

The word cloud.

Return type:

wordcloud.WordCloud