corpy.vis
#
Convenience wrappers for visualizing linguistic data.
- corpy.vis.size_in_pixels(width, height, unit='in', ppi=300)#
Convert size in inches/cm to pixels.
- Parameters:
width – width, measured in unit
height – height, measured in unit
unit –
"in"
for inches,"cm"
for centimetersppi – pixels per inch
- Returns:
(width, height)
in pixels- Return type:
(int, int)
Sample values for ppi:
for displays: you can detect your monitor’s DPI using the following website: <https://www.infobyip.com/detectmonitordpi.php>; a typical value is 96 (of course, double that for HiDPI)
for print output: 300 at least, 600 is high quality
- corpy.vis.wordcloud(data, size=(400, 400), *, rounded=False, fast=True, fast_limit=800, **kwargs)#
Generate a wordcloud.
If data is a string, the wordcloud is generated using the method
WordCloud.generate_from_text()
, which automatically ignores stopwords (customizable with the stopwords argument) and includes “collocations” (i.e. bigrams).If data is a sequence or a mapping, the wordcloud is generated using the method
WordCloud.generate_from_frequencies()
and these preprocessing responsibilities fall to the user.- Parameters:
data – input data – either one long string of text, or an iterable of tokens, or a mapping of word types to their frequencies; use the second or third option if you want full control over the output
size – size in pixels, as a tuple of integers, (width, height); if you want to specify the size in inches or cm, use the
size_in_pixels()
function to generate this tuplerounded – whether or not to enclose the wordcloud in an ellipse; incompatible with the mask keyword argument
fast – when
True
, optimizes large wordclouds for speed of generation rather than precision of word placementfast_limit – speed optimizations for “large” wordclouds are applied when the requested canvas size is larger than
fast_limit**2
kwargs – remaining keyword arguments are passed on to the
wordcloud.WordCloud
initializer
- Returns:
The word cloud.
- Return type:
wordcloud.WordCloud