The most commonly used words of 24 corpora across 10 diverse human languages exhibit a clear positive bias, a big data confirmation of the Pollyanna hypothesis. The study’s findings are based on 5 million individual human scores and pave the way for the development of powerful language-based tools for measuring emotion.
Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (i) the words of natural human language possess a universal positivity bias, (ii) the estimated emotional content of words is consistent between languages under translation, and (iii) this positivity bias is strongly independent of frequency of word use. Alongside these general regularities, we describe interlanguage variations in the emotional spectrum of languages that allow us to rank corpora. We also show how our word evaluations can be used to construct physical-like instruments for both real-time and offline measurement of the emotional content of large-scale texts.