Avg. inf. cont., in bigrams

Description:

Average information content of a type, based on an evaluation of all bigrams having this type as their second component.

For any individual bigram, the information content of its second component can be calculated on the basis of the conditional probability of the second component. Informally speaking, a high conditional probability corresponds to a low information content (of the second component), whereas a low conditional probability corresponds to a high information content (of the second component).

The average information content of a type, as given by this column, is defined as the negative average log conditional probability of this type (over all bigrams having this type as their second component):

AvgInfCont(w2) = -1*sum(C(w1w2)*log10(P(w2|w1))) / sum(C(w1w2))

where C is the frequency (count) of a type or bigram. This definition of average information content follows the works of Piantadosi et al., 2011.

Data type:

Data type
Number
Data subtype
Double precision floating point number
Query operators
greater or equal, lower or equal
Null value
-1.0

Available in tables:

Also, in the ngrams tables, you can use this filter on any of the components: