Filters
This page gives an overview over all of dlexDB's filters in all tables.
Corpus bits
Derived textual representations
Downcased corpus bits:
- Type DC (downcased)
- Type bigram DC (downcased)
- Type trigram DC (downcased)
- Character DC (downcased)
- Character bigram DC (downcased)
- Character trigram DC (downcased)
Linguistic representations:
Frequencies
Case-sensitive:
- Annotated type frequency
- Type frequency
- Annotated type bigram frequency
- Annotated type trigram frequency
- Type bigram frequency
- Type trigram frequency
- Character corpus frequency
- Character lexicon frequency
- Character bigram corpus frequency
- Character bigram lexicon frequency
- Character trigram corpus frequency
- Character trigram lexicon frequency
Case-insensitive:
- Type DC frequency
- Type bigram DC frequency
- Type trigram DC frequency
- Character DC corpus frequency
- Character DC lexicon frequency
- Character bigram DC corpus frequency
- Character bigram DC lexicon frequency
- Character trigram DC corpus frequency
- Character trigram DC lexicon frequency
Other:
Conditional probabilities
Case-sensitive:
- Annotated type bigram cond. prob.
- Annotated type trigram cond. prob.
- Type bigram cond. prob.
- Type trigram cond. prob.
Case-insensitive:
Numerical filters
Case-sensitive:
- Familiarity
- Regularity
- Document frequency
- Sentence frequency
- Cumulative syllable corpus frequency
- Cumulative syllable lexicon frequency
- Cumulative character corpus frequency
- Cumulative character lexicon frequency
- Cumulative character bigram corpus frequency
- Cumulative character bigram lexicon frequency
- Cumulative character trigram corpus frequency
- Cumulative character trigram lexicon frequency
- Initial letter
- Initial bigram
- Initial trigram
- Uniqueness point (orth.) prefix length
- Uniqueness point (orth.) neg. offs.
- Uniqueness point (lemma) prefix length
- Uniqueness point (lemma) neg. offs.
- Avg. cond. prob., in bigrams
- Avg. inf. cont., in bigrams
- Avg. cond. prob., in trigrams
- Avg. inf. cont., in trigrams
Case-insensitive:
- Type DC familiarity
- Type DC regularity
- Type DC document frequency not yet available
- Type DC sentence frequency not yet available
- Cumulative character DC corpus frequency
- Cumulative character DC lexicon frequency
- Cumulative character bigram DC corpus frequency
- Cumulative character bigram DC lexicon frequency
- Cumulative character trigram DC corpus frequency
- Cumulative character trigram DC lexicon frequency
- Initial letter DC
- Initial bigram DC
- Initial trigram DC
- Uniqueness point DC (orth.) prefix length
- Uniqueness point DC (orth.) neg. offs.
- Uniqueness point DC (lemma) prefix length
- Uniqueness point DC (lemma) neg. offs.
- Avg. cond. prob. DC, in bigrams
- Avg. inf. cont. DC, in bigrams
- Avg. cond. prob. DC, in trigrams
- Avg. inf. cont. DC, in trigrams
Neighborhood measures
Case-sensitive:
- Neighbors Coltheart higher freq., cum. freq.
- Neighbors Coltheart higher freq., count
- Neighbors Coltheart all, cum. freq.
- Neighbors Coltheart all, count
- Neighbors Levenshtein higher freq., cum. freq.
- Neighbors Levenshtein higher freq., count
- Neighbors Levenshtein all, cum. freq.
- Neighbors Levenshtein all, count
Case-insensitive:
- Neighbors Coltheart DC higher freq., cum. freq.
- Neighbors Coltheart DC higher freq., count
- Neighbors Coltheart DC all, cum. freq.
- Neighbors Coltheart DC all, count
- Neighbors Levenshtein DC higher freq., cum. freq.
- Neighbors Levenshtein DC higher freq., count
- Neighbors Levenshtein DC all, cum. freq.
- Neighbors Levenshtein DC all, count
Neighbors tables
Contents
Current version
- 0.3
- New tables: all measures in case-insensitive variant.