Project description
dlex is a cooperative project of the departments of cognitve psychology and theoretical computational linguistics at the University Potsdam and the project Digital Dictionary of the German Language (DWDS: Digitales Wörterbuch der deutschen Sprache) at the Berlin-Brandenburg Academy of Science funded by the DFG (Deutsche Forschungsgemeinschaft).
Goal of the project is to establish a lexical database for psychological and linguistic research.
The underlying resource is the Kernkorpus of the DWDS comprising over 100 million tokens. Besides planned frequency counts on the super-lexical (n-gram), sublexical (morpheme and syllable structure) and lexical level, specific variables like contextual diversity and orthographic neighbours are provided. These data are far more than what the lexicographic orientated DWDS project has worked out.
In order to validate the relevance of the different variables for human language processing, we are undertaking a series of eye movement and reaction time experiments in the scope of the project.
Compiling linguistic data like morpheme, phoneme and syllable frequency for word forms of a corpus this size cannot be done manually. We therefore revert to research results like the efficient morphological and syntactical analysis provided by the division of theoretical computational linguistics at the University Potsdam.
The lexical database dlexDB aims at providing a public resource for studies in the fields of experimental psychology, psycholinguistics and linguistics and supplementing CELEX in this area.
Funded by Deutsche Forschungsgemeinschaft (KL 955/12-1 and KL 955/19-1).
Contents
Current version
- 0.3
- New tables: all measures in case-insensitive variant.