List query

Let's say you've already compiled a list of words (e.g., your own research corpus), and you would like to query dlexDB for some of dlexDB's variables for each of these words: that's the case for our list query interface.

On the query page there is a selection Filter query vs. List query on the right hand side below the table selection area. By choosing List query, one designated filter will be removed from the list of active filters (if any) and disappears from the filters area in the center of the screen. Instead, a list entry box will appear. There, you can enter a list of words.

Please note that as of dlexDB version 0.3, each word must have its own line. It is not any longer possible to enter a list of words just separated by spaces. This change was necessary because from now on, list query is also possible for type bigrams and type trigrams which contain spaces themselves.

TableList search on column
Annotated typesType
Annotated type bigramsType bigram
Annotated type trigramsType trigram
TypesType
Type bigramsType bigram
Type trigramsType trigram
Types DCType DC (downcased)
Type bigrams DCType bigram DC (downcased)
Type trigrams DCType trigram DC (downcased)
CharactersCharacter
Character bigramsCharacter bigram
Character trigramsCharacter trigram
Characters DCCharacter DC (downcased)
Character bigrams DCCharacter bigram DC (downcased)
Character trigrams DCCharacter trigram DC (downcased)
LemmataLemma
SyllablesSyllable
Neighbors ColtheartType
Neighbors LevenshteinType
Neighbors Coltheart DCType DC (downcased)
Neighbors Levenshtein DCType DC (downcased)

Of course, you can transfer word lists from other applications by selecting, copying and pasting it into the multi-line input field. Another possibility, especially for word lists exceeding a certain length, is uploading a file you have created beforehand. dlexDB only reads pure text data, but not Word documents or other formats. However, you can create a pure text file with your word processing software by exporting it as Text, .TXT. If you are using some spreadsheet software or database application, you may be able to export a single column as CSV file.

The text file to upload must contain one word per line. If the word list contains diacritical or other special characters, the file must be in UTF-8 encoding (default on most operating systems).

After entering or uploading a list, please run your query. The results table will be shown right below the Execute query button. For each word on your list, the output contains one or several rows. The first column shows a numbering of your input list, and the second column is the corresponding input word. The third column (depending on which base table you are querying) contains the dlexDB entity (e.g., a type, see table above) which has been associated with your input word. By default, a final column show the dlexDB frequency of this entity. By using the Filter selection tree, you can add more variables (output columns) by activating the eye symbol next to the filter in question. Then re-run your query to refresh the results table.

Depending on which base table you are querying, your input list will be applied in a different manner. When querying the Types table, for each input word dlexDB fetches the orthographically identical dlexDB type (if available). In this case, for each word in the input there will be exactly one row in the output. But if you choose to run the query in case-insensitive mode (by activating Ignore case next to the list input), each input word will generally match several entries in dlexDB's Types table, and the results table will contain several rows per input word. For example, if your input word is singen, the output table will contain three corresponding rows for the three dlexDB types singen, Singen und SINGEN with their respective frequencies.

When querying the Annotated types table, the output will almost always be longer than your input list: For each word from the list, the orthographically identical dlexDB type will be looked up. But many dlexDB types have more than one entry in the Annotated types table with different part-of-speech tags, different lemma-associations and different frequencies. So when running a list query against the Annotated types table, the output is usually longer than the input. In the case-insensitive query mode, there will be even more results.

When querying the Lemmata table in List query mode, for each word in your input list dlexDB tries to find an orthographically identical lemma. Such a query generally make sense only if your input list consists of the canonical or headword forms of words (e.g., for verbs, the infinitive). For inflected forms, there won't be any matches in the Lemmata table. This query can also be run case-insensitively.

Finally, with a List query, you can additionally apply any filter that dlexDB offers for the base table you are querying. So you can upload a word list and filter or sort it with respect to the measures that dlexDB provides (e.g., frequency, familiarity, regularity or neighborhood measures).