Lexicon
Dictionaries at Sefaria
A Lexicon (dictionary) in Sefaria is composed of a Lexicon
object representing the whole work, and LexiconEntry
objects for each entry.
Additionally, a Lexicon can have WordForm
objects for each written expression of a word in the lexicon. WordForm
s represent various conjugations of words that appear in our texts, and attempt to link them to that specific headword.
If the Lexicon is to be viewed as an independent text, it needs to have an Index
record as well. This Index
will have special fields, and can optionally have a Version
record for any additional textual content which is not a dictionary entry (e.g. for introductory text to the lexicon.)
To see documentation on our lexicon APIs, see here
Understanding the Lexicon Object Model
Lexicon
If the Lexicon has an associated Index
, Lexicon.index_title
must match the title
of the Index
object, and on the Index, lexiconName
must match the name
of the Lexicon
object. If it has a Version
associated with it, place the title and language of the version in the version_title
and version_lang
attributes of the Lexicon.
name
is the key field for the Lexiconattribution
,source
andsource_url
are descriptive.language
to_language
text_categories
Example of a Lexicon
object in our database:
{
"text_categories" : [],
"attribution" : "Rabbi Marcus Jastrow",
"name" : "Jastrow Dictionary",
"language" : "heb.talmudic",
"version_lang" : "en",
"to_language" : "eng",
"index_title" : "Jastrow",
"source" : "Jastrow Dictionary",
"version_title" : "London, Luzac, 1903",
"should_autocomplete" : true
}
Lexicon Entry
Recall, every Lexicon
object is comprised of LexiconEntry
objects which represent the words in that Lexicon
.
Notes:
parent_lexicon
must matchLexicon.name
headword
- together withparent_lexicon
, the key for the Lexicon entry. Must be unique for this Lexicon.prev_hw
- The headword for the entry just before this one. (required when Lexicon is presented as a text, with anIndex
.)next_hw
- The headword for the entry just after this one. (required when Lexicon is presented as a text, with anIndex
.)rid
- unique ID. Used for lexical sorting. When presented in order, the rid should be in order.rid
values should begin with a letter, to ensure lexical, and not numeric, sorting.
{
"headword" : "אִיזְמֵיל",
"parent_lexicon" : "Jastrow Dictionary",
"rid" : "A01200",
"language_code" : " h. a. ch.",
"refs" : [
"Chullin 31a:11",
"Shemot Rabbah 26"
],
"content" : {
"senses" : [
{
"definition" : " (זמל √<span dir=\"rtl\">מל</span>; cmp. b. h. <span dir=\"rtl\">סמל</span>; cmp. <a dir=\"rtl\" class=\"refLink\" href=\"/Jastrow,_*אִיזְגָּרָא.1\" data-ref=\"Jastrow, *אִיזְגָּרָא 1\">אִיזְגָּרָא</a>) <i>cutting tool, knife</i>, esp. <i>surgeon’s knife</i>. <a class=\"refLink\" href=\"/Aramaic_Targum_to_Job.16.9\" data-ref=\"Aramaic Targum to Job 16:9\">Targ. Job XVI, 9</a>; a. e.—<a class=\"refLink\" href=\"/Chullin.31a.11\" data-ref=\"Chullin 31a:11\">Ḥull. 31ᵃ</a> <span dir=\"rtl\">א׳ שיש לו קרנים</span> a knife which has hornlike projections as ornaments. Y. Sabb. XIX, beg. 16ᵈ <span dir=\"rtl\">אנשון מייתי או׳</span> they had forgotten to bring the knife (for circumcision). <a class=\"refLink\" href=\"/Shemot_Rabbah.26\" data-ref=\"Shemot Rabbah 26\">Ex. R. s. 26</a> man <span dir=\"rtl\">מכה באי׳ וכ׳</span> wounds with a knife (operating) and heals &c. Pl. Chald. <span dir=\"rtl\">אִזְמֵלַיָּיא</span>; <span dir=\"rtl\">אִזְמַלְוָון</span> (f.). <a class=\"refLink\" href=\"/Targum_Jonathan_on_Isaiah.44.13\" data-ref=\"Targum Jonathan on Isaiah 44:13\">Targ. Is. XLIV, 13</a>. <a class=\"refLink\" href=\"/Targum_Jonathan_on_Joshua.5.2\" data-ref=\"Targum Jonathan on Joshua 5:2\">Targ. Josh. V, 2</a>."
}
],
"morphology" : "m."
},
"plural_form" : [
"אִזְמֵלַיָּיא",
"אִזְמַלְוָון"
],
"alt_headwords" : [
"אִיזְמֵל",
"אִזְ׳",
"(אוּזְמֵל)"
],
"quotes" : [
],
"prev_hw" : "אִיזְמָא",
"next_hw" : "איזמר"
}
When a Lexicon
is to be presented as a text in itself (with an Index
record), the LexiconEntry
objects are arranged with pointers to the entries before and after.
WordForm
There can be many WordForms
objects corresponding with a LexiconEntry
and many LexiconEntry
objects for a WordForm
.
Example:
{
"form" : "פתיח",
"lookups" : [
{
"parent_lexicon" : "Klein Dictionary",
"headword" : "פָּתֽיחַ ᴵ"
}
],
"c_form" : "פתיח",
"refs" : [
"Eruvin 24b:4",
"Megillah 26b:13"
],
"generated_by" : "prefix_adder_1"
}
Note: lookups
represents the list of LexiconEntry
objects. Can be many.
Refs in WordForm
The refs
list is meant to be a way to further restrict the correspondence between the naturally occurring word in a given text and a LexiconEntry
. While the word may appear in the same form throughout the Sefaria library, it may have different meanings based on the context or nature of the work within which it appears. The refs
list will associate a given definition for a word, with the instances in which it appears where that word has identical meaning.
For example, if a word in Biblical Hebrew has a different meaning than when it appears in Modern Hebrew works, there will be separate WordForm
objects. One, representing the Biblical Hebrew word and associating it with refs
from the Bible, and the other representing the Modern Hebrew word, and associating it with refs
corresponding to its appearance in Modern Hebrew works.
Many-to-Many
When querying for a WordForm
for a given string, one may receive many results to enable maximum flexibility in representing natural language.
Index
When a dictionary is presented as a text, it has a special Index
record that has a lexiconName
element at the root of the record, and a DictionaryNode
element in the schema. The lexiconName
must match the name
of the Lexicon
object. (From the reverse direction, Lexicon.index_title
must match the title
of the Index
object.
DictionaryNode
A DictionaryNode
can be placed anywhere within a complex schema tree.
nodeType
- will beDictionaryNode
lexiconName
default
- If it's true, entries can be referenced just with the dictionary name.lastWord
firstWord
headwordMap
Below is the full record for the Jastrow dictionary. Note the lexiconName
and DictionaryNode
element in the schema.
{
"categories" : [
"Reference"
],
"schema" : {
"key" : "Jastrow",
"nodes" : [
{
"lexiconName" : "Jastrow Dictionary",
"default" : true,
"lastWord" : "תתנא",
"firstWord" : "א",
"nodeType" : "DictionaryNode",
"headwordMap" : [
[
"א",
"Jastrow, א"
],
[
"ב",
"Jastrow, ב"
],
[
"ג",
"Jastrow, ג"
],
[
"ד",
"Jastrow, ד"
],
[
"ה",
"Jastrow, ה"
],
[
"ו",
"Jastrow, ו"
],
[
"ז",
"Jastrow, ז"
],
[
"ח",
"Jastrow, ח"
],
[
"ט",
"Jastrow, ט"
],
[
"י",
"Jastrow, י"
],
[
"כ",
"Jastrow, כ"
],
[
"ל",
"Jastrow, ל"
],
[
"מ",
"Jastrow, מ"
],
[
"נ",
"Jastrow, נ"
],
[
"ס",
"Jastrow, ס"
],
[
"ע",
"Jastrow, ע"
],
[
"פ",
"Jastrow, פ"
],
[
"צ",
"Jastrow, צ"
],
[
"ק",
"Jastrow, ק"
],
[
"ר",
"Jastrow, ר"
],
[
"ש",
"Jastrow, שׁ"
],
[
"ת",
"Jastrow, ת"
]
]
},
{
"key" : "Preface",
"addressTypes" : [
"Integer"
],
"sectionNames" : [
"Paragraph"
],
"nodeType" : "JaggedArrayNode",
"depth" : 1,
"titles" : [
{
"primary" : true,
"lang" : "he",
"text" : "הקדמה"
},
{
"primary" : true,
"lang" : "en",
"text" : "Preface"
}
]
},
{
"key" : "Hebrew or Aramaic Abbreviations",
"addressTypes" : [
"Integer"
],
"sectionNames" : [
"Line"
],
"nodeType" : "JaggedArrayNode",
"depth" : 1,
"titles" : [
{
"primary" : true,
"lang" : "he",
"text" : "קיצורים בעברית או בארמית"
},
{
"primary" : true,
"lang" : "en",
"text" : "Hebrew or Aramaic Abbreviations"
}
]
},
{
"key" : "List of Abbreviations",
"addressTypes" : [
"Integer"
],
"sectionNames" : [
"Line"
],
"nodeType" : "JaggedArrayNode",
"depth" : 1,
"titles" : [
{
"primary" : true,
"lang" : "he",
"text" : "רשימת קיצורים"
},
{
"primary" : true,
"lang" : "en",
"text" : "List of Abbreviations"
}
]
}
],
"titles" : [
{
"primary" : true,
"lang" : "he",
"text" : "מילון יסטרוב"
},
{
"primary" : true,
"lang" : "en",
"text" : "Jastrow"
}
]
},
"enDesc" : "A Dictionary of the Targumim, the Talmud Bavli and Yerushalmi, and the Midrashic Literature",
"order" : [
],
"is_cited" : false,
"pubPlace" : "New York",
"compPlace" : "Philadelphia",
"lexiconName" : "Jastrow Dictionary",
"pubDate" : "1903",
"title" : "Jastrow",
"era" : "CO",
"errorMargin" : "10",
"authors" : [
"Marcus Jastrow"
],
"compDate" : "1893"
}
Version
Necessary for any regular (non-definition) text
Important Notes
- In
sefaria/model/lexicon.py
, each dictionary relates to a subclass ofDictionaryEntry
. Those correspondences are listed inLexiconEntrySubClassMapping
- In
sefaria.js
, there is a line that lists the dictionaries:Sefaria.virtualBooksDict = [...]
- If this lexicon participates in the cross-dictionary auto-completer, it needs to be listed in
library.build_lexicon_auto_completers
Updated 6 days ago