The Structure of a Simple Text
Exploring the Index schema for "simple" texts at Sefaria.
The Schema of a Simple Text
The simplest Index records in the system have schema trees with just one node - a content node. Specifically, they have a JaggedArrayNode
, which is a type of content node.
For the JaggedArrayNode
there's a corresponding JaggedArray
on the Version (not included in the diagram) containing the text.
Example: The Book of Genesis
An example of a simple schema is the book of Genesis. The text for Genesis is stored in a depth 2 JaggedArray
- an array of elements of elements. Each outer element represents a chapter, and each element within that element is an array of strings, which are verses.
The content for this book looks like this:
[
["Verse 1:1", "Verse 1:2", ...] # Chapter 1
["Verse 2:1", "Verse 2:2", ...] # Chapter 2
[...]
]
The Index schema describing the book looks like this.
{
"title" : "Genesis",
"maps" : [],
"order" : [1, 1],
"categories" : ["Tanach", "Torah"],
"schema" : {
"titles" : [
{
"lang" : "en",
"text" : "Genesis",
"primary" : True
},
{
"lang" : "en",
"text" : "Bereishit"
},
{
"lang" : "he",
"text" : "בראשית",
"primary" : True
}
],
"nodeType" : "JaggedArrayNode",
"lengths" : [50, 1533],
"depth" : 2,
"sectionNames" : ["Chapter", "Verse"],
"addressTypes" : ["Integer", "Integer"],
"key" : "Genesis"
}
}
Let's dive into the various properties on the schema:
Property | Value in Genesis | Explanation |
---|---|---|
key | Genesis | A text field. For a single node schema like this, the value of key is the same as the value of the title" field on the Index record. |
nodeType | JaggedArrayNode | This corresponds to a related class in the Python code. The value, JaggedArrayNode , is currently the only one that is always used for single-node classes. |
titles | [ { "lang" : "en", "text" : "Genesis", "primary" : True }, { "lang" : "en", "text" : "Breishit" }, { "lang" : "he", "text" : "בראשית", "primary" :True } ] | An array of dictionaries specifying titles for this node (for full description of how titles work, see Titles. Each title dictionary has two required keys: - text : The title string- lang : Either "en" or "he" - primary : This field needs to be present and True for exactly one Hebrew and one English title. |
depth | 2 | The depth of the JaggedArray . A two dimensional array (i.e. a list of lists) would have a depth of 2, and a three dimensional array (i.e. a list of lists of lists) would have a depth of 3, etc. |
addressTypes | ["Integer", "Integer"] | Array with depth number of values, each one indicating how that level of the JaggedArray is addressed. Most commonly, these values are Integer , but could also be Talmud , or some less common values defined in safaria.model.schema |
sectionNames | ["Chapter","Verse"] | Array with depth number of values, each one a string name for that level of the JaggedArray . |
toc_zoom | n/a | An Integer value primarily used to adjust the way we choose to organize commentaries around the base text (and therefore not present on the Index of Genesis ).Usually, a commentary is organized by the segments of commentary on the verse of the base text it comments on. Adjusting the toc_zoom will allow you to display the commentary on a verse-by-verse basis (section), or a chapter-by-chapter basis (super-section), or based on a different level in the index depth.toc_zoom sets the depth for display in the table of contents according to the following values:0 will display segments (each string in the Jagged Array).1 will display sections.2 will display super-sections.If not set, the table of contents will display the section level (or segment level for depth 1 texts). An example of a commentary Index with an adjusted toc_zoom can be found here, where comments are aggregated by verse of the base text for display (sections), instead of individual comments (segments). |
lengths (optional) | [50, 1533] | Array with up to depth number of values, each one an integer specifying how many element exist at that level of the JaggedArray . In this case, we see that Genesis has 50 chapters, and 1533 verses. |
Updated 10 months ago