JaggedArray and JaggedArray Nodes
JaggedArray and JaggedArrayNodes are objects used for representing the text content, and the structure of simple texts.
Content Nodes
Content nodes describe the structure that the text is stored in, from that level of the tree on. Currently, all content nodes are JaggedArrayNodes
, nodes which describe a JaggedArray
(list of lists) structure.
JaggedArray
JaggedArray
A JaggedArray
is a nested array (i.e. a list of lists) of a certain depth, with the lowest level being an array of strings. A JaggedArray
with a depth of 2 is an array of arrays of strings. A common example of a text represented by a JaggedArray
of a depth of 2 is the structure of a book of Tanakh. A book of Tanakh has many chapters, and each chapter has many verses. Each chapter is represented by an array within the outer array, and contains numerous strings, with each string representing a verse. The position of an item in the array reflects the structural information about that item.
An Example of a JaggedArray
of depth=2
JaggedArray
of depth=2
The example below mocks a JaggedArray
for a text with four chapters, each of those chapters containing a variable number of verses. Each sub-array represents a chapter. Each string inside the sub-array is a verse of the given chapter.
[
["Text of 1:1", "Text of 1:2", ], # Chapter 1
["Text of 2:1", "Text of 2:2", "Text of 2:3", "Text of 2:4", "Text of 2:5"], # Chapter 2
["Text of 3:1", "Text of 3:2", "Text of 3:3"], # Chapter 3
["Text of 4:1", "Text of 4:2", "Text of 4:3", "Text of 4:4"] # Chapter 4
]
We refer to these objects as a JaggedArray
since any spot in the array could be empty, as many of our versions of text are incomplete.
JaggedArrayNode
JaggedArrayNode
A JaggedArrayNode
in the schema of an Index
is a node which indicates that there will be a JaggedArray
present to represent this part of the text in the Version
record. The simplest example is a schema that consists of a single JaggedArrayNode
. In the Version
, the entire book exists within a single array.
Segments and Sections
In Sefaria, we refer to the strings at the bottom of the JaggedArray
as segments. The arrays that contain those strings are called sections. In general, segments reflect the maximum resolution that we are capable of reaching in a given text - one cannot directly reference a part of a segment.
For example, in a book of Tanakh which is organized into chapters and verses, a chapter would be a section and a verse would be considered a segment.
Updated 7 months ago