Thursday, 8 March 2007

Semantic Encoding of the Hebrew Text

Hebrew Syntax Encoding Initiative Working Papers, no 96.4
revised draft no. 2

* * *


1. Introduction

Part I: Assignment of Semantic Functions
1. Source of Function Labels and Initial Characterizations
2. Lexical Rules (Verbal Case Frames)
3. Formal Rules

Part II: Critical Consideration of the Semantic Encoding Scheme
4. Adequacy of Scheme
5. Redundancy and Decomposition

* * *

1. Introduction

1.1 The working paper HSEIWP96.2, ""Theory-Neutral" Syntactic
Tagging of the Text," presents a brief outline of the encoding
scheme and indicates the theorizing behind it. This paper,
HSEIWP96.4, is the sister of HSEIWP96.3, "On the Syntactic
Encoding of the Text," which examines choices in implementation
of phrase-structure parsing and assignment of syntactic case. It
was thought useful to separate out the semantic case or
"functions" for independent consideration here in HSEIWP96.4.

1.2 These three working papers, together with the independent
work by Kirk Lowery, HSEIWP96.1, which introduces the Encoding
project and its goals, constitute a basic documentation package
to accompany the HSEI texts, initially of Jonah with the four
books of Samuel-Kings to follow.

Part I: Assignment of Semantic Functions

1. Source of Function Labels and Initial Characterizations

1.1 To date, the source of the semantic tags has been Kirk
Lowery's paper, "The Role of Semantics in the Adequacy of
Syntactic Models of Biblical Hebrew," pp. 101-128 in Bible and
Computer: Desk and Discipline: The Impact of Computers in Bible
Studies: Proceedings of the Fourth International Colloquium of
the Association Internationale Bible et Informatique (AIBI),
Amsterdam, 15-18 August 1994 (Travaux de Linguistique
Quantitative, no. 57; Paris: Honore Champion, 1995):

percept (with experiencer)


1.2 This scheme appears to strike a balance between fine- and
coarse-grained approaches to Biblical Hebrew semantic functions.
The goal is to apply the labels in a consistent fashion so that
they can be easily manipulated by the discrimating functionalist

1.3 In initial attempts to tag Jonah and fragments of other
texts, the semantic roles where the most difficult to assign
confidently and systematically. I have adopted a series of formal
rules to aid the encoder in this more difficult area of tagging,
which are now outlined. It is hoped that such rules might be
developed into lexical "features" of verbs; and that such a
lexicon could be invoked in the envisioned unification-based
parsing of the text.

2. Lexical Rules (Verbal Case Frames)

2.1 Verbs of Motion. The theme tag is reserved here
strictly for the constituent in motion; this may be the argument
of the verb, e.g., hlk, or independently motivated by
prepositions such as el "to." The theme may be the subject ;
or the object , in which case the subject is the agent .
There are therefore two schemas. Here as elsewhere the question
of the redundancy of the syntactic case assignment arises.


| |

2.2 Verbs of Cognition. The paired functions "experiencer"
and "percept" are reserved for verbs of cognition. It
is expected that typically will be associated with the
subject position .

2.3 Verbs of Speech. These verbs are not assigned semantic-
functional labels, only for speaker,
for content, and
for the addressee.

2.4 Verbs of Exchange. Under the interpretation of "theme"
adopted, i.e., the constituent in motion, that which is exchanged
is analyzed as a theme. The subject will necessarily be an agent
. The goal is marked "recipient" and associated with
dative assignment . In summary, verbs of exchange pattern
similarly to verbs of motion.

3. Formal Rules

3.1 Purpose is automatically assigned to the headed by

l- "to" which in turn governs the .

3.2 Why/Reason is assigned to the ki- as default. The

9al "on" in 9al ken "therefore" automatically triggers .
The complex construction with be-$el-l-mi is assigned as


s are assigned as a default.

3.4 A headed by ka'a$er triggers "manner."

3.5 The

9al triggers by stipulation in its sense of

3.6 The

9im "with" is assigned .

Part II: Critical Consideration of the Semantic Encoding Scheme

4. Adequacy of Scheme

4.1 The scheme as applied to the text of Jonah does not present
any insurmountable difficulties. Nevertheless, scholars using the
database would be well advised to consider the slipperiness of
the semantics in working with the results of searches.

4.2 It seems unlikely that the full range of semantic tags would
be required for automated parsing. However, there is the question
of the relative order of verbal modifiers; and the more fine-
grained scheme here should facilitate testing hypotheses in this

5. Redundancy and Decomposition

5.1 Redundancy and the Lexicon. There appears to be a great deal
of redundancy at the lexical level. Individual

s appear to be
redundantly associated with semantic tags, e.g., the

"with" and the function "comitative." Such observations
extend to the arguments of verbs as noted in section 2 above.
Given the verb is, e.g., yd9 "to know," the subject is
redundantly tagged "experiencer."

5.2 Such observations, however, hold out hope for automated
tagging. Lexical entries could in some fashion bear semantic
"features" that could be used in assigning semantic "functions."

5.3 Consideration of a lexicon and its structure suggests that
many of the semantic-function labels are not "primitives" but are
subject to decomposition. The label "purpose," e.g., appears
to be a function of a particular

, l- "to" as noted in section
3 above.

5.4 Such considerations suggest that groups might be related
systematically at a more primitive level. No doubt "time,"
"duration," and "frequency" could be so related;
indeed, may be the primitive, the others derived by

5.5 The basic suggestion is that individual heads introduce
semantic features into representations by virtue of their lexical
representations. These features can percolate and interact with
features and syntactic configurations to produce the ultimate
assignment of "function" to a given .

5.6 An example might be the assignment of "time" to bre$it
in Genesis 1:1. Under the null hypothesis, b- "in" might be
assigned by default, but otherwise remain unmarked. On the
other hand, re$it will bring with it the feature by virtue
of its semantic lexical entry. The proposal can be graphically
represented in (a) and (b).

(a) variables in < > (b) variables bound by

PP < > PP
/ \ / \
P NP < > --> P NP
| | | |
b- < > N b- N
| |
re$it re$it