Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper discusses an implemented program that automatically classifies verbs into those that ~describe only states of the world, such as to know, and those that describe events, such as to look. It works by exploiting the con, straint between the syntactic environments in which a verb can occur and its meaning. The only input is on-line text. This demonstrates an important new technique for the automatic generation of lexical databases. | Automatic Semantic Classification of Verbs from their Syntactic Contexts An Implemented Classifier for Stativity Michael R. Brent MIT Al Lab 545 Technology Square Cambridge Massachusetts 02139 michael@ai.mit.edu Abstract This paper discusses an implemented program that automatically classifies verbs into those that describe only states of the world such as io know and those that describe events such as to look. It works by exploiting the constraint between the syntactic environments in which a verb can occur and its meaning. The only input is on-line text. This demonstrates an important new technique for the automatic generation of lexical databases. 1 Introduction Young children and natural language processing programs face a common problem everyone else knows a lot more about words. Children it is hypothesized catch up by observing the linguistic and non-linguistic contexts in which words are used. This paper focuses on the value and accessibility of the linguistic context. It argues that linguistic context by itself can provide useful cues about verb meaning to an artificial learner. This is demonstrated by a program that exploits two particular cues from the linguistic context to classify verbs automatically into those whose sole sense is one describing a state and those that have a sense describing an event.1 The approach described here accounts for a certain degree of noise in the input due both to mis-apprehension of input sentences and to their occasional mal-formation. This work shows that the two cues are available and are reliable given the statistical methods applied. Language users whether natural or artificial need detailed semantic and syntactic classifications of words. Ultimately any artificial language The input sentences are those compiled in the Lan-caster Oslo Bergen LOB Corpus a balanced corpus of one million words of British English. The LOB consists primarily of edited prose. user must be able to add new words to its lexicon if only to .