may occur as homonymy, a word having
two distinct meanings, as polysemy, a word having two
related meanings, as vagueness, or structural ambiguity
- ...[Jäppinen et al., 1983,Jäppinen and Ylilammi, 1986]
practically complete model of Finnish morphology,
the two-level model was developed by [Koskenniemi, 1983]
in University of Helsinki.
The two-level model is generally applicable
over various languages and language families.
Thus, prototypes of the two-level model have been implemented for
over 30 languages. The most comprehensive implementations exist for
Finnish, English, Swedish, Russian, Swahili, French, Arabic and
Basque [Lindén, 1993]. A language-independent formalism, Constraint
Grammar (CG) has also been developed for syntactic
analysis [Karlsson, 1990,Karlsson et al., 1995c,Karlsson et al., 1995a,Karlsson et al., 1995b].
The recognition rate for a large English corpus,
when parsing new unrestricted running text and after
a morphological analysis by the two-level model,
is approximately 98%, i.e., only 2 words out of 100 get the wrong
syntactic code [Järvinen, 1994].
- The term 'processing' is here used
to refer to applications, whereas
the term 'interpretation' emphasizes the
cognitive point of view.
- Collected by
Ben Chi ().
- A definition of semantics
and pragmatics that would be widely accepted
is somewhat difficult to give. Levinson (1983)
gives multiple possible definitions. One of
the definitions is as follows: ``Pragmatics
is the study of the ability of language users
to pair sentences with the contexts in which
they would appear.'' He further states
that this definition fits well with the
definition of semantics according to
which semantic theories are concerned
with the recursive assignment of
truth conditions to well-formed
expressions of language. One general
aspect of defining semantics is
that it specifies the relation between
linguistic expressions and the referents
of the expressions.
connectionist means here an approach in which artificial
neural networks are used so that they are adaptive, and
the intermediate representations are numerical and
their interpretation can only be based on the adaptation
- In this work,
words are handled as the original word forms
appearing in the text
- Linell (1982) has written
about the written language bias in the following way:
``Our conception of linguistic behavior is biased by a tendency
to treat processes, activities, and conditions of them
in terms of object-like, static, autonomous and
permanent structures, i.e., as if they shared such properties
with written characters, words, texts, pictures and images. [...]
In general, most of Western philosophy and science has been
stuck with the metaphysical assumption that the world is made up of
'things' or 'objects'.''
- In philosophy this view of
the relationship between language and world was,
among others, strongly
proposed in the early works by Ludwig Wittgenstein.
Perhaps his arrogance
in Tractatus Logico-Philosophicus (saying that
all the main problems are solved) was just premature.
- A remark on the notion of 'symbol'
may be necessary: the basic idea is to consider the possibility
of grounding the symbols based on the unsupervised learning scheme.
The symbols are used on the level of communication and
may be used as the labels for the (usually) continuous multi-dimensional
conceptual spaces. What is the role of these symbols in further
processing is left open in this work. However, the requirement
of grounding the symbols is considered to be crucial.
- Symbol grounding, embodiment and
their connectionist modeling is a central
topic, e.g., in [Varela et al., 1993,Regier, 1995].
- A connection
to the relation between thought and language
may be considered: if the weight of the linguistic input
is high enough, the overall organization is
partly determined by it rather than being
only based on the ``perceptual'' component.
- Not the present author.
- The bibliography
of the Self-Organizing Map
(SOM) and Learning Vector Quantization (LVQ)
compiled in the Neural Networks Research Centre at Helsinki
University of Technology is available at the WWW address
- The two standard
measures of retrieval effectiveness
are precision, the number of relevant retrieved
documents over the total number of retrieved
documents, and recall, the ratio of relevant
retrieved documents to the total number of (known)
- N-gram is a
sequence of n characters.
- A demonstration of the ET-Map is
available at the WWW address
- The effect
of this smoothing has, however, been found to be rather
small at least in some applications [Kaski, 1997a]
- HyperText Markup Language
- The demonstration
can be seen in the WWW address http://websom.hut.fi/websom/.