avatar

(originally published on 2019-04-11; markdown version available here)

Semantics: Primes and Universals, a book review

1. State of the industry

Amazement:
    X feels something
    sometimes a person thinks something like this:
        something is happening now
        I didn't know before now: this can happen
        I want to know more about it
    because of this, this person feels something
    X feels something like this

Some people, when confronted with a problem, think "I know, I'll use neural networks."

As Semantics: Primes and Universals ("SPU") itself recounts in its first chapter — a short history of the study of semantics in linguistics — the academic approach to understanding semantics seems to have mostly been similar to what we've been doing in computer science to handle language: throw fuzzyness at it until it somehow works itself out.

In linguistics, that fuzzyness has apparently ranged from "semantics is too hard, just don't study it" to trying to apply prototype theory to everything. That theory, in short, makes the claim that a concept is a set of ideal attributes — for "bird", whatever we think of when we think of a bird — and that specific instances match a given concept up to a certain degree. Some creatures are extremely bird, some creatures are somewhat bird, some creatures are not very bird at all, etc…

In computer science, that fuzzyness has expressed itself mostly along the lines of the "it's too hard, don't try" approach. Indeed, we tend to just throw artificial neural networks at linguistic problems (and many other problems) and wait for the whole fuzzy system to approach a working solution.

Neural networks aren't magic; they find algorithms, or processes of some form, which solve the problem. Often times, such processes are considered too complex for people to try and understand them; but does that approach have to apply to the understanding of language ?

Not necessarily so; or so has been claiming Anna Wierzbicka since the 1970's. In developing her Natural Semantic Metalanguage ("NSM") since her first book in 1972 to the one I'm currently reviewing from 1996 and up to recent years (the last version of the NSM is from 2017), she has been making two fairly strong postulates:

  1. All human languages share a common semantic core of (for now) less than a hundred primitive human concepts, and for each of them a number of primitive syntactic frames,
  2. Combining these primitives using their respective syntactic frames, we can define all words of all human languages; and in fact, supposedly any expressible human concept.

Both because of my interest in AI and human cognition, and because of my interest in —especially, oligosynthetic— constructed languages, the sheer discreteness of the whole project has been to me a refreshing ray of hope that in this world where the approach to everything seems to have become nihilistic fuzzyism; maybe language, at least, can be modeled and formalized by people in a rigourous manner.

2. What's NSM is made of

Sky:
    something very big
    people can see it
    people can think like this about this something:
        it is a place
        it is above all other places
        it is far from people

I won't list here all 65 semantic primitives; a table of them can be found at this page (the latest version currently is the link to the PDF chart). But in the book, Wierzbicka justifies each semantic primitive with three sources of evidence for each:

  1. Whether the concept can be expressed using a combination of the other primitives
  2. How universal is the concept amongst languages (which SPU makes sure to explore a wide variety of)
  3. How early children seem to acquire the concept

Some of her justifications aren't entirely convincing, such as the inclusion of the supposedly basic concepts A LONG TIME and A SHORT TIME; but they seem to have managed to stay in NSM up to this very day, and considering the ambition of the whole project, the seeming soundness of most of the NSM is impressive enough that I think the whole project should be taken seriously.

These semantic primitives (14 in her first publications, 55 in the book, and 65 in the latest version of the NSM) are meant to be the basic building blocks of all human concepts; supposedly, even, human thought. As a minimalist core of its field, NSM would fullfill the same purpose as λ-calculus in functional programming or turing machines in imperative programming.

NSM also includes, for each of the concepts, a fairly limited set of syntactic frames, together forming the "NSM Grammar"; although that list was still in its early stage in SPU, it should be looked into, as not all grammatical structures are universal and, hence, deemed essential to NSM. That said, the chart linked above contains a series of example frames under each word, such as:

THINK
    someone thinks about someone else/something
    someone thinks something good/bad about someone else/something
    someone thinks like this: “…”
    many people think like this: “…”

3. KIND & LIKE, GOOD & BAD, and Color Terms

X is green:
    in some places many things grow out of the ground
    when one sees things like X one can think of this

The presence of some words, and the absence of others, imply some significant and very nontrivial claims about the nature of basic human thought — claims which SPU makes explicitely, and justifies in length.

KIND (as in X is a kind of Y) points to the idea that the human mind has some fundamental idea of taxonomy; that is, a categorization of objects into mutually disjoint categories. Furthemore, LIKE (as in X is like Y) merely indicates a notion of similarity between things; and that this notion is fundamentally different from taxonomy.

The presence of both GOOD and BAD point no only to the universality and semantic irreductibility of those concepts, but also to the notion that one can't properly describe one using the other. Indeed, or so Wierzbicka claims, "bad" and "not good" are two different ideas, and that difference (as well as the presence of both of those concepts) is universal to all human languages.

As I was reading the first chapters, I was skeptical as to the absence of emotion words, and the absence of color words, from NSM. How can one describe colors without any "primary color" to start with ? But both of these topics were addressed later in the book; and suggested definitions —taken from the book— such as that of Amazement or Green have been included at the start of each part of this review, in the hope of providing a clearer picture of what NSM is like.

Not only is the concept "Green", for example, apparently not as universal as one would expect — indeed, various languages describe and descriminate colors in widely different ways — but it is sky and plant that are semantically simpler concepts than blue and green, and not the other way around.

Indeed, how should someone who has never seen anything green have an inherent notion of green ? This way of defining so-called primary colors from physical objects makes me wonder if even sensory records, and thus even memory, can be assembled purely from discrete grammatical constructs.

4. Application to lexicography

X tempted Y to do Z:
    X wanted Y to do Z
    Y thought something like this:
        if I do Z it will be bad
        because of this, I don't want to do it
    X knew this
    because of this, X said something like this to Y:
        if you do it, something very good will happen to you
        you will feel something very good because of this
    X thought something like this:
        maybe Y will do it because of this
    X wanted this

One of the big applications that Wierzbicka proposes for NSM is defining words. As she points out using many directed graphs, dictionaries love to define words in a circular manner:

circular definitions

In fact, dictionaries seem plagued with circular definitions. Although they can be of some use to someone familiar with some words but not others, they seem for the most part useless at actually defining the nuances between words for similar concepts, multiples meanings of a word, or concepts altogether when one is not already familiar with most items in the graph.

NSM aims to solve this problem by proposing that any one concept be defined in terms of its undefinable primitives — or at least in terms of other concepts themselves semantically simpler, forming a proper hierarchy of definitions with primitives at the top.

Relatedly, Wierzbicka emphasizes the distinction between meaning and knowledge, or between dictionaries and encyclopaedias; as she brilliantly puts it,

Paradoxically, of the two, it is the dictionary entry, not the encyclopaedia entry, which can be said to be "objective" and non-arbitrary, and to represent a "hard fact". Psychocultural fact, of course, not biological fact [in the case of "mouse"]. An encyclopaedia entry for mouse may be provisional, biased, and subjective in its choices and in its emphases, but it doesn't aim at establishing psychocultural facts; it does not aim at discovering conceptual structures. Encyclopaedic knowledge is cumulative and inexhaustible. By contrast, the meanings of words are discrete and finite. They embody a special kind of knowledge … and they constitute a vital point of reference for both communication and cognition.

Where the encyclopaedia aims to collect facts about some object or concept, the dictionary merely aims to define it — that is, to describe in what terms it is thought of. Two people can disagree on facts about a mouse, but they both know what that mouse thing they're talking about is.

In addition, NSM is general enough to support the prototypes mentioned above. Where an exact definition would say X is this, a prototypical definition can say X is something like this; making the prototype explicit when needed, or absent when a prototype isn't appropriate to a definition.

How noticing patterns in the construction of concepts might help categorize them, or how translation might be helped by having words be clearly defined in terms of just a handful of primitives, one can only imagine the usefulness of a dictionary based on the NSM. Or even how subtle nuances between words in various languages might be formalized:

(A) X feels happy. =
    X feels something
    sometimes a person thinks something like this:
        something good happened to me
        I wanted this
        I don't want anything more now
    because of this, this person feels something good
    X feels. like this

(B) X feels szczęśliwy (glücklich, heureux, etc.). =
    X feels something
    sometimes a person thinks something like this:
        something very good happened to me
        I wanted this
        everything is good now
        I can't want anything more now
    because of this, this person feels something very good
    X feels like this

5. Discreteness all the way down

Head:
    a part of a person's body
    this part is above all the other parts of the body
    when a person thinks, something happens in this part

If we're willing to go with this discreteness paradigm, where might one end up ?

The physical universe is discrete. Even though it is, in general, most usefully modeled using real numbers, because of Planck constant(s) a finite volume of space with a finite number of particles in it has only a finite number of meaningfully different states.

Neurons in the brain aren't like neurons in an artifical neural network; whereas all neurons in an ANN are activated at the same time, but by a floating point value, a biological neuron stays unactivated until it has accumulated enough potential, and then activates at once before returning to its inactive state. Since a given neuron has a given activation threshold, the potential coming out of a neuron is roughly the same every time. In fact, the amount by which you lift a finger or turn your arm is apparently only dependent on the frequency of the input signal.[citation needed]

As for the "weight" of synapses, although they can change over time, on a small time scale a neuron can only receive input from a finite number of neurons, and thus has a finite number of input neuron combinations/sequences that can lead to its activation.

If physics, cognition, and language, are all discrete systems, how hard can they be to understand and model ?