The Handbook of Speech Perception. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу The Handbook of Speech Perception - Группа авторов страница 53

The Handbook of Speech Perception - Группа авторов

Скачать книгу

0 0 1 1

      So far, our example may seem tedious and somewhat arbitrary: we had to come up with attributes such as “manufactured” or “edible,” then consider their merit as semantic feature dimensions without any obvious objective criteria. However, there are many ways to automatically search for word embeddings without needing to dream up a large set of semantic fields. An incrementally more complex way is to rely on the context words that each one of our target words occurs within a corpus of sentences. Consider a corpus that contains exactly four sentences.

      1 The boy rode on the airplane.

      2 The boy also rode on the boat.

      3 The celery tasted good.

      4 The strawberry tasted better.

      Unlike the previous semantic‐field embeddings, which were constructed using our “expert opinions,” these context‐word embeddings were learned from data (a corpus of four sentences). Learning a set of word embeddings from data can be very powerful. Indeed we can automate the procedure; and even a modest computer can process very large corpora of text to produce embeddings for hundreds of thousands of words in seconds. Another strength of creating word embeddings like these is that the procedure is not limited to concrete nouns, since context words can be found for any target word – whether an abstract noun, verb, or even a function word. You may be wondering how context words are able to represent meaning, but notice that words with similar meanings are bound to co‐occur with similar context words. For example, an ‘airplane’ and a ‘boat’ are both vehicles that you ride in, so they will both occur quite frequently in sentences with the word ‘rode’; however, one will rarely find sentences that contain both ‘celery’ and ‘rode.’ Compared to ‘airplane’ and ‘boat,’ ‘celery’ is more likely to occur in sentences containing the word ‘tasted.’ As the English phonetician Firth (1957, p. 11) wrote: “You shall know a word by the company it keeps.”

Word also better boy good on rode tasted the
airplane 0 0 1 0 1 1 0 2
boat 1 0 1 0 1 1 0 2
celery 0 0 0 1 0 0 1 1
strawberry 0 1 0 0 0 0 1 1
equation
airplane

Скачать книгу