Informatics and Machine Learning. Stephen Winters-Hilt

Чтение книги онлайн.

Читать онлайн книгу Informatics and Machine Learning - Stephen Winters-Hilt страница 14

Informatics and Machine Learning - Stephen Winters-Hilt

Скачать книгу

target="_blank" rel="nofollow" href="#uc8795141-a073-5dd5-bbe6-d4de43c240ea">Chapter 5 shows some (very) basic extensions to an FSA analysis in applications to text. This begins with a simple frequency analysis on words, which for some classics (in their original languages) reveal important word‐frequency results with implied meanings meant by the author (polysemy word usage by Machiavelli, for example). The frequency on word groupings in a given text can be studied as well, with some useful results from texts of sufficient size with clear stylistic conventions by the author. Authors that structure their lines of text according to iambic pentameter (Shakespeare, for example) can also be identified according to the profile (histogram) of syllables used on each line (i.e. 10 for iambic pentameter will dominate).

      Text analytics can also take what is still O(L) processing into mapping the mood or sentiment of text samples by use of word‐scored sentiment tables. The generation and use of such sentiment tables is its own craft, usually proprietary, so only minimal examples are given. Thus Chapter 5 shows an elaboration of FSA‐based analysis that might be done when there is no clear definition of state, such as in language. NLP processing in general encompasses a much more complete grammatical knowledge of the language, but in the end the NLP and the FSA‐based “add‐on” still suffer from not being able to manage word context easily (the states cannot simply be words since the words can have different meaning according to context). The inability to use HMMs has been a blockade to a “universal translator” that has since been overcome with use of Deep Learning using NNs (Chapter 13) – where immense amounts of translation data, such as the massive corpus of dual language Canadian Government proceedings, is sufficient to train a translator (English–French). Most of the remaining Chapters focus on situations where a clear delinaeation of signal state can be given, and thus benefit from the use of HMMs.

Schematic illustration of the Viterbi path. (Left) The Viterbi path is recursively defined, thus tabulatable, with one column only, recursively, dependent on the prior column. (Right) A related recursive algorithm used to perform sequence alignment extensions with gaps (the Smith–Waterman algorithm) is provided by the neighbor-cell recursively-defined relation shown.

      HMM tools have recently been developed with a number of computationally efficient improvements (described in detail in Chapter 7), where application of the HMM methods will be described for gene‐finding, alt‐splice gene‐finding, and nanopore‐detector signal analysis.

Schematic illustration 
						<noindex><p style= Скачать книгу