The Handbook of Multimodal-Multisensor Interfaces, Volume 1. Sharon Oviatt

Чтение книги онлайн.

Читать онлайн книгу The Handbook of Multimodal-Multisensor Interfaces, Volume 1 - Sharon Oviatt страница 20

Автор:
Жанр:
Серия:
Издательство:
The Handbook of Multimodal-Multisensor Interfaces, Volume 1 - Sharon Oviatt ACM Books

Скачать книгу

auditory, tactile), not just their visual properties [Gaver 1991, Norman 1988]. For example, the acoustic qualities of an animated computer persona’s voice can influence a user’s engagement and the content of their dialogue contributions. In one study, when an animated persona sounded like a master teacher by speaking with higher amplitude and wider pitch excursions, children asked more questions about science [Oviatt et al. 2004b]. This example not only illustrates that affordances can be auditory, but also that they affect the nature of communicative actions as well as physical ones [Greeno 1994, Oviatt et al. 2012]. Furthermore, this impact on communication patterns involves all modalities, not just spoken language [Oviatt et al. 2012].

      Recent interpretations of Affordance theory, especially as applied to computer interface design, specify that it is human perception of interface affordances that elicits specific types of activity, not just the presence of specific physical attributes. Affordances can be described at different levels, including biological, physical, perceptual, and symbolic/cognitive [Zhang and Patel 2006]. They are distributed representations that are the by-product of external representations of an object (e.g., streetlight color) and internal mental representations that a person maintains about their action potential (e.g., cultural knowledge that “red” means stop), which determines the person’s physical response. This example of an internal representation involves a cognitive affordance, which originates in cultural conventions mediated by symbolic language (i.e., “red”) that are specific to a person and her cultural/linguistic group.

      Affordance theory emphasizes that interfaces should be designed to facilitate easy discoverability of the actions they are intended to support. It is important to note that the behavioral attunements that arise from object affordances depend on perceived action possibilities that are distinct from specific learned patterns. As such, they are potentially capable of stimulating human activity in a way that facilitates learning in contexts never encountered before. For this reason, if interface affordances are well matched with a task domain, they can increase human activity patterns that stimulate exploratory learning, cognition, and overall performance.

      Motivated by both Affordance theory and Activity theory, research on humancomputer interaction has shown that more expressively powerful interfaces can substantially stimulate human communicative activity and corresponding cognition. An expressively powerful computer interface is one that can convey information involving multiple modalities, representations, or linguistic codes [Oviatt 2013]. Recent research has shown that different input capabilities, such as a keyboard vs. digital pen, have affordances that prime qualitatively different types of communicative content. In one study, students expressed 44% more nonlinguistic representational content (e.g., numbers, symbols, diagrams) when using a pen interface. In contrast, when the same students worked on the same type of problems with keyboard input, they switched to expressing 36% more linguistic content (e.g., words, abbreviations) [Oviatt et al. 2012].

      These differences in communication pattern corresponded with striking changes in students’ cognition. In particular, when students used a pen interface and wrote more nonlinguistic content, they also generated 36% more appropriate biology hypotheses. A regression analysis revealed that knowledge of individual students’ level of nonlinguistic fluency accounted for a substantial 72% of all the variance in their ability to produce appropriate science ideas (see Figure 1.4, left; Oviatt et al. 2012). However, when the same students used the keyboard interface and communicated more linguistic content, a regression now indicated a substantial decline in science ideation (see Figure 1.4, right). In this case, knowledge of students’ level of linguistic communication had a negative predictive relation with their ability to produce appropriate science ideas. That is, it accounted for 62% of the variation in students’ inability to produce biology hypotheses.

      Figure 1.4 Regression analysis showing positive relation between nonlinguistic communicative fluency and ideational fluency (left). Regression showing negative relation between linguistic communicative fluency and ideational fluency (right). (From Oviatt et al. [2012])

      From an Activity theory perspective, neuroscience, behavioral, and humancomputer interface research all consistently confirm that engaging in more complex and multisensory-multimodal physical actions, such as writing letter shapes, can stimulate human cognition more effectively than passive viewing, naming, or tapping on a keyboard. Keyboard interfaces never were designed to be a thinking tool. They constrict the representations, modalities, and linguistic codes that can be communicated when using computers, and therefore fail to provide comparable expressive power [Oviatt 2013].

      In addition, research related to Activity theory has highlighted the importance of communication as a type of activity that directly stimulates and shapes human cognition. This cognitive facilitation has been demonstrated in a variety of communication modalities. In summary, multimodal interface design is a fertile direction for supporting computer applications involving extended thinking and reasoning.

      All of the theories presented in this chapter have limitations in scope, but collectively they provide converging perspectives on multisensory perception, multimodal communication, and the design of multimodal interfaces that effectively blend information sources. The focus of this chapter has been to summarize the strengths of each theory, and to describe how they have been applied to date in the design of multimodal interfaces. In this regard, the present chapter is by no means exhaustive. Rather, it highlights examples of how theory has influenced past multimodal interface design, often in rudimentary ways. In the future, new and more refined theories will be needed that can predict and coherently explain multimodal research findings, and shed light on how to design truly well-integrated multimodal interfaces and systems.

      1.1. Describe the two main types of theory that have provided a basis for understanding multisensory perception and multimodal communication.

      1.2. What neuroscience findings currently support Gestalt theory, Working Memory theory, Activity theory, Embodied Cognition theory, and Communication Accommodation theory?

      1.3. What human-computer interaction research findings support these theories?

      1.4. What Gestalt laws have become especially important in recent research on multisensory integration? And how has the field of multisensory perception substantially expanded our understanding of multisensory fusion beyond these initial Gestalt concepts?

      1.5. What Working Memory theory concept has been central to understanding the performance advantages of multimodal interfaces, as well as how to design them?

      1.6. Activity theory and related research asserts that communicative activity in all modalities mediates thought, and plays a direct role in guiding and improving human performance. What is the evidence for this in human-computer interaction studies? And what are the key implications for designing multimodal-multisensor interfaces?

      1.7. How is the action-perception loop, described by Embodied Cognition theory, relevant to multisensory perception and multimodal actions? Give one or more specific examples.

      1.8. How do multisensory and multimodal activity patterns influence the brain and its neurological substrates, compared with unimodal activity? What are the implications for multimodal-multisensor interface design?

      1.9. What is the principle of complementarity, and how does it relate to designing “well-integrated” multimodal-multisensor systems? What are the various ways that modality complementarity can be defined, as well as

Скачать книгу