Cyberphysical Smart Cities Infrastructures. Группа авторов
Чтение книги онлайн.
Читать онлайн книгу Cyberphysical Smart Cities Infrastructures - Группа авторов страница 16
49 49 Golob, T.F. and Recker, W.W. (2003). Relationships among urban freeway accidents, traffic flow, weather, and lighting conditions. Journal of Transportation Engineering 129 (4): 342–353.
50 50 Xu, J., Deng, D., Demiryurek, U. et al. (2015). Mining the situation: spatiotemporal traffic prediction with big data. IEEE Journal of Selected Topics in Signal Processing 9 (4): 702–715.
51 51 Mohamed, N. and Al‐Jaroodi, J. (2014). Real‐time big data analytics: applications and challenges. 2014 International Conference on High Performance Computing & Simulation (HPCS), IEEE, pp. 305–310.
52 52 Park, H.‐M., Park, N., Myaeng, S.‐H., and Kang, U. (2020). PACC: Large scale connected component computation on Hadoop and Spark. PLoS ONE 15 (3): e0229936.
3 Embodied AI‐Driven Operation of Smart Cities: A Concise Review
Farzan Shenavarmasouleh1, Ghareh Mohammadi1, M. Hadi Amini2, and Hamid Reza Arabnia1
1Department of Computer Science, Franklin College of Arts and Sciences, University of Georgia, Athens, GA, USA
2Knight Foundation School of Computing and Information Sciences, Florida International University, Miami, FL, USA
3.1 Introduction
A smart city is an urban area that employs information and communication technologies (ICT) [1], an intelligent network of connected devices and sensors that can work interdependently [2, 3] and a distributive manner [4] to continuously monitor the environment, collect data, and share them among the other assets in the ecosystem. This uses all the available data to make real‐time decisions about the many individual components of the city to ease up the livelihood of its citizens and make the whole system more efficient, more environmentally friendly, and more sustainable [5]. This serves as a catalyst for creating a city with faster transportation, fewer accidents, enhanced manufacturing, more reliable medical services and utilities, less pollution [6], and so on. The good news is any city, even with traditional infrastructures, can be transformed into a smart city by integrating Internet of things (IoT) technologies [7].
An undeniable part of a smart city is its use of smart agents. These agents can vary a lot in sizes, shapes, and functionalities. They can simply be light sensors that along with their controller act as the energy‐saving agents or could be more advanced machines, with complicated controllers and interconnected components that are capable of tackling more advanced problems. The latter agents usually come with an embodiment with numerous sensors and controllers built in them, enabling them to perform high‐level and human‐level tasks such as talking, walking, seeing, and complex reasoning along with the ability to interact with the environment. Embodied artificial intelligence is the field of study that takes a deeper look into these agents and explores how they can fit into the real‐world and how they can eventually act as our future community workers, personal assistants, robocops, and many more.
Imagine arriving home after a long working day and seeing your home robot waiting for you at the entrance door. Although it is not the most romantic thing ever, you then walk up to it, and ask it to make a cup of coffee for you and also add two teaspoons of sugar if there is any in the cabinet. For this to become reality, the robot has to have a vast range of skills. It should be able to understand your language and be able to translate questions and instructions to the action. It should be able to see its surroundings and have the ability to recognize objects and scenes. Last but not the least, it must know how to navigate in a big dynamic environment, interact with the objects within it, and be capable of doing long‐term planning and reasoning.
In the past few years, there has been significant progress in the fields of computer vision, natural language processing, and reinforcement learning, thanks to the advancements in deep learning models. Many things are now possible because of these that seemed impossible a few years ago. However, most of the work has been done in isolation from other lines of work. It means that the trained model can only take one type of data (e.g. image, text, video) as the input and perform a single task that it is asked for. Consequently, such a model acts as a single‐sensory machine as opposed to a multi‐sensory one. Also, for the most part, they all belong to Internet artificial intelligence (AI) rather than embodied AI. The goal of Internet AI is learning patterns in text, images, and videos from the datasets collected from the Internet.
If we zoom out and look at the way models in Internet AI being trained, we realize that generally supervised classification is the way to go. For instance, we provide a certain number of dog and cat photos along with the corresponding labels to a perception model. Moreover, if the number is large enough, the model then can successfully learn the differences between these two animals and discriminate between them. Learning via flashcards falls under the same umbrella for humans.
Extensive amount of time has been devoted in the past years to gather and build huge datasets for the imaging and language communities. A few considerable markers of this can be ImageNet [8], MS COCO [9], Sun [10], Caltech‐256 [11], and Places [12] created for vision tasks; SQuAD [13], GLUE [14], and SWAG [15] built for language objectives; and Visual Genome [16] and VQA [17] datasets created for joint purposes to name a few.
Apart from playing a pivotal role in the recent advances of the main fields, these datasets also proved to be useful when used with transfer learning methods to help underlying disciplines such as biomedical imaging [18, 19]. However, the aforementioned datasets are prune to restrictions. Firstly, at times it can get extremely costly, both in terms of time and money, to gather all the required data for the collection and label them. Secondly, the collection has to be monitored constantly to assure that they follow certain rules to avoid creating biases that could lead to erroneous results in future works [20] and also make sure that the collected data are all normal and uniform in terms of attributes such as background, size, position of the objects, lighting conditions, etc. However, in contrast, we know that in real‐world scenarios, this cannot be the case, and robots have to deal with a mixture of unnormalized noisy irrelevant data along with the relevant well‐curated ones. Additionally, the agent would be able to interact with the objects in the wild (e.g. picking it up and looking at the object from another angle) and also use its other senses such as smell and hearing to collect information (Figure 3.1).
Figure 3.1 Embodied AI in smart cities.
Humans do learn from interactions, and it is a must for true intelligence in the real world. In fact, it is not only humans but also animals. In kitten carousel experiment [21], Held and Hein exhibited this beautifully. They studied the visual development of two kittens in a carousel over time. One of them had the ability to touch the ground and control its motions within the restrictions of the device, while the other was just a passive observer. At the end of the experiment, they found out that the visual development of the former kitten was normal, whereas for the latter one it was not, even though they both saw the same thing. This proves that being able to physically experience the world and interact with it is a key element for learning [22].
The goal of embodied AI is to bring the ability to interact and being able to use multisenses simultaneously into play to enable the robot to continuously learn in a lightly supervised or even unsupervised way in a rich dynamic environment.
3.2