PENED

Artificial Intelligence & Information Analysis

Partners

– Dept. of Psychology, Aristotle Univ.

-Multimedia Lab., Dept. of Informatics, Aristotle Univ.

Project’s Rationale

The main objective of this action is to investigate the way virtual reality tools can be used for education on natural disasters. Nowadays, training methods include formal classes, books, multimedia applications, interactive simulations, on-job training, etc. Even though on-job training is particularly effective in complex tasks involving great deal of independence, it seems to be the most expensive of all. Moreover, the need of an experienced personnel, able to conduct such a training, as well as the unavailability of the training context, make it’s use prohibitory. Due to the reasons given above, Virtual Reality (VR) applications have been implemented to fulfill those requirements.

 An attempt to develop a system that will allow the trainees to develop their psychological skills that will enable them to face reality on a emergency site, is made. The training should focus on rapid situation assessment and decision-making under exceptionally stressful situations (situation training). In the beginning, a percentage (30%) of the students used for the training will be trained in the Virtual Reality environment, while the rest of them will be given instructions in traditional ways Afterwards, each trainee will be immersed in a 3D Virtual model of a school class, inside which he will experience an earthquake simulation. Before and after the simulation process, useful measurements ( such as heart beat rate and sweat indicating measurements) will be recorded, with the intention of being analyzed after the execution of the experiment.

All the above measurements will be used to identify the emotion that describes in the most appropriate way, the psychological state that characterized the trainee during the simulation of the earthquake

Our Research Objectives

Emotion Analysis for Virtual Earthquakes

The Emotion Analysis program is a proposal of the Psychology Dept. of Aristotle Univ. which aims in the development of a Virtual Reality System with augment enviroment capabilities such as heart pulse and sweat measurement feedback. The efficiency of the user is evaluated through emotion analysis from the prosodic characteristics of speech and emotion recognition from the facial expressions. The system constructed by the AIIA lab together with the Psychology department is an experimental prototype which is presently a promising way to train kids for earthquake situations which are a common situation in Greece, one of the ten countries with great seismological activity.

Hardware Setup

-Sound: An AKG Microphone with a sound console

-Biosignals: Plethysmograph, Galvanic Skin Response with IWORX 114.

-VR glasses: iglasses

-2 Cameras: Panasonic, Canon

-2 Portable Computers

Our contributions (results/graphs)

Emotional Speech Recognition

Emotion is an important factor in communication. For example, a simple text dictation that does  not reveal any emotion, it does not covey adequately the semantics of the text. An emotion speech synthesizer could solve such a communication problem. Speech emotion recognition systems can be used by disabled people for communication, by actors for emotion speech consistency as well as for interactive TV, for constructing virtual teachers, in the study of human brain malfunctions, and the advanced design of speech coders.  Until recently many voice synthesizers could not produce faithfully a human emotional speech. This results to an unnatural and unattractive speech. Nowadays, the major speech processing labs worldwide are trying to develop  efficient algorithms for emotion speech synthesis as well as emotion speech recognition. To achieve such ambitious goals, the collection of emotional speech databases is a prerequisite.

Our purpose is to design a useful tool which can be used in psychology to automatically classify utterances into five emotional states such as anger, happiness, neutral, sadness, and surprise. The major contribution of our investigation is to rate the discriminating capability of a set of features for emotional speech recognition. A total of 87 features has been calculated over 500 utterances from the Danish Emotional Speech database. The Sequential Forward Selection method (SFS) has been used in order to discover a set of 5 to 10 features which are able to classify the utterances in the best way.

The criterion used in SFS is the crossvalidated correct classification score of one of the following classifiers: nearest mean and Bayes classifier where class pdfs are approximated via Parzen windows or modelled as Gaussians. After selecting the 5 best features, we reduce the dimensionality to two by applying principal component analysis. The result is a 51.6% +- 3% correct classification rate at 95% confidence interval for the five aforementioned emotions, whereas a random classification would give a correct classification rate of 20%.

Furthermore, we find out those two-class emotion recognition problems whose error rates contribute heavily to the average error and we indicate that a possible reduction of the error rates reported in this paper would be achieved by employing two-class classifiers and combining them.