Our research
Immersive Virtual Reality in Child Interview Skills Training: A Comparison of 2D and 3D Environments
The current study aims to evaluate and compare the subjective quality of an AI-based training system developed for conducting child interviews, focusing on the distinction between immersive 3D (using virtual reality) and 2D desktop environments. To this end, a structured user study was conducted, involving 36 participants who were exposed to these two distinct environments. The study evaluated various aspects of user experience, namely presence, usability, visual fidelity, emotion, responsiveness, appropriateness, and training effectiveness. The findings reveal significant differences in user experience between the 2D and 3D environments. Notably, the 3D environment enhanced presence, visual fidelity, training effectiveness, and empathy. In contrast, the 2D environment was favored for usability. The study highlights the potential of immersive VR while also pointing out the need to improve the system response and emotional expressiveness of the avatars.
Enhancing investigative interview training using a child avatar system: A comparative study of interactive environments
Crowdsourced and remote user studies have recently gained popularity as alternatives to traditional laboratory studies. However, they are subject to unreliability, and it is challenging to ensure that valid results are collected, especially when conducting user studies with experts.
Enhancing questioning skills through child avatar chatbot training with feedback
Training child investigative interviewing skills is a specialized task. Those being trained need opportunities to practice their skills in realistic settings and receive immediate feedback. A key step in ensuring the availability of such opportunities is to develop a dynamic, conversational avatar, using artificial intelligence (AI) technology that can provide implicit and explicit feedback to trainees. In the iterative process, use of a chatbot avatar to test the language and conversation model is crucial. The model is fine-tuned with interview data and realistic scenarios. This study used a pre-post training design to assess the learning effects on questioning skills across four child interview sessions that involved training with a child avatar chatbot fine-tuned with interview data and realistic scenarios. Thirty university students from the areas of child welfare, social work, and psychology were divided into two groups; one group received direct feedback (n = 12), whereas the other received no feedback (n = 18). An automatic coding function in the language model identified the question types. Information on question types was provided as feedback in the direct feedback group only. The scenario included a 6-year-old girl being interviewed about alleged physical abuse. After the first interview session (baseline), all participants watched a video lecture on memory, witness psychology, and questioning before they conducted two additional interview sessions and completed a post-experience survey. One week later, they conducted a fourth interview and completed another post-experience survey. All chatbot transcripts were coded for interview quality. The language model's automatic feedback function was found to be highly reliable in classifying question types, reflecting the substantial agreement among the raters [Cohen's kappa (κ) = 0.80] in coding open-ended, cued recall, and closed questions. Participants who received direct feedback showed a significantly higher improvement in open-ended questioning than those in the non-feedback group, with a significant increase in the number of open-ended questions used between the baseline and each of the other three chat sessions. This study demonstrates that child avatar chatbot training improves interview quality with regard to recommended questioning, especially when combined with direct feedback on questioning.
A field assessment of child abuse investigators' engagement with a child-avatar to develop interviewing skills
Child investigative interviewing is a complex skill requiring specialised training. A critical training element is practice. Simulations with digital avatars are cost-effective options for delivering training. This study of real-world data provides novel insights evaluating a large number of trainees' engagement with LiveSimulation (LiveSim), an online child-avatar that involves a trainee selecting a question (i.e., an option-tree) and the avatar responding with the level of detail appropriate for the question type. While LiveSim has been shown to facilitate learning of open-ended questions, its utility (from a user engagement perspective) remains to be examined.
A Comparative Study of Interactive Environments for Investigative Interview of A Virtual Child Avatar
In cases of suspected child abuse, police and child protection services (CPS) personnel often conduct investigative interviews. These interviews are intended to obtain a reliable account from the child about the alleged crime so that proper legal and clinical interventions can be implemented, with the prosecution presenting the case in court. Studies show that the quality of these interviews is often poor, and interviewers' current training programs are not that efficient. Training must be spaced and repeated over time. In this paper, we present a system that simulates the scenario of interviewing a victim of child sexual abuse in four different interactive environments. We conducted a user study in which participating experts interviewed a simulated child to determine which interactive environment would provide the highest quality of experience (QoE) and motivates experts for higher practice time with the system. The experts include CPS workers and child welfare students. This study measures each user interaction's overall QoE, realism, responsiveness, presence, and flow as well as the learning effects, engagement in learning, and self-efficacy. The study shows that 66% of the participants would prefer to use virtual reality (VR) for interactive training over other environments. VR had the highest rate in most of the assessment metrics.
A Virtual Reality Talking Avatar for Investigative Interviews of Maltreat Children
Interviews conducted with the maltreated children are often the primary source of evidence in prosecution. Many alleged incidents of abuse are not prosecuted because the children's testimony is collected in an unreliable way. Research shows the consistent poor quality of these interviews and highlights the need for better training of Child Protection Services (CPS) and police personnel who interview abused child witnesses. The currently available systems for training of CPS and police personnel are developed in a rigid way that lag behind in generating dynamic responses. Moreover, these systems require human input such as employing an actor mimicking a child or an operator controlling prerecorded child responses during the interactions. This paper demonstrates the prototype of an interview training program with an artificial intelligent Child Avatar in Virtual Reality (VR), enabling CPS and police personnel to practice interviewing with abused children. The program is developed using Unity game engine and artificial intelligence-based technologies such as dialogue models, talking visual avatars, text-to-speech, and speech-to-text components.
Human vs. GPT-3: The challenges of extracting emotions from child responses
Conducting interviews with abused children requires a specific skill set that is obtained by undergoing special training. In addition to acquiring the interview training skills, these also have to be constantly refreshed to keep a high level of quality. Technology, such as synthetic video generation and natural language processing, is in a stage that could allow the construction of a system that can make this task easier. Thus, we aim to design a training system aided by machine learning that can support the interview training with an interactive child avatar capable of meaningful interactions with the trainees. In these interviews, emotions play an important role, so we conduct three different user studies in a remote study setting with the aim of analyzing child emotions in these interviews. In these user studies, the participants had to classify different transcripts excerpts as one of the possible predefined emotions. These human annotations are used to measure the performance of sentiment analysis using GPT-3. We investigate different approaches to obtain the correct classifications by changing the amount of context the participants and the model get to see. Our experiments show that humans have a hard time agreeing when choosing between seven different emotions. This improves when we reduce the set of emotions to four. In addition, we found that context is needed to make a motivated choice, but too much context can make it vague, reducing the judgment's quality.
Comparison of Crowdsourced and Remote Subjective User Studies: A Case Study of Investigative Child Interviews
Crowdsourced and remote user studies have recently gained popularity as alternatives to traditional laboratory studies. However, they are subject to unreliability, and it is challenging to ensure that valid results are collected, especially when conducting user studies with experts. Experts are a sparse resource, usually having busy schedules and heavy workloads, and are not necessarily geographically close. They are therefore often unwilling to participate in studies which require physical attendance. In this paper, we compare three alternative methods: crowd sourced user study with non-experts, remote user study with non-experts, and remote user study with domain experts, for a use case involving investigative child interview training. We present the results from three subjective studies about the perception of AI-generated child avatars, which is developed using various technologies such as dialogue models, game engine, text-to-speech and speech-to-text components. The study was conducted with three different user groups, and our results indicate the importance of using best practice measures for ensuring the collection of reliable results in crowdsourced settings as compared to remote studies, and highlight the difference between the perspectives of domain experts and non-experts.
Towards an AI-driven talking avatar in virtual reality for investigative interviews of children
Artificial intelligence (AI) and gaming systems have advanced to the stage where the current models and technologies can be used to address real-world problems. The development of such systems comes with different challenges, e.g., most of them related to system performance, complexity and user testing. Using a virtual reality (VR) environment, we have designed and developed a game-like system aiming to mimic an abused child that can help to assist police and child protection service (CPS) personnel in interview training of maltreated children. Current research in this area points to the poor quality of conducted interviews, and emphasises the need for better training methods. Information obtained in these interviews is the core piece of evidence in the prosecution process. We utilised advanced dialogue models, talking visual avatars, and VR to build a virtual child avatar that can interact with users. We discuss our proposed architecture and the performance of the developed child avatar prototype, and we present the results from the user study conducted with CPS personnel. The user study investigates the users' perceived quality of experience (QoE) and their learning effects. Our study confirms that such a gaming system can increase the knowledge and skills of the users. We also benchmark and discuss the system performance aspects of the child avatar. Our results show that the proposed prototype works well in practice and is well received by the interview experts.
Synthesizing a Talking Child Avatar to Train Interviewers Working with Maltreated Children
When responding to allegations of child sexual, physical, and psychological abuse, Child Protection Service (CPS) workers and police personnel need to elicit detailed and accurate accounts of the abuse to assist in decision-making and prosecution. Current research emphasizes the importance of the interviewer's ability to follow empirically based guidelines. In doing so, it is essential to implement economical and scientific training courses for interviewers. Due to recent advances in artificial intelligence, we propose to generate a realistic and interactive child avatar, aiming to mimic a child. Our ongoing research involves the integration and interaction of different components with each other, including how to handle the language, auditory, emotional, and visual components of the avatar. This paper presents three subjective studies that investigate and compare various state-of-the-art methods for implementing multiple aspects of the child avatar. The first user study evaluates the whole system and shows that the system is well received by the expert and highlights the importance of its realism. The second user study investigates the emotional component and how it can be integrated with video and audio, and the third user study investigates realism in the auditory and visual components of the avatar created by different methods. The insights and feedback from these studies have contributed to the refined and improved architecture of the child avatar system which we present here
Is More Realistic Better? A Comparison of Game Engine and GAN-based Avatars for Investigative Interviews of Children
The success of investigative interviews with maltreated children is often defined by the interviewer's ability to elicit a reliable and coherent account of the alleged incident from the child. Research shows that a child avatar mimicking a maltreated child can improve interviewers' performance in conducting these interviews. The realism of such a child avatar is considered one of the most critical factors. Based on this, the current study aims to generate realistic child avatars in real-time that utilize multimodal data and different components from artificial intelligence. This paper discusses the subjective findings of a study of two types of child avatar videos; animated avatars created using the Unity game engine and photorealism talking-head avatars using Generative adversarial networks (GANs). The results show that although the state-of-the-art GAN-generated avatars are significantly more realistic, they do not necessarily create a better experience, as most of the participants prefer talking to animated avatars.
Multimodal Virtual Avatars for Investigative Interviews with Children
In this article, we present our ongoing work in the field of training police officers who conduct interviews with abused children. The objectives in this context are to protect vulnerable children from abuse, facilitate prosecution of offenders, and ensure that innocent adults are not accused of criminal acts. There is therefore a need for more data that can be used for improved interviewer training to equip police with the skills to conduct high-quality interviews. To support this important task, we propose to research a training program that utilizes different system components and multimodal data from the field of artificial intelligence such as chatbots, generation of visual content, text-to-speech, and speech-to-text. This program will be able to generate an almost unlimited amount of interview and also training data. The goal of combining all these different technologies and datatypes is to create an immersive and interactive child avatar that responds in a realistic way, to help to support the training of police interviewers, but can also produce synthetic data of interview situations that can be used to solve different problems in the same domain.