Publications
* denotes equal contribution and joint lead authorship.
2025
Translating Requirements into CARLA Executable Scripts: an LLM-Driven Automated Scenario Realization
Scenario-based testing is critical for ensuring the safety and robustness of autonomous driving (AD) systems, particularly in extreme scenarios such as heavy rain, pedestrian-involved crashes, and nighttime conditions. Our previous work integrated the CARLA simulator with the Multi-view Modeling Framework for ML Systems (M3S), facilitating scenario generation but still requiring substantial manual scripting. In this paper, we extend our framework by incorporating Large Language Models (LLMs) with M3S to automate scenario generation. Using a hierarchical prompt design, our approach extracts structured parameters from M3S descriptions into a JSON schema, which guides the LLM to generate accurate CARLA simulation scripts. Our evaluation demonstrates significant improvements in automation accuracy and efficiency, substantially reducing manual intervention and enhancing continuous testing cycles for AD systems.
Enhancing Safety in Autonomous Driving: Integrating CARLA for Multi-Sensor Dataset Generation and Advanced Scenario Testing
Simulation environments are vital to autonomous driving research, enabling safe and cost- effective studies of dense traffic, adverse weather, and sidewalk navigation. Yet real-world data collection for these scenarios can be hazardous and expensive. To address this, we integrate the CARLA simulator for dataset generation and scenario construction. Our approach leverages CARLA’s autopilot to capture traffic-sign data via LiDAR detection and semantic segmentation, and employs a custom manual script for sidewalk data. We also vary weather and traffic density, using RoadRunner for specialized maps. Preliminary results suggest CARLA-generated data helps identify domain gaps when combined with real GTSRB data, and improves segmentation (IoU) in sidewalk scenes. Looking ahead, we propose automated scenario generation integrating with M3S. Engineers can define high-level objectives and then incorporate them into CARLA, ensuring robust evaluations for critical autonomous driving scenarios.
VRTopic: Advancing Topic Modeling in Virtual Reality User Reviews with Large Language Models
With the rapid development of Virtual Reality (VR) technology, effectively understanding user feedback has become a core task for improving user experience and optimizing system functionality. However, extracting meaningful insights from VR user reviews remains challenging. Traditional topic modeling methods often generate unannotated and ambiguous topics, requiring extensive manual annotation and analysis. To address this issue, this study proposes an innovative approach that leverages state-of-the-art Large Language Models (LLMs) to automatically identify and precisely summarize key topics from VR user reviews. Ultimately, this research aims to generate accurate topics from VR-related textual inputs that genuinely reflect user concerns. By filling the gap in the application of LLMs to VR text analysis, this study provides VR developers with precise user insights, aiding product optimization and iterative improvement.
2024
Generative AI for Requirements Engineering: A Systematic Literature Review.
Context: Generative AI (GenAI) has emerged as a transformative tool in software engineering, with requirements engineering (RE) actively exploring its potential to revolutionize processes and outcomes. The integration of GenAI into RE presents both promising opportunities and significant challenges that necessitate systematic analysis and evaluation. Objective: This paper presents a comprehensive systematic literature review (SLR) analyzing state-of-the-art applications and innovative proposals leveraging GenAI in RE. It surveys studies focusing on the utilization of GenAI to enhance RE processes while identifying key challenges and opportunities in this rapidly evolving field. Method: A rigorous SLR methodology was used to conduct an in-depth analysis of 27 carefully selected primary studies. The review examined research questions pertaining to the application of GenAI across various RE phases, the models and techniques used, and the challenges encountered in implementation and adoption. Results: The most salient findings include i: a predominant focus on the early stages of RE, particularly the elicitation and analysis of requirements, indicating potential for expansion into later phases; ii: the dominance of large language models, especially the GPT series, highlighting the need for diverse AI approaches; and iii: persistent challenges in domain-specific applications and the interpretability of AI-generated outputs, underscoring areas requiring further research and development. Conclusions: The results highlight the critical need for comprehensive evaluation frameworks, improved human–AI collaboration models, and thorough consideration of ethical implications in GenAI-assisted RE. Future research should prioritize extending GenAI applications across the entire RE lifecycle, enhancing domain-specific capabilities, and developing strategies for responsible AI integration in RE practices.
2023
-
Analysis of Spectro-Temporal Modulation Representation for Deep-Fake Speech Detection.
The 15th Asia-Pasific Signal and Information Processing Association (APSIPA ASC 2023), Taipei, Taiwan, 31 October - 3 November 2023.
Deep-fake speech detection aims to develop effective techniques for identifying fake speech generated using advanced deep-learning methods. It can reduce the negative impact of malicious production or dissemination of fake speech in real-life scenarios. Although humans can relatively easy to distinguish between genuine and fake speech due to human auditory mechanisms, it is difficult for machines to distinguish them correctly. One major reason for this challenge is that machines struggle to effectively separate speech content from human vocal system information. Common features used in speech processing face difficulties in handling this issue, hindering the neural network from learning the discriminative differences between genuine and fake speech. To address this issue, we investigated spectro-temporal modulation representations in genuine and fake speech, which simulate the human auditory perception process. Next, the spectro-temporal modulation was fit to a light convolutional neural network bidirectional long short-term memory for classification. We conducted experiments on the benchmark datasets of the Automatic Speaker Verification and Spoofing Countermeasures Challenge 2019 (ASVspoof2019) and the Audio Deep synthesis Detection Challenge 2023 (ADD2023), achieving an equal-error rate of 8.33\% and 42.10\%, respectively. The results showed that spectro-temporal modulation representations could distinguish the genuine and deep-fake speech and have adequate performance in both datasets.