I’m a third-year Ph.D. student at the School of Electrical Engineering and Computer Science, the University of Queensland, mentored by A/Prof. Guangdong Bai.
My research focuses on detecting security and privacy issues in the system and third-party application ecosystem based on privacy documentation. My works have been published in leading conferences and journals, including FSE, ICSE, ASE, PETs etc.
",
which does not match the baseurl
("
") configured in _config.yml
.
baseurl
in _config.yml
to "
".
Chuan Yan, Mark Huasong Meng, Liuhuo Wan, Tian Yang Ooi, Ruomai Ren, Guangdong Bai
ASE'24: The 39th IEEE/ACM International Conference on Automated Software Engineering 2024
As the flagship large language model (LLM) product of OpenAI, ChatGPT has gained global attention for its remarkable ability to handle complex natural language understanding and generation tasks. Inspired by the success of the mobile app ecosystems, OpenAI enables third-party developers to create ChatGPT plugins to fur- ther expand ChatGPT’s capabilities. These plugins are distributed through the OpenAI’s plugin store and are easily accessible to users. With ChatGPT as the powerful backbone, this app ecosystem has illustrated great business potential by offering users personalized services in a conversational manner. Nonetheless, this ecosystem is still in its nascent stage and undergoing dynamic evolution. Many crucial aspects regarding app development, deployment, and secu- rity have yet to be thoroughly studied in the research community, potentially hindering a wider adoption by both developers and users. In this work, we conduct the first comprehensive study of the ChatGPT app ecosystem, aiming to unveil its landscape to our research community. Our study focuses on the distribution and deployment models in the integration of LLMs and third-party apps, and assesses their security and privacy implications. We in- vestigate the runtime execution mechanism of ChatGPT apps and accordingly propose a three-layer security assessment model from the perspectives of users, app developers, and store operators. Our evaluation of all 1,038 plugins available in the store reveals their uneven distribution of functionality, underscoring prevelent and emerging topics. Our security assessment also reveals a concerning status quo of security and privacy in the ChatGPT app ecosystem. We find that the authentication and user data protection for third- party app APIs integrated within LLMs contain severe flaws. For example, 173 plugins have broken access control vulnerabilities, 368 plugins are subject to leaking manifest files, and 271 plugins provide inaccessible legal document links. Our study for the first time highlights the immaturity of the ChatGPT app ecosystem. Our findings should especially raise an alert to OpenAI and third-party developers to collaboratively maintain the security and privacy compliance of this emerging ecosystem.
Chuan Yan, Mark Huasong Meng, Fuman Xie, Guangdong Bai
FSE'24: Proceedings of the ACM on Software Engineering, Volume 1, Issue FSE 2024
Android has empowered third-party apps to access data and services on mobile devices since its genesis.This involves a wide spectrum of user privacy-sensitive data, such as the device ID and location. In recent years, Android has taken proactive measures to adapt its access control policies for such data, in response to the increasingly strict privacy protection regulations around the world. When each new Android version is released, its privacy changes induced by the version evolution are transparently disclosed, and we refer to them as documented privacy changes (DPCs). Implementing DPCs in Android OS is a non-trivial task, due to not only the dispersed nature of those access control points within the OS, but also the challenges posed by backward compatibility. As a result, whether the actual access control enforcement in the OS implementations aligns with the disclosed DPCs becomes a critical concern. In this work, we conduct the first systematic study on the consistency between the operational behaviors of the OS at runtime and the officially disclosed DPCs. We propose DopCheck, an automatic DPC-driven testing framework equipped with a large language model (LLM) pipeline. It features a serial of analysis to extract the ontology from the privacy change documents written in natural language, and then harnesses the few-shot capability of LLMs to construct test cases for the detection of DPC-compliance issues in OS implementations. We apply DopCheck with the latest versions (10 to 13) of Android Open Source Project (AOSP). Our evaluation involving 79 privacy-sensitive APIs demonstrates that DopCheck can effectively recognize DPCs from Android documentation and generate rigorous test cases. Our study reveals that the status quo of the DPC-compliance issues is concerning, evidenced by 19 bugs identified by DopCheck. Notably, 12 of them are discovered in Android 13 and 6 in Android 10 for the first time, posing more than 35% Android users to the risk of privacy leakage. Our findings should raise an alert to Android users and app developers on the DPC compliance issues when using or developing an app, and would also underscore the necessity for Google to comprehensively validate the actual implementation against its privacy documentation prior to the OS release.