On-device Query Intent Prediction with Lightweight LLMs to Support Ubiquitous Conversations

Abstract

Conversational Agents (CAs) have made their way to providing interactive assistance to users. However, the current dialogue modelling techniques for CAs are predominantly based on hard-coded rules and rigid interaction flows, which negatively affects their flexibility and scalability. Large Language Models (LLMs) can be used as an alternative, but unfortunately they do not always provide good levels of privacy protection for end-users since most of them are running on cloud services. To address these problems, we leverage the potential of transfer learning and study how to best fine-tune lightweight pre-trained LLMs to predict the intent of user queries. Importantly, our LLMs allow for on-device deployment, making them suitable for personalised, ubiquitous, and privacy-preserving scenarios. Our experiments suggest that RoBERTa and XLNet offer the best trade-off considering these constraints. We also show that, after fine-tuning, these models perform on par with ChatGPT. We also discuss the implications of this research for relevant stakeholders, including researchers and practitioners. Taken together, this paper provides insights into LLM suitability for on-device CAs and highlights the middle ground between LLM performance and memory footprint while also considering privacy implications.

Research highlights

We fine-tune 8 lightweight pre-trained LLMs to predict query intents in 5 public datasets.
The models are suitable for personalised, ubiquitous, and privacy-preserving scenarios.
The models compare favorably against ChatGPT; even some of them (e.g. RoBERTa) outperform it.

Resources

Paper (PDF, 475KB)
Training software to generate all the model files (.h5 file extension) we used in our paper, together with other relevant files such as model configuration, tokenizer, etc. Please read the comments in main.py for detailed instructions.
Model checkpoints of RoBERTa Base (approx. 500MB each):

Citation

Mateusz Dubiel, Yasmine Barghouti, Kristina Kudryavtseva, Luis A. Leiva. On-device Query Intent Prediction with Lightweight LLMs to Support Ubiquitous Conversations. Scientific Reports, Vol.14(12731), 2024.

@Article{llmgui,
  author    = {Mateusz Dubiel and Yasmine Barghouti and Kristina Kudryavtseva and Luis A. Leiva},
  title     = {On-device Query Intent Prediction with Lightweight LLMs to Support Ubiquitous Conversations},
  journal   = {Scientific Reports},
  volume    = {14},
  number    = {12731},
  year      = {2024},
}

Disclaimer

Our software is free for scientific use (licensed under the MIT license). The software must not be distributed without prior permission of the authors. Please contact us if you are planning to use the software for commercial purposes. The authors are not responsible for any implication derived from the use of this software.