浏览全部资源
扫码关注微信
北京邮电大学计算机学院(国家示范性软件学院),北京 100876
[ "竹倩叶 (2000—),女,北京邮电大学硕士研究生,主要研究方向为自然语言处理、智能对话系统。" ]
[ "鄂海红 (1982—),女,博士,北京邮电大学计算机学院教授、博士生导师,北京市青年英才,北邮数据流通关键技术联合实验室主任。研究方向:大数据及人工智能、知识图谱及自然语言处理、大数据中台。" ]
纸质出版日期:2023-09-15
移动端阅览
竹倩叶, 鄂海红. 基于大语言模型的垂直领域问答系统研究[J]. 新一代信息技术, 2023, 6(17): 08-16
ZHU Qian-ye, E Hai-hong. Research on Vertical Domain Dialogue Systems Based on Large Language Model[J]. New Generation of Information Technology, 2023, 6(17): 08-16
竹倩叶, 鄂海红. 基于大语言模型的垂直领域问答系统研究[J]. 新一代信息技术, 2023, 6(17): 08-16 DOI: 10.3969/j.issn.2096-6091.2023.17.002.
ZHU Qian-ye, E Hai-hong. Research on Vertical Domain Dialogue Systems Based on Large Language Model[J]. New Generation of Information Technology, 2023, 6(17): 08-16 DOI: 10.3969/j.issn.2096-6091.2023.17.002.
垂直领域的任务型对话系统的构建存在缺乏标注数据、泛化性能差、无法冷启动的问题。最近,随着ChatGPT等大型语言模型的提出,自然语言处理领域有了很多新的进展。然而,对在探索大语言模型应用在垂直领域的任务型对话系统构建中的涌现能力的研究还较少。本文针对垂直领域任务型对话系统的构建难题,提出了基于大规模语言模型的意图和词槽识别方法,具体来说有三点:(1)微调大语言模型以提高意图和词槽识别性能;(2)采用多轮交互方式提升识别效果;(3)基于大模型生成训练数据进行数据增强。这些方法的综合应用,能够为垂直领域对话机器人的构建提供一个高效解决方案,减少对人工标注数据的依赖,提升对话机器人在few-shot和zero-shot情况下的准确性。
The construction of task-based dialogue systems for vertical domains suffers from lack of labeled data
poor generalization performance
and inability to start cold. Recently
with the proposal of large language models such as ChatGPT
there have been many new advances in the field of natural language processing. However
there are fewer studies exploring the emergent capabilities of large language models applied to the construction of task-based dialog systems in vertical domains. In this paper
we propose a approach for intent detection and slot filling based on large language models to address the challenges of building task-based dialog systems in vertical domains
specifically three points: (1) fine-tuning the large language models to improve the performance of intent detection and slot filling; (2) adopting multiple rounds of interactions with large language models to enhance the recognition effect; and (3) generating training data based on the large language models for data augmentation. The combined application of these methods can provide an efficient solution for the construction of dialog systems in vertical domains
reduce the reliance on manually labeled data
and improve the accuracy of dialog systems in few-shot and zero-shot situations.
BROWN T , MANN B , RYDER N , et al . Language models are few-shot learners [J ] . Advances in Neural Information Processing Systems , 2020 , 33 : 1877 - 1901 .
TOUVRON H , LAVRIL T , IZACARD G , et al . LLaMA: Open and efficient foundation language models [EB/OL ] . ( 2023-2-27 )[ 2023-10-18 ] . https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/ https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/ .
DU Z X , QIAN Y J , LIU X , et al . GLM: General language model pretraining with autoregressive blank infilling [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) . Dublin : Association for Computational Linguistics , 2022 : 320 - 335 .
ZHANG X D , WANG H F . A joint model of intent determination and slot filling for spoken language understanding [C ] // Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI’16) . New York : AAAI Press , 2016 : 2993 - 2999 .
DEVLIN J , CHANG M W , LEE K , et al . BERT: Pre-training of deep bidirectional Transformers for language understanding [C ] // Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , Volume 1 . Minneapolis : Association for Computational Linguistics , 2019: 4171 - 4186 .
CHEN Q , ZHUO Z , WANG W . BERT for joint intent classification and slot filling [EB/OL ] . ( 2019-02-28 )[ 2023-11-05 ] . https://arxiv.org/abs/1902.10909 https://arxiv.org/abs/1902.10909 .
DU Z X , QIAN Y J , LIU X , et al . THUDM/ChatGLM2-6B: ChatGLM2-6B: An Open Bilingual Chat LLM [CP/OL ] . ( 2023-06-27 )[ 2023-9-20 ] . https://github.com/THUDM/ChatGLM2-6B https://github.com/THUDM/ChatGLM2-6B .
HU E J , SHEN Y , WALLIS P , et al . LoRA: Low-rank adaptation of large language models [EB/OL ] . ( 2021-10-16 )[ 2023-9-23 ] . https://arxiv.org/abs/2106.09685 https://arxiv.org/abs/2106.09685 .
HEMPHILL C T , GODFREY J J , DODDINGTON G R . The ATIS spoken language systems pilot corpus [C ] // Proceedings of the workshop on Speech and Natural Language - HLT’90 . Morristown : Association for Computational Linguistics , 1990 : 96 - 101 .
COUCKE A , SAADE A , BALL A , et al . Snips Voice Platform: An embedded spoken language understanding system for private-by-design voice interfaces [EB/OL ] . ( 2018-12-06 )[ 2023-11-07 ] . https://arxiv.org/abs/1805.10190 https://arxiv.org/abs/1805.10190 .
ZHU Q , HUANG K L , ZHANG Z , et al . CrossWOZ: A large-scale Chinese cross-domain task-oriented dialogue dataset [J ] . Transactions of the Association for Computational Linguistics , 2020 , 8 : 281 - 295 .
HAKKANI-TÜR D , TUR G , CELIKYILMAZ A , et al . Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM [C ] // Proceedings of The 17th Annual Meeting of the International Speech Communication Association (INTERSPEECH 2016) . San Francisco : ISCA , 2016 : 715 - 719 .
LIU B , LANE I . Attention-based recurrent neural network models for joint intent detection and slot filling [C ] // Proceedings of The 17th Annual Meeting of the International Speech Communication Association (INTERSPEECH 2016) . San Francisco : ISCA , 2016 : 685 - 689 .
GOO C W , GAO G A , HSU Y K , et al . Slot-gated modeling for joint slot filling and intent prediction [C ] // Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , Volume 2 . Stroudsburg : Association for Computational Linguistics , 2018: 753 - 757 .
LIU H , LIU Y X , WONG L P , et al . A hybrid neural network BERT-cap based on pre-trained language model and capsule network for user intent classification [J ] . Complexity , 2020 , 2020 : 1 - 11 .
0
浏览量
221
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构