LI Lin, ZHANG Cheng. Research on Voiceprint Recognition Method Based on SE-B-ResNet-50[J]. New Generation of Information Technology, 2023, 6(17): 01-07
LI Lin, ZHANG Cheng. Research on Voiceprint Recognition Method Based on SE-B-ResNet-50[J]. New Generation of Information Technology, 2023, 6(17): 01-07 DOI: 10.3969/j.issn.2096-6091.2023.17.001.
Research on Voiceprint Recognition Method Based on SE-B-ResNet-50
Aiming at the problems of low recognition rate of traditional voiceprint recognition methods and complicated implementation process
a voiceprint recognition method based on SE-B-ResNet-50 is proposed. The method is based on ResNet-50. First
the first layer of the model is optimized by combining the voiceprint features. At the same time
a global cross-scale connection is added between the first layer of the model and other layers
and then SE-Net is integrated on the basis of the model. The method used the establishment of dependencies on the feature channels in the network
used global information to enhance useful features
while suppressing useless features
and obtains deep voiceprint features through the feature extraction method combined with B-ResNet and SE-Net. The experimental results show that the recognition accuracy of the voiceprint recognition method using SE-B-ResNet-50 reaches more than 97%
which is much higher than the baseline method ResNet-50.
REYNOLDS D A . Speaker identification and verification using Gaussian mixture speaker models [J ] . Speech Communication , 1995 , 17 ( 1/2 ): 91 - 108 .
QUATIERI T F , DUNN R B , REYNOLDS D A , et al . Speaker recognition using G.729 speech CODEC parameters [C ] // 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing . Proceedings . Piscataway : IEEE , 2000 : II1089-II1092.
DEHAK N , KENNY P J , DEHAK R , et al . Front-end factor analysis for speaker verification [J ] . IEEE/ACM Transactions on Audio Speech and Language Processing , 2011 , 19 ( 4 ): 788 - 798 .
LEI Y , SCHEFFER N , FERRER L , et al . A novel scheme for speaker recognition using a phonetically-aware deep neural network [C ] // 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE , 2014 : 25 - 35 .
SCHMIDHUBER J . Deep learning in neural networks: An overview [J ] . Neural Networks , 2015 , 61 : 85 - 117 .
SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition [EB/OL ] . ( 2015-04-10 )[ 2023-10-25 ] . https://arxiv.org/abs/1409.1556 https://arxiv.org/abs/1409.1556 .
HEIGOLD G , MORENO I , BENGIO S , et al . End-to-end text-dependent speaker verification [C ] // 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE , 2016 : 5115 - 5119 .
BREDIN H . TristouNet: Triplet loss for speaker turn embedding [C ] // 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE , 2017 : 1 - 6 .
CHUNG J S , NAGRANI A , ZISSERMAN A . VoxCeleb2: Deep speaker recognition [EB/OL ] . ( 2018-06-27 )[ 2023-10-25 ] . https://arxiv.org/abs/1806.05622 https://arxiv.org/abs/1806.05622 .
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .
LUO W , LI Y , URTASUN R , et al . Understanding the effective receptive field in deep convolutional neural networks [EB/OL ] . ( 2017-01-25 )[ 2023-10-25 ] . https://arxiv.org/pdf/1701.04128.pdf https://arxiv.org/pdf/1701.04128.pdf .
HU J E , SHEN L , ALBANIE S , et al . Squeeze-and-excitation networks [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2020 , 42 ( 8 ): 2011 - 2023 .
HUANG G , LIU Z A , VAN DER MAATEN L , et al . Densely connected convolutional networks [C ] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 2261 - 2269 .
LI C , MA X , JIANG B , et al . Deep speaker: An end-to-end neural speaker embedding system [EB/OL ] . ( 2017-05-05 )[ 2023-10-25 ] . https://arxiv.org/abs/1705.02304 https://arxiv.org/abs/1705.02304 .