TY - GEN
T1 - Tencent pretrain
T2 - 61st Annual Meeting of the Association for Computational Linguistics, ACL-DEMO 2023
AU - Zhao, Zhe
AU - Li, Yudong
AU - Hou, Cheng
AU - Zhao, Jing
AU - Tian, Rong
AU - Liu, Weijie
AU - Chen, Yiren
AU - Sun, Ningyuan
AU - Liu, Haoyan
AU - Mao, Weiquan
AU - Guo, Han
AU - Guo, Weigang
AU - Wu, Taiqiang
AU - Zhu, Tao
AU - Shi, Wenhang
AU - Chen, Chen
AU - Huang, Shan
AU - Chen, Sihong
AU - Liu, Liqun
AU - Li, Feifei
AU - Chen, Xiaoshuai
AU - Sun, Xingwu
AU - Kang, Zhanhui
AU - Du, Xiaoyong
AU - Shen, Linlin
AU - Yan, Kimmo
N1 - Publisher Copyright:
© ACL-DEMO 2023. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Recently, the success of pre-Training in text domain has been fully extended to vision, audio, and cross-modal scenarios. The proposed pre-Training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-Training models within a uniform framework. In this paper, we present TencentPretrain, a toolkit supporting pre-Training models of different modalities. The core feature of TencentPretrain is the modular design. The toolkit uniformly divides pretraining models into 5 components: embedding, encoder, target embedding, decoder, and target. As almost all of common modules are provided in each component, users can choose the desired modules from different components to build a complete pre-Training model. The modular design enables users to efficiently reproduce existing pre-Training models or build brand-new one. We test the toolkit on text, vision, and audio benchmarks and show that it can match the performance of the original implementations.
AB - Recently, the success of pre-Training in text domain has been fully extended to vision, audio, and cross-modal scenarios. The proposed pre-Training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-Training models within a uniform framework. In this paper, we present TencentPretrain, a toolkit supporting pre-Training models of different modalities. The core feature of TencentPretrain is the modular design. The toolkit uniformly divides pretraining models into 5 components: embedding, encoder, target embedding, decoder, and target. As almost all of common modules are provided in each component, users can choose the desired modules from different components to build a complete pre-Training model. The modular design enables users to efficiently reproduce existing pre-Training models or build brand-new one. We test the toolkit on text, vision, and audio benchmarks and show that it can match the performance of the original implementations.
UR - http://www.scopus.com/inward/record.url?scp=85170847533&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85170847533
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 217
EP - 225
BT - System Demonstrations
PB - Association for Computational Linguistics (ACL)
Y2 - 10 July 2023 through 12 July 2023
ER -