Tencent Applies for Patent on 'Large Language Model Training Method' to Enhance Model Generalization and Accuracy

The Tianyancha App shows that Tencent Technology (Shenzhen) Co., Ltd. has recently applied for a patent titled "Training Method, Device, Computer Equipment, and Storage Medium for Large Language Models." The summary of this patent reveals that the method introduces first and second summary texts to provide the model with more learnable information during the training process of the large language model.

According to the patent description, the first and second summary texts contain different amounts of information, with the first summary text including both correct and incorrect statements. By performing comparative learning on these two different summaries of the same text and distinguishing between the correct and incorrect statements, the method effectively avoids issues such as model overfitting and inaccurate generation that may arise from a singular summary text.

Tencent (2)

The innovation of this method lies in enhancing the model's generalization performance and significantly improving its accuracy. By introducing diverse summary text content, Tencent's training method brings more efficient and precise enhancements to the training process of large language models.

Ultraman Announces AI Has Entered a New Phase: Popularization, Security, and Usability of Intelligence

OpenAI CEO Sam Altman and Chief Scientist Jacob Pachocki jointly announced the company's entry into its third development phase, focusing on democratizing advanced AI technology while ensuring safety. Since the launch of ChatGPT, OpenAI has progressed through three and a half years, spanning technical R&D and global product releases, with future efforts prioritizing technology accessibility and secure control.....

Say Goodbye to Single-Round Q&A! CloudSens Launches Original Intelligent Agent Large Model U2, Overcomes Hundreds of Complex Workflows Independently

Generative AI is evolving from a chat tool into a 'super digital employee.' On June 8, Unisound released its next-generation general large model U2, designed for individuals, developers, and enterprises. It overcomes the limitations of traditional single-turn dialogue, focusing on high intelligence density and high token value to enhance practical delivery capabilities.....

Tencent and RUC Gaoqiang Jointly Launch Open-Source Planning Evaluation Framework PlanningBench

Tencent Hunyuan team, along with Renmin University of China and other institutions, has open-sourced PlanningBench, a framework for evaluating and training large language models' planning abilities. It systematically abstracts tasks, constraints, and difficulty levels, covering over 30 planning task types, and supports data generation and validation to assess models' practical planning capabilities.....

Tencent Applies for Patent on 'Large Language Model Training Method' to Enhance Model Generalization and Accuracy

Related Recommendations

Mianbi Intelligence Collaborates with Samsung! On-Device Large Models to Launch on Flagship Phones

Cost Reduction and Efficiency Enhancement: Meta大规模启用AI接管内容审核

Ultraman Announces AI Has Entered a New Phase: Popularization, Security, and Usability of Intelligence

Say Goodbye to Single-Round Q&A! CloudSens Launches Original Intelligent Agent Large Model U2, Overcomes Hundreds of Complex Workflows Independently

Tencent and RUC Gaoqiang Jointly Launch Open-Source Planning Evaluation Framework PlanningBench