Xiaohongshu Proposes an Innovative Framework: Fully Utilizing Negative Samples to Enhance Large Language Model Inference Capabilities

According to KPMG China's recent report, "The First 50 Health Tech Companies," China accounts for more than 70% of the global release volume of medical large models. This data not only demonstrates China's rapid development in the field of intelligent healthcare, but also reflects the wide application of large language models in the healthcare industry. The report points out that about 65% of the currently released medical large models are large language models. These models can process and generate natural language, playing a significant supporting role in the analysis of medical data, patient communication, and scientific research.
In the long-standing copyright infringement lawsuit filed by The New York Times against OpenAI, the case has made significant progress. According to Ars Technica, the federal judge presiding over the case has authorized The New York Times and its co-plaintiffs, The New York Daily News and the Investigative Reporting Center, to access OpenAI's user logs, including deleted content, to accurately determine the scope of the infringement. The New York Times believes that ChatGPT users may delete their history after bypassing the paywall, and therefore it is necessary to conduct large-scale data collection.
Large Language Models (LLMs) have achieved significant progress in complex reasoning tasks by combining task prompts with large-scale reinforcement learning (RL), as demonstrated by models like Deepseek-R1-Zero, which directly apply reinforcement learning to base models, showcasing strong reasoning capabilities. However, this success is difficult to replicate across different base model families, especially within the Llama series. This raises a core question: what factors lead to inconsistent performance of different base models during reinforcement learning? How does reinforcement learning perform in
In recent years, the field of artificial intelligence has undergone tremendous changes, especially with large language models (LLMs) making remarkable progress in multi-modal tasks. These models have demonstrated strong potential in their ability to understand and generate language, but most current multi-modal models still adopt auto-regressive (AR) architectures, which limit inference processes to be rather monotonous and lacking flexibility. To address this limitation, a research team from The University of Hong Kong and Huawei Noah's Ark Lab has proposed a brand new model – FUDOKI, aiming to break these constraints. The core innovation of FUDOKI is