ZhiYuan Research Institute Releases Open-Source JudgeLM Evaluation Model to Assess Various Large Models and Provide Scores

The Beijing Academy of Artificial Intelligence (BAAI) has open-sourced a judging model named JudgeLM, which can efficiently and accurately evaluate various large models. Compared to GPT-4, JudgeLM achieves over 90% consistency in evaluation results at just 1/120 of the cost. JudgeLM is applicable to a wide range of evaluation scenarios including pure text and multimodal content, and can output scores, judgments, and explanations for its decisions. Through innovative methods, JudgeLM's consistency with reference answers has exceeded 90%, approaching human performance. BAAI has also open-sourced a dataset containing training and validation samples for in-depth research on large language model judging. In the future, the JudgeLM team will further refine this judging model to provide more accurate, efficient, and comprehensive evaluation of large language models across more scenarios.

ZhiYuan Research Institute Releases Emu2: A New Generation Generative Multimodal Foundation Model

["ZhiYuan Research Institute has released the new generation multimodal foundation model Emu2, pushing the boundaries of multimodal contextual learning capabilities.", "Emu2 surpasses Flamingo-80B and IDEFICS-80B, demonstrating excellent performance in few-shot multimodal understanding tasks.", "Emu2 achieves optimal performance in multiple few-shot understanding, visual question answering, and image generation tasks.", "Emu2-Chat realizes accurate understanding of text-image instructions, while Emu2-Gen offers flexible, controllable, high-quality images."]

ZhiYuan Research Institute Releases 1 Billion Parameter General 3D Vision Model Uni3D

["ZhiYuan Research Institute has recently open-sourced the Uni3D model with 1 billion parameters, designed for general 3D vision tasks.", "The model can process point cloud data and has achieved breakthroughs in mainstream 3D vision tasks.", "Uni3D employs a unified Transformer architecture and introduces a multimodal alignment training method.", "The model has achieved state-of-the-art results across various 3D vision tasks.", "ZhiYuan Research Institute states that the open-source release of Uni3D will contribute to the future of 3D computing."]

ZhiYuan Research Institute Releases Open Source Bilingual Model Wudao・Tianying 34 Billion Aquila2-34B

ZhiYuan Research Institute has unveiled the new open-source bilingual model Wudao・Tianying 34 Billion Aquila2-34B, which excels in reasoning, generalization, and more. The institute has also released a comprehensive open-source toolkit to promote collaborative innovation in large model research. Aquila2-34B surpasses other open-source foundational models in overall capabilities, with the ZhiYuan team developing the NLPE method to enhance the model's extension capabilities.

ZhiYuan Releases the World's Largest Chinese-English Semantic Vector Model Training Dataset MTP

The ZhiYuan Research Institute has released the world's largest Chinese-English semantic vector model training dataset, MTP, with a data scale of 300 million pairs. MTP is the largest open-source dataset of Chinese-English related text pairs, providing an important foundation for training semantic vector models. The dataset includes Chinese-English text pairs from multiple sources, covering various types such as Q&A, comments, and news. The ZhiYuan Research Institute stated that this data plays a crucial role in training large models and will promote collaborative innovation in artificial intelligence. The release of this dataset is expected to address the shortage of training datasets for Chinese models.

ZhiYuan Research Institute Releases Open-Source JudgeLM Evaluation Model to Assess Various Large Models and Provide Scores

Related Recommendations

ZhiYuan Research Institute Releases Emu2: A New Generation Generative Multimodal Foundation Model

ZhiYuan Research Institute Releases 1 Billion Parameter General 3D Vision Model Uni3D

ZhiYuan Research Institute Releases Open Source Bilingual Model Wudao・Tianying 34 Billion Aquila2-34B

ZhiYuan Releases the World's Largest Chinese-English Semantic Vector Model Training Dataset MTP

Kimi K3 Attack and Defense Exam Mishap: Vulnerability Exploitation Only Reaches 40% of US Frontier Models, Distillation Controversy Brought to Light by Security Agencies