Reporters learned from the Institute of Automation of the Chinese Academy of Sciences on the 8th that for the first time, researchers from the Institute and other institutions completed the full process training and reasoning of the original brain like pulse large model "Instant" 1.0 on the domestic GPU (graphics processor) computing platform, and officially opened the 7 billion parameter version of the large model, while opening the test website of the 76 billion parameter version of the large model. This is the world's first brain like pulse large model, which has achieved full process localization, marking an important breakthrough in the integration and innovation of brain like computing and large models in China. Currently, large models based on Transformer architecture mainly rely on simple "point neurons" and large-scale computing power to enhance intelligence. However, their training and inference costs increase sharply with text length, severely restricting the improvement of ultra long text processing capabilities. In this study, the research team drew inspiration from the working mechanism of brain neurons and proposed a linear complexity based brain pulse like large model architecture based on endogenous complexity, successfully creating "Instant Insight" 1.0. "This model not only theoretically reveals a new computing path, but also builds a training and reasoning framework adapted to domestic computing power, opening up a new path for building a new large model that is more efficient, more complex and more powerful." Li Guoqi, a researcher at the Institute of Automation of the Chinese Academy of Sciences, said. Compared to traditional models, "Instant Insight" 1.0 exhibits four core advantages: firstly, it achieves efficient training at extremely low data volumes, significantly improving the efficiency of long sequence training; Secondly, the inference efficiency has been improved by orders of magnitude, especially in the processing of ultra long sequences, showing significant advantages; Once again, a domestically independent and controllable brain like large model ecosystem has been constructed, supporting the efficient transformation of existing Transformer models into brain like pulse architectures; Finally, a multi-scale sparse mechanism was designed to provide strong support for the operation of low-power brain like models. Li Guoqi stated that this achievement is not only a major breakthrough in China's brain pulse like large model architecture and the full process construction of domestic computing power, but also provides more efficient modeling tools for ultra long sequence application scenarios such as law, medicine, and scientific simulation. It will also inspire the next generation of neural morphological computing theory and chip design. (New Society)
Edit:Wang Shu Ying Responsible editor:Li Jie
Source:Science and Technology Daily
Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com