1.05 trillion kWh (kW h)!
This is the prediction made by the International Energy Agency (hereinafter referred to as "IEA") in the report "Power 2024" recently, on the highest total power consumption of global data centers in 2026. 1 kWh is 1 kWh, "more than 1 trillion kWh". According to the report’s estimation, this amount of electricity is about the entire annual electricity consumption of Japan.
Computing infrastructure such as data center and intelligent computing center is the data center and computing carrier of artificial intelligence (AI). With the rapid development of AI, especially the generative artificial intelligence (AIGC) and large model technology, the demand for computing power has surged, and the energy consumption of AI has attracted more and more attention. At many international conferences held recently, some technology giants have expressed their concerns about the energy consumption caused by the development of AI.
How to solve the energy consumption problem while improving intellectual efficiency is a "big test" for the AI industry.
The energy consumption of AI in the reasoning stage can not be ignored.
When discussing the energy consumption of AI, it is inevitable to talk about the AI big language model (hereinafter referred to as the "big model").
"Generative artificial intelligence is the focus of current AI technology development." Wang Peng, a senior expert of Tencent Research Institute, said in an interview with the reporter of Zhongqing Daily and Zhongqing.com.. He said that at present, the foundation of generative artificial intelligence technology is a large model marked by the stacking of data and computing power, and its training and application need a lot of computing power support. "Behind computing power is the huge power demand brought by the power consumption of computing infrastructure."
Zhang Yunquan, a member of Chinese People’s Political Consultative Conference and a researcher at the Institute of Computing Technology, Chinese Academy of Sciences, pointed out that the larger the parameters and data scale of a large model, the better its intelligence effect. In the large model, "Scaling Laws" means that when the parameters and data are large enough, the intelligent performance of the large model will jump, that is, "intelligent emergence". "At present, we haven’t seen the upper limit of’ smart emergence’."
"Generally speaking, the larger the parameters, the greater the computational power consumption of the large model, and the more power it consumes." Wang Peng said that because the upper limit has not been reached, artificial intelligence companies represented by OpenAI, driven by "Scaling Laws", continue to increase the parameters and data scale of large models in order to achieve the goal of General Artificial Intelligence (AGI), resulting in a huge increase in computing power demand and power demand in the short term.
"Because GPT-3 has 175 billion parameters and 1024 NVIDIA A100 chips are used for training, it is called’ thousand calories and thousands of ginseng’ in the industry." Tian Feng, president of Shangtang Science and Technology Intelligent Industry Research Institute, said that at present, GPT-4, GPT-5 and other large models have all reached the scale of "Wanka Wancan", and the chips used in training models have also been updated from NVIDIA A100 to NVIDIA H100 and B200. "The surge in parameters will lead to a significant increase in energy consumption".
In addition to model training, the energy consumption of AI in the reasoning stage can not be ignored. "Reasoning is the process that the large model responds to the user’s demand", Zhang Yunquan introduced that the power consumption of the large model responding to the user’s demand at a time is not large, "but with the increase of the user scale, the power consumption will continue to accumulate and increase."
Recently, a news from the United States that "if 100,000 NVIDIA H00 chips are deployed in the same area for model training, the power grid will collapse" has aroused social concern.
In an interview with reporters, many experts said that the reason why AI led to the collapse of the power grid is that the training of large models is a phased work, and the computing power used should be concentrated in a data center. Training large models in a limited time and space will bring a very large power load to the local power grid.
"Sudden huge load disturbance in a stable power grid system will have an impact on the stability and security of the power grid." Zhang Yunquan pointed out that with the further increase of large model parameters and data scale, the energy consumption problem of AI will become more and more prominent, especially for countries and regions with tight power supply. "In the long run, the energy consumption of the AI ? ? reasoning process will become larger and larger; In the short term, the energy consumption of large model training is the largest increase in AI energy consumption. "
In Wang Peng’s view, compared with household electricity consumption, AI’s power consumption is very large, but its proportion in the total social electricity consumption is still very small, "far from reaching the order of magnitude of manufacturing electricity consumption".
Solution: technological innovation and new energy
According to the prediction of the American organization Uptime Institute, by 2025, the proportion of AI-related businesses in the global data center electricity consumption will increase from 2% to 10%; By 2030, the annual power consumption of intelligent computing will account for 5% of the total global power generation.
"Solving the energy consumption problem is an important prerequisite for the development of AI technology." Tian Feng told reporters that although the current energy consumption of AI will not cause a large-scale "power shortage", with the large-scale application of AI, there may be a "power shortage" of AI in the future, and it is necessary to find a suitable solution so that the limited electric energy can accommodate a larger scale of computing power.
Through research and practice, people’s understanding of AI has gradually increased, and a series of solutions have followed. From the demand point of view, optimizing large model architecture, improving chip efficiency and computing power efficiency are considered to be effective ways to reduce AI energy consumption.
Zhang Yunquan said that, first of all, a special chip for AI model training can be designed, and its efficiency is more than 10 times higher than that of GPU (graphics processor, now commonly used in AI computing); Secondly, the parameters of AI model can be optimized. Many small models have only a few billion parameters, but they have achieved the same effect as large models. In addition, by optimizing and compressing the reasoning process, a special reasoning chip can be designed to further reduce the energy consumption in the AI reasoning stage.
"The big model becomes smaller, and the effect of reducing energy consumption is the best at present." Zhang Yunquan takes Phi-3, a self-developed small-size AI model released by Microsoft at the end of April, as an example. It is understood that there are currently three versions of Phi-3 model, among which Phi-3 mini is a language model with 3.8 billion parameters, which can be deployed on mobile phones. According to the experimental and test results, its performance can be comparable to that of GPT-3.5 and other large models.
In terms of energy supply, resorting to diversified new energy supply and relying on the state for macro-control and planning will help solve the energy consumption problem of AI. Guo Tao, an angel investor and a senior artificial intelligence expert, told reporters that at present, new energy sources, including renewable energy such as solar energy, wind energy and hydropower, are gradually becoming the best energy choice for data centers. "If there is not enough renewable energy to meet the growth of AI energy consumption, it may lead to increased dependence on fossil fuels, which will have a negative impact on the environment. In addition, the data center can also optimize energy efficiency through intelligent algorithms and realize the coordinated development of AI and power grid. "
Many artificial intelligence companies have begun to pay attention to new energy sources. In 2021, Sam altman, CEO of OpenAI, invested $375 million in Helion Energy, a nuclear fusion startup; In March 2024, Amazon Cloud Services (AWS) acquired a data center park in Pennsylvania, USA. It is understood that the park obtains electricity from neighboring nuclear power plants.
"Solving the problem of AI energy consumption involves the coordination and cooperation of multiple systems such as computing power and electricity." Wang Peng pointed out that on the one hand, it is necessary to reduce energy consumption from AI itself, including optimizing algorithms, reducing model parameters, and improving computing performance; On the other hand, the entire energy system should also actively respond to the energy consumption demand of AI.
Integrated consideration of "source, network and storage"
New energy will become a "key" to solve the problem of AI energy consumption, which coincides with the "East Counting and West Computing" project proposed by China.
According to the data of the National Energy Administration, in 2023, China’s renewable energy installed capacity was 305 million kilowatts, accounting for 82.7% of the country’s new power generation installed capacity, accounting for half of the world’s new installed capacity; The national renewable energy power generation is nearly 3 trillion kWh, which is close to 1/3 of the electricity consumption of the whole society. At present, China has built the world’s largest power supply system and clean power generation system, among which Qinghai, Inner Mongolia, Ningxia and other northwest regions are "rich mines" of clean energy.
In 2021, China proposed to implement the project of "calculating from the east to the west" to guide data centers to gather in resource-rich areas in the west, promote local data centers to be low-carbon, green and sustainable, and meet the computing power demand in the east. In February 2022, eight cities including Inner Mongolia, Guizhou and Gansu started to build national computing hub nodes, and 10 national data center clusters were written into the overall "planning" of the project, and the "East Counting and West Computing" project was fully launched.
"In the era of big model, the project of’ East Counting and West Computing’ will play an important macro-control role in the national power demand and computing power demand." Zhang Yunquan predicted that in the future, more and more large-scale computing centers or intelligent computing centers will be located in the western region of China, and "training from the east to the west" (that is, the AI model in the eastern region will be trained in the western region-reporter’s note) will become a typical scenario for the coordinated development of AI and new energy. However, he stressed that energy storage is a problem that needs to be solved to promote new energy to better empower the development of AI.
"The construction of large-scale energy storage determines whether new energy can better meet the demand for computing power." Tian Feng also agreed with Zhang Yunquan. Tian Feng pointed out that new energy sources, including photoelectricity and wind power, have the characteristics of intermittent power generation, so it is necessary to rely on energy storage system to store the generated electricity in time, so as to ensure the balance between supply and demand of the power grid.
According to the latest data from the National Energy Administration, by the end of the first quarter of 2024, the cumulative installed capacity of new energy storage projects that have been completed and put into operation in China has reached 35.3 million kilowatts, an increase of more than 210% year-on-year, of which over 50% are energy storage power stations with a capacity of over 100,000 kilowatts, showing a centralized and large-scale development trend.
In the construction of energy storage, Wang Peng emphasized the distributed energy storage capacity of new energy vehicles. "With the continuous improvement of battery charging and discharging times and service life, hundreds of millions of trams use the peak-valley electricity price difference to store energy and return it to the power supply network in reverse, which can basically realize zero-cost car use and even profit, and also solve the peak shaving problem of the power grid."
In addition, Wang Peng also thinks that it is necessary to rethink the distributed linkage and micro-layout coordination of "data network" and "power network". He pointed out that in order to meet the rapidly growing demand for artificial intelligence reasoning computing power in the short term, in addition to laying out large computing centers in areas rich in renewable energy in the west, it is necessary to realize "east counting and west computing"; It is also necessary to consider actively deploying distributed renewable energy sources near the data center and computing center on the east demand side, such as distributed BIPV (Photovoltaic Building Integration) combined with urban and rural buildings and agricultural facilities, and integration of light storage and flexibility. "Moreover, it is necessary to consider the integration of’ source network and storage’, and realize the local peak-valley balance through the microgrid as much as possible to reduce the abandonment of wind and light."
"This requires the cooperation of electricity price policy, infrastructure construction, policy support and user behavior." In Wang Peng’s view, the high coupling of the whole computing network, transmission network, distributed energy network and vehicle (charging) network may be the key to solve the future AI energy consumption problem in China.
"Considering the general ledger of input and output, AI actually further improves the production efficiency of society and reduces energy consumption." Tian Feng believes that AI, as a new quality productivity, is empowering economic and social development. Today’s AI big model has become an important basic scientific research facility, and the investment in its training will eventually bring dividends of new quality productivity to the whole society.
At present, in the training cost of AI big model, the proportion of energy consumption cost has exceeded half. Tian Feng said that from the perspective of basic scientific research, we should continue to increase investment in AI technology. "Now is the time to catch up, and we should not tie our hands and feet.". Specific to the energy consumption of AI, he suggested that a large model can be trained with certain energy support policies.
Zhongqingbao Zhongqingwang Trainee Reporter Jia Yiye Reporter Zhu Caiyun Source: China Youth Daily
关于作者