PaddlePaddle · HydrogenSulfate · Mar 10, 2025 · Feb 28, 2025 · Mar 4, 2025 · Mar 4, 2025
diff --git a/docs/zh/examples/pangu_weather.md b/docs/zh/examples/pangu_weather.md
@@ -0,0 +1,112 @@
+# Pangu-Weather
+
+=== "模型训练命令"
+
+    暂无
+
+=== "模型评估命令"
+
+    暂无
+
+=== "模型导出命令"
+
+    暂无
+
+=== "模型推理命令"
+
+    ``` sh
+    # Download sample input data
+    wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/input_surface.npy -P ./data
+    wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/input_upper.npy -P ./data
+
+    # Download pretrain model weight
+    wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/pangu_weather_1.onnx -P ./inference
+    wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/pangu_weather_3.onnx -P ./inference
+    wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/pangu_weather_6.onnx -P ./inference
+    wget -nc https://paddle-org.bj.bcebos.com/paddlescience/models/Pangu/pangu_weather_24.onnx -P ./inference
+
+    # 1h interval-time model inference
+    python predict.py INFER.export_path=inference/pangu_weather_1
+    # 3h interval-time model inference
+    python predict.py INFER.export_path=inference/pangu_weather_3
+    # 6h interval-time model inference
+    python predict.py INFER.export_path=inference/pangu_weather_6
+    # 24h interval-time model inference
+    python predict.py INFER.export_path=inference/pangu_weather_24
+    ```
+
+## 1. 背景简介
+
+盘古气象大模型(Pangu-Weather)是首个精度超过传统数值预报方法的 AI 方法，其提供了 1 小时间隔、3 小时间隔、6 小时间隔、24 小时间隔的预训练模型。其使用的数据，包括垂直高度上13个不同气压层，每层五种气象要素（温度、湿度、位势、经度和纬度方向的风速），以及地球表面的四种气象要素（2米温度、经度和纬度方向的10米风速、海平面气压）。1 小时 - 7 天预测精度均高于传统数值方法（即欧洲气象中心的 operational IFS）。
+
+同时，盘古气象大模型在一张V100显卡上只需要1.4秒就能完成24小时的全球气象预报，相比传统数值预报提速10000倍以上。
+
+## 2. 模型原理
+
+本章节仅对盘古气象大模型的原理进行简单地介绍，详细的理论推导请阅读 [Pangu-Weather: A 3D High-Resolution System for Fast and Accurate Global Weather Forecast](https://arxiv.org/pdf/2211.02556)。
+
+模型的总体结构如图所示：
+
+<figure markdown>
+  ![result](https://paddle-org.bj.bcebos.com/paddlescience/docs/pangu-weather/model_architecture.png){ loading=lazy style="margin:0 auto;"}
+  <figcaption>模型结构</figcaption>
+</figure>
+
+其主要思想是使用一个视觉transformer的3D变种来处理复杂的不均匀的气象要素。由于气象数据分辨率很大，因而相比于常见的vision transformer方法，研究人员将网络的encoder和decoder减少到2级（8个block），同时采用Swin transformer的滑窗注意力机制，以减少网络的计算量
+
+模型使用预训练权重推理，接下来将介绍模型的推理过程。
+
+## 3. 模型构建
+
+在该案例中，实现了 PanguWeatherPredictor用于ONNX模型的推理：
+
+``` py linenums="67" title="examples/pangu_weather/predict.py"
+--8<--
+examples/pangu_weather/predict.py:67:97
+--8<--
+```
+
+``` yaml linenums="29" title="examples/pangu_weather/conf/pangu_weather.yaml"
+--8<--
+examples/pangu_weather/conf/pangu_weather.yaml:29:44
+--8<--
+```
+
+其中，`input_file` 和 `input_surface_file` 分别代表网络模型输入的高空气象数据和地面气象。
+
+## 4. 结果可视化
+
+先将数据从 npy 转换为 NetCDF 格式，然后采用 ncvue 进行可视化
+
+1. 安装相关依赖
+```python
+pip install cdsapi netCDF4 ncvue
+```
+
+2. 使用脚本进行数据转换
+```python
+python convert_data.py
+```
+
+3. 使用 ncvue 打开转换后的 NetCDF 文件, ncvue 具体说明见[ncvue官方文档](https://github.com/mcuntz/ncvue)
+
+## 5. 完整代码
+
+``` py linenums="1" title="examples/pangu_weather/predict.py"
+--8<--
+examples/pangu_weather/predict.py
+--8<--
+```
+
+## 6. 结果展示
+
+下图展示了模型的温度预测结果，更多指标可以使用 ncvue 查看。
+
+<figure markdown>
+  ![result](https://paddle-org.bj.bcebos.com/paddlescience/docs/pangu-weather/temperature.png){ loading=lazy style="margin:0 auto;"}
+  <figcaption>温度预测结果</figcaption>
+</figure>
+
+## 7. 参考资料
+
+- [Pangu-Weather: A 3D High-Resolution System for Fast and Accurate Global Weather Forecast](https://arxiv.org/pdf/2211.02556)
diff --git a/examples/pangu_weather/conf/pangu_weather.yaml b/examples/pangu_weather/conf/pangu_weather.yaml
@@ -0,0 +1,44 @@
+defaults:
+  - ppsci_default
+  - INFER: infer_default
+  - hydra/job/config/override_dirname/exclude_keys: exclude_keys_default
+  - _self_
+
+hydra:
+  run:
+    # dynamic output directory according to running time and override name
+    dir: ./outputs_pangu_weather
+  job:
+    name: ${mode} # name of logfile
+    chdir: false # keep current working directory unchanged
+  callbacks:
+    init_callback:
+      _target_: ppsci.utils.callbacks.InitCallback
+  sweep:
+    # output directory for multirun
+    dir: ${hydra.run.dir}
+    subdir: ./
+
+# general settings
+mode: infer # running mode: infer
+seed: 2023
+output_dir: ${hydra:run.dir}
+log_freq: 20
+
+# inference settings
+INFER:
+  pretrained_model_path: null
+  export_path: inference/pangu_weather_24
+  onnx_path: ${INFER.export_path}.onnx
+  device: gpu
+  engine: onnx
+  precision: fp32
+  ir_optim: false
+  min_subgraph_size: 30
+  gpu_mem: 100
+  gpu_id: 0
+  max_batch_size: 1
+  num_cpu_threads: 10
+  batch_size: 1
+  input_file: './data/input_upper.npy'
+  input_surface_file: './data/input_surface.npy'
diff --git a/examples/pangu_weather/convert_data.py b/examples/pangu_weather/convert_data.py
@@ -0,0 +1,159 @@
+# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
+
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+
+#     http://www.apache.org/licenses/LICENSE-2.0
+
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# ref: https://github.com/HaxyMoly/Pangu-Weather-ReadyToGo/blob/main/forecast_decode_functions.py
+
+import os
+from os import path as osp
+from typing import Dict
+
+import hydra
+import netCDF4 as nc
+import numpy as np
+
+from ppsci.utils import logger
+
+
+def convert_surface_data_to_nc(
+    surface_file: str, file_name: str, output_dir: str
+) -> None:
+    surface_data = np.load(surface_file)
+    mean_sea_level_pressure = surface_data[0]
+    u_component_of_wind_10m = surface_data[1]
+    v_component_of_wind_10m = surface_data[2]
+    temperature_2m = surface_data[3]
+
+    with nc.Dataset(
+        os.path.join(output_dir, file_name), "w", format="NETCDF4_CLASSIC"
+    ) as nc_file:
+        # Create dimensions
+        nc_file.createDimension("longitude", 1440)
+        nc_file.createDimension("latitude", 721)
+
+        # Create variables
+        nc_lon = nc_file.createVariable("longitude", np.float32, ("longitude",))
+        nc_lat = nc_file.createVariable("latitude", np.float32, ("latitude",))
+        nc_msl = nc_file.createVariable(
+            "mean_sea_level_pressure", np.float32, ("latitude", "longitude")
+        )
+        nc_u10 = nc_file.createVariable(
+            "u_component_of_wind_10m", np.float32, ("latitude", "longitude")
+        )
+        nc_v10 = nc_file.createVariable(
+            "v_component_of_wind_10m", np.float32, ("latitude", "longitude")
+        )
+        nc_t2m = nc_file.createVariable(
+            "temperature_2m", np.float32, ("latitude", "longitude")
+        )
+
+        # Set variable attributes
+        nc_lon.units = "degrees_east"
+        nc_lat.units = "degrees_north"
+        nc_msl.units = "Pa"
+        nc_u10.units = "m/s"
+        nc_v10.units = "m/s"
+        nc_t2m.units = "K"
+
+        # Write data to variables
+        nc_lon[:] = np.linspace(0.125, 359.875, 1440)
+        nc_lat[:] = np.linspace(90, -90, 721)
+        nc_msl[:] = mean_sea_level_pressure
+        nc_u10[:] = u_component_of_wind_10m
+        nc_v10[:] = v_component_of_wind_10m
+        nc_t2m[:] = temperature_2m
+
+    logger.info(
+        f"Convert output surface data file {surface_file} as nc format and save to {output_dir}/{file_name}."
+    )
+
+
+def convert_upper_data_to_nc(upper_file: str, file_name: str, output_dir: str) -> None:
+    # Load the saved numpy arrays
+    upper_data = np.load(upper_file)
+    geopotential = upper_data[0]
+    specific_humidity = upper_data[1]
+    temperature = upper_data[2]
+    u_component_of_wind = upper_data[3]
+    v_component_of_wind = upper_data[4]
+
+    with nc.Dataset(
+        os.path.join(output_dir, file_name), "w", format="NETCDF4_CLASSIC"
+    ) as nc_file:
+        # Create dimensions
+        nc_file.createDimension("longitude", 1440)
+        nc_file.createDimension("latitude", 721)
+        nc_file.createDimension("level", 13)
+
+        # Create variables
+        nc_lon = nc_file.createVariable("longitude", np.float32, ("longitude",))
+        nc_lat = nc_file.createVariable("latitude", np.float32, ("latitude",))
+        nc_geopotential = nc_file.createVariable(
+            "geopotential", np.float32, ("level", "latitude", "longitude")
+        )
+        nc_specific_humidity = nc_file.createVariable(
+            "specific_humidity", np.float32, ("level", "latitude", "longitude")
+        )
+        nc_temperature = nc_file.createVariable(
+            "temperature", np.float32, ("level", "latitude", "longitude")
+        )
+        nc_u_component_of_wind = nc_file.createVariable(
+            "u_component_of_wind", np.float32, ("level", "latitude", "longitude")
+        )
+        nc_v_component_of_wind = nc_file.createVariable(
+            "v_component_of_wind", np.float32, ("level", "latitude", "longitude")
+        )
+
+        # Set variable attributes
+        nc_lon.units = "degrees_east"
+        nc_lat.units = "degrees_north"
+        nc_geopotential.units = "m"
+        nc_specific_humidity.units = "kg/kg"
+        nc_temperature.units = "K"
+        nc_u_component_of_wind.units = "m/s"
+        nc_v_component_of_wind.units = "m/s"
+        # Write data to variables
+        nc_lon[:] = np.linspace(0.125, 359.875, 1440)
+        nc_lat[:] = np.linspace(90, -90, 721)
+        nc_geopotential[:] = geopotential
+        nc_specific_humidity[:] = specific_humidity
+        nc_temperature[:] = temperature
+        nc_u_component_of_wind[:] = u_component_of_wind
+        nc_v_component_of_wind[:] = v_component_of_wind
+
+    logger.info(
+        f"Convert output upper data file {upper_file} as nc format and save to {output_dir}/{file_name}."
+    )
+
+
+def convert(cfg: Dict):
+    output_dir = cfg.output_dir
+
+    convert_surface_data_to_nc(
+        osp.join(output_dir, "output_surface.npy"), "output_surface.nc", output_dir
+    )
+    convert_upper_data_to_nc(
+        osp.join(output_dir, "output_upper.npy"), "output_upper.nc", output_dir
+    )
+
+
+@hydra.main(version_base=None, config_path="./conf", config_name="pangu_weather.yaml")
+def main(cfg: Dict):
+    if cfg.mode == "infer":
+        convert(cfg)
+    else:
+        raise ValueError(f"cfg.mode should in ['infer'], but got '{cfg.mode}'")
+
+
+if __name__ == "__main__":
+    main()