Computational and Energy Use of CMIP6
Design of CPMIP: computational performance CMIP6, designed to assess model performance and cost of running these simulations.
Result: ~1600T of CO2Eq
Total energy = Simulated years x JPSY (Jouls per simulated year)
Carbon footprint = Total energy x CF x PUE
where:
- CF = Greenhouse gas conversion factor (MWh to CO2Eq)
- PUE = Power usage effectiveness (accounts for data center effectiveness)
Useful metrics:
- SYPD: simulation years per day
- CHSY: core hours per simulated year
- Parallelization, complexity, and resolution: cluster / experiment dependant metrics (constant values)
- Data output cost: greatly influenced by I/O configuration
- Data intensity: production efficiency of data (data generated per core hour, correlated with SYPD)
- Workflow and infrastructure cost: cost of running the simulation, very much infrastructure dependant (11% - 75% of total cost)
- Coupling cost: cost of coupling ESM models
\( CC = \frac{T_M P_M - \sum_c T_C P_C}{T_M P_M} \)
where:
- \( T_M \) = total runtime for model
- \( P_M \) = parallelization for coupled model
- \( T_C \) = total runtime for individual component
- \( P_C \) = parallelization for individual component
Other metrics:
- Speed / cost / parallel: closely related to model speed and parallelizability
- Memory bloat = (Mem size - Parallel x File size) / Ideal mem size = ~ 10 - 100
- Useful simulated years: years that are actually used for analysis
- Data produced: data generated by the simulation
How to estimate missing numbers? mean(ESGF data) or mean(Total data) or mean(ESGF data)