最近要在一个新机器上跑CMAQ, 先对机器和模式做个校验, 以前都是跑出来自己写个ncl脚本比对的, 现在觉得用ioapi的m3diff也挺好, 不过还需要基于那个结果进行后续分析, 这里记录一下
m3diff
用法:
! m3diff.f
'USAGE: m3diff [INFILEA INFILEB [REPORTFILE]] [DEFAULT]'
这里INFILEA, INFILEB, REPORTFILE最好都是环境变量, 直接是文件名也行, 但是字符串长度只有16...
#!/bin/bash
# m3diff USAGE: m3diff [INFILEA INFILEB [REPORTFILE]] [DEFAULT]
EXEC=......Linux2_x86_64ifort/m3diff # can can need
export ANC=../output_GoogleDrive/CCTM_24august2017.ACONC.CMAQ52b_mpi_SE52BENCH.35.cmas_2011182
export BNC=../output_CCTM_v52_intel_SE52BENCH/CCTM_ACONC_v52_intel_SE52BENCH_20110701.nc
export RFILE=rst-m3diff.txt
$EXEC ANC BNC RFILE DEFAULT
输出:
m3diff的默认输出如下 (截取一个计算点, 即一小时的全部空间计算结果), 包含Max, Min, Mean, Sigma
...
Date and time 2011182:000000 (0:00:00 July 1, 2011)
A:ANC/NO2 vs B:BNC/NO2 vs (A - B)
MAX @( C, R, L) Min @( C, R, L) Mean Sigma
A 4.87336E-02@( 31,38, 1) 1.02405E-04@(100,13, 1) 2.13298E-03 3.05647E-03
B 4.83082E-02@( 31,38, 1) 1.02051E-04@(100,13, 1) 2.12775E-03 3.04345E-03
A:B 4.25424E-04@( 31,38, 1) -9.22559E-05@( 6, 8, 1) 5.22923E-06 1.63126E-05
...
结果分析
虽然直接看原始结果也不是不行, 但是看着比较累, 我就是想看看偏差大不大比如说, 所以我从现有的结果中以一个标准去筛选即可
考虑到Mean偏差和Sigma偏差以及max min偏差的意义不一样, 这里我以riskLevel去衡量偏离程度
比如, Mean偏差1%, riskLevel+4, Sigma偏差2%, riskLevel+2, MaxMin的location偏差10个网格坐标, riskLevel+1, 具体参数都可调, 这样每个点位都有一个riskLevel, 再以heatMap的形式去画出来即可方便的找到差异较大的点位
这里点位我是指某个变量的某个时间维, 也是m3diff输出的最小单位
参考脚本如下, 不过绘图部分极其简陋, 简单看看用的, 没仔细调整, 就不放图了...
# coding: utf-8
# v0.1
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sb
import re
def main():
# >>>>>>>>>>>> prepare
Diff.criterias = {'mean' : 0.01, 'std' : 0.02, 'loc' : 10}
targetVars = None # ['SO2'] # None
m3diff_rst = "rst-m3diff.txt"
# >>>>>>>>>>>> get diff data
diffs = resolve_m3diff_rst(m3diff_rst)
# >>>>>>>>>>>> get datetime and varlist
datetimes = set()
varList = set()
for d in diffs:
d.calRisk()
datetimes.add(d.DateTime)
varList.add(d.var)
# randomly arranged if not sorted
datetimes = sorted(list(datetimes))
varList = sorted(list(varList)) if targetVars is None else targetVars
# >>>>>>>>>>>> get DataFrame
df = pd.DataFrame(index = varList, columns = datetimes, dtype = np.float32)
for d in diffs:
if d.var in varList:
df.loc[d.var, d.DateTime] = d.riskLevel
# >>>>>>>>>>>> draw
fig, ax = plt.subplots(figsize = (10, 60))
ax = sb.heatmap(df, vmax = 7, vmin = 0, cmap = 'binary')
fig.savefig('m3diff.hmp.png', dpi = 300, bbox_inches = 'tight')
class Diff: # contains info for each data-point
# criterias = {'mean' : 0.01, 'std' : 0.02, 'loc' : 10}
def __init__(self , lines6):
assert len(lines6) == 6, 'Error! class Diff init need 6-lines text!'
# >>>>>>>>>>>>>>>> resolve data
self.DateTime = re.search(r"\d{7}:\d{6}", lines6[0]).group()
self.var = re.search(r'/(.*) vs B', lines6[1]).group(1)
a, b, c, d, e, f = re.search(r'A +(.*?)@\((.*?)\) +(.*?)@\((.*?)\) +(.*) +(.*)$', lines6[3]).groups()
self.Max_A = float(a)
self.Min_A = float(c)
self.Mean_A = float(e)
self.Sigma_A = float(f)
self.MaxLoc_A = eval(b)
self.MinLoc_A = eval(d)
a, b, c, d, e, f = re.search(r'B +(.*?)@\((.*?)\) +(.*?)@\((.*?)\) +(.*) +(.*)$', lines6[4]).groups()
self.Max_B = float(a)
self.Min_B = float(c)
self.Mean_B = float(e)
self.Sigma_B = float(f)
self.MaxLoc_B = eval(b)
self.MinLoc_B = eval(d)
a, b, c, d, e, f = re.search(r'A:B +(.*?)@\((.*?)\) +(.*?)@\((.*?)\) +(.*) +(.*)$', lines6[5]).groups()
self.MeanDiff = float(e)
self.SigmaDiff = float(f)
self.MaxLocDiff = 0
self.MinLocDiff = 0
ndims = len(self.MaxLoc_A)
for i in range(ndims):
self.MaxLocDiff = self.MaxLocDiff + abs(self.MaxLoc_A[i] - self.MaxLoc_B[i])
self.MinLocDiff = self.MinLocDiff + abs(self.MinLoc_A[i] - self.MinLoc_B[i])
self.riskLevel = 0
def calRisk(self):
self.riskLevel = 0
if self.MeanDiff != 0 and max(abs(self.MeanDiff / self.Mean_A), abs(self.MeanDiff / self.Mean_B)) > Diff.criterias['mean']:
self.riskLevel += 4
if self.SigmaDiff != 0 and max(abs(self.SigmaDiff / self.Sigma_A), abs(self.SigmaDiff / self.Sigma_B)) > Diff.criterias['std']:
self.riskLevel += 2
if max(self.MaxLocDiff, self.MinLocDiff) > Diff.criterias['loc']:
self.riskLevel += 1
return
def resolve_m3diff_rst(infile): # get all Diff from input file
m3diff_rst = infile
rstLines = open(m3diff_rst, "r").read().splitlines()
i = 0
diffs = []
while True:
if rstLines[i].startswith(' Date and time'):
diffs.append(Diff(rstLines[i : i+6]))
i = i + 6
if i == len(rstLines):
break
i += 1
return diffs
if __name__ == '__main__':
main()