此处保存一些我写过/用过/见过的以后可能会用到的脚本合集。
- 之前想在神经网络里实现Early Stopping,花了一段时间install pytorchtools因为可以import里面的EarlyStopping,但是好像新版本没有了,所以只能手动贴源码:
class EarlyStopping:
"""Early stops the training if validation loss doesn't improve after a given patience."""
def __init__(self, patience=7, verbose=False, delta=0):
"""
Args:
patience (int): How long to wait after last time validation loss improved.
上次验证集损失值改善后等待几个epoch
Default: 7
verbose (bool): If True, prints a message for each validation loss improvement.
如果是True,为每个验证集损失值改善打印一条信息
Default: False
delta (float): Minimum change in the monitored quantity to qualify as an improvement.
监测数量的最小变化,以符合改进的要求
Default: 0
"""
self.patience = patience
self.verbose = verbose
self.counter = 0
self.best_score = None
self.early_stop = False
self.val_loss_min = np.Inf
self.delta = delta
def __call__(self, val_loss, model):
score = -val_loss
if self.best_score is None:
self.best_score = score
self.save_checkpoint(val_loss, model)
elif score < self.best_score + self.delta:
self.counter += 1
print(f'EarlyStopping counter: {self.counter} out of {self.patience}')
if self.counter >= self.patience:
self.early_stop = True
else:
self.best_score = score
self.save_checkpoint(val_loss, model)
self.counter = 0
def save_checkpoint(self, val_loss, model):
'''
Saves model when validation loss decrease.
验证损失减少时保存模型。
'''
if self.verbose:
print(f'Validation loss decreased ({self.val_loss_min:.6f} --> {val_loss:.6f}). Saving model ...')
# torch.save(model.state_dict(), 'checkpoint.pt') # 这里会存储迄今最优模型的参数
torch.save(model, 'finish_model.pkl') # 这里会存储迄今最优的模型
self.val_loss_min = val_loss
early_stopping = EarlyStopping(patience=20, verbose=True)
for e in range(1, epochs+1):
# train
...
# validation
...
early_stopping(val_avr, model)
if early_stopping.early_stop:
print('Early stopping')
break
model.load_state_dict(torch.load('finish_model.pkl'))
-
在IRIS等网站下载了数据后,得到的是服务器上的文件列表链接,没有批量下载的途径。
所以先保存了网页为.html文件,其中包含每个文件的下载链接;接下来以"为分隔符提取出网址。
cat *.html | awk -F'"' '{print $8}' > list
将不要的几行删掉。
for i in $(ls list); do (wget -c --continue -t 0 ${i});done
或
wget -i list
或直接下载总网址上所有可下载的链接:
wget -r --level=0 -E --ignore-length -x -k -p -erobots=off -np -N http://www.remote.com/remote/presentation/dir
备注1:下载的地址默认在当前目录,所以下载大量文件时最好先提前新建一个文件夹,进入文件夹再下载。
备注2:以上三种方式我没有具体去评估比较,如果文件列表中有子文件夹,可能会有所区别。但第三种经检验可以全部下载下来。
- 上一点下载下来之后发现有些链接存储的不是文件,而是有文本的网页,这样用wget下载下来只会是html文件;我想要把文本存储为文件下下来。
for i in $(cat ../list); do (fn=`echo ${i} | awk -F'/' 'print $11}'`; curl ${i} >${fn}); done
curl是专门用来下载服务器文件的命令,链接无论存储的是文件还是文本网页都可以下载存为文本文件。
备注:list里存要下的链接,必须是下载文件的具体链接;如果是总文件夹链接,只会下载总网页文本信息。
-
辐射花样和震源球的计算和绘制
辐射花样:已知断层参数(strike, dip, rake)
- 注意:下半球面离源角为0-90°,最后写入dat中要转成纬度(离源角-90°)。dat文件为[方位角(0-360°),纬度(-90°-0°),辐射花样值]
再通过GMT中伍尔夫投影,即可得到震源球。
#!/bin/bash
R=0/360/-90/0
J=S0/-90/15c
B=a30f10
radpat_file=Radpat_P
gmt set MAP_FRAME_TYPE=plain
gmt set FORMAT_GEO_MAP=+D
gmt begin my_radpat_P png
gmt xyz2grd ${radpat_file}.dat -G${radpat_file}.nc -I1/1 -R$R
gmt grd2cpt ${radpat_file}.nc -Cpolar -E100 -H > ${radpat_file}.cpt
#gmt plot -R$R -J$J -T
gmt grdimage ${radpat_file}.nc -R$R -J$J -C${radpat_file}.cpt -B$B -BN+t"Radiation Pattern for P Wave"
gmt grdcontour ${radpat_file}.nc -R$R -J$J -L-0.001/0.001 -C1 -W2p
#gmt plot -R$R -J$J -T
gmt end show
rm gmt.conf gmt.history ${radpat_file}.cpt ${radpat_file}.nc
- FFT(频谱)和PSD(功率谱密度):
频谱:常用于能量信号(无穷长时间内能量有限)和周期信号,单位是被测物理量的单位。
功率谱:常用于随机信号(和周期信号同属功率信号,无穷长时间内功率有限,功率=能量/时间),单位是被测物理量单位^2/Hz,反映某一频率的能量密度。
https://zhuanlan.zhihu.com/p/417454806 (原理参考)
https://zhuanlan.zhihu.com/p/334501240 (脚本)
- 分布与边缘分布一并绘制:
样图
- 傅里叶变换与时频谱分析
Fourier Transform:
from scipy.fftpack import fft,ifft
#dt = tr.stats.npts/tr.stats.sampling_rate # 数据时间长度
df = tr.stats.sampling_rate
def FFT (Fs,data):
data = data - np.mean(data)
L = len(data) # 信号长度
N = int(np.power(2,np.ceil(np.log2(L)))) # 下一个最近二次幂
FFT_y = (abs(fft(data,N)))/L*2 # N点FFT,但除以实际信号长度 L
Fre = np.arange(int(N/2))*Fs/N # 频率坐标
FFT_y = FFT_y[range(int(N/2))] # 取一半
return Fre, FFT_y
#Fre, FFT_y1 = FFT(df, count)
Fre, FFT_y1 = FFT(df, tr.data)
plt.figure(figsize=(8,2),dpi=100)
plt.plot(Fre,FFT_y1)
#plt.plot(1/Fre/3600/24,FFT_y1)
plt.grid()
#plt.xlabel('Period (day)')
plt.xlabel('Frequency (Hz)')
plt.ylabel('Power')
plt.xscale('log')
#plt.xlim(0,0.00002)
plt.show()
Spectrogram:
from scipy.signal import stft
df = tr.stats.sampling_rate
f, t, zxx = stft(tr.data, fs=df, nperseg=512, noverlap=512*0.9, nfft=512)
plt.figure(figsize=(10,10))
plt.imshow(np.abs(zxx), aspect='auto',
extent = (min(t), max(t), max(f), min(f))
)
plt.gca().invert_yaxis()
#cb = plt.colorbar()
#cb.ax.tick_params(labelsize = 20)
plt.xticks(size = 20)
plt.yticks(size = 20)
plt.title('STFT Magnitude', size = 30)
plt.xlabel('Time [sec]', size = 30)
plt.ylabel('Frequency [Hz]', size = 30)
- 在python里使用shell命令:
import subprocess
s = "taup time -mod prem -h 12.67 -deg 1.131 -ph P,p -time"
proc = subprocess.Popen(s, stdout=subprocess.PIPE, shell=True)
res = proc.communicate()[0]
res.decode('gbk')
- self-attention里加position embedding的方法:
(1)绝对位置
def position_matrix(start=0, stop=2, T=2, n=4):
"""
Create a positional encoding matrix.
Args:
start: lowest value in the positional encoding vector
stop: stop is the highest value in the positional encoding vector
T: number of values in one cycle of the lowest bit in the PE vector
n: length of the positional encoding vector
Return: a matrix of shape (2^(n-1)*T, n)
"""
step = (stop - start) / T
one_cycle = np.arange(start, stop, step)
pos_matrix = np.zeros((np.power(2, (n-1))*T, n))
for j in range(n):
block = np.repeat(one_cycle, np.power(2, j))
pos_matrix[:,n-j-1] = np.tile(block, np.power(2, (n-1-j)))
return pos_matrix
forward中:
ppmm = self.position_mat[:x.size(1)].repeat(x.size(0), 1, 1)
ppmm = ppmm.cuda()
ppmm = ppmm.to(torch.float32)
# print(ppmm)
#x = x + ppmm
主函数中:
num = 90; n = 12 # num———self-attention的seq_len; n———self-attention的input_dim
T = max(2, int(num/2**(n-1))+1)
pm = torch.from_numpy(position_matrix(0,2,T,n)) # (*,n)
pm.to(device)
model=Self_attention(*,*,pm)
(2)正弦余弦
class PositionalEncoding(nn.Module):
"""Positional encoding."""
def __init__(self, num_hiddens, dropout, max_len=1000):
super(PositionalEncoding, self).__init__()
self.dropout = nn.Dropout(dropout)
# Create a long enough `P`
self.P = torch.zeros((1, max_len, num_hiddens))
X = torch.arange(max_len, dtype=torch.float32).reshape(
-1, 1) / torch.pow(10000, torch.arange(
0, num_hiddens, 2, dtype=torch.float32) / num_hiddens)
self.P[:, :, 0::2] = torch.sin(X)
self.P[:, :, 1::2] = torch.cos(X)
def forward(self, X):
X = X + self.P[:, :X.shape[1], :].to(X.device)
return self.dropout(X)
主函数中:
encoding_dim, num_steps = 32, 90
pos_encoding = PositionalEncoding(encoding_dim, 0)
pos_encoding.eval()
X = pos_encoding(torch.zeros((1, num_steps, encoding_dim)))
P = pos_encoding.P[:, :X.shape[1], :]
- 绘制子图时加colorbar不占据子图空间(即单独给colorbar加一个轴)
fig, ax = plt.subplots(2, 1, figsize=(3,3), dpi=300,
gridspec_kw={'height_ratios': [1, 5]},
sharex=True)
ax1 = ax[0]
ax2 = ax[1]
ax1.plot(np.arange(len(tr.data))/df, tr.data, 'k', lw=0.4)
ax1.set_xlim(0, len(tr.data)/df)
ax1.set_ylim(-0.08,0.08)
sc = ax2.pcolormesh(t1, f1[1:], amp_zxx[1:], cmap='OrRd', vmin=0, vmax=0.1)
#plt.pcolormesh(new_t, f_log, amp_zxx)
#plt.yscale('log')
#plt.gca().invert_yaxis()
cax = fig.add_axes([ax2.get_position().x1+0.02,ax2.get_position().y0,0.03,ax2.get_position().height])
plt.colorbar(sc, cax=cax)
#cax.tick_params(labelsize = 10)
plt.xticks(size = 10)
plt.yticks(size = 10)
#plt.title('STFT Magnitude', size = 30)
ax2.set_xlabel('Time [sec]', size = 10)
ax2.set_ylabel('Frequency [Hz]', size = 10)
plt.subplots_adjust(hspace=0)
plt.show()
结果示例:
- plot 3d elevation for a region by GMT
Reference: GMT Chinese community
https://docs.gmt-china.org/latest/examples/ex029/#gmtplot-116c7b68ca8c491adade37c19a9a9b62
I plot the topography and add the locations of the array on it:
R=76.1167/76.6667/-69.6167/-69.35/-100/1000
gmt begin elevation_3d pdf
#gmt grdpaste REMA1.nc REMA2.nc -GREMA.nc
gmt grdtrack -R76.1167/76.6667/-69.6167/-69.35 -GREMA.nc staloc.txt > staloc.xyz # generate the corresponding elevations of the stations by track locations on topography file REMA.nc
gmt makecpt -Cgray -T0/900
gmt grdview REMA.nc -R$R -JM10c -JZ3c -N-100+ggray -C -Qi300 -I -Ba -BwsENZ -p30/25
cat staloc.xyz | awk '{print $1,$2,$5}' | gmt plot3d -R$R -JM10c -JZ3c -St0.05c -W1p,white -Gwhite -p
gmt colorbar -C -Ba -DJTC+o0/1c -p
gmt basemap -TdjRB+w3c+l+o0.5c/0.5c -p30/25/700 # plot the North arrow
gmt end show
rm staloc.xyz
- pytorch: If I want to manually overwrite the parameters of several layers of a model:
old_model = torch.load('OLD.pt')
new_model = MODEL()
new = new_model.state_dict()
for param_tensor in new:
new[param_tensor] = old_model.state_dict()[param_tensor]
new_model.load_state_dict(new) # necessary! important!