linux command

最近更新：2022/9/21 10:04

# vim

· 在10-30行首加4个空格："10,30 s/^/ /

# awk

· 指定条件匹配某数组的第1列：awk -F'\t' '{if($1==123) print $0}' data.txt

· 用某文件中的内容匹配某数组的第1列，输出第8列：awk -F '\t' 'BEGIN{while(getline<"userid.txt") a[$1]=1;} {if(a[$1]==1) print $8;}' data.txt

(临时建立了一个字典，key是userid.txt中的第1列，values赋值为1；然后用data.txt的第1列做key去看其value是否为1)

# ssh

· ssh配置文件：vim ~/.ssh/config

· 免密ssh：

生成本地公匙文件 ssh-keygen -t rsa

添加新服务器密码 ssh-copy-id name (name是你之前在config里给这个服务器域名取的短名)

Reference: 使用Linux，从正确配置ssh开始 - 知乎

· sshfs将本地Linux文件夹与服务器文件夹挂载，可以实现不scp查看服务器图片等文件：

sudo chmod 777 localpath

sshfs username@ip:path localpath

如要取消挂载：umount localpath -o nonempty

Reference: SSHFS使用指南_eatlemon的博客-CSDN博客_sshfs

· ssh断点续传：

rsync + ssh 替代scp 传送数据和断点续传_胡二妞的博客-CSDN博客_ssh断点续传

# conda / pip

· 如果安装包时说没有权限，就将anaconda路径chmod -R 777，注意-R必需，为更新所有子文件的权限

· pip时出现SOCK相关问题-代理问题，就unset ALL_PROXY && unset all_proxy

Pip install 报错 Failed to establish a new connection: [Errno 111] Connection refused_HUGOkungggg的博客-CSDN博客

# CPU

· 哪些计算节点空闲：pbsnodes -l free

· 切换节点：ssh cu01/cu02/...

· 修改密码：passwd

# GPU

· 查看CUDA版本：nvcc -V

· 用pip安装某python包的时候出现*Defaulting to user installation because normal site-packages is not writeable*，可以把pip install xxx改成python -m pip install xxx ;另外如果在Collecting时出现WARNING：Retrying，可以尝试换个源。临时换源：python3 -m pip install xxx -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com

· 用GPU跑ML的时候出warning说

NVIDIA A100-PCIE-40GB with CUDA capability sm_80 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70. If you want to use the NVIDIA A100-PCIE-40GB GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

那就说明pytorch的版本和CUDA的版本不对应，一般可能pytorch要往回一两个版本才能正常跑。

pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0-f https://download.pytorch.org/whl/torch_stable.html

# compress and reverse

· tar -czvf test.tar.gz test

· tar -xzvf test.tar.gz

# Linux terminal command

· 查看当前路径总文件（夹）个数：ls | wc -l

· 文件个数：ls -l |grep "^-" | wc -l

· 文件夹个数：ls -l | grep "^d" | wc -l

· 当前路径下的空文件夹：find $path -type d -empty

· 删除空文件夹：find $path -type d -empty | xargs rm -r

· 出现 “Argument list too long”：将原来的命令改为找到一个操作一个

如 rm -rf * 改为 find . -name 'temp*' | xargs rm -rf 'temp*'

· 查看后台进程：ps -aux | grep husir

· 磁盘空间及占用空间：df -h 某个目录下统计占用空间：du -sh .

· 给长路径取别名：A. ~/.bashrc中添加alias myFold="cd /storage/lab/husir/"

即可在终端中直接用myFold进入该目录

B. ~/.bashrc中添加shopt -s cdable_vars

export myFold=/storage/lab/husir/

即可在终端中用cd myFold进入该目录

（重启terminal/source .bashrc后生效）

· 用.yml安装conda环境：conda env create -f XXX.yml

· 从绝对路径中提取出最后一项文件名：long_path=/home/husir/Desktop/xxxx.dat

file=`echo $long_path | awk -F "/" '{print $NF}'`

Reference: Linux shell中提取文件名和路径 - 鲁娜的博客 | Luna's Blog

· 将某路径下批量文件改名：

删除文件名中的某字符串如filtered：for i in $(ls ./*.filtered*); do(mv ${i} `echo ${i%.filtered*}.jpg`); done

· 只scp某ls中的文件/文件夹到服务器中：

awk '{printf "scp -r PATH/%s cpu:PATH\n",$1}' ls | sh

· 双屏之后之前deepin-wine的wechat不知怎么用不了了，图标点击后打不开，试了一下docker安装wechat docker安装微信

笔记本登微信挤掉这边的微信之后，要么就一直留着登陆界面在桌面上，要么就sudo docker stop wechat; sudo docker start wechat就可以重开这个container

python

· 正则表达式：找到[]圈住的那一段字符串 re.findall("\[.*?\]")

· 数组 (array) 按第一列排正序：data = data[np.argsort(data[:,0])]

逆序：data = data[np.argsort(data[:,0], reverse=True)]

列表 (list) 按第i列排正序：ls = sorted(ls, key=lambda x: x[i])

逆序：ls=sorted(ls, key=lambda x: x[i], reverse=True)

· ax.set_xscale('log')

· 对数据点进行平滑：data_smooth = savgol_filter(data, 11, 3)

其中第二个数是平滑窗口（须为正奇数），第三个数是多项式拟合阶数

具体可参考：python 数据、曲线平滑处理——方法总结(Savitzky-Golay 滤波器、make_interp_spline插值法和convolve滑动平均滤波)_Yale曼陀罗的博客-CSDN博客_make_interp_spline

· pandas dataframe读入数据文件：pd.read_csv(filename, skiprows=?, sep='\s+', usecols=[0,1,???], names=['?','?','?'])

sep='\s+'是一个或多个空格

· 进度条：from tqdm import tqdm

for i in tqdm(range(100)):

xxx

· 列表排序：def take(ls): return ls[1]

ls.sort(key=take, reverse=False) (升序)

· 设置横坐标/纵坐标间隔：from matplotlib.pyplot import MultipleLocator

x_major_locator=MultipleLocator(1)

ax.xaxis.set_major_locator(x_major_locator)

· excel筛选某列的值在某列表内：info = info[info['cluster'].isin([1,2,3])]

· multiprocessing的starmap_async, apply_map啥的是默认不返回运行错误的，经常出现实际卡在那里但表面风平浪静的假象，为该并行加上错误返回：

具体可参考：python multiprocessing 高级用法（打印报错解决Pool线程不运行）_Ricky_Yan的博客-CSDN博客_multiprocessing 打印线程号

后来发现好像没屁用

· pandas用来创建等差时间序列：

pd.date_range(start=datetime.datetime(2019,12,6,0,0),end=datetime.datetime(2020,1,2,23,0), freq='H')

datetime计算时间加减;

+datetime.timedelta(days=1)

Obspy

· 读入多个sac，但又不是*：st = read('[NEZ].sac')则可以读入N.sac，E.sac以及Z.sac

· plot beachball:

from obspy.imaging.mopad_wrapper import beachball

mt = [-0.5, 1, 0.5, 0, 0, 0] # CLVD moment tensor (Mrr, Mtt, Mpp, Mrt, Mrp, Mtp)

beachball(mt, size=200)

Notebook