服务器设置使用的最大内存

通常一个节点的内存在250G左右,若不设置运行内存,则会默认250G,当设置核心数过少时,可能会被分配到某个含空闲核心数不多的节点,该节点内存亦不多,因此可能会导致任务被杀死,下图所示

d842cfa239723c4ac3c46d8336ac420.png

因此一般需要配置使用内存,可同时配置软件的使用内存和服务器请求使用的内存

  1. 在所使用软件的参数设置,以novoplasty的config.txt为例
Project:
-----------------------
Project name          = Test
Type                  = mito
Genome Range          = 12000-22000
K-mer                 = 33
Max memory            =
Extended log          = 0
Save assembled reads  = no
Seed Input            = /path/to/seed_file/Seed.fasta
Extend seed directly  = no
Reference sequence    = /path/to/reference_file/reference.fasta (optional)
Variance detection    =
Chloroplast sequence  = /path/to/chloroplast_file/chloroplast.fasta (only for "mito_plant" option)

Dataset 1:
-----------------------
Read Length           = 151
Insert size           = 300
Platform              = illumina
Single/Paired         = PE
Combined reads        =
Forward reads         = /path/to/reads/reads_1.fastq
Reverse reads         = /path/to/reads/reads_2.fastq
-----------------------
MAF                   =
HP exclude list       =
PCR-free              =

Optional:
-----------------------
Insert size auto      = yes
Use Quality Scores    = no


Project:
-----------------------
Project name         = Choose a name for your project, it will be used for the output files.
Type                 = (chloro/mito/mito_plant) "chloro" for chloroplast assembly, "mito" for mitochondrial assembly and
                       "mito_plant" for mitochondrial assembly in plants.
Genome Range         = (minimum genome size-maximum genome size) The expected genome size range of the genome.
                       Default value for mito: 12000-20000 / Default value for chloro: 120000-200000
                       If the expected size is know, you can lower the range, this can be useful when there is a repetitive
                       region, what could lead to a premature circularization of the genome.
K-mer                = (integer) This is the length of the overlap between matching reads (Default: 33).
                       If reads are shorter then 90 bp or you have low coverage data, this value should be decreased down to 23.
                       For reads longer then 101 bp, this value can be increased, but this is not necessary.
                       memory capacity. If you have sufficient memory, leave it blank, else write your available memory in GB
                       (if you have for example a 8 GB RAM laptop, put down 7 or 7.5 (don't add the unit in the config file))
Seed Input           = The path to the file that contains the seed sequence.
Extend seed directly = This gives the option to extend the seed directly, in stead of finding matching reads. Only use this when your seed
                       originates from the same sample and there are no possible mismatches (yes/no)
Reference (optional) = If a reference is available, you can give here the path to the fasta file.
                       The assembly will still be de novo, but references of the same genus can be used as a guide to resolve
                       duplicated regions in the plant mitochondria or the inverted repeat in the chloroplast.
                       References from different genus haven't beeen tested yet.
Variance detection   = If you select yes, you should also have a reference sequence (previous line). It will create a vcf file                
                       with all the variances compared to the give reference (yes/no)
Chloroplast sequence = The path to the file that contains the chloroplast sequence (Only for mito_plant mode).
                       You have to assemble the chloroplast before you assemble the mitochondria of plants!

Dataset 1:
-----------------------
Read Length          = The read length of your reads.
Insert size          = Total insert size of your paired end reads, it doesn't have to be accurate but should be close enough.
Platform             = illumina/ion - The performance on Ion Torrent data is significantly lower
Single/Paired        = For the moment only paired end reads are supported.
Combined reads       = The path to the file that contains the combined reads (forward and reverse in 1 file)
Forward reads        = The path to the file that contains the forward reads (not necessary when there is a merged file)
Reverse reads        = The path to the file that contains the reverse reads (not necessary when there is a merged file)

Heteroplasmy:
-----------------------
MAF                  = (0.007-0.49) Minor Allele Frequency: If you want to detect heteroplasmy, first assemble the genome without this option. Then give the resulting
                       sequence as a reference and as a seed input. And give the minimum minor allele frequency for this option
                       (0.01 will detect heteroplasmy of >1%)
                                                                                                                           83,1          87%

可增加核心数,并在配置文件中配置使用内存,如50G,则在Max memory输入50

  1. slurm提交时设置使用内存
    可在脚本文件中(.sh)设置使用的最大内存,以下两种可参考

1)指定作业设置最大内存:--mem=<size[units]>指定作业在每个节点申请的内存

#SBATCH --mem=2G 作业申请 2G 内存资源

2)每个进程设置最大内存:--mem-per-cpu=<size[units]> 每个进程申请的内存

#SBATCH --mem-per-cpu=512M 每个进程申请 512M 内存资源
  1. 查看各节点剩余资源
pestat -p xhacnormala(队列名)

结果显示节点名、节点状态、剩余CPU(核心数)、CPUload(一段时间内CPU正在处理以及等待CPU处理的进程数之和的统计信息)、节点内存大小,剩余节点内存,任务列表

Print only nodes in partition xhacnormala
Hostname     Node Num_CPU  CPUload  Memsize  Freemem  Joblist
            State Use/Tot              (MB)     (MB)  JobId User ...
hara1004      mix  36  64    4.16*   257663   195961    
  1. 查看某一节点剩余资源
sinfo -p xhacexclu04 -o "%P %a %D %c %C"

查看

PARTITION AVAIL NODES CPUS CPUS(A/I/O/T)
xhacexclu04 up 1 64 60/4/0/64

参考:
https://www.cnblogs.com/nandi001/p/11643414.html

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容