2025-08-23 小马的acRIP call完peak用DESeq2分析

一开始感觉要直接做三组平均,写了个peakaverage.sh 暂时没用上

#!/bin/bash

# Define input and output directories

INPUT_DIR="/home/data/t210424/ac4c/bam"

OUTPUT_DIR="/home/data/t210424/ac4c/peaksaverage" 

# Create output directory if it doesn't exist

mkdir -p "$OUTPUT_DIR"

# Change to input directory

cd "$INPUT_DIR"

# Call peaks for Sham group

macs2 callpeak \

  -t Sham2_IP.bam Sham3_IP.bam Sham4_IP.bam \

  -c Sham_Input.bam \

  -f BAMPE \

  -g mm \

  -n Sham_group \

  --outdir "$OUTPUT_DIR" \

  --broad \

  --broad-cutoff 0.1 \

  -q 0.01 \

  --nomodel \

  --extsize 200 \

  --keep-dup all

# Call peaks for SNI group

macs2 callpeak \

  -t SNI1_IP.bam SNI2_IP.bam SNI4_IP.bam \

  -c SNI_Input.bam \

  -f BAMPE \

  -g mm \

  -n SNI_group \

  --outdir "$OUTPUT_DIR" \

  --broad \

  --broad-cutoff 0.1 \

  -q 0.01 \

  --nomodel \

  --extsize 200 \

  --keep-dup all

单个样本的callpeak其实和之前一样,不过文件名改过了干脆重新call了,脚本是peaksingle.sh

#!/usr/bin/env bash

set -euo pipefail

bam=/home/data/t210424/ac4c/bam

out=/home/data/t210424/ac4c/peaksingle

mkdir -p "$out"

declare -A IN=( ["Sham"]=Sham_Input.bam ["SNI"]=SNI_Input.bam )

for s in Sham2 Sham3 Sham4 SNI1 SNI2 SNI4; do

  grp=${s%%[0-9]*}  # 提取 Sham 或 SNI

  macs2 callpeak \

    -t "$bam/${s}_IP.bam" \

    -c "$bam/${IN[$grp]}" \

    -f BAM -g mm --nomodel --extsize 200 --keep-dup all \

    -n "${s}" --outdir "$out"

done

对/home/data/t210424/ac4c/peaksingle里每一个bed进行合并

脚本是ac4c文件夹下的tongji1.sh

#!/usr/bin/env bash

set -euo pipefail

bam=/home/data/t210424/ac4c/bam

bed=/home/data/t210424/ac4c/peaksingle

out=/home/data/t210424/ac4c/peaksingle/bed

mkdir -p "$out"

# 1) union peak(按坐标并集)

cat $bed/*.narrowPeak \

  | cut -f1-3 | sort -k1,1 -k2,2n | bedtools merge -i - \

  > $out/union.bed

# 2) 统计每个 peak 在各样本的 reads(IP 与 Input 分开做)

bedtools multicov -bams \

  $bam/Sham2_IP.bam $bam/Sham3_IP.bam $bam/Sham4_IP.bam \

  $bam/SNI1_IP.bam  $bam/SNI2_IP.bam  $bam/SNI4_IP.bam \

  -bed $out/union.bed \

  > $out/counts_ip.txt

bedtools multicov -bams \

  $bam/Sham_Input.bam $bam/Sham_Input.bam $bam/Sham_Input.bam \

  $bam/SNI_Input.bam  $bam/SNI_Input.bam  $bam/SNI_Input.bam \

  -bed $out/union.bed \

  > $out/counts_input.txt

然后就转进R了 先做的DESeq2 双因素模型,SHAM对比SNI IP对比INPUT两种因素

library("ashr")

library(DESeq2)

library(tidyverse)

setwd("F:/data")

# 1. 导入 count 矩阵

counts <- read.table("all_counts.txt", header=TRUE, row.names=1, sep="\t")

# 2. 构建样本信息,并指定 factor 水平

#    Sham 作为 condition 的 reference,Input 作为 antibody 的 reference

sample_info <- data.frame(

  row.names = colnames(counts),

  condition = factor(c("Sham","Sham","Sham","SNI","SNI","SNI","Sham","SNI"),

                    levels = c("Sham","SNI")),

  antibody  = factor(c("IP","IP","IP","IP","IP","IP","Input","Input"),

                    levels = c("Input","IP"))

)

# 3. 构建 DESeq2 对象

dds <- DESeqDataSetFromMatrix(countData = counts,

                              colData  = sample_info,

                              design    = ~ condition + antibody + condition:antibody)

# 4. 运行差异分析

dds <- DESeq(dds)

# 5. 提取交互项(Sham vs SNI 在 IP 相对于 Input 的差异)

res <- results(dds, name="conditionSNI.antibodyIP")

# 6. 收缩 log2FC(推荐使用 ashr)

res <- lfcShrink(dds, coef="conditionSNI.antibodyIP", res=res, type="ashr")

# 7. 输出结果

write.csv(as.data.frame(res), file="acRIP_diff_peaks.csv")

结果太少了 又换回单因素了,先用IP/INPUT再做,DESeq只接受整数所以换了limma

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容