FH构建

FH构建

参考文献：

Reveal genomic insights into cotton domestication and improvement using gene level functional haplotype-based GWAS | Nature Communications

注意：

将test.traw的SNP列替换：

import pandas as pd

input_file = "test.traw"
output_file = "test_renamed.traw"

# 读取文件
traw_data = pd.read_csv(input_file, sep="\t", low_memory=False)
traw_columns = traw_data.columns.tolist()

# 【新增】设置你的列名并检查
chr_col = "CHR"
pos_col = "POS"
ref_col = "ALT"
alt_col = "COUNTED"
required_cols = [chr_col, pos_col, ref_col, alt_col]

# 检查列名是否存在
missing_cols = [col for col in required_cols if col not in traw_columns]
if len(missing_cols) > 0:
    print(f"错误：找不到以下列名，请确认你的文件列名是否正确！")
    print(f"缺失的列名：{', '.join(missing_cols)}")
    print(f"你的文件所有列名：{', '.join(traw_columns)}")
    exit(1)  # 退出脚本，避免后续报错

# 构建新 SNP 列
traw_data["SNP"] = (
    traw_data[chr_col].astype(str) + "_" +
    traw_data[pos_col].astype(str) + "_" +
    traw_data[ref_col].astype(str) + "_" +
    traw_data[alt_col].astype(str)
)

# 保存文件
traw_data.to_csv(output_file, sep="\t", index=False, header=True, quoting=0)
print(f"成功！输出文件：{output_file}")

这里需要注意REF、ALT列和原来的顺序是相反的

再运行脚本：

即可得到正确数目的基因

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成，浏览时请结合常识与多方信息审慎甄别。
平台声明：文章内容（如有图片或视频亦包括在内）由作者上传并发布，文章内容仅代表作者本人观点，简书系信息发布平台，仅提供信息存储服务。

FH构建

相关阅读更多精彩内容

友情链接更多精彩内容