黑水虻基因组图谱及遗传操作(CRISPR/Cas9)

Genomic landscape and genetic manipulation of the black soldier fly Hermetia illucens, a natural waste recycler

2019年11月25日,上海生科院植物生理生态研究所黄勇平,华中农业大学张吉斌团队等人在Cell Research上在线发表了题为Genomic landscape and genetic manipulation of the black soldier fly Hermetia illucens, a natural waste recycler的研究论文。该研究报告了黑水虻(BSF)的高质量基因组图谱,通过CRISPR/Cas9的基因编辑方法,获得了一种能显著提高BSF取食能力的基因型,为优化BSF基因系的产业化提供了有价值的基因组和技术资源。


黑水虻基因组

Abstract

黑水虻是双翅目,水虻科昆虫,能将有机物转化成动物可食用的资源,基因组大小1.1G,16,770个蛋白编码基因。与其他双翅目昆虫相比,黑水虻基因组在septic adaptation(腐败性环境的适应性)的功能类群中的基因大量扩张,包括immune system factors, olfactory receptors, and cytochrome P450s。中肠转录组表明与消化系统以及抵抗细菌等通路大量富集。BSF幼虫取食代表性的有机物的微生物组表明,Firmicutes bacteria(厚壁菌门细菌)在肠道微生物最多。通过CRISPR/Cas9-based技术得到取食能力增强的基因型。

Data availability: NCBI under BioProjectID PRJNA547968 and SRA under SRR10158821.

Introduction

随着人类人口的大量扩增,产生了越来越多的有机废物,它们的处理办法主要有三种:焚烧,填埋,堆肥。然而这些方法或多或少都会造成环境的二次污染。而黑水虻被认为是在全世界唯一可以用于水产以及家禽的饲料原料的昆虫。它们可以高效的利用有机废物转化成蛋白,脂肪等,降低二氧化碳排放,病原菌及抗生素污染。随着测序技术的不断发展,本文利用基因组,转录组,宏基因组及基因的遗传操作,可用于探索BSF生物学特征的遗传基础。

RESULTS AND DISCUSSION

Characteristics of the BSF genome

测序样本为10代自交系昆虫,~300×左右测序深度的Illumina sequencing,包括paired-end libraries of short inserts and mate-pair libraries of long inserts,1102 Mb of assembled scaffolds with a 1.69 Mb N50 length。BSF由于转座子,重复的非编码DNA,以及大量的重复序列导致其基因组很大。


图片.png

测序策略

测序深度及GC含量呈现正态分布,表明组装中污染较少。


图片.png

16,770 protein-coding genes通过与六种双翅目昆虫的同源比对,12个连续BSF发育阶段的转录组数据,以及三个从头预测的基因集得到16,770 protein-coding genes。
图片.png

Comparison of the BSF genome with those of other dipterans

the BSF genome fills a gap between the Nematocera(长角亚目), the earliest diverging suborder of Diptera, and more recent flies(短角亚目).


图片.png

BSF与家蝇和果蝇的 nonsynonymous-to-synonymous substitution (dN/dS) ratios分析得出,发现342个基因dN/dS比例高,这些快速进化的基因主要富集在与核糖体相关的功能模块上,它们参与蛋白质合成通路。它们主要富集在氨基酸代谢以及免疫相关的代谢通路中。h这可能由于BSF长期生活在高蛋白以及病原体富集的环境中。


dN/dS

b. Identification of pathways that have rapidly evolved in BSF. dN/dS ratios were calculated independently in two parallel evolutionary lineages, M. domestica and D. melanogaster, using BSF as the common ancestor. Each dot indicates the median dN/dS ratios of all related genes in the corresponding pathway. Significantly enriched (FDR-adjusted P < 0.05), rapidly evolving genes in KEGG pathways are highlighted in red.
BSF表达1798个物种特异性的重复基因,在短角亚目中最多。这些基因主要表达在幼虫期的最后阶段,这可能与其废物转化的取食行为相关。

Expansions in gene families are related to BSF environmental interactions

可以看到BSF与其他双翅目昆虫相比,在解毒酶,嗅觉感受,免疫因子,免疫通路相关的基因出现大量扩张,这与它的环境适应是相关联的。


图片.png

Fig. 3 Expansions in gene families related to BSF environmental adaptation. a Number of gene copies in the indicated families related to environmental adaptation in dipteran species. The area size of each pie indicates the relative gene number in each family. b–e Phylogenetic relationships across three dipteran species for gene families with prominent expansions in BSF: gram-negative binding proteins (b), cecropin antimicrobial peptides (c), Olfactoery receptors (d), cytochrome P450s (e). Phylogenetic trees were estimated using the maximum likelihood method.

Intestinal transcriptome of BSF larvae fed on organic waste

它们通过对BSF幼虫喂食包括食物废物,家禽粪,牛粪和猪粪,分别在第4,6,8,12天提取中肠进行转录组分析。


图片.png

Fig. 4 Intestinal transcriptome in BSF larvae fed with organic waste. Midguts of BSF larvae fed with food waste (FW), poultry manure (PM),
dairy manure (DM), or swine manure (SM) were sampled on days 4, 6, 8, and 12 of feeding with the indicated diet. The samples were subjected
to RNA-seq. a Distributions of expressed genes (n = 9417) across 16 samples: Genes expressed at each time point under each type of diet are
labeled “All”; those expressed in 15 out of 16 samples are labeled “Almost all”; genes commonly expressed under each diet but not at every
time point are labeled “Broad”; genes only expressed in one sample are labeled “Orphan”; genes only expressed by larvae fed with manure are
labeled “Manure”; and genes only expressed in larvae fed with food waste are labeled “Waste”. b Principal component analysis of intestinal
samples based on their overall expression profiles. The first two eigenvectors that explained 34.2% and 20.4% of the variance are plotted. c
Venn diagram of the 500 most highly expressed genes (~5% of all expressed genes), selected for each type of diet based on the average
expression values across all time points. A total of 326 genes were expressed by larvae fed all four diets. d The 326 genes expressed by larvae
fed all four diets were subjected to KEGG enrichment analysis. Pathways in blue belong to digestive systems, and pathways in red indicate
those related to infectious diseases. Gene counts are presented as histograms. Hypergeometric test (FDR-adjusted): *P < 0.05, ***P < 0.005,
****P < 0.001. e A representative gene cluster specific to BSF and highly expressed in larvae fed with organic waste. Genomic organization in
BSF and the homologous region in D. melanogaster are shown. Homolog pairs between these species are linked by lines. Genes in green and
blue indicate BSF-specific genes that belong to two ortholog groups. These 14 genes do not have homology to genes of any other sequenced
invertebrate species. Note that this cluster is located in the end of an assembled BSF scaffold. The heatmap shows the expression pattern of
corresponding genes in BSF larvae fed with the other diets at each of the four time points.

Microbiota of BSF larvae fed on organic wastes

通过16S rRNA测序,得到BSF在不同取食及不同时间的肠道微生物种类和丰度,可以看到取食牛粪和猪粪的幼虫肠道中有更多种类的微生物类群。不像中肠转录组的表达谱没有规律性,取食与肠道微生物类群相关性很高。这其中厚壁菌门(Firmicutes)的细菌种类最多,它们能分泌多种蛋白酶和果胶酶参与到消化稻草相关肥料的糖类代谢中
Firmicutes have an important
role in digestion of animal manure as these bacteria secrete a
variety of proteases and pectinases and are involved in degradation
of indigestible carbohydrates in straw-related compost


图片.png

Fig. 5 Microbiome of BSF larvae fed with different types of organic waste. a Within-sample diversity estimates of the bacterial communities in
larvae fed with the indicated diets. b Constrained principal coordinate analysis of between-sample diversity. Bray-Curtis distances between
samples constrained by diets plotted for the first two CPCoAs. c The dynamic landscape of OTUs across all communities at a phylum level.
OTU richness is indicated by the area of corresponding symbols. Symbols indicate counts of contained sequences. Colors indicate the fraction
of target OTUs relative to all OTUs of the corresponding sample.

Genetic manipulation to facilitate the utilization of BSF larvae

主要的思路就是能让BSF在幼虫阶段吃的更多,转化有机物的能力增加,在成虫阶段减少其移动的距离,这样可以积累大量的种群数量。
首先昆虫的变态过程是通过一系列激素和神经肽控制的,而促前胸腺激素(Ptth)可以控制蜕皮激素的合成与释放。Ptth的敲除可以有效延长幼虫到蛹的时间,two sgRNAs, targeted to the second and fourth exons, to disrupt HiPtth substantially in vivo。the last larval instar increased from 4–5 days in controls to > 85 days in mutant larvae of any mosaic forms of disrupted HiPtth。体型和体重也有明显增加,这可能由于延长其取食时间导致的。
其次,通过与果蝇翅发育基因的同源比对,BSF. Vestigial (Vg)编码对果蝇翅大小和形状的基因。通过对其敲除得到了无翅的成虫个体,并且不影响成虫的发育。

图片.png

Fig. 6 Mutagenesis of Ptth leads to increased feeding capacity in BSF larvae. The CRISPR/Cas9 system was used to induce mutations at the
HiPtth locus in H. illucens. a Schematic representation of the exon/intron boundaries of the HiPtth gene. Exons are shown as boxes; thin lines
represent introns; numbers are fragment lengths in base pairs (bp). Target site (TS) locations are noted and PAM sequences are shown in red.
b Sequences of the targeted region in the HiPtth locus in the mutants. The PAM sequence is in red. The numbers of nucleotides deleted in
each line are indicated on the right. c Morphology of HiPtth mutants showing their greater size relative to wild type (WT) controls. d Average
body weights of mutants and control (n = 30; mean values ± SEM).

图片.png

Fig. 7 Mutagenesis of Vg in BSF eliminates wings in adults.
a Schematic representation of the exon/intron boundaries of HiVg.
Exons are shown as boxes and thin lines represent the introns.
Target site (TS) locations are noted and PAM sequences are shown in
red. b Sequences of the targeted region in the corresponding loci of
Vg mutants. The PAM sequence is in red. The numbers of
nucleotides deleted in each line are indicated on the right.
c Phenotypic images show that Vg mutants lack wings in the
adult stage.

MATERIALS AND METHODS

Genome sequencing

提取单个蛹的DNA用于基因组测序,主要通过构建不同插入片段大小的paired-end和mate-pair文库来构建contig和scaffold。

Genome assembly

Kmer分析评估基因组大小,Seqtk v1.0 trim Adaptors and low-quality bases.Kmer的统计使用jellyfish(21mer)。杂合度和其他基因组特征使用GenomeScope。

  • MiSeq read pairs were utilized to assemble contigs using DiscovarDeNovo,Initial contigs were processed by redundans v0.11c63 to remove potential redundant sequences。
  • The paired-end read information from the long libraries was used step by step from 800-bp to 13-kb insert size to join contigs into scaffolds using SSPACE。
  • The remaining gaps within scaffolds were iteratively filled with paired-end reads of 250-bp and 800-bp inserts using GapCloser available in SOAPdenovo。
  • CEGMA (Core Eukaryotic Genes Mapping Approach) and BUSCO (Benchmarking Universal Single-Copy Orthologs)用于基因组组装质量评估

Genome annotation

重复序列注释

  • Tandem Repeats Finder to annotate the tandem repeats(Tandem Repeats Database)
  • RepeatModeler to construct a de novo repeat library
  • Repeat-Masker to search similar TEs against the known Repbase TE library and de novo repeat library
  • LTR FINDER to find long terminal repeats (LTRs)
    蛋白编码基因注释
  1. transcriptome evidence
    两个生物学重复的12个连续BSF发育阶段的转录组数据,HISAT2 to map RNA-seq reads to the reference genome and StringTie to predict exons。
  2. homolog alignments
    GeneWise with protein inputs from six dipteran species。
  3. ab initio gene annotation
    Three independent gene predictors were applied to generate ab initio signatures, including AUGUSTUS, SNAPand Genscan.
    上述三种pipelines最后都通过GLEAN产生一致性的基因集。
    具体基因家族的功能注释需要人工矫正,TBLASTN搜索双翅目的同源基因确定其genomic loci,基因结构预测通过GeneWise,基因的保守域及生物通路通过KEGG的KO注释得到。基因家族的收缩和扩张通过本地的InterProScan去搜索双翅目基因组。基因的表达定量使用salmon,标准化表达值TPM。

Comparative genomics

orthomclSoftware用于寻找the final orthologs, inparalogs, and co-orthologs。Multiple alignments of protein sequences for each group
were performed using Muscle,Gblocks to identify conserved blocks。
Conserved blocks were finally concatenated to 10 super genes with 255,475 amino acids, which were used to quantify the maximum likelihood
phylogeny using RAxML。
Codeml from the PAML package was used to calculate dN/dS ratios under the F3X4 codon frequency.
Functional enrichment analyses were performed via an online OMICSHARE cloud platform (http://www.omicshare.com/tools/Home/Soft/pathwaygsea).

Analysis of the BSF intestinal transcriptome

  • Each sample was independently mapped to the reference genome and subjected to expression profiling using the mode “quant” of salmon with the parameter “-validateMappings”,All independent profile were finally merged to a TPM matrix using the mode“quantmerge” of salmon 。
  • Expression profile-based principle component analysis was performed using the built-in R function “prcomp”。

Metagenomic analyses of BSF intestinal microbiota

肠道微生物的16S rRNA sequencing.

  • Clean read pairs were merged using the built-in command “join_paired_ends.py” from QIIME .
  • OTU analyses were performed by VSEARCH. Within- and between-sample diversities were estimated by the built-in QIIME scripts “alpha_diversity.py” and “beta_diversity.py”, respectively.
  • The dynamic landscape of OTUs was generated using the online platform, SILVAngs (https://www.arb-silva.de/ngs).

Mutagenesis of BSF target genes

  • 通过与其他双翅目昆虫的同源比对得到预测的HiPtth and HiVg 的ORFs。With the PAM sequences in consideration, newly designed sgRNAs should follow the NNN19GG rule。

  • Fertilized eggs were collected within 1 h and microinjection was performed within 2 h of oviposition. Cas9 protein (200 ng/μL) with the sgRNA-1 (100 ng/μL) and sgRNA-2 (100 ng/μL) molecules were co-injected into preblastoderm embryos.

  • first instar larvae were selected for genomic DNA preparation. Fragments covering the two targeting sites were amplified,The amplified fragments were cloned into a pJET1.2 vector (Fermentas) and sequenced on the Sanger platform.


    图片.png

reference

https://www.nature.com/articles/s41422-019-0252-6#Sec18

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 203,456评论 5 477
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 85,370评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 150,337评论 0 337
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,583评论 1 273
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,596评论 5 365
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,572评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,936评论 3 395
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,595评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,850评论 1 297
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,601评论 2 321
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,685评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,371评论 4 318
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,951评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,934评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,167评论 1 259
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 43,636评论 2 349
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,411评论 2 342

推荐阅读更多精彩内容