寄蜉蝣于天地,渺沧海之一粟。哀吾生之须臾,羡长江之无穷。
今天给大家分享一篇刚刚发表在Nature Communications上关于蜉蝣基因组的文章。文章主要介绍了蜉蝣作为半变态昆虫,其生活史经历水中和陆地两种环境,其中OBP的数量扩增尤其在若虫鳃状结构中表达。
成虫拥有性别两型的视觉蛋白的表达,这与两者复眼结构有关。文中还鉴定了一系列保守的翅相关的基因,其中成虫翅与若虫鳃状结构存在转录上的相似性。
🐝 🦋
- 有翅昆虫的进化彻底改变了陆地生态系统,并形成了地球最大的物种种群。
- We describe the genome of the mayfly Cloeon dipterum(半变态昆虫--二翅蜉蝣) and its gene expression throughout its aquatic and aerial life cycle and specific organs. We discover ==an expansion of odorant-binding-protein genes, some expressed specifically in breathing gills of aquatic nymphs==, suggesting a novel sensory role for this organ.
- Flying adults use an ==enlarged opsin set in a sexually dimorphic manner==, with some expressed only in males.
- We identify ==a set of wing-associated genes deeply conserved in the pterygote== insects and find ==transcriptomic similarities between gills and wings==, suggesting a common genetic program.
- It was only after insects evolved wings that this lineage (Pterygota) became the most prominent animal group in terms of number and diversity of species, and completely revolutionized Earth ecosystems.
- The development of wings also marked ==the appearance of a hemimetabolous life cycle==, with two clearly differentiated living phases (non-flying juveniles and flying adults).
- The appearance of wings and the capacity to fly greatly increased the capability of insects for ==dispersal, escape and courtship== and allowed them access to previously unobtainable nutrient sources, while establishing new ecological interactions.
- Mayflies have abdominal gills during the aquatic stages, a feature that places them in a privileged position to assess the different hypotheses accounting for the origin of wings, which suggested that ==wings are either homologous to tergal structures (dorsal body wall), or pleural structures (including gills) or a fusion of the two.==
- Some mayfly families exhibit a striking ==sexual dimorphism in their visual systems==. includes ==the presence of a second set of large compound eyes in males==.
Cloeon dipterum is a species of mayfly with a Holarctic distribution. It is the most common mayfly in ponds in the British Isles and ==the only ovoviviparous mayfly in Europe==. Males differ from females in having turbinate eyes.
The C. dipterum genome was assembled in 1395 scaffolds, with an N50 of 0.461 Mb. The total genome assembly length of C. dipterum is ==180 Mb==, which in comparison to other pterygote species, constitutes a relatively compact genome, probably due to ==the low fraction of transposable elements (TEs)== (5%, in contrast to the median of 24% ± 12% found in other insects15).
- We performed a temporal soft-clustering analysis of stage-specific transcriptomes using Mfuzz and ==focused on clusters containing genes whose expression peaks at each of these phases==.
- Clusters of genes transcribed preferentially during embryonic stages, such as ==cluster 21 and 8==, showed enrichment in Gene Ontology (GO) terms that reflected the processes of ==embryogenesis and organogenesis happening== during these stages (i.e. regulation of gene expression, neuroblast fate commitment, DNA binding, etc., Fig. 2a).
- Clusters with genes highly expressed during the aquatic phase (e.g., ==cluster 18==, Fig. 2b) presented GO enrichment in terms consistent with the continued moulting process that mayfly nymphs undergo, such as ==chitin-based cuticle development and defence response==.
- ==Cluster 30==, which contained genes with the highest expression during adulthood showed a striking enrichment of GO terms associated with ==visual perception==.
- The aquatic phase is characterised by genes involved in the perception of chemical cues, whereas vision becomes the main sensory system during the terrestrial/aerial adult phase.
-
We examined the expression of CS genes in specific tissues and organs and found that most of the ==CS genes were expressed in a highly tissue-specific manner,== as indicated by high (>0.8) tau values.
Tau index: we calculated which genes had a tau index (from 0 to 1) higher than 0.8 through the formula: sum(1 − x/max(x))/(length(x) − 1), according to their cRPKM values.
-
We found that more than half of CS genes, and among these, nearly 80% of OBPs (147 out of 276 CS genes and 121 out of 152 OBPs included in the soft-clustering analysis) were mostly assigned to just four clusters (18, 12, 9 and 10), ==all of them related to nymphal or pre-nymphal stages==.
-
The gills, where 34% of the 276 CS genes were expressed, (5/16 CSP, 8/34 IR and 82/167 OBP genes). These included 25 gill specific OBPs, several of which (OBP219, OBP199 and OBP260) were shown to be expressed ==in discrete cell clusters within the gills by in situ hybridization, often located at the branching points of their tracheal arborization==.
horseradish peroxidase (HRP), a marker of insect neurons. Remarkably, OBP expressing cell clusters were HRP-positive, suggesting a neurosensory nature.
- These results strongly suggest that, beyond their respiratory role, ==the gills are a major chemosensory organ of the aquatic mayfly nymph==.
- This ==LWS (long wave sensitive) cluster== was also present within Ephemeroptera(蜉蝣目) in the distantly related Ephemeridae, Ephemera danica, emphasizing the importance that these light-sensing molecules have had in the evolution and ecology of the entire mayfly group.
- We also found that the ==blue-sensitive opsin== underwent independent duplications in the Ephemeridae and Baetidae. E. danica has three different blue-Ops, while Baetis species with available transcriptomic data and C. dipterum have two blue-Ops, which in the latter case are located together in tandem.
- Their delayed expression onset and their sexual dimorphism strongly suggested that ==both UV-Ops4 and blue-Ops2 are turbanate eye-specific opsins==, while the rest of C. dipterum opsins function in the lateral compound eyes and/or ocelli of males and females.
- In situ hybridization assays showed that the shared ==UV-Ops2 was expressed in the compound eye of both sexes== (Fig. 4e, f), while ==UV-Ops4 was predominantly expressed in the turbanate eyes of males== but undetectable in females.
加权基因共表达网络分析 (WGCNA, Weighted correlation network analysis)是用来描述不同样品之间基因关联模式的系统生物学方法,可以用来鉴定高度协同变化的基因集, 并根据基因集的内连性和基因集与表型之间的关联鉴定候补生物标记基因或治疗靶点。
相比于只关注差异表达的基因,WGCNA利用数千或近万个变化最大的基因或全部基因的信息识别感兴趣的基因集,并与表型进行显著性关联分析。一是充分利用了信息,二是把数千个基因与表型的关联转换为数个基因集与表型的关联,免去了多重假设检验校正的问题。
Module:(模块)指表达模式相似的基因聚为一类,这样的一类基因称为模块。
-
This analysis revealed deeply ==conserved co-regulated gene modules associated with muscle, gut, ovaries, brain and Malpighian tubules (insect osmoregulatory organ)==, among others, indicating that the shared morphological features of the pterygote body plan are mirrored by deep transcriptomic conservation.
-
One of the highly correlated pairs of ==modules between mayfly and fruitfly corresponded to the wings== (wing pad and wing imaginal disc modules, respectively).
-
This gene set exhibited an enrichment in GO terms such as ==morphogenesis of a polarized epithelium, consistent with wing development==, and in agreement with this, some of these orthologues have been shown to participate in wing development in Drosophila.
-
We functionally tested this hypothesis for eight of these genes in Drosophila by ==RNAi knockdown assays== using the wing-specific nubbin-GAL4 driver. Indeed, these experiments resulted in wing phenotypes in all cases (8/8) and in particular, abnormal wing venation (7/8).
-
These results suggested that the transcriptomic programme responsible for the wings was assembled from genes that were already forming part of pre-existing gene networks in other tissues.
-
Clustering C. dipterum genes based on their expression across the different tissues, organs and developmental stages revealed that ==gills were the most closely related organ to (developing) wings==.
-
In fact, gills were the tissue that shared most specific genes with wing pads (42/98, 43%), as assessed by looking for genes with high expression only in wings and an additional tissue.
-
When interrogated with respect to gene expression associated with the two disparate environments that mayflies adapted to (aquatic and aerial), the data suggest a sensory specialization, with ==nymphs predominantly using chemical stimuli, whereas adults rely predominantly on their visual system.==
-
The presence of ==these chemosensory structures in the gills== challenges the classic idea of gills as exclusively respiratory organs.
-
Males use their dorsal turbanate eyes to locate females flying above their swarm, sexual selection might have played an important role during the evolution of UV and blue-opsin duplications and their specific expression in turbanate eyes of males.
-
Whatever the case, the transcriptomic similarities observed between gills and wings suggest that they share a common genetic programme.
The assembly was generated using the hybrid approach of Maryland Super-Read Celera Assembler (MaSuRCA) with 95.9x short-read Illumina and 36.3x Oxford Nanopore Technologies (ONT) long-read coverage.
37 RNA-seq datasets (including replicates) of multiple developmental stages (four embryonic stages) and dissected tissues and organs (nymphal and adult dissected organs and whole heads) were generated using the Illumina technology.
Reads from all paired-end transcriptomes were assembled altogether using Trinity and subsequently aligned to the genome using the Programme to Assemble Spliced Alignments (PASA) pipeline.
Reads were aligned to the genome using Spliced Transcripts Alignment to a Reference (STAR) and transcriptome assembled using Stringtie.
To obtain phylogeny-based orthology relationships between different taxa, the predicted proteomes of 14 species representing major arthropod lineages and outgroups were used as input for OrthoFinder2.
Mfuzz software was used to perform soft clustering of genes according to developmental and life history expression dynamics using normalised read counts.
Taking the orthology groups from C. dipterum or Strigamia maritima to D. melanogaster GO database from Ensembl BIOMART, we generated a topGO gene to GO key, by copying across all GO terms represented in each orthogroup.
-
We used the sequence database and the HMM profiles in the program BITACORA to (i) identify new CS members, or to curate the already annotated ones, among the pre-compiled gene models of these families (obtained with automatic methods), and (ii) to generate new models (of previously undetected copies) from the genomic sequences. Briefly, we performed various iterative rounds of BLASTP and HMMER searches against the automatically annotated proteins of C. dipterum and curated incorrect and incomplete (when possible) gene models. Also, we used TBLASTN against the genomic sequence to identify novel (not annotated by the automatic methods) regions encoding CS proteins. We generated a GTF file containing our curated annotation of C. dipterum CS genes.
-
First, we examined the presence of premature stop codons; these features could represent real non-functional copies (pseudogenes), errors in sequencing or genome assembly steps or inaccuracies in our automatic annotation step based on TBLASTN hits. All sequences encoding complete proteins (CPs) that were free of stop codons were included in the first category (CP set). Operationally, we considered a CP when its length was >80% of the corresponding average protein domain length.
-
for the GR and OR families, we also required that the CP members contained a minimum of 5 of the 7 transmembrane domains (defined by the software TMHMM version 2.0c 65; Phobius version 1.0166). For the CP IR/iGluR members, we required the presence of the ligand channel domains, namely, PF00060 (ligand-gated ion channel), present in all IR/iGluR subfamilies, i.e., kainate, AMPA, NMDA, conserved IRs (IR25a/IR8a), and divergent IRs 67. The remaining sequences that were free of stop codons and did not pass the length filter criteria were classified as incomplete proteins (IP set).
Specific primers were designed to generate DIG-labelled RNA probes against OBP260, OBP219, OBP199, UV-Ops2 and UV-Ops4.
A total of 1247 opsins from all the major metazoan groups were used as seeds in a BLAST search of the predicted protein sequences of C. dipterum, E. danica (Edan_2.0; https://www.ncbi.nlm.nih.gov/assembly/GCA_000507165.2/) and L. fulva (Lful_2.0; https://www.ncbi.nlm.nih.gov/assembly/GCA_000376725.2/).
To classify genes by origin (phylostratigraphy) we used an expanded dataset of 28 species . OrthoFinder2 was used to compute orthogroups.
Vienna Drosophila Research Centre (VDRC) lines were crossed to yw; nubbing-Gal4 (nub-Gal4); + line to express the RNAi constructs specifically in the wing.
To determine whether genes were expressed in a tissue-specific manner or ubiquitously throughout our developmental stages and organ samples, we calculated which genes had a tau index (from 0 to 1) higher than 0.8 through the formula: sum(1 − x/max(x))/(length(x) − 1), according to their cRPKM values.
We calculated which genes where expressed in wing pads and one of the other tissues preferentially, according to cRPKM.