芥菜型油菜作為一種廣泛種植的作物,能產(chǎn)生不同顏色的種子。種子的著色是由于內(nèi)皮細(xì)胞原花色素(proanthocyanidins,PA)的沉積,該終產(chǎn)物是通過一條類黃酮化合物合成的途徑形成。為了進一步了解芥菜型油菜種子著色的基因信號網(wǎng)絡(luò),研究者采用Illumina/Solexa測序平臺檢測近交系黃籽種皮(SY)以及近等位基因系棕籽種皮(NILA)的轉(zhuǎn)錄組基因,對檢測結(jié)果進行De Novo拼接,超過1.16億個高質(zhì)量的reads被組裝成69,605個獨立基因,其中以E value值為10-5為截點,大約71.5%(49,758個獨立基因)能比對到Nr蛋白數(shù)據(jù)庫。RPKM分析結(jié)果顯示,棕籽種皮較黃籽種皮,有802個基因上調(diào),502個基因下調(diào)。生物學(xué)通路分析顯示,46個基因與類黃酮合成相關(guān)。在黃籽中,與后期類黃酮生物合成相關(guān)的編碼基因二氫黃酮還原酶(DFR)、白花色素雙加氧酶(LDOX)、花色素還原酶(ANR)不表達或者表達水平非常低,這也暗示了這些與PA合成相關(guān)的基因可能與芥菜型油菜種皮的著色相關(guān),qRT-PCR檢測進一步確認(rèn)了該結(jié)果。
本研究是湖南農(nóng)業(yè)大學(xué)油料作物研究所劉忠松教授課題組完成的。該研究首次實現(xiàn)芥菜型油菜種皮轉(zhuǎn)錄組測序,研究中所獲得的基因不僅有利于闡明薺菜型油菜種皮著色的分子機制,且為該物種今后的基因組學(xué)研究提供了基礎(chǔ)。研究中所涉及的Illumina HiSeq 2000測序服務(wù)以及數(shù)據(jù)分析服務(wù)由上海伯豪生物技術(shù)有限公司提供。
轉(zhuǎn)錄組測序及De Novo拼接:對SY以及NILA的轉(zhuǎn)錄組進行測序,在去除接頭序列、低質(zhì)量的序列等后,得到高質(zhì)量的reads,結(jié)果見下表:
Figure 2 Overview of the Brassica juncea seed coat transcriptome assembly. (A) The size distribution of the scaffolds; (B) The size distribution of unigenes.
采用CLC Genomic Workbench 4.9軟件,將高質(zhì)量的reads進行拼接,獲得99,096個contig,且每個contig的最小長度為200bp,通過pair-end連接以及縫隙拼接,產(chǎn)生了79,520個scaffold,平均每個scaffold長度為200bp。Figure 2A為scaffold分布圖。采用CD-HIT (V.4.5.4)軟件將scaffold進一步組裝成69,605個獨立基因,其分布圖見Figure 2B。具體參數(shù)見表格。
功能注釋:功能基因與NCBI的non redundant (Nr)蛋白數(shù)據(jù)庫進行blast比對(以E value值10-5為截點),大約71.5%(49,758個獨立基因)能比對到該數(shù)據(jù)庫,它們中62.01%的基因E-value值低于1E-50且具有很好的同源性,剩余的37.99%的基因E-value值在1E-5與1E-50之間,見Figure 3A。相似度分析顯示在能比對到Nr數(shù)據(jù)庫的基因中50.06%的基因相似度超過90%,46.21%的基因相似度在60%-90%之間,僅僅有3.72%的基因相似度低于60%,見Figure 3B。種群分布圖顯示在能比對到Nr數(shù)據(jù)庫的基因中接近96.49%的基因與6種熱門物種相匹配:琴葉擬南芥(46.59%),鼠耳芥(40.59%),鹽芥(3.37%),另外三種均屬于十字花科,種群分布圖中前20種的名稱見Figure 3C。
Figure 3. Characteristics of homology search of Brassica juncea seed coat unigenes. (A) E-value distribution of the top Blastx hits against the non-redundant (Nr) protein database for each unigene; (B) Similarity distribution of the best Blastx hits for each unigene; (C) Number of unigenes matching the 20 top species using Blastx in the Nr database.
Figure 4. GO classification of unigenes of B.juncea seed coat.
GO分類:依據(jù)序列的同源性進行GO分類,結(jié)果顯示,在所有拼接出的基因中有19,618個基因可被歸為37個功能組,其中最主要的三類(生物過程、細(xì)胞組成、分子功能)可以比分別比對到31,026, 22,918 以及 26,267個GO分類組,詳見Figure 4。
COG分類:將49,758個能比對到 Nr蛋白數(shù)據(jù)庫的基因進行COG分類,結(jié)果顯示25,140個基因聚成25種功能組。其中最大的COG分類組為信號轉(zhuǎn)導(dǎo)機制組(10,471個基因大約占41.6%),詳見Figure 5。
Figure 5. COG function classification of transcriptome.
KEGG分類:通過Kyoto Encyclopedia of Genes and Genomes (KEGG)分析69,605個基因,顯示14,998個基因能比對到258條信號通路。最主要的信號通路為代謝通路(3,506個基因,大約占23.37%),其次為次生代謝物的合成 (1,785,11.9%),不同環(huán)境中微生物的代謝(802,5.35%),RNA降解(538, 3.59%)以及核糖體(535, 3.57%)。研究者將目光聚焦到了與薺菜型油菜種皮著色相關(guān)的次生代謝物合成通路,發(fā)現(xiàn)154個基因與苯丙素生物合成相關(guān),114個基因與苯丙氨酸、酪氨酸、色氨酸生物合成相關(guān),46個基因與類黃酮生物合成相關(guān),9個基因與黃酮以及黃酮醇生物合成相關(guān)。
種皮轉(zhuǎn)綠組中轉(zhuǎn)錄因子的鑒定:將所有拼接得到的基因通過Blastx比對到AGRIS (Arabidopsis Gene Regulatory Information Server)數(shù)據(jù)庫,E-value值小于10-5,identity大于70%,2,347個基因被推定屬于48個轉(zhuǎn)錄因子家族,其中MYB(100個基因)以及bHLH(190個基因)兩個家族在植物中與類黃酮生物合成相關(guān)。
黃色與棕色種皮中不同表達的轉(zhuǎn)錄本:為了觀察兩種不同顏色的種皮中基因的表達水平,通過RPKM分析它們各自被拼接出的69,605個基因。其中1,304個基因在兩者中的表達水平有差異,棕色種皮與黃色種皮相比較,有802個基因上調(diào),502個基因下調(diào),在這些基因中,170 (12.8%)個基因表達水平有15倍的差異,471 (36.4%)個有2-3倍的差異,詳見Figure 6。對差異表達的基因進行注釋發(fā)現(xiàn)455個基因?qū)儆?8個GO組,849個基因不能進行歸類,詳見Figure 7。
Figure 6. The fold change distribution of differentially expressed between the yellow- and brown-seeded testa of Brassica juncea.
Figure 7. Functional categoried of unigenes differentially expressed between the yellow- and brown-seeded testa of Brassica juncea.
與類黃酮生物合成信號通路相關(guān)的種皮轉(zhuǎn)綠組基因:在擬南芥中,PA的合成顯示種子內(nèi)皮細(xì)胞在授粉后3天(days after pollination,DAP)會有其生物合成基因的表達。在芥菜型油菜種皮中10 DAP才會出現(xiàn)PA的積累,芯片結(jié)果顯示在埃塞俄比亞芥棕籽形成時, 22 DAP(形成角果)時,6個類黃酮基因(CHS、F3H、FOMT、DFR、GST以及TTG1)發(fā)生上調(diào),2個基因(F39H、FLS)發(fā)生下調(diào),該現(xiàn)象在黃籽形成時未發(fā)現(xiàn)。將次生壁豐富的甘藍型油菜種皮及其下胚軸進行對比發(fā)現(xiàn),類黃酮生物合成轉(zhuǎn)錄本的基因:ANR、FLS 以及 CHS在種皮中的含量更為豐富,這就意味著類黃酮生物合成基因在種皮中高表達,與PA在種皮中的沉積相一致。
Figure 8呈現(xiàn)了芥菜型油菜與類黃酮生物合成相關(guān)基因的表達情況。過去的研究表明DFR, LDOX 及ANR基因與PA合成相關(guān),且DFR與LDOX基因在黃籽的蕓苔屬植物中不表達,本研究發(fā)現(xiàn),DFR, ANR基因在黃籽中幾乎不表達,在棕籽中高表達,LDOX在棕籽中的表達量也高于黃籽,結(jié)果表明,與PA合成相關(guān)的基因不表達或者低表達導(dǎo)致了黃籽的芥菜型油菜中沒有PA的積累。
Figure 8. Unigenes involved in the flavonoid biosynthesis pathway in seed coat of Brassica juncea. Abbreviation: ANR, anthocyanidin reductase; CHS, chalcone synthase; CHI, chalcone isomerase; DFR, dihydroflavonol 4-reductase; F3H, flavanone 3-hydroxylase; F39H, flavonoid 39-hydroxylase; FLS, flavonol synthase; LDOX, leucoanthocyanidin dioxygenase.
類黃酮生物合成通路中基因的RT-PCR分析:為了對RPKM的分析結(jié)果進一步確證,選取了類黃酮生物合成過程中的8個基因(Figure 9)進行qRT-PCR分析,發(fā)現(xiàn)在棕色種皮中Unigene_920 (CHS), Unigene_29246 (CHI), Unigene_7597 (DFR), Unigene_7701 (LDOX), Unigene_16036(ANR)發(fā)生上調(diào),Unigene_28310 (FLS)下調(diào),Unigene_682 (F3H) 及Unigene_396 (F39H)未發(fā)生明顯變化。該結(jié)果與RPKM分析結(jié)果相一致。
Figure 9. qRT-PCR validation of RPKM analysis of the eight unigenes involved in flavonoid biosynthesis of Brassica juncea seed coat.
原文出處:De Novo Transcriptome of Brassica juncea Seed Coat and Identification of Genes for the Biosynthesis of Flavonoids
Abstract:Brassica juncea, a worldwide cultivated crop plant, produces seeds of different colors. Seed pigmentation is due to the deposition in endothelial cells of proanthocyanidins (PAs), end products from a branch of flavonoid biosynthetic pathway.
To elucidate the gene regulatory network of seed pigmentation in B. juncea, transcriptomes in seed coat of a yellow-seeded inbred line and its brown-seeded near- isogenic line were sequenced using the next-generation sequencing platform
Illumina/Solexa and de novo assembled. Over 116 million high-quality reads were assembled into 69,605 unigenes, of which about 71.5% (49,758 unigenes) were aligned to Nr protein database with a cut-off E-value of 1025. RPKM analysis showed
that the brown-seeded testa up-regulated 802 unigenes and down-regulated 502 unigenes as compared to the yellow seeded one. Biological pathway analysis revealed the involvement of forty six unigenes in flavonoid biosynthesis. The unigenes encoding dihydroflavonol reductase (DFR), leucoantho-cyanidin dioxygenase (LDOX) and anthocyanidin reductase (ANR) for late flavonoid biosynthesis were not expressed at all or at a very low level in the yellow-seeded testa, which implied that these genes for PAs biosynthesis be associated with seed color of B. juncea, as confirmed by qRT-PCR analysis of these genes. To our knowledge, it is the first time to sequence the transcriptome of seed coat in Brassica juncea. The unigene sequences obtained in this study will not only lay the foundations for insight into the molecular mechanisms
underlying seed pigmentation in B.juncea, but also provide the basis for further genomics research on this species or its allies.