基因组学与比较基因组学课件.ppt

上传人:飞****2 文档编号:79048419 上传时间:2023-03-19 格式:PPT 页数:67 大小:2.48MB
返回 下载 相关 举报
基因组学与比较基因组学课件.ppt_第1页
第1页 / 共67页
基因组学与比较基因组学课件.ppt_第2页
第2页 / 共67页
点击查看更多>>
资源描述

《基因组学与比较基因组学课件.ppt》由会员分享,可在线阅读,更多相关《基因组学与比较基因组学课件.ppt(67页珍藏版)》请在得力文库 - 分享文档赚钱的网站上搜索。

1、第十一章第十一章 基因组学与比较基因组学基因组学与比较基因组学By Hongwei Guo,Peking University,2008.12.29Genomics and Comparative Genomics基础分子生物学期末安排考试时间:2009 年 1 月 12 日 下 午 2:00-4:00 考试地点:三教/301(60人)、三教/304(38人)、三教/306(37人)、三教/308(37人)基因组计划基因组(genome)是生物体内遗传信息的集合,是某个特定物种细胞内全部DNA分子的总和。基因组学(genomics)是指研究并解析生物体整个基因组的所有遗传信息的学科。基因组计划

2、(Genome Project)是指对人类以及其它生物体全基因组的测序工作(sequencing)。人类基因组计划(Human Genome Project,HGP):90年代提出并已基本完成,同40年代原子弹爆炸,60年代人类登月一起被认为是二十世纪科技发展史上的三大创举。History of the Human Genome Project1990 Official start of HGP with 3 billion$and a 15 year horizon.1999 Sanger Centre publishes chromosome 222001 Draft Genome pub

3、lished:Celera&Public2003 Completion(almost)of Human GenomeCelera:Craig VenterIntl.Cons:Francis CollinsPublic effort-strategy:Celera-strategy:Sequencing StrategiesCeleras view of International ConsortiumInternational Consortiums view of CeleraUnfair competition:IC delivering the same goods but with s

4、tate funding.Unfair competition:Celera delivering the same goods but can use IC data,while IC cannot use Celera data.Public effort-strategy:基因组基因组DNA大片段文库的构建大片段文库的构建 YAC(yeast artificial chromosome,酵母人工染色体):含有三种必需成分:着丝粒、端粒和复制起点。是迄今容量最大的克隆载体,插入片段平均长度为200-1000 Kbp,最大的可以达到2 Mbp。BAC(Bacterial artificial

5、 chromosome),用细菌的F质粒及其调控基因构建了细菌染色体克隆载体,其克隆能力在125150 Kbp左右。以BAC为基础的克隆载体形成嵌合体的频率较低,转化效率高,而且以环状结构存在于细菌体内,易于分辨和分离纯化,已被科学界广泛接受。BACBAC的构建的构建pBAC108L来自细菌的一个小型F质粒,其中oriS和repE控制了质粒的复制起始,parB和parA控制了拷贝数。100-150 Kbp insertion遗传图谱(Genetic Map)vs 物理图谱(Physical Map)遗传图又称为连锁图(Linkage Map),是指基因或DNA标志在染色体上的相对位置相对位置(

6、或(或距离距离),通常以基因或DNA片段在染色体交换过程中的分离频率(cM)来表示。cM值越大,两者之间遗传距离越远。物理图谱是指以已知序列的DNA片段(序列标签位点,sequence-tagged site,STS)在染色体上的实际位置实际位置,位点之间的距离(图距)以碱基对(bp,kb,Mb)作为测量单位的基因组图。DNA遗传标记(DNA marker)第一代DNA遗传标记是RFLP(Restriction Fragment Length Polymorphism)。DNA序列上的微小变化,甚至1个核苷酸的变化,也能引起限制性内切酶切点限制性内切酶切点的丢失或产生,导致酶切片段长度的变化。

7、第二代DNA遗传标记SSLP(Simple Seqeuce Length Polymorphism)利用了存在于人类基因组中的大量重复序列重复序列,包括重复单位长度在15-65个核苷酸左右的小卫星DNA(minisatellite DNA),重复单位长度在2-6个核苷酸之间的微卫星DNA(microsatellite DNA)。第三代DNA遗传标记SNP(single nucleotide polymorphism),也是最广泛的遗传标记,是分散于基因组中的单个碱基单个碱基的差异。这种差异包括单个碱基的缺失和插入,但更常见的是单个核苷酸的替换,即单核苷酸的多态性。到目前为止已经在人类基因组发现

8、了超过1000万个SNP位点,平均每300bp中就有一个SNP!RFLP markerSSLP markersWTmutSNP marker酵母第三号染色体遗酵母第三号染色体遗传图(右)和物理图传图(右)和物理图(左)的比较(左)的比较 由于实验方法不同,不少由于实验方法不同,不少markers之间的遗之间的遗传距离并不等于它们在物理图上的距离。传距离并不等于它们在物理图上的距离。鸟枪法序列测定技术鸟枪法序列测定技术全基因组鸟枪法测序(shotgun sequencing)技术:随机挑选带有基因组DNA的质粒进行末端序列测定,然后用计算机程序进行序列拼接。鸟枪法测序的主要缺点是,随着所测基因组

9、总量增大,所需测序的片段大量增加;其次,高等真核生物(如人类)基因组中有大量重复序列,导致判断失误 DNA target sampleDNA target sampleSHEAR&SIZESHEAR&SIZEe.g.,10Kbp e.g.,10Kbp 8%std.dev.8%std.dev.End Reads/Mate PairsEnd Reads/Mate PairsCLONECLONE&END SEQUENCE&END SEQUENCE590bp10,000bpMate-Pair Shotgun DNA Sequencing大规模大规模DNA序列拼接序列拼接 DNA序列拼接问题与组合数学中

10、的最短超串问题相似。最短超串问题即给定一个字符串的集合,找出一最短的字符串称为超串,并将集合中的任何一元素作为其子串。Popular AssemblersTIGR Assembler(TIGR)Phrap(Wash U)Celera Assembler(Celera,TIGR)Arachne(MIT Broad)Phusion(Sanger uses Phrap)Atlas(Baylor HGSC)Assembly of the Individual SequencesIndividual sequencing reads are compared to eachother and where

11、 they overlap can be assembled to create contigsAssembly of the Individual SequencesKeep adding individual sequencing reads to build larger and fewer contigsAssembly of the Individual SequencesEventually all sequencing reads merge to a single consensus sequence(a large contig)for each chromosome.鸟枪法

12、测序技术不能鉴别高等真核生物基因组中的重复序列鸟枪法测序技术不能鉴别高等真核生物基因组中的重复序列 改进后的鸟枪法改进后的鸟枪法(adopted by both(adopted by both IC and Celera)IC and Celera)高通量高通量DNA序列分析技术序列分析技术 人类基因组计划的成功很大程度上得益于有效减少DNA测序成本的技术更新。通过改良测序方法,不断提升其自动化程度,DNA测序的成本降低了100倍,从20世纪70-80年代$10/bp降低到本世纪初$0.1/bp!如果一个熟练的DNA序列分析人员采取早期80年代方法每天测定1000 bp计算,人类基因组(约31

13、09bp)的全序列分析,至少也得需要100名这样的工人花上100年的时间才能完成。而2000年代的高通量测序仪可达到月测序1-6 Mbp!目前?目前?$2/1 Mbp3 Gbp/machine/dayDNA sequencing technologies “Classical”Sanger dideoxy sequencing“Next Generation”,commercialized Roche 454 Pyrosequencing Solexa/Illumina cyclical base addition ABI SOLiD sequencing by ligation Single

14、 molecule(tethered DNA polymerase)Heliscope(cyclical base addition)VisiGen(real time,FRET-based)Illumina/Solexa Genetic Analyzer2000 Mb/runApplied Biosystems ABI 3730XL1 Mb/day Roche/454 Genome Sequencer FLX100 Mb/runApplied BiosystemsSOLiD3000 Mb/runSanger sequencing13maybe 800 bp long42Roche/454:G

15、enome Sequencer FLXReal Time Sequencing by Synthesis Chemiluminescence detection in pico titer platesAmplification:emulsion PCRPyrosequencingup to 400,000 reads/runon average 250 bases/read up to 100 Mb/runRoche/454 Genome Sequencer FLX100 Mb/runPyrosequencing-454 SequencingGenome sequencing in micr

16、ofabricated high-density picolitre reactors Margulies,M.Eghold,M.et al.Nature.2005 Sep 15;437(7057):326-7A section of Pyrosequencing readsIllumina/Solexa:Genetic AnalyzerReal Time Sequencing by Synthesis Clonal Single Molecule ArrayAmplification:bridging PCR60 million reads/runup to 50 bases/read2 G

17、b/run 8 channels,app.5 mio reads/channelFluorescent labelsReversible 3OH blockingIllumina/Solexa Genetic Analyzer2000 Mb/runReversible terminator-based sequencing(Solexa)Fragment DNA and ligate adaptors ComparisonWGS454SolexaCloningYesNoNoChemistrySangerpyrosequencing reversible terminatorsCost$to$A

18、ccuracyConsensus 99.99%Single read 99.5%;Consensus 99.99%?AssemblyBestBetterBadGap Closure and FinishingToughTougherPossible?Real Time Sequencing by LigationEmulsion PCR and Beads on slides85 million reads/runUp to 35 bases/read3 Gb/rundual fluorescent labels8 individual channels/flowcell2 flowcells

19、/runApplied BiosystemsSOLiD3000 Mb/runOligonucleotide Ligation&Detection(SOLiD)SOLiD:Substrate attachment;dibase probesMake sequencing library by shearing and adapter ligationAttach DNA fragments to beads and amplify polonies in emulsionAttach beads to slideSOLiD:Sequencing ligation cyclesSOLiD:Data

20、 Collection and Image AnalysisSOLiD:Decoding thesequenceMardis 2008Comparison of“Next Generation”Sequencing TechnologiesApplications Genetic AnalyzerSingle Molecule Sequencing Technologies:on the horizon Array of tethered DNA polymerase molecules Bound to template strand+primer Heliscope Cyclical ba

21、se addition(similar to Solexa)VisiGen Real time,imaging FRET flashes Hopeful prediction:1 Mb/sec“Next Generation”Sequencing Technologies:Rate Limiting Factors Front end:Making the sequencing library Back end:Bioinformatics to make sense of the“sequence tsunami”-essemblyWhat do we sequence?de novo ge

22、nome sequencinggenome resequencing(SNP identification)metagenomes or complex samplestranscriptome profilingsmall RNA identification(applications)Examples of Applications of“Next Generation”Sequencing TechnologiesBest for“re-sequencing”,i.e.,aligning generated sequence to a reference genomeNext gener

23、ation DNA technologies may replace microarrays for some applicationsShendure&Ji 2008The“$10,000 human genome sequencing”prize To the first team that can build a device and use itto sequence:100 human genomes within 10 days or less,Accuracy:at most 1 error per 100,000 bases,Accurate coverage of at le

24、ast 98%of the genome,Recurring cost of no more than$10,000(US)per genome.Prize:$10 million Deadline:12:01 AM PST,October 4,2013.Donors:X Foundation,J.Craig Venter FoundationHuman HapMap ProjectHapMap的构建分为三个步骤:(a)在多个个体的DNA样品中鉴定单核苷酸多态性(SNPs);(b)将群体中频率大于1%的那些共同遗传的相邻SNPs组合成单体型;(c)在单体型中找出用于识别这些单体型的标签SNPs

25、。通过对图中的三个标签SNPs进行基因分型,可以确定每个个体拥有哪一个单体型。SCIENCE 315:1781(30 MARCH 2007)Metagenomes or complex samples转录图(转录图(transcript profiling)基因转录图(cDNA图),或者基因的cDNA片段图,即表达序列标签图(EST,expressed sequence tag),是基因组图的重要组成部分。大规模生产EST的主要程序如下:分离特定组织在某一发展阶段或某种生理条件下的总mRNA,合成双链cDNA,克隆到plasmid中并进行两端测序。Pyrosequencing Provides

26、 Evidence for Novel Transcripts and Transcript Architecturesmall RNA identification(i.e.microRNA)到到2006年底已完成的基因组项目年底已完成的基因组项目(http:/www.genomesonline.org/)根据根据2007年年1月的数据,全球已启动月的数据,全球已启动2296个个基因组项目,其中基因组项目,其中607个项目已经完成,已个项目已经完成,已经公开发表经公开发表481个基因组序列,包括个基因组序列,包括403个个细菌基因组,细菌基因组,33个古细菌基因组和个古细菌基因组和45个真个

27、真核生物基因组。核生物基因组。其它基因组计划其它基因组计划到到2006年年12月全世界主要基因组计划的进展情况月全世界主要基因组计划的进展情况 Why do we sequence genomes?To catalog all the genes present in one organism.To compare the gene content of one organism to another organism.To study gene/genome evolution.To study organismal evolution.As a foundation for future

28、experimentation.不同模式生物基因组的比较不同模式生物基因组的比较物 种基因组大小估计基因数尿殖道支原体 Mycoplasma genitalium580 Kb467肺炎支原体 Mycoplasma pneumoniae816 Kb677流感嗜血杆菌 Haemophilus influenzae1.8 Mb1709枯草芽孢杆菌 Bacillus subtilis4.2 Mb4100大肠杆菌 Escherichia coli4.6 Mb4288酿酒酵母 Saccharomyces cerevisiae13 Mb6275线 虫 Caenorhabditis elegans100 Mb

29、18891拟南芥 Arabidopsis thaliana125 Mb25498果 蝇 Drosophila melanogaster165 Mb14113人 类 Homo sapiens3 Gb约2.5万比较基因组学(比较基因组学(Comparative genomics)比较基因组学的威力在于它能根据对一种生物相关基因的认识来理解、诠释甚至克隆分离另一种生物的基因。远缘基因组间的比较为认识生物学机制的普遍性,寻找研究复杂生理和病理过程所需的实验模型提供了理论依据,而近缘基因组间的比较则为认识基因结构与功能等细节提供了参数。物种名基因数目转录因子数量比例拟南芥约2938815335.9酵母约

30、58852093.5线虫约188916693.5果蝇约133796354.5Functional Categories in Eukaryotic ProteomesApplications to Medicine1.A key application of human genome research is to find disease genes by positional cloning2.This method involves mapping the chromosomal region containing the gene by linkage analysis in affec

31、ted families3.The human genomic sequence in public databases allows rapid identification in silico of candidate genes,followed by mutation screening of relevant candidates,aided by information on gene structure4.For a mendelian disorder,a gene search can now often be carried out in a matter of month

32、s with only a modestly sized teamNext Steps on HGP1.Finish the human sequence2.Large-scale identification of regulatory regions3.Sequencing of additional large genomes4.Completing the catalogue of human variation5.From sequence to functionFuture Technology Development1.Functional genomics -aims to u

33、nderstand how genes are regulated and what they do,largely through massively parallel studies of gene expression in a variety of tissues2.Proteomics promises to make the identity of each protein known and elucidate protein-protein interactions3.Bioinformatics enhance the ability of researchers to manipulate,collect and analyze data more quickly and in new ways祝同学们新年快乐!考试顺利!祝同学们新年快乐!考试顺利!个人观点供参考,欢迎讨论!

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 教育专区 > 教案示例

本站为文档C TO C交易模式,本站只提供存储空间、用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。本站仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知得利文库网,我们立即给予删除!客服QQ:136780468 微信:18945177775 电话:18904686070

工信部备案号:黑ICP备15003705号-8 |  经营许可证:黑B2-20190332号 |   黑公网安备:91230400333293403D

© 2020-2023 www.deliwenku.com 得利文库. All Rights Reserved 黑龙江转换宝科技有限公司 

黑龙江省互联网违法和不良信息举报
举报电话:0468-3380021 邮箱:hgswwxb@163.com