并行程序设计导论.pptx

上传人:莉*** 文档编号:87395467 上传时间:2023-04-16 格式:PPTX 页数:43 大小:3.08MB
返回 下载 相关 举报
并行程序设计导论.pptx_第1页
第1页 / 共43页
并行程序设计导论.pptx_第2页
第2页 / 共43页
点击查看更多>>
资源描述

《并行程序设计导论.pptx》由会员分享,可在线阅读,更多相关《并行程序设计导论.pptx(43页珍藏版)》请在得力文库 - 分享文档赚钱的网站上搜索。

1、RoadmapWhy we need ever-increasing performance.Why were building parallel systems.Why we need to write parallel programs.How do we write parallel programs?What well be doing.Concurrent,parallel,distributed!#Chapter Subtitle第1页/共43页Changing timesFrom 1986 2002,microprocessors were speeding like a roc

2、ket,increasing in performance an average of 50%per year.Since then,its dropped to about 20%increase per year.第2页/共43页An intelligent solutionInstead of designing and building faster microprocessors,put multiple processors on a single integrated circuit.第3页/共43页Now its up to the programmersAdding more

3、 processors doesnt help much if programmers arent aware of them or dont know how to use them.Serial programs dont benefit from this approach(in most cases).第4页/共43页Why we need ever-increasing performanceComputational power is increasing,but so are our computation problems and needs.Problems we never

4、 dreamed of have been solved because of past increases,such as decoding the human genome.More complex problems are still waiting to be solved.第5页/共43页Climate modeling第6页/共43页Protein folding第7页/共43页Drug discovery第8页/共43页Energy research第9页/共43页Data analysis第10页/共43页Why were building parallel systemsUp

5、 to now,performance increases have been attributable to increasing density of transistors.But there areinherent problems.第11页/共43页A little physics lessonSmaller transistors=faster processors.Faster processors=increased power consumption.Increased power consumption=increased heat.Increased heat=unrel

6、iable processors.第12页/共43页Solution Move away from single-core systems to multicore processors.“core”=central processing unit(CPU)nIntroducing parallelism!第13页/共43页Why we need to write parallel programsRunning multiple instances of a serial program often isnt very useful.Think of running multiple ins

7、tances of your favorite game.What you really want is forit to run faster.第14页/共43页Approaches to the serial problemRewrite serial programs so that theyre parallel.Write translation programs that automatically convert serial programs into parallel programs.This is very difficult to do.Success has been

8、 limited.第15页/共43页More problemsSome coding constructs can be recognized by an automatic program generator,and converted to a parallel construct.However,its likely that the result will be a very inefficient program.Sometimes the best parallel solution is to step back and devise an entirely new algori

9、thm.第16页/共43页ExampleCompute n values and add them together.Serial solution:第17页/共43页Example(cont.)We have p cores,p much smaller than n.Each core performs a partial sum of approximately n/p values.Each core uses its own private variablesand executes this block of codeindependently of the other cores

10、.第18页/共43页Example(cont.)After each core completes execution of the code,is a private variable my_sum contains the sum of the values computed by its calls to Compute_next_value.Ex.,8 cores,n=24,then the calls to Compute_next_value return:1,4,3,9,2,8,5,1,1,5,2,7,2,5,0,4,1,8,6,5,1,2,3,9第19页/共43页Example

11、(cont.)Once all the cores are done computing their private my_sum,they form a global sum by sending results to a designated“master”core which adds the final result.第20页/共43页Example(cont.)第21页/共43页Example(cont.)Core01234567my_sum8197157131214Global sum8+19+7+15+7+13+12+14=95Core01234567my_sum95197157

12、131214第22页/共43页But wait!Theres a much better wayto compute the global sum.第23页/共43页Better parallel algorithmDont make the master core do all the work.Share it among the other cores.Pair the cores so that core 0 adds its result with core 1s result.Core 2 adds its result with core 3s result,etc.Work w

13、ith odd and even numbered pairs of cores.第24页/共43页Better parallel algorithm(cont.)Repeat the process now with only the evenly ranked cores.Core 0 adds result from core 2.Core 4 adds the result from core 6,etc.Now cores divisible by 4 repeat the process,and so forth,until core 0 has the final result.

14、第25页/共43页Multiple cores forming a global sum第26页/共43页AnalysisIn the first example,the master core performs 7 receives and 7 additions.In the second example,the master core performs 3 receives and 3 additions.The improvement is more than a factor of 2!第27页/共43页Analysis(cont.)The difference is more dr

15、amatic with a larger number of cores.If we have 1000 cores:The first example would require the master to perform 999 receives and 999 additions.The second example would only require 10 receives and 10 additions.Thats an improvement of almost a factor of 100!第28页/共43页How do we write parallel programs

16、?Task parallelism Partition various tasks carried out solving the problem among the cores.Data parallelismPartition the data used in solving the problem among the cores.Each core carries out similar operations on its part of the data.第29页/共43页Professor P15 questions300 exams第30页/共43页Professor Ps gra

17、ding assistantsTA#1TA#2TA#3第31页/共43页Division of work data parallelismTA#1TA#2TA#3100 exams100 exams100 exams第32页/共43页Division of work task parallelismTA#1TA#2TA#3Questions 1-5Questions 6-10Questions 11-15第33页/共43页Division of work data parallelism第34页/共43页Division of work task parallelismTasks1)Recei

18、ving2)Addition 第35页/共43页CoordinationCores usually need to coordinate their work.Communication one or more cores send their current partial sums to another core.Load balancing share the work evenly among the cores so that one is not heavily loaded.Synchronization because each core works at its own pa

19、ce,make sure cores do not get too far ahead of the rest.第36页/共43页What well be doingLearning to write programs that are explicitly parallel.Using the C language.Using three different extensions to C.Message-Passing Interface(MPI)Posix Threads(Pthreads)OpenMP第37页/共43页Type of parallel systemsShared-mem

20、oryThe cores can share access to the computers memory.Coordinate the cores by having them examine and update shared memory locations.Distributed-memoryEach core has its own,private memory.The cores must communicate explicitly by sending messages across a network.第38页/共43页Type of parallel systemsShar

21、ed-memoryDistributed-memory第39页/共43页Terminology Concurrent computing a program is one in which multiple tasks can be in progress at any instant.Parallel computing a program is one in which multiple tasks cooperate closely to solve a problemDistributed computing a program may need to cooperate with o

22、ther programs to solve a problem.第40页/共43页Concluding Remarks(1)The laws of physics have brought us to the doorstep of multicore technology.Serial programs typically dont benefit from multiple cores.Automatic parallel program generation from serial program code isnt the most efficient approach to get

23、 high performance from multicore computers.第41页/共43页Concluding Remarks(2)Learning to write parallel programs involves learning how to coordinate the cores.Parallel programs are usually very complex and therefore,require sound program techniques and development.第42页/共43页Copyright 2010,Elsevier Inc.All rights Reserved感谢您的观看。第43页/共43页

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 应用文书 > PPT文档

本站为文档C TO C交易模式,本站只提供存储空间、用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。本站仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知得利文库网,我们立即给予删除!客服QQ:136780468 微信:18945177775 电话:18904686070

工信部备案号:黑ICP备15003705号-8 |  经营许可证:黑B2-20190332号 |   黑公网安备:91230400333293403D

© 2020-2023 www.deliwenku.com 得利文库. All Rights Reserved 黑龙江转换宝科技有限公司 

黑龙江省互联网违法和不良信息举报
举报电话:0468-3380021 邮箱:hgswwxb@163.com