(精品)Module5-HadoopTechnicalReview.ppt-得力文库

资源描述

《(精品)Module5-HadoopTechnicalReview.ppt》由会员分享，可在线阅读，更多相关《(精品)Module5-HadoopTechnicalReview.ppt（76页珍藏版）》请在得力文库 - 分享文档赚钱的网站上搜索。

1、Google Cluster Computing Faculty Training Workshop Module V:Hadoop Technical Review Spinnaker Labs,Inc.OverviewHadoop Technical WalkthroughHDFSDatabasesUsing Hadoop in an Academic EnvironmentPerformance tips and other tools Spinnaker Labs,Inc.You Say,“tomato”Google calls it:Hadoop equivalent:MapRedu

2、ceHadoopGFSHDFSBigtableHBaseChubby(nothing yet but planned)Some MapReduce TerminologyJob A“full program”-an execution of a Mapper and Reducer across a data setTask An execution of a Mapper or a Reducer on a slice of data a.k.a.Task-In-Progress(TIP)Task Attempt A particular instance of an attempt to

3、execute a task on a machine Spinnaker Labs,Inc.Terminology ExampleRunning“Word Count”across 20 files is one job20 files to be mapped imply 20 map tasks+some number of reduce tasksAt least 20 map task attempts will be performed more if a machine crashes,etc.Spinnaker Labs,Inc.Task AttemptsA particula

4、r task will be attempted at least once,possibly more times if it crashesIf the same input causes crashes over and over,that input will eventually be abandonedMultiple attempts at one task may occur in parallel with speculative execution turned onTask ID from TaskInProgress is not a unique identifier

5、;dont use it that way Spinnaker Labs,Inc.MapReduce:High Level Spinnaker Labs,Inc.Node-to-Node CommunicationHadoop uses its own RPC protocolAll communication begins in slave nodesPrevents circular-wait deadlockSlaves periodically poll for“status”messageClasses must provide explicit serialization Spin

6、naker Labs,Inc.Nodes,Trackers,TasksMaster node runs JobTracker instance,which accepts Job requests from clientsTaskTracker instances run on slave nodesTaskTracker forks separate Java process for task instances Spinnaker Labs,Inc.Job DistributionMapReduce programs are contained in a Java“jar”file+an

7、XML file containing serialized program configuration optionsRunning a MapReduce job places these files into the HDFS and notifies TaskTrackers where to retrieve the relevant program code Wheres the data distribution?Spinnaker Labs,Inc.Data DistributionImplicit in design of MapReduce!All mappers are

8、equivalent;so map whatever data is local to a particular node in HDFSIf lots of data does happen to pile up on the same node,nearby nodes will map insteadData transfer is handled implicitly by HDFS Spinnaker Labs,Inc.Configuring With JobConfMR Programs have many configurable optionsJobConf objects h

9、old(key,value)components mapping String ae.g.,“mapred.map.tasks”20JobConf is serialized and distributed before running the jobObjects implementing JobConfigurable can retrieve elements from a JobConf Spinnaker Labs,Inc.What Happens In MapReduce?Depth First Spinnaker Labs,Inc.Job Launch Process:Clien

10、tClient program creates a JobConfIdentify classes implementing Mapper and Reducer interfaces JobConf.setMapperClass(),setReducerClass()Specify inputs,outputsJobConf.setInputPath(),setOutputPath()Optionally,other options too:JobConf.setNumReduceTasks(),JobConf.setOutputFormat()Spinnaker Labs,Inc.Job

11、Launch Process:JobClientPass JobConf to JobClient.runJob()or submitJob()runJob()blocks,submitJob()does notJobClient:Determines proper division of input into InputSplitsSends job data to master JobTracker server Spinnaker Labs,Inc.Job Launch Process:JobTrackerJobTracker:Inserts jar and JobConf(serial

12、ized to XML)in shared location Posts a JobInProgress to its run queue Spinnaker Labs,Inc.Job Launch Process:TaskTrackerTaskTrackers running on slave nodes periodically query JobTracker for workRetrieve job-specific jar and configLaunch task in separate instance of Javamain()is provided by Hadoop Spi

13、nnaker Labs,Inc.Job Launch Process:TaskTaskTracker.Child.main():Sets up the child TaskInProgress attemptReads XML configurationConnects back to necessary MapReduce components via RPCUses TaskRunner to launch user process Spinnaker Labs,Inc.Job Launch Process:TaskRunnerTaskRunner,MapTaskRunner,MapRun

14、ner work in a daisy-chain to launch your Mapper Task knows ahead of time which InputSplits it should be mappingCalls Mapper once for each record retrieved from the InputSplitRunning the Reducer is much the same Spinnaker Labs,Inc.Creating the MapperYou provide the instance of MapperShould extend Map

15、ReduceBaseOne instance of your Mapper is initialized by the MapTaskRunner for a TaskInProgressExists in separate process from all other instances of Mapper no data sharing!Spinnaker Labs,Inc.Mappervoid map(WritableComparable key,Writable value,OutputCollector output,Reporter reporter)Spinnaker Labs,

16、Inc.What is Writable?Hadoop defines its own“box”classes for strings(Text),integers(IntWritable),etc.All values are instances of WritableAll keys are instances of WritableComparable Spinnaker Labs,Inc.Writing For Cache Coherencywhile(more input exists)myIntermediate=new intermediate(input);myIntermed

17、iate.process();export outputs;Spinnaker Labs,Inc.Writing For Cache CoherencymyIntermediate=new intermediate(junk);while(more input exists)myIntermediate.setupState(input);myIntermediate.process();export outputs;Spinnaker Labs,Inc.Writing For Cache CoherencyRunning the GC takes timeReusing locations

18、allows better cache usageSpeedup can be as much as two-foldAll serializable types must be Writable anyway,so make use of the interface Spinnaker Labs,Inc.Getting Data To The MapperReading DataData sets are specified by InputFormatsDefines input data(e.g.,a directory)Identifies partitions of the data

19、 that form an InputSplitFactory for RecordReader objects to extract(k,v)records from the input source Spinnaker Labs,Inc.FileInputFormat and FriendsTextInputFormat Treats each n-terminated line of a file as a valueKeyValueTextInputFormat Maps n-terminated text lines of“k SEP v”SequenceFileInputForma

20、t Binary file of(k,v)pairs with some addl metadataSequenceFileAsTextInputFormat Same,but maps(k.toString(),v.toString()Spinnaker Labs,Inc.Filtering File InputsFileInputFormat will read all files out of a specified directory and send them to the mapperDelegates filtering this file list to a method su

21、bclasses may overridee.g.,Create your own“xyzFileInputFormat”to read*.xyz from directory list Spinnaker Labs,Inc.Record ReadersEach InputFormat provides its own RecordReader implementationProvides(unused?)capability multiplexingLineRecordReader Reads a line from a text fileKeyValueRecordReader Used

22、by KeyValueTextInputFormat Spinnaker Labs,Inc.Input Split SizeFileInputFormat will divide large files into chunksExact size controlled by mapred.min.split.size RecordReaders receive file,offset,and length of chunkCustom InputFormat implementations may override split size e.g.,“NeverChunkFile”Spinnak

23、er Labs,Inc.Sending Data To ReducersMap function receives OutputCollector objectOutputCollector.collect()takes(k,v)elementsAny(WritableComparable,Writable)can be used Spinnaker Labs,Inc.WritableComparatorCompares WritableComparable dataWill call WritableCpare()Can provide fast path for serialized da

24、taJobConf.setOutputValueGroupingComparator()Spinnaker Labs,Inc.Sending Data To The ClientReporter object sent to Mapper allows simple asynchronous feedbackincrCounter(Enum key,long amount)setStatus(String msg)Allows self-identification of inputInputSplit getInputSplit()Spinnaker Labs,Inc.Partition A

25、nd ShufflePartitionerint getPartition(key,val,numPartitions)Outputs the partition number for a given keyOne partition=values sent to one Reduce taskHashPartitioner used by defaultUses key.hashCode()to return partition numJobConf sets Partitioner implementation Spinnaker Labs,Inc.Reductionreduce(Writ

26、ableComparable key,Iterator values,OutputCollector output,Reporter reporter)Keys&values sent to one partition all go to the same reduce taskCalls are sorted by key “earlier”keys are reduced and output before“later”keys Spinnaker Labs,Inc.Finally:Writing The Output Spinnaker Labs,Inc.OutputFormatAnal

27、ogous to InputFormatTextOutputFormat Writes“key valn”strings to output fileSequenceFileOutputFormat Uses a binary format to pack(k,v)pairsNullOutputFormat Discards output Spinnaker Labs,Inc.HDFS Spinnaker Labs,Inc.HDFS Limitations“Almost”GFSNo file update options(record append,etc);all files are wri

28、te-onceDoes not implement demand replicationDesigned for streaming Random seeks devastate performance Spinnaker Labs,Inc.NameNode“Head”interface to HDFS clusterRecords all global metadata Spinnaker Labs,Inc.Secondary NameNodeNot a failover NameNode!Records metadata snapshots from“real”NameNodeCan me

29、rge update logs in flightCan upload snapshot back to primary Spinnaker Labs,Inc.NameNode DeathNo new requests can be served while NameNode is downSecondary will not fail over as new primarySo why have a secondary at all?Spinnaker Labs,Inc.NameNode Death,contdIf NameNode dies from software glitch,jus

30、t rebootBut if machine is hosed,metadata for cluster is irretrievable!Spinnaker Labs,Inc.Bringing the Cluster BackIf original NameNode can be restored,secondary can re-establish the most current metadata snapshotIf not,create a new NameNode,use secondary to copy metadata to new primary,restart whole

31、 cluster()Is there another way?Spinnaker Labs,Inc.Keeping the Cluster UpProblem:DataNodes“fix”the address of the NameNode in memory,cant switch in flightSolution:Bring new NameNode up,but use DNS to make cluster believe its the original oneSecondary can be the“new”one Spinnaker Labs,Inc.Further Reli

32、ability MeasuresNamenode can output multiple copies of metadata files to different directoriesIncluding an NFS mounted oneMay degrade performance;watch for NFS locks Spinnaker Labs,Inc.Databases Spinnaker Labs,Inc.Life After GFSStraight GFS files are not the only storage optionHBase(on top of GFS)pr

33、ovides column-oriented storagemySQL and other db engines still relevant Spinnaker Labs,Inc.HBaseCan interface directly with HadoopProvides its own Input-and OutputFormat classes;sends rows directly to mapper,receives new rows from reducer But might not be ready for classroom use(least stable compone

34、nt)Spinnaker Labs,Inc.MySQL ClusteringMySQL database can be sharded on multiple serversFor fast IO,use same machines as HadoopTables can be split across machines by row key rangeMultiple replicas can serve same table Spinnaker Labs,Inc.Sharding&Hadoop PartitionersFor best performance,Reducer should

35、go straight to local mysql instanceGet all data in the right machine in one copyImplement custom Partitioner to ensure particular key range goes to mysql-aware Reducer Spinnaker Labs,Inc.Academic Hadoop Requirements Spinnaker Labs,Inc.Server ProfileUW cluster:40 nodes,80 processors total2 GB ram/pro

36、cessor24 TB raw storage space(8 TB replicated)One node reserved for JobTracker/NameNodeTwo more wouldnt cooperate But still vastly overpowered Spinnaker Labs,Inc.Setup&MaintenanceTook about two days to setup and configureMostly hardware-related issuesHadoop setup was only a couple hoursMaintenance:o

37、nly a few hours/weekMostly rebooting the cluster when jobs got stuck Spinnaker Labs,Inc.Total UsageAbout 15,000 CPU-hours consumed by 20 students Out of 130,000 available over quarterAverage load is about 12%Spinnaker Labs,Inc.Analyzing student usage patterns Spinnaker Labs,Inc.Not Quite the Whole S

38、toryRealistically,students did most work very close to deadlineCluster sat unused for a few days,followed by overloading for two days straight Spinnaker Labs,Inc.Analyzing student usage patternsLesson:Resource demands are NOT constant!Spinnaker Labs,Inc.Hadoop Job SchedulingFIFO queue matches incomi

39、ng jobs to available nodesNo notion of fairnessNever switches out running jobRun-away tasks could starve other student jobs Spinnaker Labs,Inc.Hadoop SecurityBut on the bright(?)side:No security system for jobsAnyone can start a job;but they can also cancel other jobsRealistically,students did not c

40、ancel other student jobs,even when they should Spinnaker Labs,Inc.Hadoop Security:The Dark SideNo permissions in HDFS eitherJust now added in 0.16One student deleted the common data set for a projectEmail subject:“Oops”No students could test their code until data set restored from backup Spinnaker L

41、abs,Inc.Job Scheduling LessonsGetting students to“play nice”is hardNo incentiveJust plain bad/buggy codeCluster contention caused problems at deadlinesWork in groupsStagger deadlines Spinnaker Labs,Inc.Another PossibilityAmazon EC2 provides on-demand serversMay be able to have students use these for

42、 jobs“Lab fee”would be$150/studentSimple web-based interfaces existRHadoopOnDemand(HOD)coming soonInjects new nodes into live clusters Spinnaker Labs,Inc.More Performance&Scalability Spinnaker Labs,Inc.Number of TasksMappers=10*nodes(or 3/2*cores)Reducers=2*nodes(or 1.05*cores)Two degrees of freedom

43、 in mapper run time:Number of tasks/node,and size of InputSplitsSee http:/wiki.apache.org/lucene-hadoop/HowManyMapsAndReduces Spinnaker Labs,Inc.More Performance TweaksHadoop defaults to heap cap of 200 MBSet:mapred.child.java.opts=-Xmx512m1024 MB/process may also be appropriateDFS block size is 64

44、MBFor huge files,set dfs.block.size=134217728mapred.reduce.parallel.copiesSet to 1550;more data=more copies Spinnaker Labs,Inc.Dead TasksStudent jobs would“run away”,admin restart neededVery often stuck in huge shuffle processStudents did not know about Partitioner class,may have had non-uniform dis

45、tributionDid not use many Reducer tasksLesson:Design algorithms to use Combiners where possible Spinnaker Labs,Inc.Working With the SchedulerRemember:Hadoop has a FIFO job schedulerNo notion of fairness,round-robinDesign your tasks to“play well”with one another Decompose long tasks into several smal

46、ler ones which can be interleaved at Job level Spinnaker Labs,Inc.Additional Languages&Components Spinnaker Labs,Inc.Hadoop and C+Hadoop PipesLibrary of bindings for native C+codeOperates over local socket connectionStraight computation performance may be fasterDownside:Kernel involvement and contex

47、t switches Spinnaker Labs,Inc.Hadoop and PythonOption 1:Use JythonCaveat:Jython is a subset of full PythonOption 2:HadoopStreaming Spinnaker Labs,Inc.HadoopStreamingAllows shell pipe|operator to be used with HadoopYou specify two programs for map and reduce(+)stdin and stdout do the rest(-)Requires

48、serialization to text,context switches(+)“cat|grep|sort|uniq”is now a valid MR!Spinnaker Labs,Inc.Eclipse PluginSupport for Hadoop in Eclipse IDEAllows MapReduce job dispatchPanel tracks live and recent jobsIncluded in Hadoop since 0.14(But works with older versions)Contributed by IBM Spinnaker Labs,Inc.ConclusionsHadoop systems will put up with reasonable amounts of student abuseBiggest pitfall is deadlinesHBase may not be ready for this quarters students;next year almost certainlyOther tools provide student design projects with additional options Spinnaker Labs,Inc.

展开阅读全文