《dissertation upon request Signed ___________________7015787947.pdf》由会员分享,可在线阅读,更多相关《dissertation upon request Signed ___________________7015787947.pdf(91页珍藏版)》请在得力文库 - 分享文档赚钱的网站上搜索。
1、An extensible administration and configuration tool forLinux clustersJohn D.Fogarty B.ScA dissertation submitted to the University of Dublin,in partial fulfillment of the requirements for the degree ofMaster of Science in Computer Science1999iiDeclarationI declare that the work described in this dis
2、sertation is,exceptwhere otherwise stated,entirely my own work and has notbeen submitted as an exercise for a degree at this or any otheruniversity.Signed:_John D.Fogarty15th September,1999Permission to lend and/or copyI agree that Trinity College Library may lend or copy thisdissertation upon reque
3、st.Signed:_John D.Fogarty15th September,1999iiiSummaryThis project addresses the lack of system administration tools for Linux clusters.Thegoals of the project were to design and implement an extensible system that wouldfacilitate the administration and configuration of a Linux cluster.Cluster syste
4、ms areinherently scalable and therefore the cluster administration tool should also scale well tofacilitate the addition of new nodes to the cluster.The tool allows the administration and configuration of the entire cluster from a singlenode.Administration of the cluster is simplified by way of comm
5、and replication acrossone,some or all nodes.Configuration of the cluster is made possible through the use of aflexible,variables substitution scheme,which allows common configuration files toreflect differences between nodes.The system uses a GUI interface and is intuitivelysimple to use.Extensibili
6、ty is incorporated into the system,by allowing the dynamic addition of newcommands and output display types to the system.Through the use of a menusconfiguration file the system is easily extended to include additional commands.This canbe accomplished without reprogramming for the majority of comman
7、ds.An API has beenprovided to allow different types of component to be displayed on the main GUI panel.The system thus exhibits extensibility in supporting different types of displaycomponents such as text or graphics.This extensible scheme can then be used to support,for example,graphical output.Th
8、rough the use of a configuration file for the nodes in the system,the system can scalewell.New nodes are easily added through a single entry in this configuration file.Scalability is also incorporated into the design of the system,whereby as much work aspossible is distributed to the server nodes.iv
9、AcknowledgementsI would like to thank my supervisor Dr.Paddy Nixon for his advice and assistancethroughout the project.I would also like to express my appreciation to Dr.Simon Dobsonwho also assisted me during this work.Stefan Weber was unstintingly generous with histime and his expertise so“danke s
10、chon”.Finally,cheers to my classmates.vTABLE OF CONTENTSCHAPTER 1 INTRODUCTION.1INTRODUCTION.1MOTIVATIONS.1GOALS.2ROADMAP.2SUMMARY.3CHAPTER 2 BACKGROUND RESEARCH.4INTRODUCTION.4CLUSTER DEFINITION.4CLUSTER CHARACTERISTICS.4TYPES OF CLUSTERS.5Server farms.6Failover clusters.6Coupled clusters.6shared n
11、othing cluster.6shared memory cluster.6CLUSTER CONFIGURATIONS.7CLUSTER FORCES.7CLUSTER TAXONOMY.9Pile Of PCs.9Cluster of Workstations(COW).9Network of Workstations(NOW).10Beowulf.10Beowulf History.11Advantages of a Beowulf.11D-Shared memory Clusters and SMPs.11NT-PC.12CLUSTER EXAMPLES.12Beowulf clas
12、s.12Loki/Hygalc.12Avalon.13Stone SouperComputer.13ASCI Red.14Other Examples.14CAG Cluster.14Windows NT example.15CLUSTER SOFTWARE.15Operating system software.15Linux.15Linux Modules.16Administration software.17SMILE.18Windows NT class cluster software.20SUMMARY.21CHAPTER 3 PROBLEM SPECIFICATION.22IN
13、TRODUCTION.22GOALS.22REQUIREMENTS OF THE SYSTEM.22Command replication and configuration changes across multiple nodes.22Variable substitution.23Module management.23Formatted output from some menu commands.23viShutdown&reboot of nodes and unloading of server software.23Output comparison between nodes
14、.23SYSTEM CONSTRAINTS.24CORBA.24Java.24Advantages of Java.24SUMMARY.24CHAPTER 4 CLUSTER MANAGER DESIGN.25INTRODUCTION.25OVERVIEW.25Clusters.25Cluster System Manager Overview.26Client Overview.26Communications Overview.27Server Node Operation Overview.27CLUSTER MANAGER DETAILED DESIGN.28CLIENT.28Clie
15、nt GUI.28GUI Design.28Menus Generation.29Flexibility through Variable Text Substitution.31Scalability.32Client StartUp.32Node selection.32Command Execution.33Configuration File Management.34Editing a file.35Saving a file.35CLIENT-SERVER INTERFACE.35Communication between nodes&cluster manager.36SERVE
16、R.36Command Execution Responses.36Command Execution.37PP.38GP.39NP.39NPP.39Error messages.39MODULES.40SUMMARY.40CHAPTER 5 IMPLEMENTATION.41INTRODUCTION.41CLIENT GUI.41Menus Generation.43Example.44VARIABLE TEXT SUBSTITUTION.44SCALABILITY NODES CONFIGURATION FILE.46CLIENT STARTUP.46NODE SELECTION.47CO
17、MMAND EXECUTION.49Redesign.49ExecThreads.50SwingUtilities invokeLater().51SaveThreads.51CONFIGURATION FILE MANIPULATION.52viiEdit File.52Save File.52Cancel Edit.53Cat File.53Log Files.54CLIENT SERVER INTERFACE.54Communication between nodes&cluster manager.54SERVER.55Server Operation.55PP-Perl Script
18、.55GP-Generic Parsing.57NP-Perl but no parsing.58NPP-No Perl,no parsing.58NRP-No reply.58Error messages.59MENUS IMPLEMENTED.59File.59System.59Shutdown server daemon.59ShutDown/Reboot.60Modules.60Module Manipulation.61Load Module.61Unload module.62System Statistics.64IMPLEMENTATION PROBLEMS.64SUMMARY
19、.64CHAPTER 6 ANALYSIS.65INTRODUCTION.65REVIEW OF WORK.65ANALYSIS.65Goals.66Extensibility.66Commands.66GUI components.66Scalability.67Flexibility.67Ease of use.68Portability.68Functional Requirements.68Command replication across a number of nodes.68Variable substitution.68Module management.68Formatte
20、d output from some menu commands.69Shutdown or reboot of nodes.69Comparison of nodal output.69COMPARISON WITH OTHER TOOLS.69SMILE CMS.69FUTURE WORK.70Provide a web interface.70Extend the menus framework.70Implement callbacks.71Automatic on-line node checking.71Build a dynamic Perl parser.71Integrate
21、 with existing cluster tools.71CONCLUSIONS.72viiiPersonal Achievements.72SUMMARY.72APPENDIX 1 LINUX MODULES.73INTRODUCTION.73LINUX MODULES.73ADVANTAGES.73MODULE ELEMENTS.74IMPLEMENTATION.74Manual loading:Command line operation.74Automatic loading:kerneld.75COMPILING THE KERNEL.75SYSTEM CALLS SPECIFI
22、C TO MODULES.75THE KERNEL DAEMON.78BIBLIOGRAPHY.80ixTable of FiguresFigure 1:A Taxonomy of Parallel Computing _9Figure 2:The Smile Cluster Management system _19Figure 3:Basic architecture of a cluster system_25Figure 4:Cluster System Manager Architecture Overview _26Figure 6:Screenshot of the Cluste
23、r Manager Interface_42Figure 7:Node selection_48Figure 8:Client Server Communication Implementation_50Figure 9:Output of the mount command_57Figure 10:The Load Module list box _62Figure 11:The UnLoad Module list box_63Figure 12:The xosview utility _64Table of TablesTable 1:menus configuration file f
24、ormat _30Table 2:Output types from command executions _37-1-Chapter 1 IntroductionIntroductionClusters of computers,where a cluster is a group of independent systems that act andappear as a single system,have become popular in both research establishments in needof high performance computing power a
25、nd in business organisations seeking highavailability from their computing resources.The popularity of clusters has arisen due to anumber of factors.Not least amongst these,particularly with regards to researchcomputing,has been the dramatic increase in price-performance of PC chips which hasshown m
26、uch greater improvement vis a vis traditional supercomputer components.Thishas led people to experiment with using commodity off the shelf components in buildingtheir supercomputers out of clusters of PCs.Other factors that have reinforced the interestin clusters include the fact that clusters scale
27、 well.They can grow very large,accommodating a large number of nodes and also nodes can be added in a piecemealfashion.Linux is the dominant operating system with regard to cluster systems.It is a popularchoice mainly because it is free,stable and full source code is available.AdditionallyLinux impl
28、ements a feature of operating system customisability whereby modules can beloaded and unloaded on-demand.This is particularly important in relation to clusters,where often,small kernels are desired in order to maximise performance.It also meansthat different nodes in the cluster can run modified ver
29、sions of the kernel if needed.MotivationsConcomitant with the deployment of clusters,is the need for tools to facilitate theiradministration.Administration in this context relates to the normal administrative dutiesneeded to effectively manage a single machine.The bulk of research effort related toc
30、luster software has up to now been devoted to the development of management tools toaid parallel programming,job scheduling and load balancing.A lack of administrationtools continues to render the management,particularly of large scale clusters,moredifficult.This has been the main motivation underpi
31、nning this project.-2-GoalsThe goals of the project were to design and implement an extensible system that wouldfacilitate the administration of a Linux cluster.Extensibility would allow new commandsto be easily added to the system the system should easily accommodate commands thatreturn different t
32、ypes of output.One of the features and main attractions of cluster systems is their scalability,whichallows nodes to be added incrementally and supports a large number of nodes.Thecluster administration tool should also scale well.In addition to these general designgoals,a number of specific require
33、ments are demanded of the system.RoadmapA conventional thesis pattern is used.This chapter has given an introduction to theproject and the problem that is being addressed.The motivations&goals of the projectwere described.Chapter 2 reviews the field of cluster systems and cluster administration tool
34、s.It explainsthe essential characteristics of clusters and describes the different classes of clusters.Italso gives examples of the major types of clusters in use today.The place of clustermachines in the hierarchy of parallel computing machines is examined.The chapter thenturns its attention to the
35、 software that is used on these machines.Specifically adescription of available administration tools is also given.Chapter 3 gives a detailed problem specification.The chapter details the goals of theproject and outlines the requirements of the system.Chapter 4 details the design of the cluster mana
36、gement system.It explains how thevarious goals and requirements of the system will be accomplished through the proposeddesign.-3-Chapter 5 describes the implementation of the system.It details how particular designfeatures were incorporated into the system.It also outlines any problems encountereddu
37、ring the implementation.Chapter 6 gives a detailed analysis of the cluster tool developed in this project.It reviewsthe goals and requirements of the system and assesses the extent to which these havebeen achieved.The chapter also suggests what future work might be carried out to extendand improve t
38、he system.SummaryA broad overview of the project has been given in this chapter.The problem underconsideration has been outlined along with the motivations and goals of the project.Thenext chapter examines the field of cluster computing and examines the software toolsavailable to administer a Linux
39、cluster.-4-Chapter 2 Background ResearchIntroductionThe problem being investigated by this project is the design and implementation of aLinux cluster administration tool.This chapter explains the essential characteristics ofclusters and examines the different types of clusters that are in use today.
40、Contrasts aremade with other types of parallel machines such as SMPs(Symmetric Multi-Processors)and NOWs(Network of Workstations).Examples of the different types of clusters areused to illustrate the flexibility and power of this type of system.The chapter concludes with an examination of the softwa
41、re that is used on thesemachines.Specifically,software relating to system administration is detailed.Cluster definitionAchieving a precise definition for a cluster is difficult,as evidenced by the inconclusiveand sometimes rancorous electronic submissions of the IEEE Task Force on ClusterComputing H
42、REF1.However a consensus of sorts was formed around the followingdefinition:“$?FOXVWHU?LV?D?WSH?RI?SDUDOOHO?RU?GLVWULEXWHG?VVWHP?WKDW?FRQVLVWV?RI?D?FROOHFWLRQ?RI?LQWHUFRQQHFWHGZKROH?FRPSXWHUV?XVHG?DV?D?VLQJOH?XQLILHG?FRPSXWLQJ?UHVRXUFH”FIS97.Whole computer in this context is taken to mean a normal c
43、omputer that can be used onits own:i.e.contains processor(s),memory,I/O,OS,software subsystems andapplications.An alternative but similar definition given by Microsoft HREF2:“$?FOXVWHU?LV?D?JURXS?RI?LQGHSHQGHQW?VVWHPV?ZRUNLQJ?WRJHWKHU?DV?D?VLQJOH?VVWHP?$?FOLHQWLQWHUDFWV?ZLWK?WKH?FOXVWHU?DV?LI?LW?ZHU
44、H?D?VLQJOH?VHUYHU?$?QRGH?LV?D?VHUYHU?LQ?WKH?FOXVWHU.”Cluster characteristicsTaking the loosest interpretation of this definition would allow that the PCs on a LAN,can together be considered a form of cluster.In general though,clusters can bedistinguished from distributed systems which tend to be loo
45、sely connected and use slow-5-interconnects(not always).Furthermore nodes in a cluster have a strong sense ofmembership which generally nodes in a distributed system do not.Other characteristics of a cluster include:it can be managed as a single system,reflecting the tight coupling services are clus
46、ter wide it can tolerate component failure components can be added transparently to usersTypes of ClustersIn general there are two broad categories of clusters in use today:i)research machines comprising many compute nodes which are used for massivelycomplex computations such as weather forecasting
47、or N-body particlesimulations.ii)smaller=2 clusters which are used in business as web servers,mail servers,database servers etc.These might be classified as commercial clusters and arepopular as a preferred alternative to other forms of high availability technologysuch as data mirroring,server mirro
48、ring and fault tolerant systems.The former type of machine is generally found in academic institutions or governmentinstallations and generally uses Linux or some other free Unix operating systemdistribution to run the machine.On the other hand the latter is found in business andfrequently uses Wind
49、ows NT as its operating system of choice.Both these types ofsystem are described in further detail below.Within these two broad categories,a further classification of clusters can be made on thetype of use to which they are put and the type of service they are expected to provide:-6-Server farmsThes
50、e are the oldest and simplest type of cluster whereby a cluster consists of a group ofcompute nodes which demand work from a central server.This is well suited toapplications that require large amounts of processing and that need limited inter-nodecommunication.In this cluster,if a node fails,the se