Red hat 9 linux的双机热备安装比较简单,需要的安装文件有以下几个: heartbeat-1.0.4-2.rh.9.um.1.i386.rpm heartbeat-pils-1.0.4-2.rh.9.um.1.i386.rpm heartbeat-stonith-1.0.4-2.rh.9.um.1.i386.rpm net-snmp-5.0.6-17.i386.rpm
按顺序依次安装:
1、heartbeat-pils-1.0.4-2.rh.9.um.1.i386.rpm 2、net-snmp-5.0.6-17.i386.rpm
3、heartbeat-stonith-1.0.4-2.rh.9.um.1.i386.rpm 4、heartbeat-1.0.4-2.rh.9.um.1.i386.rpm
#rpm -ivh heartbeat-pils-1.0.4-2.rh.9.um.1.i386.rpm #rpm -ivh net-snmp-5.0.6-17.i386.rpm
#rpm -ivh heartbeat-stonith-1.0.4-2.rh.9.um.1.i386.rpm #rpm -ivh heartbeat-1.0.4-2.rh.9.um.1.i386.rpm
安装完成之后,开始配置主服务器。配置文件位于/etc/ha.d下,用rpm安装之后不会产生配置文件,需要从/usr/share/doc/heartbeat-1.0.4下,把ha.cf,,,,authkeys,,,,,,,,haresources,,,,三个文件cp到/etc/ha.d下面。
文件在ha.cf是主要heartbeat的配置文件,authkeys是heartbeat的安全配置文件,haresource文件是heartbeat的资源文件 其文件说明如下: ha.cf
############################################################################################# #
# There are lots of options in this file. All you have to have is a set # of nodes listed {\"node ...}
# and one of {serial, bcast, mcast, or ucast} #
# ATTENTION: As the configuration file is read line by line, # THE ORDER OF DIRECTIVE MATTERS! #
# In particular, make sure that the timings and udpport # et al are set before the heartbeat media are defined! # All will be fine if you keep them ordered as in this
# example. # #
# Note on logging:
# If any of debugfile, logfile and logfacility are defined then they # will be used. If debugfile and/or logfile are not defined and # logfacility is defined then the respective logging and debug # messages will be loged to syslog. If logfacility is not defined # then debugfile and logfile will be used to log messges. If # logfacility is not defined and debugfile and/or logfile are not # defined then defaults will be used for debugfile and logfile as # required and messages will be sent there. #
# File to write debug messages to
debugfile /var/log/ha-debug 【heartbeat的debug信息记录文件】 # #
# File to write other messages to #
logfile /var/log/ha-log 【日志文件】 # #
# Facility to use for syslog()/logger #
logfacility local0 【记录日志在syslog中,可选项】 # #
# A note on specifying \"how long\" times below... #
# The default time unit is seconds # 10 means ten seconds #
# You can also specify them in milliseconds # 1500ms means 1.5 seconds # #
# keepalive: how long between heartbeats? #
keepalive 3 【每3秒发送一次keeplive消息】 #
# deadtime: how long-to-declare-host-dead? #
deadtime 15 【如果15秒没有收到keeplive消息将会认为节点已经失效】 #
# warntime: how long before issuing \"late heartbeat\" warning? # See the FAQ for how to use warntime to tune deadtime. #
warntime 10 【在日志中记录最后心跳last heartbeat-best 前的警告时间】 # #
# Very first dead time (initdead) #
# On some machines/OSes, etc. the network takes a while to come up # and start working right after you've been rebooted. As a result # we have a separate dead time for when things first come up. # It should be at least twice the normal dead time. #
initdead 60 【如果节点的机器重启后,可能需要一些时间启动网络,这个时间与deadtime不一样,要单独对待】 # #
# nice_failback: determines whether a resource will # automatically fail back to its \"primary\" node, or remain # on whatever node is serving it until that node fails. #
# The default is \"off\
# back to the node which is declared as primary in haresources #
# \"on\" means that resources only move to new nodes when # the nodes they are served on die. This is deemed as a # \"nice\" behavior (unless you want to do active-active). #
nice_failback on 【如果主节点失效之后,重新恢复后,不会再成为主节点,只
有当当前主节点失效,此节点才可恢复 为主节点】 #
# hopfudge maximum hop count minus number of nodes in config #hopfudge 1 # #
# Baud rate for serial ports...
# (must precede \"serial\" directives) #
#baud 19200 #
# serial serialportname ... #serial /dev/ttyS0 # Linux
#serial /dev/cuaa0 # FreeBSD #serial /dev/cua/a # Solaris #
# What UDP port to use for communication? # [used by bcast and ucast] #
#udpport 694 #
# What interfaces to broadcast heartbeats over? #
#bcast eth1 # Linux #bcast eth1 eth2 # Linux #bcast le0 # Solaris #bcast le1 le2 # Solaris #
# Set up a multicast heartbeat medium
# mcast [dev] [mcast group] [port] [ttl] [loop] #
# [dev] device to send/rcv heartbeats on
# [mcast group] multicast group to join (class D multicast address # 224.0.0.0 - 239.255.255.255)
# [port] udp port to sendto/rcvfrom (no reason to differ # from the port used for broadcast heartbeats)
# [ttl] the ttl value for outbound heartbeats. This affects # how far the multicast packet will propagate. (1-255) # [loop] toggles loopback for outbound multicast heartbeats. # if enabled, an outbound packet will be looped back and # received by the interface it was sent on. (0 or 1) # This field should always be set to 0. # #
mcast eth1 225.0.0.22 694 1 0 【使用组播225.0.0.22,端口694发送keeplive消息】 #
# Set up a unicast / udp heartbeat medium # ucast [dev] [peer-ip-addr] #
# [dev] device to send/rcv heartbeats on
# [peer-ip-addr] IP address of peer to send packets to #
#ucast eth0 192.168.1.2 # #
# Watchdog is the watchdog timer. If our own heart doesn't beat for # a minute, then our machine will reboot.
#
#watchdog /dev/watchdog #
# \"Legacy\" STONITH support
# Using this directive assumes that there is one stonith # device in the cluster. Parameters to this device are # read from a configuration file. The format of this line is: #
# stonith # NOTE: it is up to you to maintain this file on each node in the # cluster! # #stonith baytech /etc/ha.d/conf/stonith.baytech # # STONITH support # You can configure multiple stonith devices using this directive. # The format of the line is: # stonith_host # # Note that if you put your stonith device access information in # here, and you make this file publically readable, you're asking # for a denial of service attack ;-) # # #stonith_host * baytech 10.0.0.3 mylogin mysecretpassword #stonith_host ken3 rps10 /dev/ttyS1 kathy 0 #stonith_host kathy rps10 /dev/ttyS1 ken3 0 # # Tell what machines are in the cluster # node nodename ... -- must match uname -n node rh-9-a 【定义节点名称,必须是节点的主机名】 node rh-9-b # # Less common options... # # Treats 10.10.10.254 as a psuedo-cluster-member # #ping www.163.com www.google.com # # Started and stopped with heartbeat. Restarted unless it exits # with rc=100 # #respawn userid /path/name/to/run ################################################################## authkeys # # Authentication file. Must be mode 600 # # # Must have exactly one auth directive at the front. # auth send authentication using this method-id # # Then, list the method and key that go with that method-id # # Available methods: crc sha1, md5. Crc doesn't need/want a key. # # You normally only have one authentication method-id listed in this file # # Put more than one to make a smooth transition when changing auth # methods and/or keys. # # # sha1 is believed to be the \"best\# # crc adds no security, except from packet corruption. # Use only on physically secure networks. # auth 3 【指定认证加密方式,3 表示加密方式的行号】 #1 crc #2 sha1 HI! 3 md5 Hello! 【使用md5加密,密码为hello!】 ############################################################################### ##################################################### # # This is a list of resources that move from machine to machine as # nodes go down and come up in the cluster. Do not include # \"administrative\" or fixed IP addresses in this file. # # # The haresources files MUST BE IDENTICAL on all nodes of the cluster. # # The node names listed in front of the resource group information # is the name of the preferred node to run the service. It is # not necessarily the name of the current machine. If you are running # nice_failback OFF then these services will be started # up on the preferred nodes - any time they're up. # # If you are running with nice_failback ON, then the node information # will be used in the case of a simultaneous start-up. # # BUT FOR ALL OF THESE CASES, the haresources files MUST BE IDENTICAL. # If your files are different then almost certainly something # won't work right. # # We refer to this file when we're coming up, and when a machine is being # taken over after going down. # # You need to make this right for your installation, then install it in # /etc/ha.d # # Each logical line in the file constitutes a \"resource group\". # A resource group is a list of resources which move together from # one node to another - in the order listed. It is assumed that there # is no relationship between different resource groups. These # resource in a resource group are started left-to-right, and stopped # right-to-left. Long lists of resources can be continued from line # to line by ending the lines with backslashes (\"\\\"). # # These resources in this file are either IP addresses, or the name # of scripts to run to \"start\" or \"stop\" the given resource. # # The format is like this: # #node-name resource1 resource2 ... resourceN # # # If the resource name contains an :: in the middle of it, the # part after the :: is passed to the resource script as an argument. # Multiple arguments are separated by the :: delimeter # # In the case of IP addresses, the resource script name IPaddr is # implied. # # For example, the IP address 135.9.8.7 could also be represented # as IPaddr::135.9.8.7 # # THIS IS IMPORTANT!! vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv # # The given IP address is directed to an interface which has a route # to the given address. This means you have to have a net route # set up outside of the High-Availability structure. We don't set it # up here -- we key off of it. # # The broadcast address for the IP alias that is created to support # an IP address defaults to the highest address on the subnet. # # The netmask for the IP alias that is created defaults to the same # netmask as the route that it selected in in the step above. # # The base interface for the IPalias that is created defaults to the # same netmask as the route that it selected in in the step above. # # If you want to specify that this IP address is to be brought up # on a subnet with a netmask of 255.255.255.0, you would specify # this as IPaddr::135.9.8.7/24 . # # If you wished to tell it that the broadcast address for this subnet # was 135.9.8.210, then you would specify that this way: # IPaddr::135.9.8.7/24/135.9.8.210 # # If you wished to tell it that the interface to add the address to # is eth0, then you would need to specify it this way: # IPaddr::135.9.8.7/24/eth0 # # And this way to specify both the broadcast address and the # interface: # IPaddr::135.9.8.7/24/eth0/135.9.8.210 # # The IP addresses you list in this file are called \"service\" addresses, # since they're they're the publicly advertised addresses that clients # use to get at highly available services. # # For a hot/standby (non load-sharing) 2-node system with only # a single service address, # you will probably only put one system name and one IP address in here. # The name you give the address to is the name of the default \"hot\" # system. # # Where the nodename is the name of the node which \"normally\" owns the # resource. If this machine is up, it will always have the resource # it is shown as owning. # # The string you put in for nodename must match the uname -n name # of your machine. Depending on how you have it administered, it could # be a short name or a FQDN. # #------------------------------------------------------------------- # # Simple case: One service address, default subnet and netmask # No servers that go up and down with the IP address # #just.linux-ha.org 135.9.216.110 # #------------------------------------------------------------------- # # Assuming the adminstrative addresses are on the same subnet... # A little more complex case: One service address, default subnet # and netmask, and you want to start and stop http when you get # the IP address... # #just.linux-ha.org 135.9.216.110 http #------------------------------------------------------------------- # # A little more complex case: Three service addresses, default subnet # and netmask, and you want to start and stop http when you get # the IP address... # #just.linux-ha.org 135.9.216.110 135.9.215.111 135.9.216.112 httpd #------------------------------------------------------------------- # # One service address, with the subnet, interface and bcast addr # explicitly defined. # #just.linux-ha.org 135.9.216.3/28/eth0/135.9.216.12 httpd # #------------------------------------------------------------------- # # An example where a shared filesystem is to be used. # Note that multiple aguments are passed to this script using # the delimiter '::' to separate each argument. # rh-9-a 11.1.1.96/24/eth0 【定义主节点使用的公网IP,掩码和接口名称】 # # Regarding the node-names in this file: # # They must match the names of the nodes listed in ha.cf, which in turn # must match the `uname -n` of some node in the cluster. So they aren't # virtual in any sense of the word. # 根据情况更改配置文件,两台服务器的heartbeat配置必须一样,这样才能启动heartbeat, 启动heartbeat: /etc/rc.d/init.d/heartbeat start [stop|restart] 因篇幅问题不能全部显示,请点此查看更多更全内容