Click here to learn
about this Sponsor:
Home  |  News  |  Articles  |  Polls  |  Forum

Keywords: Match:
ELJonline: Highly Available Networking
John Mehaffey   (January, 2002)

Achieve five nines reliability with the bonding network driver and the high availability dæmon.

High availability (HA) means different things to different people. This article defines availability as the percentage of time that a computer system is capable of providing the service that it is assigned to do. A good figure of availability for computer systems that are used for business critical tasks, such as running a telephone switch or enterprise data communication network, is 99.999% of the time (five nines). This translates to less than six minutes per year that the service is not available.

CompactPCI traditionally has been the platform of choice for these five nines systems, because hot swap of components in to and out of a running system is usually also a requirement.

In a highly available network there should be multiple independent paths to each system in the network to avoid single points of failure (SPOF). Physical separation is also a good idea because if both paths are in the same conduit, and the conduit gets cut by accident, the network will go down.

The Bonding Driver

Key to availability is the ability to detect failure quickly and transparently switch from one LAN connection to another. Putting the burden of handling redundancy in the networking driver allows for easier HA hardening of networked applications, as it relieves the application of having to be aware of network topology.

The Linux bonding driver has the ability to detect link failure and reroute network traffic around a failed link in a manner transparent to the application. It also has the ability (with certain network switches) to aggregate network traffic in all working links to achieve higher throughput. This is sometimes referred to as trunking.

The bonding driver accomplishes this by enslaving all of the Ethernet ports in the bond to the same Ethernet MAC address, which ensures the proper routing of packets across the links. With a hub arrangement, there should not be more than one link with the same MAC address active at any one time, so the bonding driver can be set up to have only one channel active at a time. This is called active-backup mode, and it will route all traffic through one channel until it detects a failure, at which point it switches to the next backup channel.

With a switch instead of a hub, it is possible to send traffic over all live links at the same time, effectively aggregating the bandwidth of the available links. This is called the round-robin mode. Round-robin mode provides availability as well as aggregation, but not all switches are capable of supporting aggregation. The bonding documentation (see Resources) contains a list of some switches that do support aggregation. The round-robin mode sends packets over all working links, with each successive packet being sent over the next link in the bonding rotation, effectively aggregating the bandwidth of all usable links.

The program that creates the bond is the ifenslave program. It is similar in function to the ifconfig program that configures nonbonded Ethernet interfaces, except that it configures all members of the bond to the same network configuration (IP, MAC, broadcast addresses, etc.). To configure the bonding driver, use ifconfig to configure the bond0 device, and use ifenslave to configure the members of the bond (the slaves).

Many recent distributions, including the Hard Hat Linux HA Framework 2.0 release, come with bonding and ifenslave already in the distribution. Bonding is available as a patch that contains the bonding driver and the ifenslave program, as well as some other modifications necessary to make the whole package work properly. The driver can be compiled in or run as a module.

Listing 1 shows a typical configuration scenario. The first line installs the bonding driver as a module in active-backup mode with a link-status check period of 100ms. Round-robin mode would use a mode parameter of 0. The first ifconfig sets the IP address for the bonding driver. The next two ifenslave commands enslave eth0 and eth1 to the bond0 device. The bond0 device takes the MAC address of the first slave configured in the bond, and this becomes the MAC address for all devices in the bond.

Listing 1. Typical Configuration Scenario


The networking stack talks to the bond0 device, which sends packets out over whichever slave device is appropriate, given the mode and availability status. In Listing 1, the mode is active-backup, and the active Ethernet device is eth0. Inactive Ethernet slaves have NOARP in the status line.

Hot Swap

When a component fails, it is not enough to detect and mask the failure. The failing component must be repaired so that the next failure does not cause loss of service. For an Ethernet cable or hub or switch, it is usually a simple matter of replacing it with a working one. For an Ethernet board in a running computer, it is not always so simple.

The PCI Industrial Computer Manufacturers Group (PICMG) has created a set of standards for CompactPCI hardware and software that make it easier to replace defective hardware in a running system. With PICMG-compliant hardware and the proper drivers and dæmons, replacing a defective board in a running system is a simple matter of removing the defective board and replacing it with a working one.

PICMG standard 2.1 is a hardware standard that covers the mechanical and electrical requirements necessary to remove and/or plug in a board in a running system (hot swap). PICMG standard 2.12 is a software standard that covers the driver requirements to handle hot-swap events. The SourceForgePICMG hot-swap site has the hot-swap driver routines and HA dæmon for handling hot swapping.

Hot swap requires additional coordination with drivers and the PCI subsystem to handle PCI devices that come and go. When an Ethernet card fails and the operator wants to remove it, all he or she has to do is open the handle switch on the CompactPCI board, and this sends an ENUM# interrupt to the PICMG 2.12 driver, which calls to the routine registered to receive hot-swap events. This routine is responsible for notifying the driver for the card, removing the device from the kernel PCI tree and turning on the blue hot-swap LED on the board, which indicates to the operator that it is safe to remove the card. It also notifies the HA dæmon so that it can do any user-space actions necessary (such as removing an Ethernet device from a bond or removing a driver that is no longer used).

When a replacement card is inserted, it also causes an ENUM# interrupt, which gets routed to the same routine mentioned above. This routine is then responsible for inserting the device in the kernel PCI tree and notifying the HA dæmon that a new device has been inserted.

HA Dæmon

The HA Dæmon (HAD) is a user-space program that receives events from the hot-swap subsystem. It takes two configuration files, one to specify which devices are supported (and their corresponding drivers) and one to specify actions to take when a hot-swap event is received.

If the hot-swap subsystem receives an insert event and does not have a driver loaded for the card that was inserted, it sends a load-driver message to the HAD. The HAD checks its device-driver configuration file (/etc/pcidrivers.conf), and if it knows the driver for the card, it loads it. If the card is unknown, the HAD just ignores the insert event.

The HAD also has another major duty with regard to hot swap, and that is configuring the card that has just been inserted. For example, if the card is involved in networking, it needs to have its address established, or if it is a member of a bond, it needs to be enslaved.

The HAD's configuration file is /etc/had.conf, shown in Listing 2. This file is for a Motorola 8216 chassis with two I/O domains. The first two lines in Section 1 state that this processor is going to control both I/O domains. A chassis with only one I/O domain may skip this section.

Listing 2. Sample /etc/had.conf File


The first line in Section 2 indicates that the HAD will start the bonding driver for bond0 and configure it with IP address 10.0.1.1.

The next two lines define Ethernet configurations that will be used by the ports in the boards described by Section 3.

Configuration config1 is an example of a nonbonded Ethernet configuration. It has four parameters: the IP address, network mask, network address and broadcast address.

Configuration config2 is an example of a bonding configuration that will enslave any board that uses it to the bond0 device configured by the bond command in Section 2.

The remainder of the had.conf file states which configurations are used by devices in the backplane. The first parameter of the device command is the slot, and the second is the subdevice. Thus, the card in slot 2 is a dual Ethernet card, and both Ethernet ports will be enslaved to bond0. The device in slot 12 is also a dual Ethernet card, and the device in slot 16 is a single Ethernet that will be configured with the nonbonded configuration specified by config1.

Conclusion

The Linux bonding driver can be an important component of a highly available system and, coupled with the hot-swap capability of CompactPCI hardware, is capable of providing networking with five nines of availability.

b:The bonding driver could use a number of improvements. It only detects link failure through the Ethernet link-status indicator and could use a mechanism to diagnose more subtle failures. The bonding driver also should be enhanced to provide monitoring software with an indication of when it has detected a link failure and routed around it so that a repair strategy can be implemented. But the beauty of Linux and open source is that you don't have to wait for someone else to do it, you can do it yourself.

Resources




About the author: John Mehaffey is the author of the PICMG 2.12 driver in SourceForge, as well as a participant in a number of PICMG working groups including the 2.12 and 2.13 standards. John works for MontaVista Software as a technical marketing engineer and is also the mayor of Saratoga, California, a city of 30,000 in Silicon Valley. Contact John at mehaf@mvista.com.



Copyright © 2001 Specialized Systems Consultants, Inc. All rights reserved. Embedded Linux Journal Online is a cooperative project of Embedded Linux Journal and LinuxDevices.com.


(Click here for further information)


FUEL Database on MontaVista Linux
Whether building a mobile handset, a car navigation system, a package tracking device, or a home entertainment console, developers need capable software systems, including an operating system, development tools, and supporting libraries, to gain maximum benefit from their hardware platform and to meet aggressive time-to-market goals.

Breaking New Ground: The Evolution of Linux Clustering
With a platform comprising a complete Linux distribution, enhanced for clustering, and tailored for HPC, Penguin Computing¿s Scyld Software provides the building blocks for organizations from enterprises to workgroups to deploy, manage, and maintain Linux clusters, regardless of their size.

Data Monitoring with NightStar LX
Unlike ordinary debuggers, NightStar LX doesn¿t leave you stranded in the dark. It¿s more than just a debugger, it¿s a whole suite of integrated diagnostic tools designed for time-critical Linux applications to reduce test time, increase productivity and lower costs. You can debug, monitor, analyze and tune with minimal intrusion, so you see real execution behavior. And that¿s positively illuminating.

Virtualizing Service Provider Networks with Vyatta
This paper highlights Vyatta's unique ability to virtualize networking functions using Vyatta's secure routing software in service provider environments.

High Availability Messaging Solution Using AXIGEN, Heartbeat and DRBD
This white paper discusses a high-availability messaging solution relying on the AXIGEN Mail Server, Heartbeat and DRBD. Solution architecture and implementation, as well as benefits of using AXIGEN for this setup are all presented in detail.

Understanding the Financial Benefits of Open Source
Will open source pay off? Open source is becoming standard within enterprises, often because of cost savings. Find out how much of a financial impact it can have on your organization. Get this methodology and calculator now, compliments of JBoss.

Embedded Hardware and OS Technology Empower PC-Based Platforms
The modern embedded computer is the jack of all trades appearing in many forms.

Data Management for Real-Time Distributed Systems
This paper provides an overview of the network-centric computing model, data distribution services, and distributed data management. It then describes how the SkyBoard integration and synchronization service, coupled with an implementation of the OMG¿s Data Distribution Service (DDS) standard, can be used to create an efficient data distribution, storage, and retrieval system.

7 Advantages of D2D Backup
For decades, tape has been the backup medium of choice. But, now, disk-to-disk (D2D) backup is gaining in favor. Learn why you should make the move in this whitepaper.

 


Got a HOT tip?   please tell us!
Free weekly newsletter
Enter your email...
Click here for a profile of each sponsor:
PLATINUM SPONSORS
(Become a sponsor)
GOLD SPONSORS
(Become a sponsor)
(Become a sponsor)

ADVERTISEMENT
(Advertise here)

Check out the latest Linux powered...

Mobile phones!

MIDs, UMPCs
& tablets

Mobile devices

Other cool
gadgets



Resource Library

• Unix, Linux Uptime and Reliability Increase: Patch Management Woes Plague Windows Yankee Group survey finds IBM AIX Unix is highest in ...
• Scalable, Fault-Tolerant NAS for Oracle - The Next Generation For several years NAS has been evolving as a storage ...
• Managing Software Intellectual Property in an Open Source World This whitepaper draws on the experiences of the Black Duck ...
• Open Source Security Myths Dispelled Is it risky to trust mission-critical infrastructure to open source ...
• Bringing IT Operations Management to Open Source & Beyond Download this IDC analyst report to learn how open source ...


BREAKING NEWS

• "3G" HP netbook boasts Atom, ExpressCard expansion
• Mini-notebook chips suitable for Linux devices?
• Single-drive NAS runs ARM Linux
• Linux fast-boot add-on reviewed
• Linux NAS/iSCSI server adopts Atom
• Superscalar ARM SoC runs Linux
• "Zubuntu" keeps Zaurus spirit alive
• i.MX515 targets Linux netbooks
• Palm "Nova" Linux set for CES debut?
• German Linux integrator launches workshops
• In memorium: Thiemo Seufer
• Browser for Linux devices hits second alpha
• OpenSUSE changes licenses
• "...and I'm Linux" contest nears
• COM Express module sports Atom


Most popular stories -- past 90 days:
• Linux boots in 2.97 seconds
• Tiniest Linux system, yet?
• Linux powers "cloud" gaming console
• Report: T-Mobile sells out first 1.5 million G1s
• Open set-top box ships
• E17 adapted to Linux devices, demo'd on Treo650
• Android debuts
• First ALP Linux smartphone?
• Cortex-A8 gaming handheld runs Linux
• Ubuntu announces ARM port


DesktopLinux headlines:
• A peek at Phoenix HyperSpace
• Linux desktop gains kid-friendly browser
• OpenSUSE Community Manager discusses 11.1 release
• "...and I'm Linux" video contest approaches
• OpenSUSE rev's license, build system
• Linux gains fresh "AIR"
• Video-call software boasts HD quality
• Sun rev's "open source" desktop VM manager
• Open source music player rev's up
• Fedora 10 dubbed a "solid" chapeau


Also visit our sister site:


Sign up for LinuxDevices.com's...

news feed

Home  |  News  |  Articles  |  Polls  |  Forum  |  About  |  Contact
 

Ziff Davis Enterprise Home | Contact Us | Advertise | Link to Us | Reprints | Magazine Subscriptions | Newsletters
Tech RSS Feeds | White Papers | ROI Calculators | Tech Podcasts | Tech Video | VARs | Channel News

Baseline | Careers | Channel Insider | CIO Insight | DesktopLinux | DeviceForge | DevSource | eSeminars |
eWEEK | Enterprise Network Security | LinuxDevices | Linux Watch | Microsoft Watch | Mid-market | Networking | PDF Zone |
Publish | Security IT Hub | Strategic Partner | Web Buyer's Guide | Windows for Devices

Developer Shed | Dev Shed | ASP Free | Dev Articles | Dev Hardware | SEO Chat | Tutorialized | Scripts |
Code Walkers | Web Hosters | Dev Mechanic | Dev Archives | igrep

Use of this site is governed by our Terms of Service and Privacy Policy. Except where otherwise specified, the contents of this site are copyright © 1999-2008 Ziff Davis Enterprise Holdings Inc. All Rights Reserved. Reproduction in whole or in part in any form or medium without express written permission of Ziff Davis Enterprise is prohibited. Linux is a registered trademark of Linus Torvalds. All other marks are the property of their respective owners.