Click here to learn
about this Sponsor:
Home  |  News  |  Articles  |  Polls  |  Forum

Keywords: Match:
Design of a Fully Preemptable Linux Kernel
George Anzinger and Nigel Gamble, MontaVista Software     (September 6, 2000)

Introduction

MontaVista has developed a hard real-time fully preemptable Linux kernel, based on Linux kernel 2.4. The preemptable kernel has the potential to dramatically improve application responsiveness of the Linux kernel, while fully preserving the standard Linux programming model. The current prototype of the preemptable Linux kernel (for IA32/X86 platforms) is available for ftp download, at ftp://ftp.mvista.com.

Preemption model

The preemption model used is to allow the kernel to be preempted at any time when it is not locked. This is very different from the model where preemption is actively requested by the currently executing code. Using this model, when an event occurs that causes a higher priority task to be executable, the system will preempt the current task and run the higher priority task. Of course, there are times when this should not be done. These include:
  • While handling interrupts

  • While doing "bottom half" processing. Bottom half processing is work that an interrupt routine needs to do, but which can be done at a more relaxed pace.

  • While holding a spinlock, writelock, or readlock. These locks were put in the kernel to protect it from other processors in Symmetric Multiprocessing (SMP) Systems. While these locks are held, the kernel is not preemptable for reentrancy or data protection reasons (just as in the SMP case).

  • While the kernel is executing the scheduler itself. The scheduler is charged with executing the "best" task and if it is engaged in making that decision, it should not be confused by asking it to give up the processor.
At all other times the MontaVista algorithm allows preemption. Also, whenever the system exits from one of the above states, a test is made to see if preemption is called for. If so, the current task is preempted. (Please see the final paragraph below for important comments on this attribute.)

MontaVista's effort to bring preemption to Linux focused on two changes.

First the basic interrupt entry and completion code "entry.S" and its helpers was modified to ensure that it never returns to a user with a pending soft interrupt, context switch, or signal delivery. It was also modified to never return to system code with a pending soft interrupt, or allowed context switch pending. Here "allowed" means that the preemption lock count is zero. The preemption lock count is incremented whenever a spinlock, writelock, readlock, interrupt or trap is taken and decremented when ever these conditions clear.

The second change centered on modifying the header files for spinlock, writelock, readlock, and some SMP code to produce code that modified the preemption count.

The scheduler was changed only slightly to directly prevent its own preemption and to honor an additional "state" TASK_PREEMPTING flag. This flag is set whenever preemption is taken and tells the scheduler that the task is to be treated as running, even though its actual state may be other than running. This allows preemption to occur during wake_up set up times when the kernel sets the current tasks state to something other than running and then does other set up work on the way to calling the scheduler. By using the TASK_PREEMPTING flag on the state (the underlying state is preserved) the scheduler can distinguish between the preemption call and the completion of the wake_up set up. Without this flag, the task could be put to sleep prior to completion of the wake_up setup, and thus would never wake_up.

Continuing development

MontaVista is continuing to work on preemption and is currently looking at the following:
  • Change the spinlock, writelock, and readlock macros that also disable interrupts to not bother with the preemption counter. The interrupt system being off prevents preemption, so the flag is not needed.

  • Set up the spinlock, writelock, and readlock macros so that both SMP and preemption can be turned on at the same time. There is no reason an SMP system cannot be preemptable.

  • Measure the impact on throughput of a variety of workloads. Note that under combined I/O and CPU intensive workloads, throughput may actually improve.

  • Do timing analysis on the system locks.

  • Using the above analysis for short locks, consider using the interrupt system as a lock, instead of the preemption counter. This will reduce system overhead.

  • Again, using the above analysis, consider ways to reduce the longer times. This would most likely entail using mutex locks to protect the regions, thus blocking only code that needed the same function, rather than the whole processor. These mutex locks would be based on priority-inheriting binary semaphores. (Priority-inheriting semaphores are designed such that a task holding such a semaphore runs at, or above, the priority of the highest priority task waiting for the semaphore. This prevents a low priority task from being blocked by priority while it is holding a semaphore that a higher priority task wants.)
Over the long term, MontaVista is investigating whether preemption locks can be eliminated (or at least greatly reduced in number) by protecting all the short-duration critical regions with spinlocks that also disable interrupts on the local CPU, and the long-duration critical regions with mutex locks. ("Long duration" means much greater than the time taken by two context switches.) This will reduce the overhead of the preemptable kernel, since there will no longer be any need to test for preemption ("polling for preemption") at the end of a preemption-locked region (which could happen tens of thousands of times per second on an average system). Instead, preemption would happen automatically as part of the interrupt servicing that causes a higher-priority process to become runnable ("event-driven preemption"). Typically, this only happens a few times to a few tens of times per second with an average system workload, making the "event-driven preemption" model much more efficient than the "polling for preemption" model. This method also has an added efficiency in that the system will take advantage of the cache disruption caused by the interrupt (which is unavoidable) to continue with the preemption.

Related stories:
   MontaVista unveils fully preemptable Linux kernel prototype
   MontaVista unveils transparent real-time scheduler for Linux
   The Linux Real-Time Characterization Project
   MontaVista Partners with FSMLabs for RTLinux


(Click here for further information)


7 Advantages of D2D Backup
For decades, tape has been the backup medium of choice. But, now, disk-to-disk (D2D) backup is gaining in favor. Learn why you should make the move in this whitepaper.

4 Legal Reasons to Control Internet Access
The Internet is obviously a valuable resource for many organizations. However, many are exposed to legal liability concerns because they fail to control Internet access. Learn if you're safe in this white paper.

Rapidly Resolve J2EE Application Problems
Whether you are in the process of building J2EE applications or have J2EE applications already running in production, you must ensure that they deliver the expected ROI. Learn how in this white paper.

Load Testing 2.0 for Web 2.0
There are many unknowns in stress testing Web 2.0 applications. Find out how to test the performance of Web 2.0 in this white paper.

Build Better Games Online
For the game infrastructure providers, life is complex. Making money from games has become more complicated. Why? Find out in this white paper.

Building a Virtual Infrastructure from Servers to Storage
This white paper discusses the virtual storage solutions that reduce cost, increase storage utilization, and address the challenges of backing up and restoring Server environments.

Gaining Faster Wireless Connections with WiMAX
Welcome to what is quickly becoming the hyperconnected world where anything that would benefit from being connected to the network will be connected. Learn more in this white paper.

Is Your Desktop a Security Threat?
The new wave of sophisticated crimeware not only targets specific companies, but also targets desktops and laptops as backdoor entryways into those business’ operations and resources. Learn how to stay safe in this white paper.

Increasing SAN Reliability by 100 Percent
Storage area networks (SAN) are a strong part of storage plans. Learn how to increase your reliability and uptime by 100 percent in this case study.

 


Got a HOT tip?   please tell us!
Free weekly newsletter
Enter your email...
Click here for a profile of each sponsor:
PLATINUM SPONSORS
GOLD SPONSORS
(Become a sponsor)

ADVERTISEMENT
(Advertise here)

Check out the latest Linux powered...

mobile phones!

other cool
gadgets
HOWTOs: from DevShed & IBM DeveloperWorks:



BREAKING NEWS

• Balanced XIP and the AXFS filesystem
• ARM-based controller has StackableUSB expansion
• Intel multi-core threading library supports Sun Studio
• Multi-protocol wireless mesh gateway runs Debian
• Embedded Linux is doomed. DOOOMED!
• Article explores Linux hot-patching utility
• Sun demo's RIA tools on Android
• Panel PCs support PoE, Linux
• New Palms on horizon?
• Embedded track added to Linux conference
• Ubuntu ported to a PDA
• Linux-friendly SBC hosts dual quad-core Xeons
• Embedded Linux SQL database adds Java, C# bindings
• Lightweight, embedded graphics framework rev'd
• Shock resistant automation computer runs Linux


Most popular stories -- past 30 days:
• Ubuntu ported to ARM
• Linux still top embedded OS
• Linux gains new architecture support
• Linux 2.6.25 release bolsters ARM
• Linux-based diskless notebook costs under $300
• Low-cost MP3 player gains fancy Linux port
• Dutch UMPC runs Ubuntu Linux
• Mini-notebook boasts Linux, near-fullsize keyboard
• PC/104 module runs x86 Linux on 1.85 Watts
• Low-cost Linux-based NAS device supports RAID 1
• Free router distro gains wild WiFi features


Linux-Watch headlines:
• Running a small business on desktop Linux
• Sun launches OpenSolaris
• Via tiptoes toward openness
• Linux certification comes to Italy
• Installing Ubuntu Hardy Heron as a web hosting server
• Black Duck Buys Koders
• Open source conference co-locates with Ubuntu show
• Ubuntu 8.04 ready to challenge Windows
• KDE Linux reaches 52 million Brazilian kids
• Free x86-based Linux router distro rev'd


Also visit our sister site:


Sign up for LinuxDevices.com's...

news feed

Home  |  News  |  Articles  |  Polls  |  Forum  |  About  |  Contact
 
Use of this site is governed by our Terms of Service and Privacy Policy. Except where otherwise specified, the contents of this site are copyright © 1999-2008 Ziff Davis Enterprise Holdings Inc. All Rights Reserved. Reproduction in whole or in part in any form or medium without express written permission of Ziff Davis Enterprise is prohibited. Linux is a registered trademark of Linus Torvalds. All other marks are the property of their respective owners.