EndurIX


		Department of Computer Science 4



		EndurIX

	Dept. of Computer Science > CS 4 > Research > PowerManagement > Projects > EndurIX

Process Cruise Control
Scalability of the core frequency is a common feature of low-power processor architectures. Many heuristics for frequency scaling were proposed in the past to find the best trade-off between energy efficiency and computational performance. With complex applications exhibiting unpredictable behavior these heuristics cannot reliably adjust the operation point of the hardware because they do not know where the energy is spent and why the performance is lost.
Embedded hardware monitors in the form of event counters have proven to offer valuable information in the field of performance analysis. We will demonstrate that counter values can also characterize the power-specific characteristics of a thread.
W propose an energy-aware scheduling policy that benefits from event counters. By exploiting the information from these counters, the scheduler determines the appropriate clock frequency for each individual thread running in a time-sharing environment. A recurrent analysis of the thread-specific energy and performance profile allows an adjustment of the frequency to the behavioral changes of the application. While the clock frequency may vary in a wide range, the application performance should only suffer slightly (e.g. with 10% performance loss compared to the execution at the highest clock speed). Because of the similarity to a car cruise control, we called our scheduling policy Process Cruise Control. This adaptive clock scaling is accomplished by the operating system without any application support.
Process Cruise Control has been implemented on the Intel XScale architecture, that offers a variety of frequencies and a set of configurable event counters. Energy measurements of the target architecture under variable load show the advantage of the proposed approach.

Process Cruise Control-Event-Driven Clock Scaling for Dynamic Power Management: Andreas Weißel, Frank Bellosa; Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES 2002), Grenoble, France,; October 2002; [Abstract] [Full Paper (pdf), 144 kB]

Memory Compression

Optimal Processor Speed for Low Power Memory Compression
Using event-driven clock scaling, we want to compare the speed and energy consumption of algorithms for memory compression running under different processor clock frequencies. Dependent on the amount of free memory, compression can be performed at reduced clock speeds to save energy. Under critical memory conditions and for decompression the processor is set to run at maximum clock speed.
Algorithms differing in compression speed and energy characteristics will be studied: e.g. the well-known Ziv-Lempel compressors or the WK algorithms introduced in [2], which are optimized for in-memory data pages.

Static Power Consumption of the Memory vs. CPU Power Consumptions for Compression
RAM modules contribute to the static power consumption of the whole system. Would it be possible to save energy if we substituted part of the systems physical RAM by a virtual memory module of approximately the same size using memory compression? The additional energy consumption by the processor when compressing/decompressing memory pages must be taken into consideration.
In contrast to the previouse presented in [2] and [3] we don't use memory compression to avoid slow disk paging: our objective lies in saving energy by replacing static power consumption of memory modules by the energy necessary for additional computing.

Hibernation of Memory Banks
State-of-the-art RAM devices support various low-power modes. Saving a significant amount of energy can be achieved by setting as many memory chips as possible into sleep state (see [4]). If more memory has to be allocated, the system can choose between two options, both inducing short delays: additional banks can be made active or part of the memory in the already active banks is compressed. The figure below shows a system with 4 memory banks under increasing memory usage. As a first step, the page allocation should cluster an application's pages into the active banks (a). As memory usage increases, part of the active banks' memory is reserved for compressed storage and less frequently used pages are moved to the compressed area (b). With still increasing memory usage more and more chips are made active resp. used for storing compressed pages (c). Memory banks are ordered corresponding to their latency (with sleeping banks showing the highest) to enable optimal page allocation.
A loss in system performance is unavoidable due to the trade-off between fast and slow (compressed) memory regions and resynchronization delays when activating memory banks. The memory management system has to minimize this loss while maximizing energy savings.

Memory Compression

[1] Frank Bellosa. Process Cruise Control: Event-Driven Clock Scaling for Dynamic Power Management
[2] Paul R. Wilson, Scott F. Kaplan, and Yannis Smaragdakis. The Case for Compressed Caching in Virtual Memory Systems. In Proceedings of the USENIX Technical Conference (June 1999), USENIX.
[3] Michael J. Freedman. The Compression Cache: Virtual Memory Compression for Handheld Computers, March 2000
[4] Alvin R. Lebeck, Xiaobo Fan, Heng Zeng, Carla Ellis. Power Aware Page Allocation. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IX), November 2000.



Imprint Privacy	Last modified: 2004-06-28 11:35 AW