Dynamic Thermal Management for Distributed Systems
Andreas Weissel, Frank Bellosa, "Dynamic Thermal Management for
Distributed Systems", Proceedings of the First Workshop on
Temperature-Aware Computer Systems (TACS-1), Munich, Germany, June 2004
[Abstract(english)]
[Full Paper (pdf), 200 kB]
[Talk (pdf)]
Abstract:
In modern data centers, the impact on the thermal properties by increased scale
and power densities is enormous and poses new challenges on the designers of
both computing as well as cooling systems. Control-theoretic techniques have
proven to manage the heat dissipation and the temperature to avoid thermal
emergencies, but are not aware of the task currently executing or its specific
service requirements. In this work we investigate an approach to dynamic
thermal management with respect to the demands of individual applications,
users or services. We show that the energy consumption and the temperature can
be determined on a fine grained level and without the need for measurement,
using information from event monitors embedded in modern processors. We extend
the well-known abstraction of resource containers to an infrastructure for
transparent energy and temperature management in distributed systems. In a
cluster-based server, the processing of a request can be throttled to meet the
thermal requirements of the system, even if machine boundaries are crossed,
e.g. by remote procedure calls in a client/server relationship. With this
facility, energy consumption can be accounted and the resulting heat generation
be controlled precisely without the need for expensive hardware. Experiments on
a Pentium 4 architecture show that energy and temperature are accurately
determined and thermal limits for the individual CPU and the whole distributed
system will not be exceeded. Use cases and important implications of our
approach are discussed.