Understand System Level Thermal Resistance

Understand System Level Thermal Resistance

From George, CEO Celsia: Today’s post on understanding thermal resistance is from Dr. Ross Wilcoxon, a thermal management veteran working for Rockwell Collins as a research and development engineer. He specializes in mechanical packaging of electronics used in harsh environments.

If any of you would like to become a guest blogger, please contact me at gmeyer@celsiainc.com.

From Dr. Wilcoxon:

System Level Thermal Resistance – Understand the Problem Before Trying to Solve It

Unless your electronics happen to be in an unusual place, such as on a satellite or on a down hole drilling head, it will be air cooled. Even systems that are liquid cooled ultimately dump heat to the ambient air from a radiator. If you look at things from a high enough level, this convective system cooling is pretty straightforward. The overall system-level cooling thermal resistance, i.e. the thermal resistance between a system’s external surfaces and the inlet cooling air) can be expressed as:

R_system-level = R_convection + R_latent, where:

R_convection = 1/hA, with h = the convection coefficient and A = surface area

R_latent = 1/mc_p, where m is the mass flow of cooling fluid and c_p is the fluid’s specific heat

OK, for you marginal purists who think that the definition of R_latent should have a 2 in the denominator, I tend to agree – but it depends a bit on your system design requirements. For you real purists who think that the whole equation is bogus and we need to use a heat exchanger analysis to be truly accurate, again I agree. But the preceding equations are the easiest to use for this discussion and are often close enough for electronics cooling applications. Keep in mind that the system level thermal resistance does not include any internal thermal resistances (the temperature gradients associated with getting heat transfer from the power dissipating components and the external surface of the system); it only accounts for transferring heat from some external surface of the system to the surrounding air.

There are a couple things to note about the parameters in the overall thermal resistance equation. First of all, since they are all in the denominators of the equations, if we want to reduce the thermal resistance we have to increase one or more of them. We can increase the convection coefficient by making the flow more turbulent and, in some cases, by increasing the air velocity; we can increase the surface area by adding fins; more flow corresponds to a higher mass flow rate (as does using higher density fluid); and a fluid with a larger specific heat will increase cooling capacity. So, for example, cooling with high humidity air will have slightly higher specific heat than dry air – but it will have a even more slightly lower density, which slightly offsets the already small improvement.

More importantly the overall thermal resistance is the sum of two different resistances. So it is important that the contribution of each thermal resistance be understood in order to ensurethat attempted improvements actually accomplish something. If the convective thermal resistance is 10C/W and the latent thermal resistance is 1C/W, adding more fans to increase the flow rate of air won’t do much of anything to reduce the overall thermal resistance. OK, it might help a little bit because higher flow rates may lead to somewhat higher convection coefficients, but the improvement won’t likely be huge. On the other hand, if your system doesn’t have a sufficient flowrate, putting in a bigger heat sink will lead to less than stellar improvements.

Understanding the contributions of the individual thermal resistances provides guidance on where to focus your efforts to improve a design. If the overall thermal resistance is dominated by convection, there may be opportunities to reduce fan speed and therefore the noise. Or if the latent thermal resistance is more significant in a design with some margin, a smaller heat sink could reduce system weight without degrading thermal performance. Maybe these suggestions are obvious, but it just seems worth pointing out that it is a good idea to estimate the magnitude of both terms in the system thermal resistance before embarking on some attempt to improve it.

That is, of course, assuming that you have control over both terms; sometimes one of the terms may be constrained by other factors. For example, commercial avionics receive a specific amount of airflow from the aircraft’s cooling system based on their power dissipation. Under normal operation, a system will receive sufficient airflow such that the air’s exhaust temperature is ~15C above the inlet temperature – regardless of what the system power dissipation is.

Keeping a big picture perspective on the influence of mass flow on temperature rise can extend beyond just the simple thermal resistance equation that I showed. Sometimes, things that are done to improve the thermal resistance for a given sub-system can lead to it actually getting hotter if one doesn’t keep track of the overall system. I saw an example of this a few years ago when I was working on a prototype radio for an airborne demonstration. The radio was mounted in a chassis that was placed in an auxiliary rack of an electronics pod on a military aircraft. The pod included a primary electronics system that used chilled air that was supplied by a refrigeration system built into the pod. Once the cooling air flowed through that primary system, it then flowed through the auxiliary rack before leaving the pod. The air flow was split more or less equally between the three slots in the auxiliary rack; for our demonstration two of the slots were empty.

There was considerable concern about our radio getting too hot in this demonstration. It had actually not been designed for use in this type of application; it really was just a lab prototype that was pressed into service for a field demonstration. Everyone involved was intent on improving the radio’s thermal management. So I didn’t think too much of it when one of our electrical engineers mentioned that he had helped to rebalance the airflow in order to improve the cooling in our slot in the pod.

A week later I was pulled into a panicked telecon because the system was shutting down due to high temperatures during the field demonstrations. Everyone had had such high hopes for this test since the airflow had been rebalanced. I eventually smartened up enough to ask exactly what was meant by ‘rebalancing’ the airflow. It turned out that it consisted of putting tape across the exits of the two empty slots in the auxiliary rack to force all of the cooling air to go through the slot that held our radio. That sounded fine until one actually thought about it. Since the three slots were in parallel at the end of the flow channel, blocking two of the slots increased the total pressure drop through the system and thereby reduced the flow rate that the fan could generate.

The primary electronics system in the pod probably dissipated ten times power than our radio. So any reduction in the flow going through the system would lead to a significant increase in the temperature of the air as it left the primary electronics system – which meant that our radio was being cooled by hotter air. In terms of the system level thermal resistance, the increase in latent resistance could overwhelm any reduction in convective resistance associated with the higher flow velocity in one slot of the auxiliary rack. I recommended getting rid of the tape on the other slots but, while it did help somewhat, that still wasn’t enough to get things to run cool enough. Heroic measures were performed by the guys at the test to keep the radio running (that’s probably a topic for another essay), and they made it through the day with sufficient data and a reasonably happy customer.

As it turned out, when things were torn down at the end of the day, someone discovered a small washer had been left in the excessively thick film of thermal grease between the radio and the mounting plate. This had apparently happened when our radio was connected to its mounting plate (by someone at the company that we were teamed with on this program). Once that washer (as well as most of the thermal grease) was removed, the thermal performance of the assembly considerably improved. With the improved, non-washer configuration, the system met thermal performance requirements – and the detrimental effects of the ‘rebalanced’ flow configuration would have likely been more noticeable.

There are probably a couple morals in that story, the first of which being that if you have so much thermal grease that you are misplacing washers in it, that is probably going to be a problem. Applying too much thermal grease is a) apparently a characteristic that is embedded in human DNA; people inherently believe that if a little grease is good then a lot of grease must be great, and b) is a topic suitable for a stand-along blog entry/rant that goes beyond the system-level perspective that was the goal of this entry.

The main point relevant to that system-level perspective is that one should keep in mind that faster, or even more, air flow, does not necessarily lead to better cooling. Same for a bigger heat sink. It’s important to remember that, when you do something to reduce one of the two thermal resistances that make up the system level resistance, you want to avoid doing something stupid that increases the other one.

The challenge in that is knowing enough about your system to be able to judge the magnitude of the two thermal resistances. Specific heat of air is the easy one: just assume it is 1kJ/kg K. You should be able to make a rough estimate of the heat transfer area if you know anything about the geometry of your system. For the convection coefficient, you can just SWAG numbers of 10 and 50 W/m2K for free and forced convection respectively. Flow rate is a bit tougher – if you know what the maximum flow rate of the fan is, assume you are running at half that. Or if you have an actual system and can measure an exhaust temperature (and know your power dissipation), you can estimate the mass flow rate by m = Q/[c_p (T_exit – T_inlet)]. That should work for either free or forced convection. All these numbers are likely to be fairly inaccurate, but that’s OK since the goal is not to nail down precise numbers. Instead,the main objective is to compare the magnitudes of the two thermal resistances to determine if one of them is dominating the other and should receive more attention in any attempts to improve the thermal management.


Please contact us if you’d like to learn more about how Celsia can help with your next heat sink project. We’ve worked on everything from consumer devices to industrial test equipment that require heat sinks to cool anywhere from a few watts to a few kilowatts.

Vapor Chambers & Heat Pipes Cool Performance FBDIMMs

Vapor Chambers & Heat Pipes Cool Performance FBDIMMs


In a recent post, I talked about the use of fans and micro-thin heat pipes to cool smartphones. Today, I’d like to take you through a project Celsia tackled a number of years ago; cooling performance DDR3 ‘gamer’ memory modules using ultra-thin vapor chambers. This example should serve to illustrate the design advantages of vapor chambers over heat pipes as well as to comment on how consumer perception affects product success.

Whether it’s the CPU, graphics card or memory modules, PC gamers are notorious for demanding products that push performance limits. What’s better than being able to seamlessly run a taxing application in a higher resolution, fire an extra round of ammunition before your opponent, or having the bragging rights to the coolest looking gaming rig around. Nothing!

Mushkin approached us several years ago about creating a memory module cooler that helped gamers build this kind of machine. For most of us, bare-naked modules are just fine. Gamers’ speed requirements drive them to seek the fastest, most stable components which are then overclocked. The added heat generally requires more robust thermal solutions where every degree counts. In the case of performance memory, the most common solution is a simple aluminum spreader as seen below.

Memory Module with Aluminum Heat Spreader (Source: Kingston)

Memory Module with Aluminum Heat Spreader (Source: Kingston)

 

Attempting to further cool these modules, many of the top companies added one or more heat pipes. While visually aggressive, these coolers rely on the aluminum side spreaders to transport heat to the two phase device, limiting their thermal transport efficiency. Later models tried to solve this problem using flattened heat pipes that made direct contact with the heat source, although they appreciably increased the overall thickness of the module and didn’t entirely cover each FBDIMM. Moreover, both of these solutions increased the height of the heat sink to the point where they couldn’t be used with larger, more extravagant CPU coolers – memory slots are typically located very close to the CPU.

Heat Pipes for Cooling Performance Memory (Source: Apacer & OCZ)

Heat Pipes for Cooling Performance Memory (Source: Apacer & OCZ)

Performance Memory and CPU Heat Sink

Large CPU Heat Sink with FBDIMMs Installed (Source: overclock.net)

 

Celsia’s challenge was to design and manufacture a low profile solution (x,y,z dimensions) that increased both heat spreading and dissipation while still allowing the use of any CPU heat sink and the ability to populate every memory slot. Since the mid-2000’s, we had been working on perfecting the thermal efficiency and mass production yield rates of one-piece vapor chambers. These devices differ from heat pipes in that they can be made thinner while still allowing adequate vapor flow. They differ from the traditional two-piece vapor chamber designs due to both lower cost and reduced thickness. In either case, this new solution needed to outperform existing competitive offerings while hitting cost targets.

We modeled and turned around our first prototype within a few weeks. A couple of design tweaks later, we arrived at a solution which used two 1.5mm thick vapor chambers and TIM material sandwiched between ribbed aluminum spreaders with a small vertical fin stack. Mushin attached this heat sinks to their top of the line memory and named them “eVCI Coolers” (enhanced vapor chamber interface – oh those marketing folks). In addition to meeting all the design and performance criteria, it was the first time vapor chambers had been used to cool FBDIMMs.

Celsia Designed Vapor Chamber Memory Cooler

Celsia Designed Vapor Chamber Memory Cooler

Two 1.5mm Vapor Chambers per Module

Two 1.5mm Vapor Chambers per Module

 

Mushkin tested this heat sink against bare modules, in real-world gaming scenarios, in order to highlight the benefits. The temperature of modules without a heat sink was 43.7 degrees C above ambient while those using eVCI heat sinks measured 22.7 degrees C above ambient. In spite of achieving some really stellar thermal performance figures against many FBDIMM heat sink styles, these modules were not received well by the market. Perhaps because they were thermally cool but not visually cool, gamers opted for a more in-your-face design. Maybe elaborate heat pipe solutions just looked like they’d cool better. Market mysteries! But, it might serve as a lesson to thermal engineers who design for consumers whose devices are viewed as an extension of themselves and who revel in technical achievement as well as visual appeal.

This project is a good example of how ultra-thin vapor chambers can be used in space constrained environments. Power and power densities were low enough in this application to allow for a custom mesh wick to be used while a one-piece vapor chamber design kept the cost down. If you’d like to learn more about how Celsia can help with your next heat sink project, please contact us. We’ve worked on everything from consumer devices to industrial test equipment that require heat sinks to cool anywhere from a few watts to a few kilowatts.