Platform: All Platforms Versions: All versions

## Problem Description

I am going to buy a new dedicated computer for running COMSOL Multiphysics®. What hardware do you recommend?

## Solution

Due to the wide range of different problem types that COMSOL Multiphysics® solves, the rapid pace of software and hardware development, and the variety of different hardware at significantly different price points, there is no single optimal choice of computer for all usage cases.

#### Memory

The single most important factor is that you have enough physical memory (RAM) to solve the largest models that you want to work with, and that the RAM is correctly installed. If you do not have enough RAM, then there will be significant slowdown, regardless of all other hardware choices.

Predicting RAM requirements is done by solving similar, but smaller, models that contain the same physics that you want to solve in your largest models. Monitor the memory used and the degrees of freedom, which are reported in the Solver Log. Fit a curve to this data of the form A x (dof)^N, where A and N are fitting coefficients and dof is the number of degrees of freedom, and use this to predict the memory requirements for your larger models. The exponent N will usually be between 1 and 2. When using multigrid preconditioning with an iterative solver, N will be closer to 1, and when using a direct solver it will be closer to 2. The factor, A, depends on the sparsity of the problem. For example, for a thermal radiation problem where the degrees of freedom are nonlocally coupled, A will be much higher than for a conductive heat transfer problem, where there are only local couplings between degrees of freedom.

Be aware that memory usage versus degrees of freedom can be very different between different model types, so you may need to repeat this procedure for every type of model that you wish to solve. You will need a computer with at least this amount of RAM. Also be aware that there is no advantage to having significantly more RAM than is actually needed. Make sure to use the fastest possible memory speed supported by the CPU that you choose.

Performance is also strongly dependent on how the memory is installed. All computers access the installed memory via a multichannel memory bus. The memory speed will be clocked down if the memory banks are not correctly populated. For example, consider a four memory-channel single-CPU computer with four memory banks (one per memory channel) and each of these banks has four open slots for a total of 16 open DIMM slots, as shown in the schematic below.

Usually, if more than two slots are used in any bank then the memory speed gets reduced, but on some systems if more than one slot per bank is used there will be a slowdown. Your hardware vendor should provide this information. So, for example, if you want to install 16GB of RAM in the above system, then install either four 4GB or eight 2GB DIMMs, and make sure that all memory banks are used. Installing four 4GB DIMMs leaves the most space for installing more RAM, and takes best advantage of the multiple memory channels. Do not install the DIMMs in such a way that some of the memory channels would be unused; this will lead to significant slowdown since some memory channels are unused. You will need to add more RAM to take advantage of all of the memory channels. This is summarized in the schematic below.

#### Other Factors Affecting Overall Software Speed

There is a complicated relationship between performance, CPU type, CPU base frequency, cache, number of CPUs, number of cores per CPU, and hardware cost. The COMSOL codebase is composed of several different classes of algorithms, and these algorithms have different scaling properties. Therefore, some hardware factors will weigh more heavily on performance than others, and the relative merits of these factors is both problem-type and problem-size dependent. It is thus very difficult to make specific hardware recommendations. The following are general recommendations.

##### CPU Type

Different CPU architectures offer different sets of features, at significantly different prices.

High-end CPUs, such as the Intel® Xeon® Gold and Platinum, or AMD® EPYC®, processors have CPU-to-CPU interconnects that enable multiple CPUs per computer, and allow the CPUs to communicate with each other to access very large amounts of memory. These processors have the highest memory bandwidth; the ability to quickly move a lot of data back and forth between RAM memory and the processor. That is their primary advantage when running COMSOL. High-end CPUs should be used in dual-CPU, or even four-CPU or eight-CPU, configurations. This is motivated if you need to address very large amounts of memory, or are planning to continuously run many simulations in parallel. When solving a single model, performance will improve with increasing number of CPUs but the relative performance improvement is dependent on model size. Larger models will see greater speedup on multi-CPU systems. If you are considering purchasing a four- or eight-CPU system, please contact COMSOL Technical Support.

Mid-range CPUs, such as the Intel® Xeon® W, or AMD® Ryzen™ Threadripper™, processors, do not have CPU-to-CPU interconnects and are thus an appropriate choice for a single-CPU computer. They do have comparable clock speeds and core counts as high-end systems. They are an attractive all-around choice.

Entry-level CPUs, such as Intel® Xeon® E processors, have two memory channels, do not have CPU-to-CPU interconnects, and cannot address as much memory. They have the lowest memory bandwidth, but can have high clock speeds. They are not as good a choice for running multiple simulations in parallel, but can often solve single models very quickly.

The above listed CPU's are current-generation processors marketed towards the professional engineering community. There are also processors that are mostly marketed towards the consumer market that share many of the same features, and can have comparable performance, usually for lower cost.

##### Clock Frequency

Higher clock frequency will generally lead to faster performance of the software in all areas. If all other hardware specifications are the the same, the relative performance between two computers will be most directly dependent on clock frequency.

##### Cache Memory

Cache memory is built directly into the processor. Higher cache is better. All other factors being equal, a higher cache machine will show better performance.

##### Number of Cores

The more cores in the processor, the more parallel threads can be executed at once, this is known as multithreading. COMSOL will automatically take advantage of all available cores, but there is a computational cost to this. Using too many cores in parallel may even lead to a slowdown, although usually only for relatively small models. Some models are even dominated by their single-thread performance. In general, six- or eight-core systems are a good all-around choice, but more cores than that can be better, especially when running multiple models in parallel, or when using the PARDISO direct solver.

#### General Recommendations

##### Parametric Sweeps

If you plan to solve for many geometric variations, different meshes, different sets of materials, or other parameters within each unique model then you will be using the Parametric Sweep functionality. For example, a sweep over 10 variations of a part dimension along with a sweep over 10 different materials and 10 different model parameters would require solving a similar model 1000 times, and the solution time when running this as a single job on a single computer will be (in the worst case) just about exactly 1000 times greater.

Solution time for sweeps over large numbers of parameters can be reduced by running jobs in parallel, either on a single computer, using any license type, or on a cluster computer, using the Floating Network License.

To solve in parallel on a single computer, use the Batch Sweep functionality. Running parametric sweeps in parallel on a single computer is only advised if all models will fit within memory at the same time. For example, if one instance of the model requires 3GB of RAM to solve, then it can make sense to run four simultaneous jobs on a 16GB RAM computer. For models with small memory requirements, you may see an improvement running as many simultaneous jobs as there are cores. The relative speedup when using Batch Sweep is both model- and hardware-dependent.

To solve Parametric Sweeps in parallel on a cluster, use the Cluster Sweep functionality. There is no limit to the number of parallel jobs that you can run at once (up to the number of of available nodes on the cluster.) You can run on your own cluster or use a third-party cluster. COMSOL maintains a list of Technology Partners who provide on-demand computing resources for cluster computations. Each node of the cluster need only meet the requirements described for running a unique model. For further guidance on cluster hardware, see Knowledge Base 1116.

Always consider if you can avoid large sweeps by using the Optimization Module.

##### OS

In versions of COMSOL Multiphysics prior to version 5.4, Linux and macOS operating systems could outperform Windows on some processors with many cores.

##### Hard Drives

Solid State Drives give overall better system performance compared to Hard Drives. Faster drives are always better, but if the system is using the drive for swap space (virtual memory) on the models you are solving, it is better to upgrade the RAM rather than to invest in faster drives.

##### Graphics

We recommend modern AMD or NVIDIA based dedicated graphics cards. A list of tested graphics cards can be found on the system requirements page. The larger the memory in the graphics card, the more complex models can be visualized. Note that just because a models require large amounts of RAM memory to solve does not necessarily mean it will require a large video card to display, and vice-versa.

##### GPUs

General-purpose computing on graphics processing units is not currently supported.