Platform: All Platforms Applies to: COMSOL Multiphysics® Versions: 5.3a

Problem Description

You compute a distributed parametric sweep on a cluster. When processing the results some values are missing or you get errors such as "The solution number is invalid" or "Solution number is out of range".


This issue is fixed in COMSOL Multiphysics update 4. Select Help > Check for Product Updates (available in the File menu on Windows) to install this software update. The update is also available for manual download in the Product Update page.

Workaround (if update is not possible)

In some cases you can avoid the problem by changing the Keep solutions in memory option in the Parametric Sweep settings to Only last. In that case, use probes to probe the data you need and enable the Accumulated probe table to activate the accumulation of probe updates for the variation on the solver level and on the parametric sweep level. Another option is to enable the Save each solution as model file. It will store the result of each parameter tuple in a separate file that you can later open for postprocessing.

Settings for the distributed Parametric Sweep

If you need all data for postprocessing you can use the Cluster Sweep functionality instead of the distributed Parametric Sweep. Replace the Parametric Sweep node with a Cluster Sweep node that has the same settings as the Parametric Sweep. Set the Number of nodes to 1 and enable the option Synchronize solutions under Batch Settings. You will also have to set up cluster computing settings. See Settings for the Cluster Computing Node described at How to run on clusters from the COMSOL Desktop environment. The same settings apply to the Cluster Sweep node. If you already have a Cluster Computing node, the Cluster Sweep node replaces the Cluster Computing node and the Parametric Sweep node.

Settings for the Cluster Sweep

Difference between a Distributed Parametric Sweep and a Cluster Sweep

If you are running COMSOL Multiphysics on a cluster and you have a Parametric Sweep in your model you typically select the Distribute parametric sweep check box in order to distribute the parameters of the sweep across the nodes. The set of parameters and parameter tuples gets distributed across the compute nodes and every compute node is processing a single parameter value or tuple or a set of parameter values or tuples. The resources (cores) assigned to this compute process are used when solving. Several parameters are solved in parallel by the compute nodes of the parallel job.

If you are using a Cluster Sweep node, a designated cluster job will be launched for every parameter or parameter tuple specified in the parameter list in the Study Settings section of the Cluster Sweep node. This configuration can also be beneficial in those cases where some parameter values are expected to give failures. If the cluster jobs are not managed by an external scheduler (e.g., by SLURM or PBS), the Number of simultaneous jobs in the Study Extensions section of the Cluster Sweep node needs to be set to the number of available nodes or to the number of parameters or parameter tuples.

If the work per parameter value or tuple is small, you can consider to run a fewer total number of cluster jobs by handling multiple parameters or parameter tuples by each cluster job. This reduces the overhead caused by starting a lot of cluster jobs. In this case you need to set up a nested Cluster Sweep and Parametric Sweep with an additional parameter in the Cluster Sweep that controls the offset or the parameterization of the parameter list handled by the (nondistributed) Parametric Sweep.

The Cluster Sweep compares to a Batch Sweep with additional settings for distributed computations. See also the blog posts How to Use the Cluster Sweep Node in COMSOL Multiphysics and The Power of the Batch Sweep for further information.