Sie sind auf Seite 1von 8

Results of parallel computing in

HYDRO_AS-2D software
Report

The objective of these tests was to determine the viability of HYDRO_AS-2D software's new
parallel computing capabilities for use in production environments. Several test cases were created
with different input parameters and simulation durations were recorded so as to get the speed-up of
the simulation for running it on a different number of threads on the CPU and on the GPGPU.
A machine with Windows 8 was used in conjuction with SMS 11.2 (Surface-water Modeling
System) software for viewing the results and setting of simulation parameters. On the hardware
side, the machine is equipped with an i7 CPU, xxGB RAM as well as a NVIDIA Tesla K40 for
general-purpose calculation on the GPU.
All the simulations where run using the provided .bat files where the environment variables where
edited for setting of the number of used cores and used GPU index.

Case 1
A prepared mesh of a part of the river Rjeina up to its mouth was used consisting of approximately
50000 elements. A mass flow inlet was setup on a nodestring at the edge of the mesh and set to be
almost constant throughout the simulation. Terrain slope was setup as the outlet on a nodestring far
enough from the mouth of the river so as to produce better results at the mouth itself.
HYDRO_AS-2D
1-step

Simulation time
4000 s

Inlet Q
3

420 m /s

Writing interval (SMS)


50 s

With running the simulation purely on the CPU, weird results were achieved where the simulation
time for a run with more used threads was higher than the simulation run which used less
designated CPU threads.

To confirm the results received when running on the GPU, another two simulations where
performed.

The conclusion is that setting the environment variable for controlling the number of threads
adversly affects the calculation time on the GPU.

To test the effect of the frequency of writing data for SMS visualization, the writing interval has
been significally increased and simulations repeated.
HYDRO_AS-2D

Simulation time

Inlet Q

Writing interval
(SMS)

1-step

4000 s

420 m3/s

4000 s

The same effect of the set number of threads on the calculation on the GPU was observed as in the
the example with lower writing intervals. A similar pattern was also noticed conscerning the
increase of the calculation time when running purely on the CPU as with lower writing intervals.

1-step Simulation time = 4000 s, writing intervals = 4000 s(SMS), 4000 (Q-strg)
4500
4000

3810

3810

3810

3500
3000
K40

2500
Simulation time

i7

2000
1500

K40

1822

1809

1754

16661747
1351

1000
500
0
0

10

12

14

Number of CPU threads

Case 2
To observe the effects of mesh size on the calculation times, an auxillary simple rectangle mesh was
created counting at 1058816 nodes.
HYDRO_AS-2D

Simulation time

Inlet Q

Writing interval
(SMS)

1-step

400 s

400 m3/s

400 s

In
this case the higher amount of set threads during the GPU calculation actually slightly affected the
simulation time, reducing it, while there was almost no change in calculation time when using just
the CPU for the simulation.

Accelerating Ansys Fluent 15.0 using Nvidia Tesla GPUs


Speed up of computational fluid dynamics simulations can be accomplished by using NVIDIA's
general purpose graphics processing units (GPGPUs) alongside CPUs.
When running ANSYS Fluent 15.0 interactively, the Parallel Settings tab in the Fluent Launcher
panel as shown in picture below allows you to specify settings for running ANSYS Fluent in
parallel. This tab is only available if you have selected Parallel under Processing Options. In this
panel, you can specify the number of CPU processes using the Processes field and specify the
number of GPUs using the GPGPUs per Machine field. It is assumed that number of GPUs on all
machines/nodes is the same.

Speed up factor is defined as total wall-clock time from GPU+CPU run over a total wall-clock time
from CPU run. The solver is based on iterative Algebraic Multi-Grid (AMG) method. In flow
problems, pressure-based coupled solver typically spends 60-70% time in AMG where as
segregated solver only spends 30-40% of the time in AMG. Higher AMG times are ideal for GPU
acceleration, thus coupled-problems benefit from GPUs.

Fluent speed up from GPU acceleration can be seen on picture below.

Water jacket analysis GPU+CPU combination speed up in terms of simulation time can be seen on
picture below.

By raising the complexity of a model larger speed up factor can be accomplished. For more
demanding models with a large number of elements and a complex mesh better results in
acceleration of a simulation are showing. On example of aerodynamics simulation of a truck body
structure Intel Xeon E5-2667 with 12 cores per node is being used as CPU and 4 Nvidia Tesla K40
accelerators per node as GPU accelerators. On picture below can be seen speed up comparison
between two simulation processes, one with 14 million cells and other with 111 million cells.

Das könnte Ihnen auch gefallen