Viscoacoustic modeling on a layered model

In the first synthetic example, we perform viscoacoustic modeling on a multi-scale layered model with a single GeForce GTX760 GPU and a single core of Intel Core i5-4460 CPU for speedup comparison. As is shown in Figure 3, the scale of these layered models varies from $128\times 128$ to $2048\times 2048$ grids. We record the mean runtime per time step of viscoacoustic modeling using a single CPU core and a single GPU at each model scale, and their corresponding speedup ratio, which are presented in Table 2. CPU-based simulation is compiled by GNU C++ compiler (g++ 4.8.4) with FFTW 3.3.2. GPU-based simulation is compiled by CUDA C with the CUFFT library API. Figure 4 shows the mean runtime per time step and the corresponding speedup ratio against model scale. It indicates that the presented cu$Q$-RTM package running on a single GPU card can nearly be 50-80 times faster than the conventional CPU implementation with a single CPU core. Furthermore, simulation on a larger model scale tends to achieve a greater speedup ratio.

Fig3_v
Fig3_v
Figure 3.
Velocity models for multi-scale layered model.
[pdf] [png] [scons]


Table 2: The mean runtime per time step of viscoacoustic modeling using a single GTX760 GPU relative to a four-core Intel Core i5-4460 CPU and the corresponding speedup ratio against model scale.
Model Scale (grids) 128 $\times$ 128 256 $\times$ 256 512 $\times$ 512 1024 $\times$ 1024 2048 $\times$ 2048
CPU Runtime (ms) 9.7170 43.5925 101.3938 359.0682 1855.8382
GPU Runtime (ms) 0.1839 0.8195 1.8262 6.3267 22.3263
Speedup Ratio 52.8385 53.1940 55.5217 56.7544 83.1234

Fig4_v
Fig4_v
Figure 4.
The mean runtime per time step of viscoacoustic modeling using a single GTX760 GPU relative to a four-core Intel Core i5-4460 CPU and the corresponding speedup ratio against model scale.
[pdf] [png] [scons]


2020-04-03