next up previous [pdf]

Next: Conclusions Up: Modeling Examples Previous: Modeling Examples

3D examples

We test the 3D multi-GPU implementation by computing impulse responses for 3D elastic media with different TI symmetries: isotropic, VTI, HTI and orthorhombic. Each example again uses the isotropic parameter set of $ v_p$ =2.0 km/s, $ v_s=v_p/\sqrt{3}$ , and $ \rho=2000$ kg/m$ ^3$ , but incorporates different Thomsen anisotropy parameters. We define our VTI medium by $ [\epsilon_1,\delta_1,\gamma_1] = [0.2, -0.1, 0.2]$ , our HTI model by $ [\epsilon_2,\delta_2,\gamma_2]=[0.2,-0.1,0.2]$ , and our orthorhombic medium by $ [\epsilon_1,\epsilon_2,\delta_1,\delta_2,\delta_3,\gamma_1,\gamma_2]=[0.2, 0.25,-0.1,-0.05,-0.075, 0.2, 0.5]$ . These parameters are transformed into stiffness tensor values using appropriate transformation rules (Thomsen, 1986). Figures 9(a)-(d) present a color-coded representation of the 6x6 $ C_{ij}$ Voigt representation of stiffness matrix $ c_{ijkl}$ for the isotropic, VTI, HTI and orthorhombic models used in the 3D impulse response tests, respectively.

ISOc VTIc HTIc ORTc
ISOc,VTIc,HTIc,ORTc
Figure 9.
Elastic stiffness moduli in 6x6 Voigt notation for four elastic models with TI different symmetry. (a) Isotropic. (b) VTI. (c) HTI. (d) Orthorhombic.
[pdf] [pdf] [pdf] [pdf] [png] [png] [png] [png] [scons]

We model seismic data on a $ N_x \times N_y \times N_z=204^3$ mesh at uniform $ \Delta x=\Delta y=\Delta z=0.005$  km spacing, assuming a 35 Hz Ricker wavelet stress source that we inject in each wavefield component. Figures 10(a)-(d) present the 3D impulse responses for the vertical component ($ u_z$ ) for isotropic, VTI, HTI and orthorhombic media, respectively. Again, the GPU modeled wavefields for each TI medium are as expected when compared to results from the corresponding CPU code (not shown).

ISOw-3d-GPU VTIw-3d-GPU HTIw-3d-GPU ORTw-3d-GPU
ISOw-3d-GPU,VTIw-3d-GPU,HTIw-3d-GPU,ORTw-3d-GPU
Figure 10.
3D Impulse responses ($ u_z$ shown) for four elastic models with different TI symmetry. (a) Isotropic. (b) VTI. (c) HTI. (d) Orthorhombic.
[pdf] [pdf] [pdf] [pdf] [png] [png] [png] [png] [scons]

Figure [*] presents performance metrics for a number of different cubic ($ N^3$ ) model dimensions. Figure [*] shows the runtimes for four different ewefd3d implementations: eight-core CPU (green line), single GPU (blue line), two GPUs with MPI communication within a single consolidated node (red line), and two GPUs with P2P communication within a single consolidated node (magenta line). Each reported runtime number is the mean value of ten repeat trials. The speedup metric shown in Figure [*] documents up to a 16$ \times$ improvement over CPU benchmarks when using a single GPU device (blue line), and up to 28$ \times$ improvement when using two GPU devices and P2P communication (magenta line). Generally, we observe increasing speedups when moving to larger model domains. Future multi-GPU tests will determine where this trend levels off. Figure 11(c) presents the P2P versus MPI speedup benchmark. We note that the MPI-based communication has a 10-15% overhead cost, which is expected due to the time required for repeatedly writing to a pinned memory location during the numerous MPI Send/Receive transfers required at each time step; this effect, though, diminishes with increasing model size. Note that the test results are benchmarks for a single consolidated node, and that in a true distributed compute environment where GPUs are located in networked nodes, network bandwidth and latency will have a significant impact on total compute time.

Runtime3 Speedup3 P2PvMPI
Runtime3,Speedup3,P2PvMPI
Figure 11.
Performance metrics showing the mean of ten trials for various cube ($ N^3$ ) model domains using the ewefd3d code. (a) Computational run time for CPU (green line), a single GPU (blue line), two GPU with MPI communication (red line), and two GPU with P2P communciation (magenta line). (b) Speedup relative to CPU for single GPU (red line), and two GPUs with MPI (blue line) and P2P communication (magenta line). (c) Relative speed for the P2P versus MPI communication.
[pdf] [pdf] [pdf] [png] [png] [png]

Our last test illustrates the utility of the 3D FDTD code for modeling wavefields through 3D heterogeneous anisotropic elastic media. Figures 12(a)-(b) present the $ C_{44}$ coefficient showing a layered earth model with a single dipping interface for the isotropic and HTI media, respectively. We superposed a snapshot of the propagating wavefield to demonstrate the complexity caused by the HTI media relative to an isotropic medium. The evident differences between the two wavefields indicates the importance of modelling realistic 3D anisotropic behavior, especially for velocity and anisotropic parameter estimation applications.

ISOwfld ANISOwfld
ISOwfld,ANISOwfld
Figure 12.
3D impulse responses calculcated through a layered earth model with a single dipping layer. (a) Isotropic. (b) HTI.
[pdf] [pdf] [png] [png] [scons]


next up previous [pdf]

Next: Conclusions Up: Modeling Examples Previous: Modeling Examples

2013-12-07