The second synthetic example presented here is CUDA-based -RTM for the Marmousi model. Figures 5a and 5b show its velocity and models, which contains a high-attenuation zone with a low value. The model has 234 nodes with a sampling interval of m in depth and 663 nodes with a sampling interval of m in the horizontal direction. In the observation system, 60 sources are distributed laterally with a shot interval m, and each shot has 301 double-sided receivers with a maximum offset of 1500 m. The point source is a Ricker wavelet with a dominant frequency Hz. The synthetic seismic data are modeled by the PSM with time interval s, and the records last 2 s.
Figure 6 shows the migrated image using conventional RTM from acoustic data (Figure 6a) and viscoacoustic data without compensation (Figure 6b), and -RTM from viscoacoustic data (Figures 6c and 6d), respectively. The acoustic imaging result shown in Figure 6a serves as a reference for comparison. Due to the presence of a high-attenuation zone, the imaging result of the structure beneath high-attenuation zone shown in the blue frame in Figure 6b exhibits attenuated amplitudes and blurred structures. The attenuation also severely affects the migrated image of the anticlinal structure, shown in the green frame in Figure 6b below the unconformity. Figures 6c and 6d show compensated images from -RTM using conventional low-pass filtering and the proposed adaptive stabilization scheme. The compensated images exhibit a clear anticlinal structure and recovered amplitudes compared with the non-compensated image. For another comparison, Figure 7 shows migrated seismic traces, which are selected arbitrarily at three distances of 1500 m, 3600 m and 5200 m from the imaging results shown in Figure 6. From these traces, one find that the compensated traces match well with the reference traces. It indicates that the developed cu-RTM package is capable of improving imaging quality.
The strong scaling plot shows how the execution time decreases with an increasing number of computing resources. During large-scale imaging, the proportion of computational time spent to simulate wave propagation mandates that the solver must be efficient and scale well. In this regard, 60 shots of -RTM are evenly distributed among every GPU card with the number of GPUs (Tesla K10) varying from one to six. We record scheduling runtime and computational runtime during every test and present them in Table 3. Figure 8 shows the results of a strong scaling test of cu-RTM on the Marmousi model. It demonstrates that very close to ideal efficiency can be achieved with a balanced load on each GPU. Thus, the code package exhibits excellent scalability and can be run with almost ideal code performance, in part because communications are almost entirely overlapped with calculations.
Figure 5. (a) Velocity and (b) of the Marmousi model.
Figure 6. Migrated images of the Marmousi model using (a) conventional RTM from acoustic data, (b) conventional RTM from viscoacoustic media without compensation, (c) -RTM using low-pass filtering and (d) -RTM using adaptive stabilization scheme.
Figure 7. Migrated seismic traces selected at three distances of (a) 1500 m, (b) 3600 m and (c) 5200 m from migration results shown in Figure 6.
|The number of GPUs||1||2||3||4||5||6|
|Manipulational Runtime (s)||7.62||10.07||10.52||11.02||11.41||11.92|
|Computational Runtime (s)||2639.29||1329.40||889.82||672.78||539.21||449.97|
|Total Runtime (s)||2646.91||1339.50||900.34||683.80||550.62||461.89|
Figure 8. Strong scaling for cu-RTM on the Marmousi model using multiple Tesla K10 GPUs. Speedup ratios are plotted against the number of GPUs. The model has 234 nodes with in depth and 663 nodes in the horizontal direction.