FWI/JMI with directional TV has been demonstrated to be a more effective method than the alternatives. We design the directional TV based on the dip field calculated from an initial image. By considering the local structural directions of the spatial gradient and their weights according to the local dip, the proposed method achieves a better result compared to FWI/JMI without regularization or with conventional TV. In the case of complex subsurface structures, the local dip map cannot be estimated properly. However, directional TV regularization is not sensitive to the accuracy of the estimated dip, because even using an arbitrary dominant direction would not be worse than using horizontal- and vertical-gradients like using conventional TV in a complex area.

In terms of the parameter selection, we choose a relaxation strategy for $\mu$, which is increasing exponentially. In this way, we relax the strength of the L1 constraint gradually to make the inversion converge. $\lambda$ is a constant which depends on the scale of the data. We can set a proper $\lambda$ to make sure around $60\% - 70\%$ of the energy is passed through the shrinkage step in Algorithm 1, in order to improve the stability of the algorithm. Regarding the weights on the dominant direction and its perpendicular direction of gradients, it depends on the accuracy of the estimated dip field and the bias of the subsurface structures. Usually, $\alpha_1 : \alpha_2 = 2 : 1$ is a safe choice. In this paper, we use $\alpha_1 : \alpha_2 = 3 : 1$ for both examples, which puts more weight on the dominant spatial direction of the velocity gradient, because the structures of the Marmousi model are quite tilted and biased.

Regarding the calculation efficiency of JMI, JMI is more cost-effective than FWI. First, it doesn't require a good initial model to start with due to its linearization; Second, it is implemented in the frequency domain and no finite-difference-based method is used, therefore the horizontal and vertical grid size do not have to satisfy a frequency dispersion condition, but are defined by the spatial Nyquist criterion. For instance, in the JMI example, the frequency range is upto $40$ Hz and the chosen horizontal and vertical grid size is $20$ m and $10$ m, respectively.