Journal of Technology: Summer 2022 additional articles
07/04/2022 | 03:43am EDT
The Aramco Journal of Technology
Gas Injection Optimization under Uncertainty in Subsurface Reservoirs: An Integrated Machine Learning Assisted Workflow
Xupeng He, Dr. Hussain Hoteit, Marwah M. AlSinan and Dr. Hyung T. Kwak
Gas injection in subsurface reservoirs is of significant interest to the petroleum industry for the enhanced
oil recovery (EOR) process. There exists geological uncertainty in the subsurface due to the limited
measurements. Optimization under such uncertainty is, therefore, required to make more robust oper-
ational decisions to achieve maximum EOR with a minimum risk of early breakthrough.
This work introduces an integrated machine learning assisted workflow for the optimization under
uncertainty in subsurface reservoirs. The proposed workflow includes three steps: (1) Training sample
generation. We first identify the uncertain parameters, which affect the objective of interests. We then
generate the input designs using Latin Hypercube Sampling (LHS) based on the identified uncertain
parameters. High fidelity simulations based on the MATLAB Reservoir Simulation Toolbox (MRST)
are run for each of the input designs to obtain the objective of interests as outputs. (2) Surrogate model
development. A data-driven surrogate model is then built to model the nonlinear mapping between the
input and output results from Step 1. Herein, the Bayesian optimization technique is implemented to
obtain the surrogate model. (3) Optimization under uncertainty. We first conduct a blind test on the
proposed surrogate model with high fidelity simulations. Followed by Monte Carlo to perform the un-
certainty quantifications and a genetic algorithm (GA) to conduct the optimization.
This work introduces an efficient, robust, and accurate machine learning assisted workflow for gas
injection optimization under uncertainty in subsurface reservoirs. To our best knowledge, this approach
is applied for the first time.
Subsurface reservoirs exhibit geological uncertainty since we have a limited number of measurements. Such uncertainty can make the decision making process very challenging. Gas injection problems often feature more complex physics because of the significant contrast in compressibility and density compared to water injection cases. This work takes gas injection into oil reservoirs as an example, and introduces an efficient and robust workflow for optimization under uncertainty using machine learning techniques.
Recent advances in machine learning have inspired many applications in the petroleum industry. Examples include fracture recognition from outcrops, upscaling of discrete fracture models, fracture permeability estimation, multicomponent flash calculation, and carbon dioxide leakage rate forecasting1-6. Their studies demonstrate that provided with large amounts of high quality data sets and optimal network hyperparameters, this technology is competitive to traditional approaches in terms of accuracy and efficiency.
The abovementioned applications correspond to four network architectures, respectively: (1) U-Net for image-to-image problems, (2) convolutional neural network for image to value problems, (3) artificial neural network (ANN) for value-to-value problems, (4) long short-term memory (LSTM) for time series problems. The ANN model, with time as input also could deal with time series problems, yet honors simplicity and efficiency compared to LSTM7, 8. In this work, the surrogate model developed by ANN will replace the expensive high fidelity simulation model.
The proposed workflow is comprised of three steps: (1) training sample generation, (2) surrogate model develop- ment, and (3) optimization under uncertainty.
We consider injecting gas into oil reservoirs as an example. The continuity equation and Darcy's law are applied to govern the corresponding physics, in which the continuity equation of phase, α, is expressed as:
where Q is the sink/source term, is velocity, p is density, S is saturation, is porosity, and t is time.
The Aramco Journal of Technology
Darcy's law models the velocity as:
where kr is the relative permeability, µ is viscosity, is the absolute permeability tensor, p is pressure, g is gravity acceleration, and z is the depth. We constrain phase saturations by using the following equation:
Sg + So = 1
and relate two pressures by capillary pressure
(denoted by Pc) function:
Pc(Sw) = Pg - Po
The time-dependent oil recovery factor (denoted by RFo), as one objective of interest, is given by:
Fig. 1 The layered cake permeability distribution and well placement.
where QP is production rate measured under standard conditions, Bo is the oil formation volume factor, Vb is the bulk reservoir volume, and Soini is the initial oil saturation. The other objective of interest refers to the time when gas breakthrough occurs (denoted by tbreak).
We implement the first Society of Petroleum Engineers benchmark as the high fidelity simulation model. It is a live oil/dry gas black oil model with nearly immobile water. The model is initially undersaturated with
uniform mixture of water (Swini = 0.12) and oil (Soini = 0.88) with no initial free gas (Sgini = 0). We assume a constant dissolved gas-oil ratio throughout the model. The model resembles a three-layer cake configuration, in which each layer is assumed to be homogeneous and isotropic. Future work will address significantly hetero- geneous and anisotropic cases. The geological uncer- tainty is represented by changing the permeability value and corresponding porosity value for each layer. In this study, the high fidelity simulation is solved by a fully implicit black oil solver within the MATLAB Reservoir Simulation Toolbox (MRST) framework. A detailed numerical implementation can be found in Lie (2019)9.
We consider a base case, in which permeability values of 2,000 mD, 200 mD, and 800 mD are assigned to each layer with a constant gas injection rate of 100 million standard cubic feet per day (MMscfd). The oil production rate is fixed before breakthrough and then converts to constant borehole pressure, i.e., bottom-hole pressure, after breakthrough. Other parameters and their corresponding values are provided in Lie (2019)9 and Odeh (1981)10.
Figure 1 illustrates the layered cake permeability distribution and diagonally opposite well placement. Figure 2 shows the gas saturation with the increasing time for the base case.
Although the high fidelity simulation model provides the most accurate methodology for capturing the physics, it suffers from intensive computation costs. Multiple
Fig. 2 The gas saturation with increasing time. We observe the typically gravity dominated flow behavior, i.e., gas tending to migrate upwards, due to the significant contrast in density between the gas and oil.
simulation runs are required for applications such as optimization under uncertainty, making it infeasible for practical engineering purposes, e.g., quick decision making. It integrates knowledge of sampling techniques, machine learning, uncertainty quantification, and multi-object optimization.
We will detail the workflow in the following three steps, Fig. 3.
Training Sample Generation
We first identify the uncertain parameters that impact the objective of interests, i.e., EOR and breakthrough time. With these identified uncertain parameters, various input designs are generated using Latin Hypercube Sampling (LHS). The implementation of LHS guarantees data samples are distributed in a space filling manner
The Aramco Journal of Technology
Fig. 3 The proposed workflow includes three steps: (1) training sample generation, (2) surrogate model development, and (3) optimization under uncertainty.
(a) Training Sample Generation
(b) Surrogate Model Development (c) Optimization Under Uncertainty
(e.g., fracture network,
injection rate, K, etc.)
Latin Hypercube Sampling
Objective of Interests
instead of a clustering manner2, 3, 11.
Table 1 summarizes the identified uncertain parameters, including geological and operational (in light blue). Their corresponding ranges are collected from the literature, in which range of permeability is adopted and modified from the North Sea fields12. Correlation between porosity and permeability is modified from Chen and Pawar (2019)7, in which we reduce the exponential coefficient to account for a bigger range of permeability values. The range of gas injection rates are collected from various projects in the North Sea, and example cases from MRST9, 13. We assume all uncertain parameters to be independent with uniform distributions except for porosity. The corresponding values of porosity are calculated based on permeability values using the correlation in Table 1. (Note: K1, K2, and K3 correspond to permeability values of the first, second, and third layer, respectively.)
High fidelity simulations based on MRST9 are run for each input design to generate the corresponding objective of interests as output. We then collect the inputs and
outputs to be read for training the surrogate model.
Surrogate Model Development
This step strives to build a data-driven, physics featuring surrogate model to map the nonlinear relation between the inputs and outputs obtained earlier. Figure 4 shows the implemented ANN architecture with one input layer, various hidden layers, one output layer. The time term (in red) is added into the input layer to capture time-dependent problems. Three key elements, including the ratio of training to validation samples, proper network architecture, and optimal weights and biases, are critical for obtaining a successfully surrogate ANN model. Obviously, choosing appropriate hyperparameters related to these three elements is challenging.
The traditional approach of tuning hyperparameters based on trial and error is exhaustive and labor intensive. As an alternative, Bayesian optimization is implemented to automate the tuning process in this work. A detailed description of Bayesian optimization could be found in Frazier (2018)14.
Table 1 The identified uncertain parameters and corresponding ranges and distributions.
Ø = 0.082 × K0.2
Gas Injection Rate (Qinj)
The Aramco Journal of Technology
Fig. 4 The implemented ANN architecture with one input layer, various hidden layers, and one output layer.
Input Hidden Output
Attention should be paid to the coupled training validation process regarding overfitting and gradient vanishing issues. The overfitting issue takes place with a considerable number of epochs. The gradient vanishing issue occurs when choosing deep neural networks.
Optimization under Uncertainty
In this step, we perform gas injection optimization under these geological uncertainties to achieve maximum EOR while maintaining the minimum risk of early breakthrough.
We further validate the developed surrogate model using various blind cases. The following parameters are used to evaluate its performance.
APE: The average of prediction errors (PE) between the predicted (denoted byMP ) and ground truth (de- noted by MG) solutions. M refers to the objective of interests, i.e., recovery factor or gas breakthrough time.
PPE: The percentage of PEs within an acceptable error margin - herein, 10%.
• RMSE: The root-mean-square error of PEs.
where N is the total number of points.
If values of these three parameters are within certain ranges, it means the developed surrogate model passes
the blind test and could be implemented as a fully trusted surrogate. Otherwise, we need to retrain the ANN model by increasing the number of samples or adjusting the ratio of training to validation samples. This process is repeated until the trained surrogate model passes the blind test.
We then perform Monte Carlo simulations based on the fully trusted surrogate model to explore uncertainty propagation behaviors. The corresponding responses are grouped in the way from which the probabilistic forecast of percentiles, P10, P50, and P90, are quantified. We increase the number of runs until P10, P50, and P90 values tend to be stable. The uncertain ranges provide a rough estimation of the objective of interests under the ranges of uncertain parameters, previously listed in Table 1.
In this work, the genetic algorithm (GA) is performed to a multi-object optimization problem - maximum EOR while minimum risk of breakthrough - to determine the optimized gas injection rate. We then deploy the Pareto front to seek a compromise between the oil recovery factor and gas breakthrough time.
The key to the proposed workflow is guaranteeing a fully trusted surrogate model. More rigorous measures are therefore required to assure the robustness of the proposed workflow.
Results and Discussion
Following the first two steps previously discussed, we generate 50 input designs based on LHS and run the high fidelity simulations using MRST. We select 20 time steps from each simulation and have a total of 1,000 data samples for the Bayesian optimized training validation process. Table 2 summarizes the optimal ratio of training to validation samples, the Bayesian optimized ANN architecture and hyperparameters, and model evaluation performance. For illustration purposes, we only show oil recovery as an example. Gas breakthrough time follows
The Aramco Journal of Technology
Table 2 The optimized ANN architecture, ANN hyperparameters, and model performance.
Four hidden layers with 4, 7, 8, and 6 neurons, respectively.
the same way and will not be detailed.
As observed in Table 2, the Bayesian optimized ANN architecture features four hidden layers with 4, 7, 8, and 6 neurons on each layer. The optimal ratio of training to validation samples is 7:3. The optimized model achieves accuracy exceeding 95% on both the training and validation samples with PPE of 97% and 99%, respectively.
Overfitting issues typically occur - good predictions, i.e., small errors, on training data, yet poor performance for validation samples, Fig. 5. The model performance in terms of validation shows an optimum epoch, before and after which two statuses are underfitting and over- fitting, respectively.
Another issue we need to pay attention to is the gradient vanishing issue, as weight and biases will not be updated with a very long network structure. According to the chain rule, the gradient term (highlight in blue) in Eqn. 10 is the multiplication of many derivative terms. If with very deep neural networks, the blue term is almost zero, which leads to weights not being updated. Therefore, proper network architecture is also crucial for obtaining
a successfully trained model.
Figures 6a and 6b illustrates the diagonal plots between
Fig. 6 The diagonal plots showing the surrogate vs. ground truth predictions for
(a) training, and (b) validation samples.
Fig. 5 The epoch vs. MSE of the model.
This is an excerpt of the original content. To continue reading it, access the original document here.
Saudi Aramco - Saudi Arabian Oil Company published this content on 03 July 2022 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 04 July 2022 07:42:01 UTC.