Fracturing/pressure pumping

Production Analysis Couples Multivariate Statistical Modeling and Pattern Recognition

The purpose of this paper is to apply multivariate statistical modeling in conjunction with geographic-information-systems (GIS) pattern-recognition work to the Eagle Ford.

jpt-2014-03-prodanalysisfig2.jpg
Fig. 2—Map showing the three areas of this study.

Approximately 5,000 horizontal Eagle Ford wells have been completed in south Texas. Still, geologists and engineers question whether their companies are using the most appropriate operating practices. Multivariate statistical analysis of larger data sets offers sound interpretation across larger geographic areas, with the provision that correlations need to be scaled to local conditions. The purpose of this paper is to apply multivariate statistical modeling in conjunction with geographic-information-systems (GIS) pattern-recognition work to the Eagle Ford.

Introduction

With approximately 5,000 horizontal Eagle Ford wells now completed, large sets of public and proprietary data are available for production and completion-/stimulation-optimization studies. However, the processes of gathering data, quality control, learning what questions to ask of the data sets, and learning how to ask these questions in robust statistical fashion contain many challenges. Data sets involving large well counts contain many variables that are not ideally distributed or have missing or bad values.

This work uses particular data-mining methods, particularly GIS mapping and boosted-tree regression modeling, to attempt to overcome some of the challenges with the available data sets to better understand the effect of key well, completion, and stimulation parameters on productivity and production efficiency.

In this work, the Eagle Ford formation of south Texas was divided into three major producing areas that were subsequently studied with mapping techniques and that were individually modeled by use of boosted trees.

Formation, Study Area, and Goals

The Eagle Ford formation is a Late Cretaceous sedimentary-rock formation that underlies much of south Texas. The rocks are mainly organic-matter-rich fossiliferous marine shales of the Lower Eagle Ford interval. The play extends over an area of approximately 11 million acres overall, and the main body of the play stretches from the Texas/Mexico border to the eastern borders of Gonzales and Lavaca counties (Fig. 1). The northern part of the play (highlighted in green) is in the oil-maturation window, and, in addition to producing crude oil, it also contains lesser amounts of natural gas and natural-gas liquids (NGLs). Situated to the south and southeast of the oil window, the wet-gas region (highlighted in yellow) produces gas along with high volumes of NGLs. The southernmost region (highlighted in red) contains mostly dry natural gas. Because oil and NGLs command a higher price than natural gas, producers have mostly focused on extracting the formation’s oil and NGL resources.

jpt-2014-03-prodanalysisfig1.jpg
Fig. 1—Map showing general study area of Eagle Ford production, south Texas.

 

The goal of this study is to apply GIS and multivariate statistical data-mining methods to Eagle Ford data sets in order to focus on the effects of well location, well architecture, completion, and stimulation on production results in three geographical areas (Fig. 2 above).

Data Sources, Quality Control, and Methodology

The data sets forming the basis of this work are taken from both public and proprietary sources. Basic well header data, location, key well dates, producing formation, actual directional surveys, well-test and -treatment data, and monthly production-stream values are from subscription to a commercial database. Well-treatment reports internal to the authors’ company were also used to sanity check the public stimulation data, particularly fluid volumes and proppant quantities, for those wells treated by the authors’ company. Data were gathered initially, and additional quality-control checks were performed as appropriate. Suspect data values were flagged to limit their influence on study results.

Well-architecture data included completed lateral length (CLAT), well azimuth, and well dip angle. CLAT was calculated as measured depth (MD) of the bottom perforation or sleeve minus MD of the top perforation or sleeve. Average azimuth calculations were taken from the actual directional survey in the completed lateral section of the well. Well dip angle is averaged from the actual directional survey over the completed section of the lateral.

The well-stimulation-treatment data analyzed in this study focused on generic fracturing-fluid type and volume and on proppant type and mass. Stage-by-stage data were aggregated to the well level for analysis. As part of the data-validation process, public treatment data, even within the same data model, were cross checked for internal consistency, and inconsistencies were flagged.

Production-Trend Analysis

Different hydrocarbon types in the Eagle Ford may be characterized by their cumulative gas/oil ratio (GOR). Fluid types evolve basinward from black oil to volatile oil, to condensate, and finally to dry gas and vary with increasing formation depth, pore pressure, gravity in °API, and thermal maturity (Fig. 3). Eagle Ford production is found at depths between 4,000- and 14,000-ft true vertical depth. Porosity ranges from approximately 6 to 11%, and pressure gradient ranges between 0.5 and 0.8 psi/ft.

jpt-2014-03-prodanalysisfig3.jpg
Fig. 3—Types of Eagle Ford shale fluids characterized by GOR.

 

Special care was taken when calculating GOR and selecting wells for inclusion in or exclusion from the analysis. For the purposes of this paper, wells having GOR values greater than 15,000 scf/bbl were excluded from the study. The high GOR values were excluded because the study was focused primarily on liquids production. Indeed, better production rates were observed in each study area when GOR values were less than 5,000 scf/bbl.

Boosted-Tree Regression Models

Considering the complex nonlinear nature of the data set, the authors adopted a tree-boosting method—specifically, gradient boosting—for the purposes of this study. This powerful machine-learning method generates a sequence of simple decision trees. Each tree is built upon the prediction residuals of the preceding tree, eventually producing a predictive model in the form of a weighted additive ensemble of simple trees. Compared with the traditional multivariate modeling methods, this method of tree boosting is more resistant to common data issues such as missing data and outliers. It also handles nonlinearity and variable interactions well because of the hierarchical structure of decision trees.

Results and Discussion

Maximum-Oil-Rate Models. When multiple variables influence the target variables simultaneously, the goal is to learn which ones are the key drivers. The relative influence is essentially a weighted average of how frequently a variable is used for splitting trees, with higher values on the influence plot suggesting stronger effects on the target variable. For producing Eagle Ford wells in Area 1, GOR stands out as the most influential predictor, followed by proppant amount; X, Y location; and CLAT. The remaining variables are somewhat less influential.

As another output from the boosting model, partial-dependence plots show the marginal effect of the chosen variables on the target variable. The partial-dependence plot of proppant quantity shows that increased proppant quantity is generally associated with increased productivity, at least up to the maximum 8-million-lbm treatments shown in the data set. Well location is also a key driver, as is CLAT, with increased peak oil rate being associated with longer laterals over the range of approximately 3,000 to 6,000 ft.

Efficiency Models. In addition to production performance, operators are often concerned about production efficiency for economic analysis. To investigate a measure of well efficiency, boosting models were constructed on production over horizontal-well length. Plots reveal that CLAT appears to be negatively associated with production efficiency [i.e., longer CLAT was related to lower barrels of oil (BO)/CLAT]. The production efficiency is highest when CLAT is less than 2,000 ft. Wells with GOR between 1,000 and 5,000 scf/bbl showed better production efficiency. Optimum well azimuth, well location, and larger fracture-treatment sizes (up to 8 million lbm) also have positive effects on production efficiency.

Conclusions

  • Many variables in well architecture, completion, stimulation, and production are not normally distributed (e.g., they may be strongly skewed or bimodally distributed). For these data sets, boosted-tree regression models may serve better than more-classical multiple linear regression models.
  • Well location in each of the three study areas is a strong predictor of productivity. Location is interpreted to carry with it a relatively systematic variation in fundamental reservoir parameters such as matrix and system permeability, reservoir pressure, thickness, and reservoir-fluid viscosity.
  • GOR is also a strong predictor of productivity across the range of GOR values studied (<15,000 scf/bbl) and when the maximum-oil-rate production metric is the target in all three study areas. GOR is also a strong predictor of BO/CLAT in Areas 1 and 3 but was not found to be significant in Area 2. GOR is interpreted to be a reasonable proxy for reservoir-fluid viscosity.
  • Within the range of lateral lengths represented in the data sets, longer laterals resulted in higher total production rates per well.
  • Within the range of lateral lengths studied, longer laterals were less efficient (i.e., they were associated with lower BO/CLAT values). Whether this relates to economic efficiency is beyond the scope of the study.
  • Based on the wells studied, the effect of stage count on Eagle Ford BO production appears much less significant than the effect of treatment size.
  • Larger treatments using more proppant were associated with better productivity (maximum-oil-rate model) in all study areas.
  • Larger treatments were associated with improved 
  • BO/CLAT in Area 1 but not in Areas 2 and 3. 

This article, written by Special Publications Editor Adam Wilson, contains highlights of paper SPE 168628, “Application of Multivariate Statistical Modeling and Geographic-Information-Systems Pattern-Recognition Analysis to Production Results in the Eagle Ford Formation of South Texas,” by Randy F. LaFollette, SPE, Ghazal Izadi, SPE, and Ming Zhong, SPE, Baker Hughes, prepared for the 2014 SPE Hydraulic Fracturing Technology Conference, The Woodlands, Texas, USA, 4–6 February. The paper has not been peer reviewed.