Modeling Physics With Artificial Intelligence and Machine Learning

Domain experts who have a good understanding of artificial intelligence (AI) and machine learning (ML)—and have become expert practitioners of this technology—can be successful in modeling physics- and engineering-related problems purely based on data. When the data used for such modeling is generated using mathematical equations, the data-driven AI-based model is called a “smart proxy.” When the data used for such modeling is generated using field measurements, the data-driven AI-based model is called a “data-driven model.”

Combining these two techniques is not a good idea. Here is the reason: When data-driven models are developed on the basis of field measurements, ML algorithms are trained to model the physics of the phenomena that are of interest. This practice provides a unique set of characteristics that is the main reason behind building data-driven models as an alternative to the traditional mathematical formulation of the physics. This unique set of characteristics is the avoidance of biases, preconceived notions, gross assumptions, and problem simplifications. These characteristics are usually a function of several things, including the scientists’ current understanding of the physics of the phenomena that is the subject of the modeling. Another characteristic of data-driven models is a much shorter time of development, therefore requiring fewer resources from an operating company.

Smart-proxy modeling has its own set of characteristics that can contribute significantly to the use of numerical simulation and modeling, including numerical reservoir simulation as well as computational fluid dynamics. Combining data-driven models with smart-proxy modeling (or any other technique that uses mathematical equations to generate data) undermines those characteristics of the data-driven models.

Terms such as “physics-based data-driven modeling;” “physics-based AI;” “data physics;” “augmented AI;” and, specifically, “hybrid model” seemingly are used when the initial attempts to build data-driven models that are purely based on experience (field measurements) end up not being successful. Therefore, in order to overcome the failure of building data-driven models to represent physics- and engineering-related problems accurately, the developers try to combine mathematical equations with the field measurements. The question that should be asked from such individuals is, “Would you still be interested in building hybrid models if you are able to build accurate, and explainable, data-driven models?”

People may use these terms for several reasons. Some use these terms to satisfy the traditionalists in our industry, and some use them only for marketing purposes. Those who find it too difficult to discuss the use of field measurements for data-driven modeling with the traditionalists in our industry may find this approach to be an easy solution. Using terminology that includes the word “physics” may help convince traditionalists that they are using physics in addition to data to accomplish their objectives. Those who this terminology as a marketing tool, however, have very little interest (or scientific knowledge, for that matter) in the correctness and validity of such terminology.

Finally, some people may try to solve a problem that includes such little data (field measurements) that using ML to train a model is almost impossible. In some specific cases, getting help from the known physics in the form of equations may make sense; however, this does not apply to reservoir modeling in any way, or even to a large number of oil- and gas-related problem solving. This has been applied to certain geophysics-related inverse modeling. These are mainly technical and scientific exercises and have nothing to do with marketing schemes.

When it comes to such applications in petroleum data analytics, specifically reservoir engineering, reservoir modeling, and reservoir management, people have been applying the idea of hybrid models (including all the other terms that were mentioned previously) to both conventional and unconventional plays. While the application of these approaches is inappropriate for conventional plays, they are far more unrealistic and completely nonscientific when they are applied to unconventional plays. Data-driven models can be the most important contributors to reservoir modeling and reservoir management in unconventional plays. Unfortunately, the understanding of AI and ML technology in our industry has not been deep and solid enough to take advantage of such a contribution, at least not until now. Let us hope that, eventually, our industry will figure out how AI and ML should be applied to unconventional plays.

Shahab D. Mohaghegh, a pioneer in the application of artificial intelligence and data mining in the exploration and production industry, is the president and chief executive officer of Intelligent Solutions Inc. and professor of petroleum and natural gas engineering at West Virginia University. He holds BS, MS, and Ph.D. degrees in petroleum and natural gas engineering. Mohaghegh has authored more than 150 technical papers and has been an SPE Distinguished Lecturer. He has been featured in the Distinguished Author Series of JPT four times. Mohaghegh is a founder of SPE’s Petroleum Data-Driven Analytics Technical Section.


Don't miss out on the latest technology delivered to your email monthly.  Sign up for the Data Science and Digital Engineering newsletter.  If you are not logged in, you will receive a confirmation email that you will need to click on to confirm you want to receive the newsletter.