Application of machine learning algorithms in predicting pyrolytic analysis result
https://doi.org/10.32454/0016-7762-2020-63-6-8-19
Abstract
Introduction. Geochemical studies of organic matter in oil source rocks play an important role in assessing oil and gas accumulation in any territory. These studies play a particularly important role in forecasting unconventional resources and oil and gas reserves (so-called shale hydrocarbons). It is recommended to carry out pyrolytic studies by the Rock-Eval method for rocks saturated with organic matter on samples before and after their extraction with chloroform. However, extraction is a laborious and time-consuming process, and the load on laboratory equipment and the time required for analysis is doubled.
Aim. To get a working model for predicting pyrolytic parameters of extracted samples, without carrying out extraction analysis.
Materials and methods. In this paper, machine learning regression algorithms are applied for predicting one of the pyrolysis parameters of extracted samples based on the pyrolytic analysis results of the extracted and non-extracted samples. To develop the prediction model, 5 different machine learning regression algorithms were applied and compared, including multiple linear regression, polynomial regression, support vector regression, decision tree, and random forest.
Results. The prediction result showcases that the relationship between the parameters before and after extraction is complex and non-linear. Some methods have shown their incompatibility with the assigned tasks, others have shown good and satisfactory results. Those algorithms can be applied to predict all geochemical parameters of extracted samples.
Conclusions. The best machine learning algorithm for this task is the Random forest.
About the Authors
Thi Nhut Suong LeRussian Federation
Le Thi Nhut Suong — student
65 Leninskiy ave., Moscow 119991
tel.: +7 (977) 586-98-45
Competing Interests:
the authors declare no conflict of interest
A. V. Bondarev
Russian Federation
Alexsandr V. Bondarev — Cand. of Sci. (Geol.-Min.), Assoc. Prof.
Scopus ID: 56308173600
SPIN-code: 6559-1469
65 Leninskiy ave., Moscow 119991
tel.: +7 (499) 507-88-88
Competing Interests:
the authors declare no conflict of interest
L. I. Bondareva
Russian Federation
Liana I. Bondareva — senior lecturer
Scopus ID: 57209737387
SPIN-code: 1584-1518
65 Leninskiy ave., Moscow 119991
tел.: +7 (499) 507-84-32
Competing Interests:
the authors declare no conflict of interest
A. S. Monakova
Russian Federation
Aleksandra S. Monakova — Cand. of Sci. (Geol.-Min.), Assoc. Prof.
Scopus ID: 8574084700
SPIN-code: 5619-7973
65 Leninskiy ave., Moscow 119991
tel.: +7 (916) 849-57-04
Competing Interests:
the authors declare no conflict of interest
A. V. Barshin
Russian Federation
Andrey V. Barshin — lecturer
Scopus ID: 57221607978
SPIN-code: 3618-5049
65 Leninskiy ave., Moscow 119991
tел.: +7 (915) 127-32-11
Competing Interests:
the authors declare no conflict of interest
References
1. Bondarev Alexander V., Dantsova Kristina I., Barshin Andrey V., Minligalieva Liana I. Modeling maturity of organic matter in source rocks of silurian oil and gas source strata of southern urals based on statistical processing of rock-eval results // Proceedings of Gubkin Russian State University of oil and gas. 2020. № 1 (298). pp. 29—37.
2. Le Thi Nhut Suong, Bondarev A., Barshin A., Bondareva L. Application of machine learning methods for prediction of geochemical parameters of Rock-Eval analysis after chloroform extraction // Collection of reports of the IV regional scientific and technical conference “Gubkin University in solving issues of the oil and gas industry of Russia”, dedicated to the 90th anniversary of Gubkin University and the Faculty of Economics and Management. 2020. 484 p. ISBN 978-5-91961-332-9.
3. Monakova A., Osipov A., Bondarev A., Minligalieva L. Geochemical characteristics of oil-bearing rocks of the Silurian age of the southern segment of the Pre-Ural foredeep (Kuvandyk) // New ideas in Earth sciences. Materials of the XIV International Scientific and Practical Conference. 2019. Vol. 7. pp. 69—70.
4. Arlot, Sylvain and Robin Genuer. “Analysis of purely random forests bias.” ArXiv abs/1407.3939 (2014)
5. Ben-Haim Y. Information-gap theory: decisions under severe uncertainty. Academic Press, London, 2001.
6. Biau G. Analysis of a random forests model. Journal of Machine Learning Research. 2012. Vol. 13. P. 1063— 1095.
7. Chih-Chung Chang, Chih-Jen Lin. LIBSVM: a library for support vector machines, 2001.
8. Cortes C., Vapnik V. Support-Vector Networks // MachineLearning. Vol. 20. P. 273—297.
9. Lane D.M. Online Statistics Education: A Multimedia Course of Study (http://onlinestatbook.com/). Rice University.
10. Draper N.R., Smith H. Applied Regression Analysis. Wiley-Interscience. 1998.
11. Glantz Stanton A., Slinker B.K. Primer of Applied Regression and Analysis of Variance. McGraw-Hill, 1990.
12. Kamiński B., Jakubczyk M., Szufel P. A framework for sensitivity analysis of decision trees // Central European Journal of Operations Research, 2017. P. 135—159. https://doi.org/10.1007/s10100-017-0479-6. PMC 5767274. PMID 29375266
13. Karimi K., Hamilton H.J. Generation and Interpretation of Temporal Decision Rules // International Journal of Computer Information Systems and Industrial Management Applications, 2011. Vol. 3.
14. Rouaud M. Probability, Statistics and Estimation Propagation of Uncertainties in Experimental Measurement. 2013.
15. Nefedova A.S., Osipov A.V., Ermolkin V.I. The Source Rock Generation Potential of Lower Permian Artinsckian Age in Southern Part of Pre-Ural Fore Deep // In: Geomodel 2016 — 18th Science and Applied Research Conference on Oil and Gas Geological Exploration and Development. 2016.
16. Ho Tin Kam. Random Decision Forests (PDF). Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, 14—16 August 1995. P. 278—282.
17. Liaw A. and Wiener M. Classification and regression by random forest. Researchgate 2002. Vol. 2. P. 18—22.
18. Quinlan J.R. Simplifying decision trees // International Journal of Man-Machine Studies. 1987. P. 221—234. https://doi.org/10.1016/S0020-7373(87)80053-6
19. Scornet E. Random forests and kernel methods. 2015.
20. Smola A., Schoelkopf B. A tutorial on support vector regression: Tech. Rep. 1998.
21. Tissot B.P., Welt, D.H. Petroleum formation and occurrence. Springer, Berlin, 1984.
22. Yan Xin. Linear Regression Analysis: Theory and Computing, World Scientific. 2009. P. 1—2.
23. Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael Ringgaard, Chih-Jen Lin, Training and testing low-degree polynomial data mappings via linear SVM // Journal of Machine Learning Research. P. 1471—1490.
24. Zhu R., Zeng D., Kosorok M.R. Reinforcement Learning Trees // Journal of the American Statistical Association. 2015. P. 1770—1784. https://doi.org/10.1080/01621459.2015.1036994
Review
For citations:
Le T., Bondarev A.V., Bondareva L.I., Monakova A.S., Barshin A.V. Application of machine learning algorithms in predicting pyrolytic analysis result. Proceedings of higher educational establishments. Geology and Exploration. 2020;63(6):8-19. (In Russ.) https://doi.org/10.32454/0016-7762-2020-63-6-8-19