Data analytics applied to the analysis of petroleum production in Brazil / Análise de dados aplicada à análise da produção de petróleo no Brasil


  • Alessandra Brito Leal Brazilian Journals Publicações de Periódicos, São José dos Pinhais, Paraná
  • Thiago Rafael da Silva Moura



Data Science, Business Intelligence, Petroleum Production


We mine the set of data provided by the ANP (Agência Nacional do Petróleo e Gás - National Oil and Gas Agency), of petroleum production and distribution in Brazilian territory. We use modern data science techniques to collect, analyze, treat and model hydrocarbon production data from all production units operating in the period from February 2009 to 2020. We highlight the high production of hydrocarbons in the Brazilian territory related to the performance of Petrobras, responsible for about 95% of Brazilian production. We report the discovery of an apparent paradox: the Tupi field presents the highest daily production, however it is not the largest national producer, a position that belongs to the Marlim field, yet we present the data analytics techniques that we use to solve this paradox.


Big Data Now: Current Perpectives from O’Reilly Radar. 2012. Sebastopol, CA:O’Reilly Media..

TechAmerica Foundation.: Demystifying Big Data: A Practical Guide To Transforming The Business of Government. 2012. TechAmerica Foundation, Washington.

Laney, D.. 3-D Data Management: Controlling Data Volume, Velocity, and Variety. 2001. META Group Res Note 6, Stamford.

Sakr, S.. Big Data 2.0 Processing Systems: A Survey. 2016. Springer Publishing Company,Incorporated.

Moniruzzaman, A., Hossain, S.. 2013. NoSQL database: new era of databases for big data analytics-classification, characteristics and comparison. Int. J. Database Theory Appl. 6(4).

B.B. Gupta , Dharma P.. 2019. Handbook of Research on Cloud Computing and Big Data Applications in IoT. Agrawal. Editora:Engineering Science Reference.

Social Media Today. Last accessed 09 Feb 2021.

Statista. Last accessed 09 Feb 2021.

Baktagul I., Azamat N., Andrey S, & Ainur S. 2020. The Practice of Moving to Big Data on the Case of the NoSQL Database, Clickhouse. H. A. Le Thi et al. (Eds.): WCGO 2019, AISC 991, pp. 820–828.

Karimuzzaman, M., Islam, N., Afroz, S. et al. 2021. Predicting Stock Market Price of Bangladesh: A Comparative Study of Linear Classification Models. Ann. Data. Sci. 8, 21–38.

Clements M.P., Smith J.. 2001. Evaluating forecasts from SETAR models of exchange rates. J Int Money Finance 20(1):133–148.

Moussa, W., Bejaoui, A. & Mgadmi, N.. 2020. Asymmetric Effect and Dynamic Relationships Between Stock Prices and Exchange Rates Volatility. Ann. Data. Sci..

Gallagher, R.J., Frank, M.R., Mitchell, L. et al. 2021. Generalized word shift graphs: a method for visualizing and explaining pairwise comparisons between texts. EPJ Data Sci. 10, 4.

Kassarnig, V., Mones, E., Bjerre-Nielsen, A. et al. 2018. Academic performance and behavioral patterns. EPJ Data Sci. 7, 10.

Fan, J., Han, F. & Liu, H.. 2014. Challenges of big data analysis. National science review 1, 293–314.

Clarke P., Coveney P.V., Heavens A., Jaykka J., Joachimi B., Karastergio, A., Konstantinidis N., Korn A.,

Mann R., McEwen J., Ridder Sd., Roberts S., Scanlon, T., Shellard E. & J. Yates. 2016. Big Data in the physical sciences: challenges and opportunities.

Kebede, G. 2010. Knowledge management: An information science perspective. International Journal of Information Management - 30, 416–424, Elsevier.

Laudon, K.C. & Laudon, J.P.. 2012. Managemen Information Systems - Managing the digital firm. 12. ed. London: Pearson Prentice Hall,.

Chen, H. H.. 2005. The working environment and changing role of corporate librarians in Taiwan. Journal of the American Society for Information Science and Technology, 56(11), 1227–1236.

Blair, D. C.. 2002. Knowledge management: Hype, hope, or help? Journal of the American Society for Information Science and Technology, 53(12), 1019–1028.

Bawden, D.. 2001. The shifting terminologies of information. Aslib Proceedings, 53(3), 93–98.

Bawden, D.. 2007. Organised complexity, meaning and understanding: An approach to a unified view of information for information science. Aslib Proceedings, 59(4/5), 307–327.

PROVOST, F.; FAWCETT, T.. Data Science for Business. What you Need to Know About Data Mining and Data-Analytic Thinking. California: O’Reilly Media, 2013.

Becerra-Fernandez, I.; Sabherwal, R.. 2006. ICT and knowledge management systems. In D. Schwartz (Ed.), Encyclopedia of knowledge management. Harrisburg, Pennsylvania: Idea Group Publishing, pp. 230–236.

Beijerse, R. P.. 1999. Questions in knowledge management: Defining and conceptualizing a phenomenon. Journal of Knowledge Management, 3(2), 94–109.

HAIDER, M.. 2015. Getting Started with Data Science: Making Sense of Data with Analytics. Indianápolis: IBM Press.

Belkin, N.J.. 1978. Information concepts for information science. Journal of Documentation, 34(1), 55–85.


MICROSOFT Power BI. Business Intelligence. Microsoft Corporation. 2020. <> Last accessed 27 Set 2020.

FERRARI, A.; RUSSO, M.. 2016. Introducing Microsoft Power BI. Washington: Microsoft Press.

Silva, F.R.. 2018. Business Intelligence Clinic: Creat and Learn. Independently published.

O’ Conor, E.. 2018. Microsoft Power BI Dashboards Step by Step. Microsoft Press; 1ª edition.

Gil, R.. 2018. Collect, Combine, and Transform Data Using Power Query in Excel and Power BI. Microsoft Press; 1ª edition.





Artigos originais