PROCESSING AND CLEANING OF AGRICULTURAL PRODUCTIVITY DATA: APPLICATION OF A PYTHON SCRIPT

Authors

  • Raphael Prazeres da Silva Author
  • Welington Gonzaga do Vale Author
  • Janyelle do Nascimento Silva Author
  • Valfran José Santos Andrade Author
  • Patricia de Azevedo Castelo Branco do Vale Author
  • Adilson Machado Enes Author
  • Diego Andrade Pereira Author

DOI:

https://doi.org/10.56238/edimpacto2025.015-011

Keywords:

Agricultura de precisão, Análise de dados, Mapa de produtividade

Abstract

A script written in Python was developed to process and clean agricultural yield data from grain harvesters on a farm located in Brasnorte (MT), aiming to improve data reliability in Precision Agriculture. The code, using the Pandas library, followed three main steps: (1) filtering by machine operation status (retaining only “Effective” records), (2) removal of outliers (values <500 kg/ha or >twice the average), and (3) iterative adjustment of machine-specific yield values to match the field average. The cleaned data were interpolated in QGIS using the IDW method. The results showed that 58.8% of the raw data were discarded in Field 1 and 66.9% in Field 2, mainly due to failures or zeroed sensors. Yield averages increased from 2.67 t/ha to 3.67 t/ha (Field 1) and from 2.52 t/ha to 3.82 t/ha (Field 2), with the elimination of extreme values. The generated maps highlighted critical zones near field edges and data gaps. The results suggest that the tool efficiently automates data cleaning, though future studies should consider including cross-validation to reinforce the reliability of the results.

Published

2025-07-03