PROCESSING AND CLEANING OF AGRICULTURAL PRODUCTIVITY DATA: APPLICATION OF A PYTHON SCRIPT
DOI:
https://doi.org/10.56238/arev7n7-037Keywords:
Precision agriculture, Data analysis, Yield mapAbstract
A script written in Python was developed to process and clean agricultural yield data from grain harvesters on a farm located in Brasnorte (MT), aiming to improve data reliability in Precision Agriculture. The code, using the Pandas library, followed three main steps: (1) filtering by machine operation status (retaining only “Effective” records), (2) removal of outliers (values <500 kg/ha or >twice the average), and (3) iterative adjustment of machine-specific yield values to match the field average. The cleaned data were interpolated in QGIS using the IDW method. The results showed that 58.8% of the raw data were discarded in Field 1 and 66.9% in Field 2, mainly due to failures or zeroed sensors. Yield averages increased from 2.67 t/ha to 3.67 t/ha (Field 1) and from 2.52 t/ha to 3.82 t/ha (Field 2), with the elimination of extreme values. The generated maps highlighted critical zones near field edges and data gaps. The results suggest that the tool efficiently automates data cleaning, though future studies should consider including cross-validation to reinforce the reliability of the results.