Problem statement

Greenhouse tomato producers face acute competition nowadays as Tomato has become one of the most important vegetables in Morocco. This calls for accurate control over means of production (labor, fertilizers, water, energy, …) as well as logistics and commercial aspects.

One of these aspects is the fruit harvest date which heavily impacts input usage, transportation means, and contracts with buyers. Consequently, accurately forecasting this date per monitored greenhouse would highly benefit producers and save lots of wasted productivity, fruit quality, and money.

Our objective is to develop a machine learning application based on weather data and fruit stage development to predict each fruit harvest date through monitoring sample rows in greenhouses. The implementation of this model in an IT monitoring system would help farmers get real-time feedback on each greenhouse’s estimated harvest date and act upon the returned harvest calendar.

Analytic approach

The approach is based on regressing the cumulative values of weather parameters (heat units and solar radiation) from blooming to harvest on other variables like stem position, fruit position, and accumulated heat units or solar radiation from one stage to another. The monitored stages are Blooming, Fruits setting, Fruit growth and the seven stages of coloration.

Since two weather parameters were monitored, two models were developed for the user to choose from. A model using heat units and another using solar radiation.

The used data comes from an IT monitoring system database as well as greenhouse sensors to capture weather parameters.

Data overview

Starting from blooming, it takes on average 72 days for the fruit to be harvested, which is equivalent to 528 accumulated heat units. Out of these 72 days, it takes an average of 52 days from blooming to the first coloration, an average of 9 days from first to second coloration, and 11 days distributed on the other stages.

This might indicate that the harvest date will be mainly determined by the blooming date and the first and second coloration transfer. This hypothesis will be confirmed later.