Slow cadastral surveys
Official declarations arrive 6โ18 months after harvest โ too late for yield forecasting, subsidy control or market intelligence.
AgriSpec AI is an automated machine-learning pipeline that classifies crop types for the current agricultural season by learning the multi-temporal spectral signatures โ NDVI, EVI, NDWI, SWIR and thermal bands โ captured by Sentinel-2 and Landsat-8 across the previous cropped season.
01 โ The problem
Official declarations arrive 6โ18 months after harvest โ too late for yield forecasting, subsidy control or market intelligence.
Generic land-cover maps (e.g. ESA WorldCover) mix fallow, pasture and 10+ crops under a single class.
A snapshot in July cannot separate maize from sorghum, or winter wheat from barley โ phenology is required.
Sowing windows and rotation patterns are shifting year on year; static look-up tables decay quickly.
02 โ Core hypothesis
Each crop carries a near-identical "phenological fingerprint" across years โ a recurring curve of green-up, peak, senescence and harvest. A model trained on a full season of Sentinel-2 composites generalises to the next season's observations without any in-season labels.
x = [NDVI(tโ), EVI(tโ), NDWI(tโ), โฆ, B11(tโโ)] โ โยฒยณหฃโธ y_t = crop class โ {1, โฆ, 17} X_{t-1}, y_{t-1} (labelled LPIS / ground truth) p(x_t | y_t) โ p(x_{t-1} | y_{t-1}) for stable climate years 03 โ Methodology
From raw Sentinel-2 L1C tiles to a per-parcel crop label โ fully unsupervised at inference time, fully reproducible.
Sentinel-2 L2A (10 m) and Landsat-8/9 (30 m) are streamed from a STAC catalogue. Only scenes with < 20 % cloud cover over the AOI of interest are kept.
Sen2Cor โ BOA reflectances, SCL mask applied. Temporal interpolation with Whittaker smoother reconstructs a continuous 5-day composite stack per parcel.
NDVI, EVI, NDWI, GCVI, LAI-proxy and tasselled-cap wetness/greenness/brightness are derived band-wise across 23 time steps.
A 1D-CNN + bidirectional GRU encoder compresses the 23 ร 8 cube into a 128-d latent vector per parcel โ the transferable signature.
Domain-adversarial training (Gradient Reversal Layer) on previous-season labels aligns the latent space so current-season embeddings cluster identically.
Soft-attention classifier outputs class probabilities + a per-parcel confidence flag. Parcels below 0.65 are routed to a human-in-the-loop queue.
04 โ Model architecture
The encoder is intentionally small โ 380 k parameters โ so it can be retrained per AOI on commodity GPUs in under 40 minutes. The adversarial head is what makes the model transfer across seasons.
05 โ Empirical results
Lower-bound 0.65, upper-bound 0.97.
Top confusable class pairs (rows โ predicted).
06 โ Training data
The model is trained on 1.2 M labelled parcels spanning 4 cropped seasons (2020-2023) across 9 European countries. Labels come from LPIS / GSAA declarations harmonised to the LUCAS taxonomy.
07 โ Honest limits
Performance drops below 0.5 ha โ mixed pixels dominate the spectral signal.
Drought or flood years weaken the cross-season transfer; per-AOI fine-tuning is then required.
Parcel identity is assumed stable; mergers and splits need a re-segmentation step upstream.
Open notebooks, pre-trained encoders and a 100 k-parcel demo dataset are available for replication and extension.