At a time when data science and machine learning were still emerging fields, we faced the challenge of accurately and efficiently measuring the chemical composition of mushroom compost — a highly heterogeneous material that traditionally required labour intensive, manual chemical testing.
I set out to develop a scalable, intelligent solution that could automate and accelerate chemical analysis while maintaining exceptional levels of accuracy and adaptability across varying client requirements and sample types.
I integrated spectral data (captured through molecular vibrations using near-infrared (NIR) spectroscopy) with laboratory reference results to build and validate a predictive model. Applying early machine learning and chemometric techniques, I delivered a full development lifecycle — including data preprocessing (noise reduction, normalization), multivariate calibration, cross-validation, and operational deployment. Simultaneously, I re-engineered laboratory workflows to reduce sample variability, enhancing measurement standards and boosting the reliability of the model.
The deployed system consistently predicted unknown samples within ±1% accuracy, achieving strong generalization across all clients. Through the implementation of this predictive NIR system and the refinement of laboratory process, I reduced the operational days from five to just one per week (reserved only for periodic model updates). This transformation not only dramatically increased operational efficiency and reduced costs but also demonstrated the early potential of blending machine learning with traditional laboratory science to create scalable, data-driven solutions