Machine learning keeps solar one step ahead of soiling


Worldwide, many of the sites that offer the highest solar irradiation also come with the disadvantage of dry, dusty conditions on the ground that can cause various problems for PV system performance.

Dealing with the losses caused by the buildup of dust on the surface of a module is big business for the PV industry, since these losses can quickly to significant amounts of lost revenue. Cleaning modules too frequently or investing in the wrong type of cleaning equipment can also hurt the economics of a project. And so the ability to accurately predict losses from soiling on both long and short term timescales is something that PV project developers and system operators value greatly.

Various approaches exist, employing different combinations of on-site sensors, historical climate data, local weather data, satellite imaging and more. A group of scientists led by the University of Cyprus sought to compare the accuracy of a few of these, by comparing modeled forecasts of soiling loss with data from a test installation at the University of Cyprus campus in Nicosia.

Machine learning

Soiling losses at the test site were calculated by comparing a cleaned and uncleaned module set side by side. Six different models – three using a physical modeling approach and three based on machine learning – were evaluated for accuracy against the site data.

The three “physical” models are well-established methods to model soiling, while the machine learning methods are open-source programs being applied to soiling measurement for the first time. Full details of the models used and how they were evaluated can be found in the paper “Characterizing soiling losses for photovoltaic systems in dry climates: A case study in Cyprus,” published in Solar Energy.

Popular content

The evaluation showed that the physical models, fed with field observed data, achieved the highest accuracy, with error rates (root mean square error) of 1.16% on daily soiling losses and 0.83% for monthly soiling losses, for the highest performing machine learning model, called CatBoost.

The machine learning approaches, however, were not far behind with 1.55% error in daily soiling losses and 1.18% for monthly. The researchers note that, given shortcomings in the availability of field-observed data covering a whole site over a sufficient period, the machine learning models, based on environmental data gathered by satellite, could also be a useful approach.

“Modelling soiling with this kind of satellite-derived environmental data might help in scheduling O&M strategies and operations along the year to minimize the soiling loss; particularly, in dust and arid regions where sudden changes in the aerosols loading can happen and precipitation is much less frequent,” the researchers explained.

This content is protected by copyright and may not be reused. If you want to cooperate with us and would like to reuse some of our content, please contact: