The development of the OpenDML Project derives from the accumulated knowledge of several major publications by researchers in Dengue analytics as well as those actively working in other fields with data manipulation. Each contribution plays a significant role in their own right, therefore listed below are the overviews of the research done in each publication as well as their significance in the expansion of OpenDML :
The purpose of this study was to develop a dengue forecasting model that would provide early warning of a dengue outbreak several months in advance to allow sufficient time for effective control to be implemented. The researchers constructed a statistical model using weekly mean temperature and rainfall. This involved 1) identifying the optimal lag period for forecasting dengue cases; 2) developing the model that described past dengue distribution patterns; 3) performing sensitivity tests to analyze whether the selected model could detect actual outbreaks. The study also used the selected model to forecast dengue cases from 2011–2012 using weather data alone. The model exhibits high sensitivity in distinguishing between an outbreak and a non-outbreak.
Among the relevant features were the study conducted on annual cumulative rainfall (mm), mean temperature (c) and its correlation between succeeding Dengue cases throughout a period of 10 years.
Valuable insight from this study can be used for real-life accuracy of the simulation data used in OpenDML. Also, defining predictor variables for outbreaks may become more precise.
Example of linear correlations in the study
Another key aspect is the influence of past outbreaks on the number of current cases. Though still in its conceptual phase, we are likely to adopt an
autoregressive model to account for the serial relationship between past and present instances in data.
This study describes a novel prediction method utilizing Fuzzy Association Rule Mining to extract relationships between clinical, meteorological, climatic, and socio-political data from Peru. These relationships are in the form of rules. The best set of rules is automatically chosen and forms a classifier. That classifier is then used to predict future dengue incidence as either HIGH (outbreak) or LOW (no outbreak), where these values are defined as being above and below the mean previous dengue incidence plus two standard deviations, respectively.
By this study's findings on the correlation of seroprevalence (levels of pathogen in a population) as well as vector habitat characteristics on outbreak frequency, we are able to deduce the appropriate methods used to define predictor variables. The study also provides us with Dengue case data from the Peruvian Ministry of Health as a reference guide for the actual model constructed from the predictor variables.