Statistical Modelling and Forecasting of Reported HIV Cases in Nepal

The HIV cases in Nepal are having an increasing trend. Estimates of the total number of prevalent HIV infections attributable to the major routes of infection make an important contribution to public health policy. They can be used for the planning of healthcare services and for contributing to estimates of the future numbers with severe HIV infection used for planning health promotion programmes.

the first case was diagnosed in 1988.Up until recently, (2009) the total number of positive cases reported are 14787 out of which 13005 are receiving HIV care.There are gaps and challenges to be addressed in the fight against HIV and AIDS.Nepal is low prevalence country for HIV and AIDS (0.49 percent).However, some of the groups show evidence of a concentrated HIV epidemic e.g.sex workers, migrant population and intravenous drug users (IVDU's), both in rural and urban areas.Since 1988 when the first case was diagnosed Ministry of Health and Population/Department of Health Service (MoHP/DoHS) and different stakeholders came forward to address HIV and AIDS issues.The main focus was given to preventive aspects.In 1995 MoHP in consultation with different stakeholders developed a policy for the control of HIV and AIDS.However, the activities were implemented in a sporadic and disorganized manner.
The real state about the spread of the epidemic in Nepal is not clear since the details available are on the basis of risk group.As regards the risk group, the prevalence rate is high.Perhaps it may not represent it prevalence rate of the general population 1,2 .The study conducted by Kermack and McKendrick (1927) for treatment of the Bombay plague of 1905-06 proved the capability of mathematical models in understanding and predicting epidemics.Anderson and May (1991) present more models of infections including HIV with illustrations.Mukerji (1989) represents one of the earliest Indian attempts at modeling data on AIDS [3][4][5] .This model used annualized south Asian regional data and extrapolated to AIDS in future.Basu et al (1998) attempt to model the spread of AIDS in a comprehensive manner with limited data 6 .The applicability of various models to predict AIDS in India, beginning from classic simple epidemic models to more complex heterosexual transmission models proposed and back calculation method were done by Sreenivasa Rao (2003) .Until the last decade conventional study through statistical methods were adopted to understand the trend and prevalence of HIV/AIDS in almost all countries.The objective of the study is to extract as much as information possible from available data and find out the trends of HIV cases in future.

Materials and Methods
A retrospective study was carried out on the HIV data collected from the Health ministry records of Nepal, between 1988 and 2004.The major mode of transmission of HIV in the country is heterosexual.The numbers compiled on the basis of reported voluntary testing and sentinel surveillance.The data was analysed using Excel 2003, R 2.8.0, Statistical Package for the Social Sciences (SPSS) for Windows Version 16.0 (SPSS Inc; Chicago, IL, USA) and EPI Info 3.5.1 windows version.A p-value of < 0.05 (two-tailed) was used to establish statistical significance.The annual reported numbers of HIV patients plotted in y-axis against the corresponding year in the x-axis.Curve fitting, also known as regression analysis, was used to find the "best fit" line or curve for a series of data points.Linear, Logarithmic, Inverse, Quadratic, and Cubic were chosen to fit to the obtained curve.F-test was used for selecting the best fitting curve for the testing of hypothesis.P-value was taken as significant when < 0.05 (two-tailed).R2 value > 0.80 was taken as significantly better for prediction 10 .The decision regarding the selection of a suitable prediction approach is governed by the relative performance of the models for monitoring and prediction.It should also adequately interpret the phenomenon under study.Cubic model selected here could closely fit curves for estimated and reported HIV cases (Fig 1).While building model, the extremities (maximums and minimums) play a great role.If the points are scattered more, the curve tries to adjust with maximum number of observed points.The cubic model is a third degree polynomial, represented by the equation y , where m 0 is the constant term and m 1 , m 2 , m 3 are coefficient terms 11,12 .Without the constant term, the equation of this model is y Where Y is the number of number of reported HIV cases annually and X is the corresponding year; 1=1988, 2=1989, 3=1990, 4=1991 and so on.

Results
The data was modelled using the curve fitting method.
[Tables 1, 3 and 5/Graph 1] depicts the model summary and the parameter estimates including and excluding the constant term for different models.When the constant term was included, the p values were >0.05 in all the models and none of the models were best fitted.After excluding the constant term from the equation, the cubic model was the best fit, for the forecasting of HIV cases.The cubic model equations below (1, 2 and 3) contain X and Y, which are the corresponding year and frequency of reported HIV cases respectively.m 1 , m 2 , m 3 calculated from the observed data.The equation for the cubic model for the reported number of male HIV cases is (Where Y is the number of number of reported male HIV cases annually and X is the corresponding year; 1=1988, 2=1989, 3=1990, 4=1991 and so on) Using the equation ( 1), reported numbers of male HIV cases were estimated.
The equation for the cubic model for the reported number of female HIV cases is (Where Y is the number of number of reported female HIV cases annually and X is the corresponding year; 1=1988, 2=1989, 3=1990, 4=1991 and so on)  2), reported numbers of female HIV cases were estimated.Table 4 shows the reported number of HIV cases up to the year 2004, and estimated number of HIV cases up to the year 2015.
The equation for the cubic model for the reported number of total HIV cases is Y= 45.879X -8.784X 2 + 0.571X 3 --------- (Where Y is the number of number of reported total HIV cases annually and X is the corresponding year; 1=1988, 2=1989, 3=1990, 4=1991 and so on)  3), reported numbers of total HIV cases were estimated.Table 6 shows the reported number of HIV cases up to the year 2004, and estimated number of HIV cases up to the year 2015.Graph 1: Fitted curves for reported HIV cases (X-axis shows years; 1=1988, 2=1989, 3=1990, 4=1991 and so on, Y-axis shows number of reported HIV cases) Graph 2: Year wise estimates for reported HIV cases.

Discussion
Modelling and Extrapolation: A plot is a graphical representation of the collected data (independent and dependent variables) involved in a study.The association between these variables are then assessed by connecting the `points' with a line.Though very true, this association cannot be relied upon to predict the future trend of this data.Now a `model', which `fits best' to the observed data has to be worked out.This is then `fitted' and used to replace the existing set of data points as `the appropriate model'.After `modelling' the observed data, this model can be used to predict future trend of the dependent variable for a given change in the other.The foregoing statement covertly mentions several requirements which often ensure confident achievement in any subsequent extrapolation from the model.The model selected must be the most appropriate for the collected data.A usable and understandable curve-fitting method is to be available from which the model facts those are reflective of future behaviour can be obtained 13,14 .
Timely and accurate monitoring of the HIV epidemic requires measures of incidence, that is, the number of new infections in a defined population that occur during a defined time period.Unfortunately, longitudinal studies that have traditionally provided incidence measures are costly, time consuming, logistically complex, and may be subjectively biased, differential loss to follow-up, or an intervention effect [15][16][17] .As a result, public health agencies have generally relied on surveys that measure HIV prevalence, the proportion of persons at a specified point in time that are infected, to track the epidemic to adjust with maximum number of observed points.Therefore, it might give over-and under-estimation inevitably, but that is not the case in all the situations.A sudden annual decrease and increase in the trend is possible, as the curve cannot exactly connect these data points because of its shape.For adjusting the over-and under-estimation, the model gave wide confidence intervals in case of some years (Table 6).In our study, the future annual reported HIV cases (Table 6) shows an increasing trend.Such an increase might be convincing as HIV incidence in developing countries is expected to rise principally due to the possible decline of mortality from infectious diseases, population growth. .Our study hereby establishes the applicability of statistical modelling in predicting the reported number of HIV cases in the Nepalese context.

Conclusion
It is well known that HIV/AIDS is a fatal epidemic and effective medicines are not yet discovered hence active precautionary measures have to be taken with intensive care against the spread of the epidemic.In order to determine the current levels and trends in this epidemic, best possible information is extremely necessary.The correct information on incidence and prevalence is not possible now due to stigma of the disease.The use of statistical modeling approaches make a valuable contribution in order to develop better understanding of the levels and trends in the HIV epidemic and the limited information based on the estimates.Estimates of the total number of prevalent HIV infections attributable to the major routes of infection make an important contribution to public health policy.They can be used for the planning of healthcare services and for contributing to estimates of the future numbers with severe HIV infection used for further planning of the programmes 19 .

18 .
Using the curve fitting method, we estimated the number and trend of reported HIV cases at Nepal from the year 1988 to 2015.Cubic model provided closely fitted curves for estimated and reported HIV cases (Graph 1).While building model, the extremities (maximums and minimums) play a great role.If the points are scattered more, the curve tries Nepal Journal of Epidemiology 2011;1(3): 106-110 Copyright © 2011 INEA Published online by NepJOL-INASP www.nepjol.info/index.php/NJE110

Table 6 : Observed and estimated Total HIV cases
As of 2007, national estimates indicate that approximately 70,000 adults and children are infected with the HIV virus in Nepal, with an estimated prevalence of about 0.49% in the adult population.As of June 2007, a total of 9756 cases of HIV, 1454 AIDS cases and 423 AIDS deaths had been reported to the National Centre for AIDS and STD control (NCASC).According to our study, we got cumulative number of Reported HIV cases 9614 up to 2007 that is just 142 cases lesser than NCASC reported.Like wise up to 2009 projected is 14833 which differs with actual reported cumulative number of HIV cases by 46 cases