Research Critique

Disease burden from viral Hepatitis A, B and C in the United States.
One of the most infectious diseases in recent times is Hepatitis. To arrive at any decision related to Hepatitis, statistical data is collected, organized and analyzed in different ways to give meaning and strengthen the decision. Such data is collected through surveys, interviews, questionnaires and surveillance tools that may be available at the medical centers. The data annexed below for example was collected by the Centers for Disease Control and Prevention from the third National Health and Nutrition Examination Survey, reports at the Nationally Notifiable Disease Surveillance System (Control, 2012). The population for the data is the annual observation unit performed over a number of years and in this case we have data for 31 years. The variable is the number of infections in a particular year.
Looking at the figures provided, it is easy to conclude that there has been a steady decrease in the estimates of Hepatitis infections since 1998. However, our decisions and conclusions should be backed by statistical measures. Suppose we wish to show that there was a decrease in infections between 2001 and 2010, then the best approach would be to formulate a hypothesis and work towards rejecting or accepting it. The null hypothesis (Ho) for this data is that there was no decrease in average infections of Hepatitis A between 2001 and 2010. The alternative hypothesis (H1) is that there was a decrease in average infections of Hepatitis A between 2001 and 2010. This will give us a sample of 10 variables out of 31 assuming that 31 years represents the population in question. Then we can compare statistical measures between the sample and the population to strengthen our decisions. Such statistical work is inferential.
The mean is a measure of central tendency. As for this data, it will indicate the average number of infections and reported acute cases per year for a period of 31 years.
From the table, the population mean = 18852 for reported acute cases and 198000 for infections. The median is also a measure of central tendency and for infections, the median is 228000.
To give a relation between the mean and other values in the distribution, measures of dispersion are calculated. The variance and standard deviation is a common measure of dispersion. The population standard deviation in infections is 120695. The simplest measure of variability is the range. For the infections, the lowest is 17000 and the highest is 380000 giving a range of 363000.
Working with the sample of 10 years between 2001 and 2010, the mean of the sample for reported acute cases is 5004 while that of infections is 44200. Assuming that the distribution of reported acute cases and infections is normal, the values obtained from the samples can be tested at certain confidence levels such as 95% level to give assurance of where the mean lies in the population.
For infections of Hepatitis A, the Z score for the sample mean = (198000 – 44200)/ (120695/√10) = 4.03. At 95% level, the Z score is 1.96 which is below the Z score for the sample mean. Therefore, we can confidently say that the mean for infections has dropped significantly at 95% level (and even at 99% level of confidence) between 2001 and 2010 as compared to the period between 1980 and 2010.
With this information, we can now reject the null hypothesis formulated and accept the alternative hypothesis. We can confidently state that Hepatitis A infections greatly dropped by the year 2010. Similarly, and following the same working, we can confidently state that reported acute infections dropped greatly during the same period. However, it should be noted that large samples are preferred because they give more informed and accurate statistical measures that can be used to make decisions on the whole population.
The following bar chart also supports this decision.
Working out a correlation between the two sets of data aims at showing that reported acute cases of Hepatitis is related to the total number of new infections. A positive correlation coefficient indicates that the two are closely related. A scatter diagram also shows a relationship between such variables which can further be connect through a regression line. A scatter diagram for the data between reported cases and new infections is given below.
The line drawn is the regression line clearly indicating that a linear relationship does exist between the two variables.
With such information analysis, we can draw conclusions that are supported by the statistical measures. For example, the drop in reported cases and new infections of Hepatitis A is evident because of improved medical practices that manage the disease and new drugs that are more effective against Hepatitis. A drop in new infections also results in a drop in reported cases and vice versa. Other areas that may need further research can be suggested in making conclusions on the work.
Control, C. f. (2012, june 22). CDC DVH. Retrieved march 24, 2013, from Centers for Disease Control and Control Website:
Owen, F., & Jones, R. (1982). Statistics. London: Butler & Tanner Ltd.