Influenza or flu is a viral infection that is readily transmitted through the air and causes respiratory problems in humans and other animals. It occurs seasonally and results in an average of 30,000 deaths in the U. S. annually primarily among the young and old. There have been some epidemics, such as the pandemic of 1918, which  resulted in over 30 million deaths around the world. 

There are two primary types, A and B of influenza, which appear seasonally, usually starting in the fall and continuing through the winter months. These types mutate creating new strains each year. Flu is an RNA virus with 11 genes that mutate regularly, making it difficult to create general vaccines. Each year scientists anticipate the most likely dominant strain and determine which mutant is most likely to spread in order to create a vaccine to protect the population. In addition to the many deaths, this disease results in huge economic losses from lost work and treatment of fragile
patients.

The disease follows a classic mathematical model known as an SIR model. The population can be divided into susceptible, infected, and recovered individuals. Once a person has a particular strain of one of the types of the influenza virus, then that individual develops immunity to that specific strain, preventing reinfection. (This is why flu viruses need to mutate to keep finding new hosts.) Since flu acts over a very short period of time, we will simplify our model by assuming that the population remains constant with a size of
N. This means that it is sufficient to simply keep track of two classes of individuals, the susceptibles, Sn, and the infected, In, as the recovered satisfy 

Rn = N - Sn - In,


Since we are ignoring births and deaths in the population, we only consider the spread of the disease, which is based on successful contacts (such as inhaling aerosol infected particles from the sneeze of an infected individual into the lungs of a susceptible host). The discrete mathematical model is given by the system of equations:

Sn+1Sn - (b/N)SnIn,
In+1 = (1 - g)In + (b/N)SnIn,

where bS/N represents the proportion of contacts by an infected individual that result in the infection of a susceptible individual. The parameter g  is the probability that an infected person recovers (enters class R of the SIR model). The ratio 1/g  is the average length of the infectious period of the disease. 

Below we provide data from the Center of Disease Control for one season of flu. The table below gives the number of cases week by week for a control set of individuals. In particular, we consider the data for a particular strain of A for the flu season in 2004 and 2005. Assume that  the control population consists of 
N = 157,759. The first week, n = 0, corresponds to the last week in September, which is near the seasonal start
of the flu season in the U. S. 

n (wk) In n (wk) In n (wk) In
0 3 17 1096 34 2
1 2 18 1354 35 0
2 7 19 1335 36 2
3 12 20 1109 37 1
4 9 21 936 38 6
5 10 22 627 39 0
6 27 23 476 40 0
7 21 24 295 41 1
8 36 25 164 42 0
9 63 26 94 43 0
10 108 27 37 44 0
11 255 28 26 45 1
12 472 29 15 46 0
13 675 30 8 47 3
14 580 31 5 48 0
15 844 32 3
16 974 33 1



a. We want to simulate the SIR model above, finding the best parameters, 
b and g, that match the data from the CDC. Insert the week number, n, and the data on infected individuals in the first two columns. Use Columns C and D for the simulation of the discrete model, Sn, and In. For initial values, take I0 = 3 and S0157,756 =  N - I0. Define your parameters, b and g, and take your initial guesses for these parameters to be b = 3.2 and g = 2.7. In Column E, compute the square error between the number of infected individuals in the data and those found by the model. Use Excel's solver to find the best fitting parameters b and g  that minimize the sum of square errors of the infected individuals. Also, give the least sum of square errors. 

Give the model prediction for people infected with influenza at 
n = 15 and n = 25 and find the percent error at each of these times from the actual CDC data given. As noted above, the average length of the infectious period is equal to 1/g. Find this period in units of days. (Recall that n is
in weeks.) 

Epidemiologists often examine what is called the basic reproduction ratio given by 

R0b/g,

which provides a measure of how rapidly a disease will spread and how much of the population will be affected by a particular disease. Use your values of b and g to find R0

To determine the impact of a particular flu season, we want to know the total number of individuals who were infected by the influenza virus. Since we are assuming a constant population
N and because the number of infected individuals is essentially zero at the end of the simulation. We estimate the total number of cases of flu by computing the number in the recovered class, so 

Rn = N - Sn,

for n large. Estimate the number of Influenza A cases for the 2004 - 2005 flu season. What percent of the original population ultimately got this strain of influenza? 

b. In your Lab report, create one graph with the data and the model of the individuals infected with influenza. Create another graph showing the number of susceptible individuals for this particular strain of influenza over the 48 weeks. Write a brief paragraph discussing how well the model fits the data. Also, discuss if the predicted value of 
g  gives a reasonable estimate of the infectious period for the flu. 

c. The CDC is interested in minimizing the impact of flu on the population, so uses a number of different controls. We will examine three different controls that are employed to fight outbreaks of the flu. 

The first line of defense of which you are undoubtedly aware is the annual flu vaccine. The effect of a vaccine is to lower the population of susceptible individuals, which is terms of our model is simply to lower
S0 by moving a number of individuals to R0. (In fact, many individuals already have immunity to a given strain of the flu because of earlier contact with a related strain.) Suppose we vaccinate 5% of the total population at the very beginning of the flu season. This immediately shifts some of the population to the recovered individuals in the model. We calculate the initial susceptible population by 

S0 = 0.95N - I0,
R0 = 0.05N

Assuming no other changes for this epidemic, then the values of g and b remain the same as above. Simulate the SIR model with this new initial condition. Find the number of infecteds and susceptibles at n = 20 and n = 30. Also, simulate the model sufficiently long, so there are essentially no new infecteds and determine the total number of people who would have suffered from the flu if 5% percent of the population is vaccinated. This is given by the number of individuals in the recovered class (subtracting out the ones protected by the vaccine). Since the flu epidemic has faded out (I = 0), Rn ~ N - Sn for n large. An estimate of the number of people getting the flu is given by Rn R0. What percent of the population becomes infected?

d. A second method of control is the quarantine of infected individuals or the education of the public on how to lessen the contact between infected
and susceptible individuals. This lowers the value of 
b. Suppose we lower the value of b by 5%, which is equivalent to multiplying the value of b found in Part a by 0.95. Use the original initial conditions and best fitting value of g  and this new b value to simulate the SIR model. Find the number of infecteds and susceptibles at n = 20 and n = 30. Also, simulate the model sufficiently long, so there are essentially no new infecteds and determine the total number of people who would have suffered from the flu if this control was used. (Recall that since the flu epidemic has faded out (I = 0), the people who got the flu are the people in the recovered class. Thus, Rn ~ N - Sn for n large. What percent of the population becomes infected?

e. Oseltamivir or Tamiflu is a drug that shortens the symptoms of flu for many people. If we assumes that this drug shortens the length of the period of infectivity of the infected individuals, then this can be modeled by increasing 
g. Suppose we increase the value of g  by 5%, which is equivalent to multiplying the value of g  found in Part a by 1.05. Use the original initial conditions and best fitting value of b to simulate the SIR model. Find the number of infecteds and susceptibles at n = 20 and n = 30. Also, simulate the model sufficiently long, so there are essentially no new infected individuals and determine the total number of people who would have suffered from the flu if this drug was used. What percent of the population becomes infected?

f. In your Lab report, reproduce the graph with the original data and the best fitting SIR model. Add graphs of the infected individuals from the SIR model using the three means of controlling the disease discussed above. Be sure to label each type of control on the graph. Describe what you observe in your graphs both quantitatively and qualitatively. Compare and contrast the different approaches to controlling a flu outbreak. Give advantages and disadvantages of each of the controls. Include a discussion of the practicality and financial burden of each of these approach. Give a couple of strengths and weaknesses for this SIR model. Select another disease that satisfies the SIR model and discuss how this lab applies to treatment of the disease you are considering. How does this study apply to elimination of other serious diseases? 


[1] CDC Flu website - www.cdc.gov/flu/ (last visited Sept. 2009). 
[2] Wikipedia - en.wikipedia.org/wiki/Influenza (last visited Sept. 2009).