AY 2025–26
Instructor: Debasis Sengupta
Office / Department: ASU
Email: sdebasis@isical.ac.in
Marking Scheme:
Assignments: 20% | Midterm Test: 30% | End Semester: 50%
Likelihood-based estimation for Infant Mortality Rate (IMR) modeled with Poisson distributions across 3 decades. Work line by line with intuition and visual thinking.
Intuition: think of \(\alpha\) as the starting height and \(\gamma\) as the common ratio of a geometric progression driving the decade-to-decade change.
This is a two-parameter Poisson trend model where \(\alpha\) sets the initial level and \(\gamma\) controls geometric change across decades. The log-likelihood is concave in \(\alpha\) and has well-behaved mixed partials; the score equations cleanly separate expected vs. observed contributions, making interpretation straightforward.
Ignoring constants, the complete-data log-likelihood is:
\[ L(\beta,\mathbf T;\mathbf x,\mathbf y) =\sum_{i=1}^n \Big[-(\beta+1)T_i + y_i(\log\beta+\log T_i) + x_i\log T_i\Big]. \]
The complete-data sufficient statistics are:
\[ S_X=\sum_i x_i,\quad S_Y=\sum_i y_i,\quad \sum_i\log T_i \;\text{(enters via }x_i+y_i\text{)}. \]
When nothing is missing, the MLEs satisfy:
\[ \hat{\beta}=\frac{\sum_i y_i}{\sum_i x_i},\qquad \hat T_i=\frac{x_i+y_i}{\hat\beta+1}. \]
Construct:
\[ Q(\beta,\mathbf T\mid \beta^{(n)},\mathbf T^{(n)}) = \mathbb E\!\left[L(\beta,\mathbf T;\mathbf X,\mathbf Y)\,\middle|\,\text{observed},\,\beta^{(n)},\mathbf T^{(n)}\right]. \]
Only terms involving the missing \(X_s\) need an expectation. Since:
\[ X_s \mid \beta^{(n)},\mathbf T^{(n)} \sim \text{Pois}\big(T_s^{(n)}\big),\quad \mathbb E[X_s\mid\cdot]=T_s^{(n)}, \]
and \(\log T_s\) is not random, we have:
\[ \mathbb E[X_s \log T_s \mid \cdot]=T_s^{(n)}\log T_s. \]
So, in \(Q\) replace missing \(x_s\) by \(T_s^{(n)}\). Concretely, use:
\[ \tilde S_X^{(n)}=\sum_{i\neq s} x_i + T_s^{(n)}. \]
Maximize \(Q\) as if \(x_s\) were replaced. Updates:
\[ \beta^{(n+1)}=\frac{\sum_i y_i}{\tilde S_X^{(n)}}=\frac{\sum_i y_i}{\sum_{i\neq s} x_i + T_s^{(n)}}. \]
\[ T_i^{(n+1)}=\frac{x_i+y_i}{\beta^{(n+1)}+1}\quad(i\neq s),\qquad T_s^{(n+1)}=\frac{T_s^{(n)}+y_s}{\beta^{(n+1)}+1}. \]
Repeat E→M until convergence.
Input: observed {x_i (i≠s), y_i (all i)}, choose initial β(0)>0 and T_i(0)>0
repeat
# E-step
Sx_tilde ← (∑_{i≠s} x_i) + T_s^(n)
# M-step
β^(n+1) ← (∑\_i y\_i) / Sx\_tilde
for i≠s: T\_i^(n+1) ← (x\_i + y\_i) / (β^(n+1) + 1)
i = s: T\_s^(n+1) ← (T\_s^(n) + y\_s) / (β^(n+1) + 1)
until convergence (e.g., relative change in β and all T\_i below tolerance)
Output: β̂, T̂\_1,…,T̂\_n
Stopping rule:
\[ \max\left\{\frac{|\beta^{(n+1)}-\beta^{(n)}|}{\beta^{(n)}},\;\max_i \frac{|T_i^{(n+1)}-T_i^{(n)}|}{T_i^{(n)}}\right\}<10^{-6}\;(\text{tight}) \;\text{or } 10^{-4}\;(\text{looser}). \]
Suppose \(n=3\). Observed:
\[ (x_1, x_2, x_3)=(12,\,?,\,9),\qquad (y_1,y_2,y_3)=(8,\,10,\,7). \]
Init: \(\beta^{(0)}=1\). Guess missing \(x_2\) with \(y_2\). Then:
\[ T_1^{(0)}=10,\;\;T_2^{(0)}=10,\;\;T_3^{(0)}=8. \]
E: \(\tilde S_X^{(0)}=31.\)
M: \(\beta^{(1)}=\tfrac{25}{31}\approx0.8065.\)
Then:
\[ T_1^{(1)}\approx11.06,\quad T_2^{(1)}\approx11.08,\quad T_3^{(1)}\approx8.86. \]
Next iteration uses \(T_2^{(1)}\), and so on, until convergence.