A Study of the Bias from Inverse Poisson-Variance Weighting

A Study of the Bias from Inverse Poisson-Variance Weighting

F. Masci, 4 February 2008

Overview

We have simulated the bias from inverse-variance weighted averaging of purely Poisson distributed measurements where the variances σ²_i are derived from the measurements x_i, i.e., σ²_i = x_i. Results are summarized in Tables 1 & 2 for different sigma limits and whether the confidence interval (total probability) about the assumed mean is symmetric or asymmetric.

The measurements may represent the number of photo-electrons recorded at a pixel in an imaging detector. It's important to note that the biases computed below are for pure Poisson fluctuations. Other contributions to the measurement variance such as detector read-noise, which is likely to be independent and Gaussian in nature, will dilute the bias from inverse Poisson-variance weighting. Furthermore, artificial pixel responsivity variations must be corrected before computing any inverse-variance weighted average using Poisson derived variances.

Case 1: Asymmetric Confidence Intervals about the Mean

Table 1 summarizes results for the bias in the inverse Poisson-variance weighted average with the confidence interval (integrated probability) kept asymmetric about the mean μ (truth). This is due to the non-zero skewness in the Poisson distribution where the skew scales as μ^-1/2. This case may be encountered when one blindly uses the same number of standard deviations about the mean for retaining data after outlier rejection, i.e., within μ ± nσ, where σ = μ^1/2 for Poisson-distributed data. This ignores the possibility that the underlying distribution of the population is asymmetric, e.g., as in the Poisson regime.

We have simulated this scenario by randomly drawing 10,000,000 samples from a Poisson distribution with assumed mean μ (column 1) and then retaining only values x_i within μ ± 2σ and μ ± 3σ. These values were then combined to compute an inverse Poisson-variance weighted average (column 2):

Weighted Avg = ∑_i(x_i/σ²_i) / ∑_i(1/σ²_i) = ∑_i(x_i/x_i) / ∑_i(1/x_i) = N_s / ∑_i(1/x_i),

where N_s = sample size.

The bias in the weighted average (column 3) is measured relative to the true mean μ:

Bias (%) = 100 * (Weighted Avg - μ) / μ.

  μ (truth)   |   Weighted Avg   |  Bias (%)  |  n in ±nσ 
---------------------------------------------------------
   10             8.963957         -10.360428      2.000
   30             29.074249         -3.085837      2.000
   50             49.045239         -1.909521      2.000
   100            99.056126         -0.943874      2.000
   300            299.076390        -0.307870      2.000
   500            499.070494        -0.185901      2.000
   1000           999.063474        -0.093653      2.000
   3000           2999.059874       -0.031338      2.000
   5000           4999.058573       -0.018829      2.000
   10000          9999.108676       -0.008913      2.000
   15000          14999.109218      -0.005939      2.000
   20000          19999.021523      -0.004892      2.000
   30000          29999.093535      -0.003022      2.000
   .....................................................
   10             8.830797         -11.692025      3.000
   30             28.948477         -3.505076      3.000
   50             48.964837         -2.070327      3.000
   100            98.973086         -1.026914      3.000
   300            298.990371        -0.336543      3.000
   500            498.981064        -0.203787      3.000
   1000           998.997981        -0.100202      3.000
   3000           2998.983203       -0.033893      3.000
   5000           4998.994062       -0.020119      3.000
   10000          9998.955520       -0.010445      3.000
   15000          14998.989938      -0.006734      3.000
   20000          19999.000118      -0.004999      3.000
   30000          29999.022856      -0.003257      3.000
---------------------------------------------------------
Table 1 - Results for bias in inverse-variance weighted
          average assuming asymmetric (unbalanced) 
          confidence intervals about the mean.

Case 2: Symmetric Confidence Intervals about the Mean

Table 2 summarizes results for the bias in the inverse Poisson-variance weighted average with the confidence interval (integrated probability) made symmetric about the mean (truth). This is done by first fixing an upper threshold n_U, then finding the lower threshold n_L such that:

Prob[ μ - n_Lσ < x_i < μ ] = Prob[ μ < x_i < μ + n_Uσ ].

10,000,000 random draws were initially made. Then only those values x_i within the new interval μ - n_Lσ to μ + n_Uσ were used for the weighted averaging (column 2). All other columns are the same as defined for Table 1 above.

  μ (truth)   |   Weighted Avg   |  Bias (%)   |   n_L   |   n_U 
-----------------------------------------------------------------
   10             9.962058         -0.379419      0.949    2.000
   30             30.165960         0.553199      1.095    2.000
   50             50.132571         0.265143      1.273    2.000
   100            100.198941        0.198941      1.400    2.000
   300            300.255111        0.085037      1.559    2.000
   500            500.158431        0.031686      1.655    2.000
   1000           1000.203027       0.020303      1.739    2.000
   3000           3000.253909       0.008464      1.826    2.000
   5000           5000.205662       0.004113      1.867    2.000
   10000          10000.167745      0.001677      1.910    2.000
   15000          15000.138307      0.000922      1.919    2.000
   20000          20000.289366      0.001447      1.923    2.000
   30000          30000.361102      0.001204      1.940    2.000
   .............................................................
   10             10.083210         0.832102      0.949    3.000
   30             30.101412         0.338040      1.278    3.000
   50             50.127484         0.254967      1.414    3.000
   100            100.137742        0.137742      1.600    3.000
   300            300.190409        0.063470      1.848    3.000
   500            500.334528        0.066906      1.923    3.000
   1000           1000.379039       0.037904      2.055    3.000
   3000           3000.423075       0.014102      2.264    3.000
   5000           5000.400008       0.008000      2.362    3.000
   10000          10000.478232      0.004782      2.460    3.000
   15000          15000.419536      0.002797      2.523    3.000
   20000          20000.534468      0.002672      2.560    3.000
   30000          30000.448127      0.001494      2.610    3.000
-----------------------------------------------------------------
Table 2 - Results for bias in inverse-variance weighted average
          assuming symmetric (balanced) confidence intervals
          about the mean.

Results for the bias in Tables 1 & 2 are graphed below. Note, the Log₁₀[ < N > ] on the horizontal axes is the same as Log₁₀[μ] with μ from column 1 of the above tables.

Figure 1 - Bias (in percent) as a function of assumed input expectation value (mean count) for 2-sigma fluctuations: asymmetric (blue) and symmetric (red) confidence intervals. Click to enlarge.

Figure 2 - Bias (in percent) as a function of assumed input expectation value (mean count) for 3-sigma fluctuations: asymmetric (blue) and symmetric (red) confidence intervals. Click to enlarge.

Conclusions and Discussion

The results above show that the bias from inverse Poisson-variance weighted averaging is negative when the confidence interval (CI) is "blindly" kept asymmetric about the mean. The magnitude of this bias increases as the mean of the Poisson counts decrease and the distribution becomes more asymmetric. When the CI is made symmetric about the mean (e.g., by tuning the respective lower and upper σ-clipping thresholds), the biases become predominately positive, but their magnitudes become smaller by at least a factor of 7 below a mean count of 100. The asymmetric and symmetric bias estimates become equal as the mean increases (e.g., Figures 1 & 2) since here the distributions approach a Gaussian and the asymmetry (or skew) is reduced.

For typical mean backgrounds of 100 to 10,000 photo-electrons per pixel per 8.8 sec exposure expected for WISE, the asymmetric CI (or symmetric 3σ-clipped) assumption gives biases of ~1% and ~0.01% respectively. The symmetric CI (or asymmetric 3σ-clipped upper tail) assumption gives biases of ~0.14% and ~0.005% for 100 and 1000 photo-electrons respectively. Note that the 2σ-clipped biases are even smaller. However, these photo-electron counts only pertain to the background. A typical source with signal-to-noise ratio of ~20 per pixel will correspond to an additional photo-electron count of ~700 electrons per pixel per 8.8 sec in WISE bands 1 and 2. Note that a nominal read-noise of 19 e-/pixel was included in this calculation. The Poisson bias here is then expected to be <0.2% under the worst possible case above: asymmetric CI with symmetric 3σ-clipping (Table 1). The bias for WISE bands 3 and 4 will be minuscule since here the photo-electron counts will number in the tens of thousands.

If one is combining data at the raw count level with mean counts below ~300 and is after a photometric accuracy better than the biases quoted for the asymmetric CI case in Table 1, it is advised that the symmetric CI case (with smaller biases in Table 2) be used. On the other hand, if one inadvertently keeps the CI asymmetric by assuming symmetric nσ-clipping, then a simple (unweighted) arithmetic average of the measurements will give biases smaller than those quoted in Table 1 but similar to those quoted in Table 2 (symmetric CI case). For example, for mean counts of ≥50, an arithmetic average will be biased by <0.3% and <0.08% for measurements within 2σ and 3σ respectively. However, there may be good reasons why one will want to perform an inverse-variance weighted average, e.g., the application of instrumental calibrations will also contribute to the error budget. Inverse-variance weighting provides a way to emphasize the good over the bad.

In the previous bullet point, we used the phrase "raw count level". This is because the above simulations assume one has a perfect detector: no read-noise and no instrumental signatures to correct. If present, these will be corrected with some degree of uncertainty and will therefore inflate the final variances used for the optimal combination of pixels in co-addition. Given that these are all likely to be independent of the Poisson noise component, the Poisson bias will no doubt be diluted. This is further discussed in the last bullet point below.

Further Work / Refinements

One of the caveats of using inverse-variance weighted averaging when combining measurements is that this method only applies to data that are Gaussian distributed, or, that are far enough into the Gaussian (large N) limit. It is not the most optimal estimation method in the Poisson regime, or, when the underlying distribution is far from Gaussian. In general, optimal combination of measurements x_i whose parent probability density functions p_i are known (or assumed) a priori requires first computing the refined joint density function:
P_r(x | x_i) = ∏_ip_i / ∫_x∏_ip_idx_i,
and then extracting the refined mean and variance. It turns out that if all the p_i are Gaussian density functions, then the refined mean from the above equation boils down to the classic inverse-variance weighting formula. This can also be achieved using maximum likelihood methods. This procedure is rarely done in practice (let alone the underlying parent distribution checked) since most people (including myself - like above) have blindly combined measurements using inverse-variance weighting even when in the Poisson regime. It would be interesting to see how the bias behaves under this more optimal method as one approaches the Poisson limit.

What is the true impact of this bias in a realistic (practical) situation? The biases computed above are for pure Poisson fluctuations. Other contributions to the measurement variance such as detector read-noise will dilute these biases. Read-noise is an independent stochastic component and likely to be Gaussian in nature. A nice exercise would be to first convolve a Gaussian representative of read-noise with a Poisson distribution and then simulate the bias. Also, variances are further modified downstream by the application of several calibrations: e.g., flat-fielding, dark subtraction, non-linearity etc. It is only after all these artificial effects have been corrected (and variances updated accordingly) that the data are ready to be optimally combined for co-addition. Will the biases reported above persist this far into the instrumental calibration process? I would confidentally say no unless one is well into the Poisson regime with a mean count of ≤20 electrons. However, if read-noise with say σ > 10 electrons (variance > 100) is also present, then any such Poisson bias will be substantially diluted.

The above simulations were generated using the S-Plus/R Code: PoissonBias.r

Last update - 4 February 2008
F. Masci - IPAC