------
Question:
I have a collection of measurements from astrometry of the pointing accuracy with an error associated with each one from chi-squared fits of the stars.  Pointing accuracy is dependent on many variables, some of which are unknown, so I don't expect a normal distribution of measurements of pointing accuracy.  Under some conditions it will be poor, and others it will be good.
So my problem is how to report my measurements.  My first thought was to calculate a weighted mean using the inverse square of the measurement errors as the weights.  But I found this is a poor choice as it's not a normal distribution, so one really precise measurement of good pointing can carry all the weight and make all the other measurements irrelevant.
A simple average also has problems, as the measurement error is not taken into account. Thus a value of 0.7 +/- 0.7 is weighted the same as 0.7 +/- 0.01, so outliers dominate this calculation.
Any ideas?  I was thinking the weights need to be some kind of figure of merit, but can't think of a good way to do that.

------
Reply:

From: Frank Masci <fmasci@ipac.caltech.edu>
Date: November 14, 2014 at 9:56:04 AM PST

I need to understand how your measurements and metrics are represented. Are these radial differences between extractions and astrometric reference stars? Or do you have them as deltas per axis (dX,dY) in the detector or maybe sky frame? For the latter, I would simply compute a robust (outlier resistant) measure of the RMS per axis (collapsed along either dX or dY), e.g., the inter-percentile spread: sigma = 0.5*[84%-tile - 16%-tile]. The 16,84% limits contain 68% of the measurements, and as you know = +/- 1 standard deviation for a Gaussian. However, your data need not be Gaussian and you can pick any limits for sigma, e.g., a 95% confidence interval (CL): sigma = 0.5*[97.5%-tile - 2.5%-tile], as long as you say it’s a 95% CL when you quote your sigmas. Also, your dX (and/or) dY may not be zero mean. You also would want to quote this bias. A straight median of the measurements (along dX and dY) is fine. Your n-sigma uncertainty in median{dX} is then sigma (from above) / sqrt(N). Similar for dY. Again, you need say if this is a 68% or 95% CL uncertainty. You can also get fancy and compute the uncertainties in median{dX,dY} using bootstrap resampling (remember that?). This would provide a good check.

Now, if you simply have radial distances between extracted positions and astrometric references (dR = sqrt[dX^2 + dY^2]), things are trickier because the population distribution is now Rayleigh-like and possibly asymmetric. You can still use the same robust methods as above, i.e., compute median{dR}, but your uncertainties could be asymmetric. I.e., the distance "median{dR} - 16%-tile{dR}” will not be the same as “84%-tile{dR} - median{dR}”. This means you’ll be quoting something like: median{dR} + unc(hi) - unc(lo), instead of the symmetric case (in dX,dY) where you’d quote median{dX,dY} +/- unc. Like above, you’d need to say that the unc(hi),unc(lo) span some P% CL. It also doesn't hurt to re-estimate your uncertainties using bootstrap resampling on your "radial" sample. 

Also, if your input 1D (per-axis) distributions (that generated the radial distributions) are known to be close to Gaussian and radially symmetric, you can estimate the 1D sigma from the median{dR} using the simple relation:
sigma{dX or dY} ~ median{dR}/sqrt(2*ln2).