Title: | Smoothness-Penalized Deconvolution for Density Estimation Under Measurement Error |
---|---|
Description: | Implements the Smoothness-Penalized Deconvolution method for estimating a probability density under measurement error of Kent and Ruppert (2023) <doi:10.1080/01621459.2023.2259028>. The estimator is formed by computing a histogram of the error-contaminated data, and then finding an estimate that minimizes a reconstruction error plus a smoothness-inducing penalty term. The primary function, sped(), takes the data and error distribution, and returns the estimator as a function. |
Authors: | David Kent [aut, cre] |
Maintainer: | David Kent <[email protected]> |
License: | GPL-3 |
Version: | 0.1 |
Built: | 2025-03-09 02:55:01 UTC |
Source: | https://github.com/cran/spedecon |
compute_ephemera()
does data-independent pre-computations for sped and can speed up repeated applications
compute_ephemera(gtwid, hn, padding, spline_dim, perknot = 2)
compute_ephemera(gtwid, hn, padding, spline_dim, perknot = 2)
gtwid |
Object of class spedecon_gtwid describing the density of |
hn |
Object of class |
padding |
Support of spline space is extended by |
spline_dim |
Numeric integer, dimension of spline space |
perknot |
Number of positivity constraints per knot |
The computations in sped rely on several matrices and vectors that are determined by the error density, spline space, and histogram bins, but do not depend on the data. Computing these is the most time-intensive element of the process, so if the estimator will be applied several times to different data, but the same error density, spline space, and histogram bins (likely in simulations), gains can be had by pre-computing those matrices and vectors just one time.
For comparison, the sped function internally uses padding = 0.4
, and perknot = 2
.
Object of class spedecon_ephemera
, a list containing the pre-computed values.
Kent D, Ruppert D (2023). “Smoothness-Penalized Deconvolution (SPeD) of a Density Estimate.” Journal of the American Statistical Association, to appear. ISSN 0162-1459, doi:10.1080/01621459.2023.2259028
alpha <- 1e-3; n <- 1e3; s <- 0.3 Y <- rgamma(n,5,2) + rnorm(n,0,s) gtwid <- gaussian_gtwid(sd=s) hn <- hist(Y,breaks="FD",plot=FALSE) ephemera <- compute_ephemera(gtwid=gtwid,hn=hn,padding=0.4,spline_dim=30,perknot=2) sol1 <- sped(Y,gtwid,1e-3,ephemera=ephemera) # fast sol2 <- sped(Y,gtwid,1e-3) # slow attr(sol1,"coef") - attr(sol2,"coef")
alpha <- 1e-3; n <- 1e3; s <- 0.3 Y <- rgamma(n,5,2) + rnorm(n,0,s) gtwid <- gaussian_gtwid(sd=s) hn <- hist(Y,breaks="FD",plot=FALSE) ephemera <- compute_ephemera(gtwid=gtwid,hn=hn,padding=0.4,spline_dim=30,perknot=2) sol1 <- sped(Y,gtwid,1e-3,ephemera=ephemera) # fast sol2 <- sped(Y,gtwid,1e-3) # slow attr(sol1,"coef") - attr(sol2,"coef")
Returns a spedecon_gtwid
object representing the Fourier transform of a mean-zero Gaussian density
gaussian_gtwid(sd)
gaussian_gtwid(sd)
sd |
Standard deviation |
Object of class spedecon_gtwid
gtwid <- gaussian_gtwid(sd = 1)
gtwid <- gaussian_gtwid(sd = 1)
Returns a spedecon_gtwid
object representing the Fourier transform of a mean-zero Laplace density with scale b
laplace_gtwid(b)
laplace_gtwid(b)
b |
Scale parameter |
Object of class spedecon_gtwid
gtwid <- laplace_gtwid(b = 1)
gtwid <- laplace_gtwid(b = 1)
spedecon_gtwid
Constructor for class spedecon_gtwid
. Use helper functions gaussian_gtwid()
, laplace_gtwid()
, and uniform_gtwid()
instead whenever possible.
new_spedecon_gtwid(gtwid, family)
new_spedecon_gtwid(gtwid, family)
gtwid |
Function representing the Fourier transform |
family |
List with at least one entry |
The spedecon_gtwid
class is meant to represent the Fourier transform of a probability density.
The basic type is a function.
It also has a family
attribute which can hold the name and parameters of the family of distributions.
Object of class spedecon_gtwid
Use gaussian_gtwid()
, laplace_gtwid()
, or uniform_gtwid()
instead whenever possible.
spedecon_spline_sped_fit
Internal use only. Constructor for class spedecon_spline_sped_fit
new_spedecon_spline_sped_fit(coef, basis, alpha, constraint)
new_spedecon_spline_sped_fit(coef, basis, alpha, constraint)
coef |
Numeric vector; the coefficients of the spline. |
basis |
Object of class |
alpha |
Positive numeric, the alpha that was used for the fit. |
constraint |
The type of constraint used. |
The basic type of an object of type spedecon_spline_sped_fit
is a function; one can therefore evaluate, plot, etc. and ignore the other attributes if desired.
The function is represented as a spline, and has attributes coef
and basis
, which represents the coefficients and basis respectively.
coef
is a numeric vector, while basis
is an object of class spedecon_spline_basis
, which is essentially just a list holding the knots and order of the spline space.
A spedecon_spline_sped_fit
object also has attributes alpha
and constraint
which record the penalty parameter and constraint method used for the fit.
Object of class spedecon_spline_sped_fit
sped()
computes the Smoothness-Penalized Deconvolution estimate on the provided data and error distribution
sped( Y, gtwid, alpha, constraint = "constrainedQP", spline_dim = 30, hn = NULL, ephemera = NULL, ... )
sped( Y, gtwid, alpha, constraint = "constrainedQP", spline_dim = 30, hn = NULL, ephemera = NULL, ... )
Y |
Numeric vector of data from the model |
gtwid |
Object of class spedecon_gtwid describing the density of |
alpha |
Positive numeric penalty parameter |
constraint |
String, controls whether and how the solution is constrained to be a pdf. One of |
spline_dim |
Numeric integer, dimension of spline space |
hn |
(optional) Object of class |
ephemera |
(optional) Object of class |
... |
(optional) Other arguments |
This function computes the "Smoothness-Penalized Deconvolution" (SPeD) estimate of a density under additive measurement error.
The essential inputs to the function are the data Y
, the Fourier transform gtwid
of the error density, and the penalty parameter alpha
; more details follow here, but for a full description of the estimator please consult Kent and Ruppert (2023).
The data model is that we observe an iid sample distributed like , with
an error independent of
.
We wish to estimate the density
of
.
It is assumed that we know the probability density of the errors
, call it
.
The estimator begins with a density estimate of the density of
, and minimizes the objective function
in , with
ranging over a space of cubic splines with equally-spaced knots; the dimension of this space can be adjusted with the argument
spline_dim
.
The SPeD estimate is not naturally a pdf, so it must be constrained.
When constraint = "constrainedQP"
, the constraint is imposed directly into quadratic program minimizing the objective; when constraint = "projection"
, the unconstrained estimate is computed and then projected onto the space of pdfs.
The preliminary density estimate is computed internally as a histogram using Freedman-Diaconis choice of bin width, but a user-supplied histogram computed with
hist()
may be provided via the hn
argument.
The computations require the Fourier transform of the probability density, and this must be supplied as an object of type
spedecon_gtwid
, which can be produced for common error densities using the helper functions gaussian_gtwid()
, laplace_gtwid()
, and uniform_gtwid()
.
If the estimator will be re-computed many times for many realizations of data, substantial time can be saved by pre-computing all the auxiliary matrices and vectors one time, and supplying them through the ephemera
argument.
This can be done whenever the repeated computations all use the same error density, same histogram bins, and same spline space, as those are what define the required matrices and vectors.
A helper function compute_ephemera()
is provided to pre-compute these.
Object of class spedecon_spline_sped_fit
Kent D, Ruppert D (2023). “Smoothness-Penalized Deconvolution (SPeD) of a Density Estimate.” Journal of the American Statistical Association, to appear. ISSN 0162-1459, doi:10.1080/01621459.2023.2259028
alpha <- 1e-3 n <- 1e3; s <- 0.3 Y <- rgamma(n,5,2) + rnorm(n,0,s) # Data, contaminated with Gaussian errors sol <- sped(Y,gtwid=gaussian_gtwid(sd=s),1e-3) plot(sol,n=1e3) # Plot the resulting estimate curve(dgamma(x,5,2),col=2,n=1e3,add=TRUE) # The target density f() of X sol(c(2,3,4)) # We can evaluate sol; it is a function
alpha <- 1e-3 n <- 1e3; s <- 0.3 Y <- rgamma(n,5,2) + rnorm(n,0,s) # Data, contaminated with Gaussian errors sol <- sped(Y,gtwid=gaussian_gtwid(sd=s),1e-3) plot(sol,n=1e3) # Plot the resulting estimate curve(dgamma(x,5,2),col=2,n=1e3,add=TRUE) # The target density f() of X sol(c(2,3,4)) # We can evaluate sol; it is a function
Returns a spedecon_gtwid
object representing the Fourier transform of a Uniform[-a,a] density
uniform_gtwid(a)
uniform_gtwid(a)
a |
Half-width |
Object of class spedecon_gtwid
gtwid <- uniform_gtwid(a = 1)
gtwid <- uniform_gtwid(a = 1)