tps                 package:funfits                 R Documentation

_T_h_i_n _p_l_a_t_e _s_p_l_i_n_e _r_e_g_r_e_s_s_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     A thin plate spline is result of minimizing the residual sum of
     squares subject to a constraint that the function have a certain
     level of smoothness (or roughness penalty). Roughness is
     quantified by the integral of squared m^th order derivatives. For
     one dimension and m=2 the roughness penalty is the integrated
     square of the second derivative of the function. For two
     dimensions the roughness penalty is the integral of  (Dxx(f))^2 +
     2(Dxy(f))^2 + (Dyy(f))^2 (where Duv denotes the second partial
     derivative with respect to u and v.) Besides controlling the order
     of the derivatives, the value of m also determines the base
     polynomial that is fit to the data. The degree of this polynomial
     is (m-1).

     The smoothing parameter controls the amount that the data is
     smoothed. In the usual form this is denoted by lambda, the
     Lagrange multiplier of the minimization problem. Although this is
     an awkward scale, lambda =0 corresponds to no smoothness
     constraints and the data is interpolated.  lambda=infinity
     corresponds to just fitting the polynomial base model by ordinary
     least squares.

_U_s_a_g_e:

     tps(x, y, lambda=NA, df=NA, cost=1, knots, weights=rep(1, length(y)), m, 
     power,scale.type="unit.sd", x.center, x.scale, return.matrices=T, 
     nstep.cv=80, method="GCV", rmse=NA, link.matrix=NA, verbose=F, 
     subset=NULL, tol=0.0001, print.warning=T)

_A_r_g_u_m_e_n_t_s:

       x: variables.  

       Y: Vector of dependent variables. 

  lambda: Smoothing parameter. If omitted this is estimated by GCV.
          Lambda=0 gives an interpolating model. 

      df: Specifies the effective degrees of freedom associated with
          the spline  estimate. This parameter is an alternative to
          specifying lambda directly.  

    cost: increased number of parameters.  

   knots: Subset of data used in the fit. 

 weights: Vector - default is no weighting i.e. vector of unit weights
          (Weights are in units of reciprocal variance.) 

       m: Order of spline surface, default is 2 corresponding to a
          linear (m-1=1)  base polynomial model. If power is specified
          (m-1) will be the degree of the polynomial null space.  

   power: Power used for the norm in the radial basis functions  the
          default is 2*m-d and this will result in true thin-plate
          splines.   

scale.type: The independent variables and knots are scaled to the
          specified scale.type. By default the scale type is "unit.sd",
          whereby the data is scaled to have mean 0 and standard
          deviation 1. Scale type of "range" scales the data to the
          interval (0,1) by forming (x-min(x))/range(x) for each x.
          Scale type of "user" allows specification of an x.center and
          x.scale by the user. The default for "user" is mean 0 and
          standard deviation 1. Scale type of "unscaled" does not scale
          the data. 

x.center: Value subtracted from each column of the x matrix. 

 x.scale: Value divided into each column for scaling. 

return.matrices: Matrices from the decompositions are returned. 

nstep.cv: Number of grid points for initial GCV grid search. 

  method: The method for determining the smoothing parameter to
          evaluate the  spline estimate. Choices are "GCV" (the
          default), "RMSE" and "pure error".  The method may also be 
          set implicitly by the values of other  arguments  

    rmse: ~Describe rmse here 

link.matrix: A matrix that relates the function evaluated at the x
          values to the  mean of y. This option is used when the mean
          of the observed data is a  linear combination of the function
          evaluated at the x values.   

 verbose: If true prints all kinds of intermediate calculations. This
          is mainly for  trouble shooting 

  subset: A logical vector indicating the subset of data to use for
          fitting.  

     tol: Tolerance for convergence of the golden section and bisection
          searches  in the GCV function and the df.to.lambda function. 

print.warning: 

_V_a_l_u_e:

     A list of class tps. This includes the predicted surface of
     fitted.values and the residuals. The results of the grid search
     minimizing the generalized cross validation function is returned
     in gcv.grid.

    call: Call to the function 

       x: Matrix of independent variables. 

       y: Vector of dependent variables. 

    form: Logical denoting that the form of the model. Default is a
          thin plate spline. 

    cost: Cost value used in GCV criterion.  

       m: Order of spline surface. 

   trace: Effective number of parameters in model. 

    trA2: trace of the square of the smoothing matrix, tr(A(lambda)**2) 

   yname: Name of the response. 

 weights: Vector of weights. 

   knots: Subset of data used. 

transform: List of components used in scaling data. 

   power: 2*m-d unless specified explicitly in the call. 

      np: Total number of parameters in the model. 

      nt: Number of parameters in the null space. 

matrices: List of matrices from the decompositions (D, G, u, X, qr.T). 

gcv.grid: Matrix of values used in the GCV grid search. The first
          column is the grid of lambda values used in the search, the
          second column  is the trace of the A matrix, the third column
          is the GCV values and the fourth column is the estimated
          variance. 

  eff.df: Effective degrees of freedom of the model. 

fitted.values: Predicted values from the fit. 

residuals: Residuals from the fit. 

  lambda: Value of the smoothing parameter used in the fit.  

    beta: All coefficients i.e. corresponding to polynomial and radial
          basis functions. 

       d: Parameters of the polynomial null space model. The powers for
          these terms are accessible from the attribute matrix returned
          by make.tmatrix. 

       c: Parameters for the radial basis terms.  

coefficients: Same as beta.  

just.solve: Logical indicating lambda=0 i.e. an interpolating function
          is returned. 

lambda.est: A matrix giving the values  of lambda found for different
          methods and the corresponding estimates of sigma and the GCV
          function. 

    shat: Estimated standard deviation of the errors using the
          specified value of  lambda 

shat.pure.error: Estimate of standard deviation of the errors using
          replicated points. the value is NA if there are no replicates 

     GCV: Value of Generalized Cross Validation at lambda 

    rmse: Root mean squared error used as a target for choosing lambda. 

      q2: 

   press: 

_R_e_f_e_r_e_n_c_e_s:

     See "Nonparametric Regression and Generalized Linear Models"  by
     Green and Silverman.

     See "Additive Models" by Hastie and Tibshirani.

_S_e_e _A_l_s_o:

     summary.tps, predict.tps, predict.se.tps, plot.tps, surface.tps

_E_x_a_m_p_l_e_s:

     #1-d example

     tps( rat.diet$t, rat.diet$trt) # lambda found by GCV
     tps( rat.diet$t, rat.diet$trt, df=6) # lambda chosen so that spline has 6 
                                          # degrees of freedom

     #2-d example
     tps(ozone$x, ozone$y) -> fit # fits a surface to ozone measurements.
     plot(fit) # plots fit and residuals.

     #4-d example
     tps(BD[,1:4],BD$lnya,scale.type="range") -> fit # fits a surface to
     # DNA strand displacement amplification as a function of various 
     # buffer components.
     surface(fit)  
     # plots fitted surface and contours



     #2-d example using a reduced set of basis functions
     r1 <- range(flame$x[,1])
     r2 <-range( flame$x[,2])
     g.list <- list(seq(r1[1], r1[2],6), seq(r2[1], r2[2], 6))
     make.surface.grid(g.list) -> knots  # these knots are a 6X6 grid over
     # the ranges of the two flame variables
     tps(flame$x, flame$y, knots=knots, m=4) -> out 

     # here is an example using a link matrix
     # 
     x<- seq( 0,1,.02)
     f<- 8*(x**2)*(1-x)
     M <- matrix( .02, ncol=50, nrow=50)
     M[ col(M)< row( M)]<- 0 
     set.seed( 123)
     y<- M%*%f + error*.1 # So y is approximately the integral of f pus error
     ex.out<-tps( x,y,link.matrix=M)
     #Note: predict will give the spline NOT the predicted values for y!

