Skip to main content

Case Studies

Case study: Ultrasonic Reference Block Study

nistlogo

NIST info: This example illustrates the construction of a non-linear regression model for ultrasonic calibration data. This case study demonstrates fitting a non-linear model and the use of transformations and weighted fits to deal with the violation of the assumption of constant standard deviations for the errors. This assumption is also called homogeneous variances for the errors. Full description can be found here.

Data: The ultrasonic reference block data consist of a response variable and a predictor variable. The response variable is ultrasonic response and the predictor variable is metal distance. These data were provided by the NIST scientist Dan Chwirut (avaiable here).


In this example we will follow the suggested steps, listed at NIST pages.

1) Read in the data: For non-linear regression we'll use tMtxNonLinReg control. First we have to create the control and populate it's X and Y vectors with data.

2) Plot the y=y(x) chart to get an idea about the fit model. For this, you can use the MtxVecTee.DrawValues method. These two steps can be acomplished by the following code class:

project d32

Uses MtxVec, StatTools, Math387, MtxVecTee; 


procedure FitModel; 
var nlr: TMtxNonLinReg; 
begin 
  nlr := TMtxNonLinReg.Create(nil); 
  Chart1.FreeAllSeries; Chart1.AddSeries(TLineSeries.Create(Chart1)); 
  try 
    nlr.X.LoadFromFile('xdata.vec'); 
    nlr.Y.LoadFromFile('ydata.vec'); 
    DrawValues(nlr.X,nlrY,Chart1.Series[0]);

project cpp


#include "MtxVecCpp.h" #include "StatTools.hpp" #include "Math387.hpp" #include "MtxVec.hpp" #include "MtxVecTee.hpp" void FitModel(void); { TMtxNonLinReg* nlr = new TMtxNonLinReg(NULL); Chart1->FreeAllSeries(); Chart1.AddSeries(new TLineSeries(Chart1)); try { nlr->X->LoadFromFile("xdata.vec"); nlr->Y->LoadFromFile("ydata.vec"); DrawValues(nlr->X,nlr->Y,Chart1->Series[0],0,1,false);

project cs

using Dew.Math; 
using Dew.Math.Units;
using Dew.Stats.Units;
using Dew.Stats; 

namespace Dew.Tests 
{ 
   private void FitModel() { 
      TMtxNonLinReg nlr = new TMtxNonLinReg(null);tChart1.Series.Clear(); 
      tChart1.Series.Add(new Steema.TeeChart.Styles.Line(tChart1.Chart));
      try { 
         nlr.X.LoadFromFile("xdata.vec"); 
         nlr.Y.LoadFromFile("ydata.vec"); 
         MtxVecTee.DrawValues(nlr.X, nlr.Y, tChart1[0],0,1,false);
   


The y=y(x) plot shows an exponentially decaying pattern in the data. This suggests that some type of exponential function might be an appropriate model for the data. As suggested, the Exp(-a*x)/(b+c*x) function is a good choice. To perform a non-linear fit, we'll have to define the regression function and starting values for regression routine. We will determine starting values based on a grid of the parameter values. As suggested, in this case te grid is 0.1 to 1.0 in increments of 0.1, where the best parameter values minimize residual sum of squares In this case we will also "vectorize" the regression function so that all y(x) values will be calculated in one go (speed gain for large sample sizes). Below is the implementation:

project d32

// Scalar version => simpler, but slower for multiple x
function RegFun(const b: Array of TSample; x: TSample): TSample; overload;
begin
   Result := Exp(-b[0]*x)/(b[1]+x*b[2]);
end;

// Vector version => more complex, but faster for multiple x
procedure RegFun(B,X,Y: TVec); overload;
var wrk1:Vector;
begin
   // wrk1 = Exp(-B[0]*x)
   wrk1.Normalize(X,0.0,-1.0/B.Values[0]);
   wrk1.Exp;
   // 1/(B[1]+B[2]*x)
   Y.Copy(X);
   Y.Mul(B.Values[2]);
   Y.Add(B.Values[1]);
   Y.Inv;
   // Y = Exp(-B[0]*x)/(B[1]+B[2]*x)
   Y.Mul(wrk1);
end;
procedure InitialEstimates(X,Y: TVec; B: TVec);
var rss, rssmin: TSample;
   i: Integer;
   parval: TSample;
   wrk1,Residuals,YCalc: Vector;
begin
   wrk1.Size(b);
   wrk1.SetZero;
   b.Copy(wrk1);
   rssmin := 1.0E+10;
   // iterate for all parameters
   for i := 0 to b.Length - 1 do
   begin
      // evaluate on interval [0,1], step = 0.1
      parval := 0.0;
      while (parval<=1.0) do
      begin
         wrk1.Values[i] := parval;
         // evaluate function (vectorized version => faster)
         RegFun(wrk1,X,YCalc);
         // calculate RSS
         Residuals.Sub(YCalc,Y);
         rss := Residuals.SumOfSquares;
         parval := parval + 0.1;
         // minimal rss ?
         if (rss<rssmin) then
            begin
               b.Copy(wrk1);
               rssmin := rss;
            end;
      end;
   end;
end;

project cpp

// Scalar version => simpler, but slower for multiple x
void double RegFun(const double* b, double x);
 {
   return Math::Exp(-b[0]*x)/(b[1]+x*b[2]);
 }
// Vector version => more complex, but faster for multiple x
 void RegFun(Vector B,Vector X,Vector Y);
 {
   Vector wrk1;
   // wrk1 = Exp(-B[0]*x)
   wrk1->Normalize(X,0.0,-1.0/B->Values[0]);
   wrk1->Exp();
   // 1/(B[1]+B[2]*x)
   Y->Copy(X);
   Y->Mul(B->Values[2]);
   Y->Add(B->Values[1]);
   Y->Inv();
   // Y = Exp(-B[0]*x)/(B[1]+B[2]*x)
   Y->Mul(wrk1);
 }
void InitialEstimates(Vector X,Vector* Y, Vector B);
 {
   Vector residuals, wrk2, ycalc;
   wrk1->Size(b);
   wrk1->SetZero();
   b->Copy(wrk1);
   double rssmin = 1.0e+10;
   double parval,rss;
   // iterate for all parameters
   (for int i=0; iLength; i++)
   {
      // evaluate on interval [0,1], step = 0.1
      parval = 0.0;
      while (parval<=1.0)
      {
         wrk1->Values[i] = parval;
         // evaluate function (vectorized version => faster)
         RegFun(wrk1,X,ycalc);
         // calculate RSS
         residuals->Sub(ycalc,Y);
         rss = Residuals->SumOfSquares();
         parval += 0.1;
         // minimal rss ?
         if (rss<rssmin)
         {
            b->Copy(wrk1);
            rssmin = rss;
         }
      }
   }
}

project cs

// Scalar version
double RegFun(double[] b, double x)
{
   return Math.Exp(-b[0]*x)/(b[1]+b[2]*x);
}

// Vectorized version
void RegFun(TVec B, TVec X, TVec Y)
{
   Vector wrk1 = new Vector(0);
   // wrk1 = Exp(-B[0]*x)
   wrk1.Normalize(X,0.0,-1.0/B.Values[0]);
   wrk1.Exp();
   // 1/(B[1]+B[2]*x)
   Y.Copy(X);
   Y.Mul(B.Values[2]);
   Y.Add(B.Values[1]);
   Y.Inv();
   // Y = Exp(-B[0]*x)/(B[1]+B[2]*x)
   Y.Mul(wrk1);
}

void InitialEstimates(TVec X, TVec Y, TVec B)
{
   Vector wrk1 = new Vector(0);
   Vector residuals = new Vector(0);
   Vector ycalc = new Vector(0);
   wrk1.Size(B);
   wrk1.SetZero();
   B.Copy(wrk1);
   double rssmin = 1.0E+10;
   double rss, parval;
   for (int i=0; i<B.Length; i++)
   {
      parval = 0.0;
      while (parval<=1.0)
      {
         wrk1.Values[i] = parval;
         // evaluate function (vectorized version => faster)
         RegFun(wrk1,X,ycalc);
         // calculate RSS
         residuals.Sub(ycalc,Y);
         rss = residuals.SumOfSquares();
         parval += 0.1;
         // minimal rss ?
         if (rss<rssmin)
         {
            B.Copy(wrk1);
            rssmin = rss;
         }
      }
   }
}

3) Perform non-linear regression on data: Now that we have the regression function and initial estimates for regression coefficients, we can perform the actual non-linear regression.

project d32

procedure FitModel;
   var nlr: TMtxNonLinReg; 
   residuals: Vector;
   rss: TSample;
begin
   nlr := TMtxNonLinReg.Create(nil);
   try
      // load data
      nlr.X.LoadFromFile('x.vec');
      nlr.Y.LoadFromFile('y.vec');
      // Define and connect to regression function
      nlr.B.Length := 3;
      nlr.RegressFunction := RegFun;
 
      // initial values for coefficients
      InitialEstimates(nlr.X, nlr.Y,nlr.B);

      // Perform non-linear regression
      nlr.Recalc;
      // Calculate residuals
      residuals.Sub(nlr.YCalc,nlr.Y);
      rss := residuals.SumOfSquares;
   finally
      nlr.Free;
   end;
end;

project cpp

void FitModel(void);
{
   Vector residuals;
   TMtxNonLinReg* nlr = new TMtxNonLinReg(NULL);
   try
   {
      nlr->LoadFromFile("x.vec");
      nlr->LoadFromFile("y.vec");
      // connect to regression function
      nlr->RegresFun = RegFun;

      // initial estimates for coefficients
      nlr->B->Length = 3;
      InitialEstimates(nlr->X,nlr->Y,nlr->B);

      // perform non-linear regression
      nlr->Recalc();
      residuals->Sub(nlr->YCalc,nlr->Y);
      double rss = residuals->SumOfSquares();
   }
   __finally
   { 
      nlr->Free();
   }
}

project cs

private void FitModel()
{
   TMtxNonLinReg nlr = new TMtxNonLinReg(this);
   Vector residuals = new Vector(0);
   // load data
   nlr.X.LoadFromFile(@".\Data\x.vec");
   nlr.Y.LoadFromFile(@".\Data\y.vec");
   // connect to regression function
   nlr.RegressFun += new Dew.Stats.TRegressFun(RegFun);
   // initial values for coefficients
   nlr.B.Length = 3;
   InitialEstimates(nlr.X,nlr.Y,nlr.B);

   // Perform non-linear regression
   nlr.Recalc();
   // Calculate residuals
   residuals.Sub(nlr.YCalc,nlr.Y);
   double rss = residuals.SumOfSquares();
} 

Additional things to try :

  • Try transforming the y values before applying the non-linear regression. Suggested transformations: sqrt(y), ln(y), 1.0/y. This can be done with a single line of code class: nlr.Y.Sqrt();
  • Perform a weigted regression on the same data. Define weights as specified at NIST pages.
  • Plot data, fitted data and residuals by using DrawValues routine.
  • Plot normal probability plot of residuals.
  • Plot histogram of residuals.
  • Compare the results with quoted results at NIST pages.