StatProbPlots.StatQQPlot Function

Constructs the Quantile-Quantile probability plot.

Pascal

procedure StatQQPlot(const XData: TVec; const YData: TVec; const XDrawVec: TVec; const YDrawVec: TVec; out MinX: double; out MaxX: double; out MinY: double; out MaxY: double; const XDataSorted: boolean = true; const YDataSorted: boolean = true); overload;

File

StatProbPlots

Parameters

Parameters	Description
XData	X Data (first dataset).
YData	Y Data (second dataset).
XDrawVec	Returns q-q plot horizontal values to be drawn - > estimated quantiles from XData vector or in this case sorted XData values.
YDrawVec	Returns q-q plot vertical values to be drawn - > estimated quantiles from YData vector or in this case sorted YData values.
MinX	Returns slope line start X point, XDrawVec 25th percentile. These value are used by Dew.Stats.Tee.ProbabilityPlot series.
MaxX	Returns slope line end X point, XDrawVec 75th percentile. These value are used by Dew.Stats.Tee.ProbabilityPlot series.
MinY	Returns slope line start Y point, YDrawVec 25th percentile. These value are used by Dew.Stats.Tee.ProbabilityPlot series.
MaxY	Returns slope line end Y point, YDrawVec 75th percentile. These value are used by Dew.Stats.Tee.ProbabilityPlot series.
XDataSorted	If true, algorithm assumes XData is already sorted in ascending order. If XData is not sorted, you must set this parameter to false so that internal algorithm will automatically do the sorting.
YDataSorted	If true, algorithm assumes YData is already sorted in ascending order. If YData is not sorted, you must set this parameter to false so that internal algorithm will automatically do the sorting.

Description

Constructs the Quantile-Quantile Chart. Use TStatProbSeries to visualize/plot constructed values. The QQ chart is a graphical technique for determining if two data sets come from populations with a common distribution. Specifically, QQ chart is a plot of the quantiles of the first data set against the quantiles of the second data set. By a quantile, we mean the fraction (or percent) of points below the given value. That is, the 0.3 (or 30%) quantile is the point at which 30% percent of the data fall below and 70% fall above that value. A 45-degree reference line is also plotted. If the two sets come from a population with the same distribution, the points should fall approximately along reference line. The greater the departure from this reference line, the greater the evidence for the conclusion that the two data sets have come from populations with different distributions.

The advantages of the q-q plot are:

The sample sizes do not need to be equal.
Many distributional aspects can be simultaneously tested. For example, shifts in location, shifts in scale, changes in symmetry, and the presence of outliers can all be detected from this plot. For example, if the two data sets come from populations whose distributions differ only by a shift in location, the points should lie along a straight line that is displaced either up or down from the 45-degree reference line.

The QQ Chart is similar to a probability plot. For a probability plot, the quantiles for one of the data samples are replaced with the quantiles of a theoretical distribution. The QQ chart is used to answer the following questions:

Do two data sets come from populations with a common distribution?
Do two data sets have common location and scale?
Do two data sets have similar distributional shapes?
Do two data sets have similar tail behavior?

When there are two data samples, it is often desirable to know if the assumption of a common distribution is justified. If so, then location and scale estimators can pool both data sets to obtain estimates of the common location and scale. If two samples do differ, it is also useful to gain some understanding of the differences. The QQ Chart can provide more insight into the nature of the difference than analytical methods such as the Statistics.GOFChi2Test and Statistics.GOFKolmogorov two sample tests.

If the data sets have the same size, the q-q plot is essentially a plot of sorted X against sorted Y. If the data sets are not of equal size, the quantiles are usually picked to correspond to the sorted values from the smaller data set and then the quantiles for the larger data set are interpolated.

How to construct two datasets Q-Q plot?

If needed, XData values or YData are sorted (DataSorted parameter set to false).
Abscissa drawing values are formed by estimated XData quantiles - ordered XData values. After calculation they are copied to XDrawVec. crefTMtxVecBase.Length and crefTMtxVec.Complex properties of XDrawVec are adjusted automatically.
Ordinate drawing values are formed by estimated YData quantiles - ordered YData values. After calculation they are copied to YDrawVec. crefTMtxVecBase.Length and crefTMtxVec.Complex properties of YDrawVec are adjusted automatically.
XDrawVec and YDrawVec 25th and 75th percentile points are used to construct a reference line. Drawing points departures from this straight line indicate XData and YData do not come from the same distribution.

See Also

Dew.Stats.Tee.ProbabilityPlot

Example

The following code will create probability plot and then plot calculated values.

Uses MtxExpr, StatProbPlots, StatSeries, Math387, MtxVecTee;
procedure Example(Series1: TStatProbSeries);
var XData, YData, XVec, YVec: Vector;
  X1,Y1,X2,Y2: double;
begin
  // generate some random values for Data vec
  XData.Size(100);
  YData.Size(100);
  XData.RandGauss(0.0,1.0); // standard norm. dist.
  YData.RandGauss(0.0,1.0); // standard norm. dist.
  // now construct QQ plot
  StatQQPlot(XData,YData,XVec,YVec,X1,X2,Y1,Y2,false);
  With Series1 do
  begin
    MinX := X1;
    MinY := Y1;
    MaxX := X2;
    MaxY := Y2;
  end;
  DrawValues(XVec,YVec,Series1);
end;

#include "Math387.hpp"
#include "MtxExpr.hpp"
#include "StatProbPlots.hpp"
#include "StatSeries.hpp"
#include "MtxVecTee.hpp"
void __fastcall Example(TStatProbSeries * Series1);
{
  sVector xdata,ydata,xvec,yvec;
  double x1,x2,y1,y2;

  xdata.Size(100,false);
  xdata.RandGauss(0.0,1.0); // standard distribution
  ydata.Size(100,false);
  ydata.RandGauss(1.0,2.3); // standard distribution

  StatQQPlot(xdata,ydata,xvec,yvec,x1,x2,y1,y2,false);
  Series1->MinX = x1;
  Series1->MaxX = x2;
  Series1->MinY = y1;
  Series1->MaxY = y2;
  DrawValues(xvec,yvec,Series1);
}

StatProbPlots, Functions, Example, See Also

What do you think about this topic? Send feedback!