You are here: Symbol Reference > Dew.Stats Namespace > Statistics Class > Statistics Methods > GOFKolmogorov Method > GOFKolmogorov Method (TVec, THypothesisResult, TSample, TVec, TVec, THypothesisType, TSample)
Dew Stats for .NET
Contents
PreviousUpNext
Statistics.GOFKolmogorov Method (TVec, THypothesisResult, TSample, TVec, TVec, THypothesisType, TSample)

One sample Kolmogorov-Smirnov GOF test.

C#
public double GOFKolmogorov(TVec Data, ref THypothesisResult hRes, ref double Signif, TVec CDFx, TVec CDFy, THypothesisType hType, double Alpha);
Parameters
Parameters 
Description 
Data 
Samples to be tested. 
hRes 
Returns the result of the null hypothesis (default assumption is that data comes from specific distribution). 
Signif 
(Significance level) returns the probability of observing the given result by chance given that the null hypothesis is true. 
CDFx 
Defines set of possible x values. 
CDFy 
Defines set of hypothesized CDF values, evaluated at CDFx. 
hType 
Defines the type of the null hypothesis (left, right and two - tailed). 
Alpha 
Defines the desired significance level. If the significance probability (Signif) is bellow the desired significance (Alpha), the null hypothesis is rejected. 
Returns

K-S statistics.

Performs one-sample Kolmogorov-Smirnov (KS) goodnes of fit test. The KS test is used to decide if a sample comes from a population with a specific distribution. Test is based on the empirical distribution function (ECDF). An attractive feature of this test is that the distribution of the K-S test statistic itself does not depend on the underlying cumulative distribution function being tested. Another advantage is that it is an exact test (the chi-square goodness-of-fit test depends on an adequate sample size for the approximations to be valid). Despite these advantages, the K-S test has several important limitations:

  • It only applies to continuous distributions.
  • It tends to be more sensitive near the center of the distribution than at the tails.
  • Perhaps the most serious limitation is that the distribution must be fully specified. That is, if location, scale, and shape parameters are estimated from the data, the critical region of the K-S test is no longer valid. It typically must be determined by simulation.

 

If CDFx and CDFy vectors are not defined, Data values are compared with standard normal distribution. If defined, CDFx and CDfy vectors represent hypothesized distribution x and CDF(x) values. In this case all Data values must lie within the [Min(CDFx),Max(CDFx)] interval. The KS test assumes CDFx and CDFy are predefined - KS test is not very accurate if CDFx and CDFy values are calculated from Data values.

In this example sample is generated using Normal (mu=2,sigma=1) distribution. Then a KS test is used to determine if sample comes from normal distribution. 

 

using Dew.Math;
using Dew.Stats;
using Dew.Stats.Units;
namespace Dew.Examples
{
  private void Example()
  {
    Vector d = new Vector(300);
    Statistics.RandomNormal(2,1,d2,-1);
    THypothesisResult hRes;
    double Signif;

    double KS = GOFKolmogorov(d, out hRes, out Signif, htTwoTailed, 0.05);

    // Result should be significance below 0.05 meaning d1 and d2 value
    // do not come from same distribution  =>  H0 is therefore rejected!
  }
}
What do you think about this topic? Send feedback!
Copyright (c) 1999-2010 by Dew Research. All rights reserved.