Isis 3 Programmer Reference
|
This class is used to approximate cumulative probibility distributions of a stream of observations without storing the observations or having any apriori knowlege of the range of the data. More...
#include <StatCumProbDistDynCalc.h>
Public Member Functions | |
StatCumProbDistDynCalc (unsigned int nodes=20, QObject *parent=0) | |
Construtor sets up the class to start recieving data. | |
StatCumProbDistDynCalc (QXmlStreamReader *xmlReader, QObject *parent=0) | |
void | readStatistics (QXmlStreamReader *xmlReader) |
StatCumProbDistDynCalc (const StatCumProbDistDynCalc &other) | |
~StatCumProbDistDynCalc () | |
Destroys StatCumProbDistDynCalc object. | |
StatCumProbDistDynCalc & | operator= (const StatCumProbDistDynCalc &other) |
void | initialize () |
Inializer, resets the class to start its dynamic calculation anew. | |
void | setQuantiles (unsigned int nodes) |
void | validate () |
void | addObs (double obs) |
Values for the estimated quantile positions are update as observations are added. | |
double | cumProb (double value) |
Provides the cumulative probility, that is, the proportion of the distribution that is less than or equal to the value given (according the current estimate of cumulative probility function). | |
double | value (double cumProb) |
Provides the value of the variable that has the given cumulative probility (according the current estimate of cumulative probility function) | |
double | max () |
Returns the maximum observation so far included in the dynamic calculation. | |
double | min () |
Returns the maximum observation so far included in the dynamic calculation. | |
void | save (QXmlStreamWriter &stream, const Project *project) const |
QDataStream & | write (QDataStream &stream) const |
QDataStream & | read (QDataStream &stream) |
Public Attributes | |
unsigned int | m_numberCells |
The number of cells or histogram bins that are being used to model the probility density function. | |
unsigned int | m_numberQuantiles |
The number of quantiles being used to model the probility density function. | |
unsigned int | m_numberObservations |
The number of observations, note this is dynamically changing as observations are added. | |
QList< double > | m_quantiles |
The target quantiles being modeled, between 0 and 1. | |
QList< double > | m_observationValues |
The calculated values of the quantiles, note this is dynamically changing as observations are added. | |
QList< double > | m_idealNumObsBelowQuantile |
The ideal number of observations that should be less than or equal to the value of the corresponding quantiles, note this is dynamically changing as observations are added. | |
QList< int > | m_numObsBelowQuantile |
The actual number of observations that are less than or equal to the value of the corresponding quantiles, note this is dynamically changing as observations are added. | |
This class is used to approximate cumulative probibility distributions of a stream of observations without storing the observations or having any apriori knowlege of the range of the data.
This class is used to approximate cumulative probibility distributions of a stream of observations without storing the observations or having any apriori knowlege of the range of the data. "The P^2 algorithim for dynamic calculation of Quantiles and Histograms without storing Observations" Raj Jain and Imrich Chlamtac, Communication of the ACM Oct 1985, is used. A finite set of evenly spaced qunatiles are dynamically updated as more observations are added. The number of quantiles is set in the construtor, and has a defualt of 20. After sufficient data points (number of observations >> number of quantiles to track) the class provides cumulative probility as a function of value or vice versa. Thus it can be used to build histograms or find any number of discrete quantiles. Specific points on the function are evaluated by fiting piece wise parabolic functions to the three nearest adjacent nodes. Preformance of algorithim is within a few percent error for most of the distribution (given sufficient data), however care should be taken if the points to be querried are within 200/(numberOfQuantiles-1)% of the edges of the distributions. Near the edges the individual quantiles are still well calculated, but the piece wise parabolic function doesn't always fit the tails well, so interpolated points are more unrealiable. Developement note: Two possible ways to improve the fitting of the tails: caculate more densely place quantiles near the edges, use exponential regression (or some other alternative–perhaps adaptively selected).
2012-03-23 Orrin Thomas - Original Version
2014-07-19 Jeannie Backer - Added QDataStream >> and << operator methods. Brought code closer to ISIS standards. Updated unitTest to include these methods.
2014-09-11 Jeannie Backer - Added xml write/read capabilities. Fixed bug in cumPro() method for previously untested lines (case where the given value is closest to the last quantile value). Renamed member variables for clarity.
Definition at line 63 of file StatCumProbDistDynCalc.h.
Isis::StatCumProbDistDynCalc::StatCumProbDistDynCalc | ( | unsigned int | nodes = 20, |
QObject * | parent = 0 ) |
Construtor sets up the class to start recieving data.
[in] | unsigned | int nodes – this is the number of specific evenly spaced quantiles that will be dynamically tracked |
Definition at line 37 of file StatCumProbDistDynCalc.cpp.
References initialize().
Isis::StatCumProbDistDynCalc::StatCumProbDistDynCalc | ( | QXmlStreamReader * | xmlReader, |
QObject * | parent = 0 ) |
Definition at line 44 of file StatCumProbDistDynCalc.cpp.
Isis::StatCumProbDistDynCalc::StatCumProbDistDynCalc | ( | const StatCumProbDistDynCalc & | other | ) |
Definition at line 115 of file StatCumProbDistDynCalc.cpp.
Isis::StatCumProbDistDynCalc::~StatCumProbDistDynCalc | ( | ) |
Destroys StatCumProbDistDynCalc object.
Definition at line 130 of file StatCumProbDistDynCalc.cpp.
void Isis::StatCumProbDistDynCalc::addObs | ( | double | obs | ) |
Values for the estimated quantile positions are update as observations are added.
[in] | double | obs – the individual observation to be used to dynamically readjust the cumulative probility distribution |
Definition at line 455 of file StatCumProbDistDynCalc.cpp.
References m_idealNumObsBelowQuantile, m_numberCells, m_numberObservations, m_numberQuantiles, m_numObsBelowQuantile, m_observationValues, and m_quantiles.
Referenced by Isis::BundleResults::addProbabilityDistributionObservation(), and Isis::BundleResults::addResidualsProbabilityDistributionObservation().
double Isis::StatCumProbDistDynCalc::cumProb | ( | double | value | ) |
Provides the cumulative probility, that is, the proportion of the distribution that is less than or equal to the value given (according the current estimate of cumulative probility function).
[in] | value | – the upper bound of values considered in the cumlative probility calculation |
IsisProgrammerError | – StatCumProbDistDynCalc will return no data until there has been at least m_numberQuantiles observations added |
Definition at line 360 of file StatCumProbDistDynCalc.cpp.
References m_numberCells, m_numberQuantiles, m_observationValues, m_quantiles, and value().
Referenced by value().
void Isis::StatCumProbDistDynCalc::initialize | ( | ) |
Inializer, resets the class to start its dynamic calculation anew.
[in] | unsigned | int nodes – this is the number of specific evenly spaced quantiles that will be dynamically tracked |
Definition at line 158 of file StatCumProbDistDynCalc.cpp.
References m_idealNumObsBelowQuantile, m_numberCells, m_numberObservations, m_numberQuantiles, m_numObsBelowQuantile, m_observationValues, and m_quantiles.
Referenced by StatCumProbDistDynCalc().
double Isis::StatCumProbDistDynCalc::max | ( | ) |
Returns the maximum observation so far included in the dynamic calculation.
IsisProgrammerError | – StatCumProbDistDynCalc will return no data until the number of observations added matches the number of quantiles (i.e. number of nodes) selected. |
Definition at line 204 of file StatCumProbDistDynCalc.cpp.
References m_numberCells, and m_observationValues.
Referenced by Isis::BundleSolutionInfo::outputHeader().
double Isis::StatCumProbDistDynCalc::min | ( | ) |
Returns the maximum observation so far included in the dynamic calculation.
IsisProgrammerError | – StatCumProbDistDynCalc will return no data until the number of observations added matches the number of quantiles (i.e. number of nodes) selected. |
Definition at line 221 of file StatCumProbDistDynCalc.cpp.
References m_observationValues.
Referenced by Isis::BundleSolutionInfo::outputHeader().
StatCumProbDistDynCalc & Isis::StatCumProbDistDynCalc::operator= | ( | const StatCumProbDistDynCalc & | other | ) |
Definition at line 135 of file StatCumProbDistDynCalc.cpp.
QDataStream & Isis::StatCumProbDistDynCalc::read | ( | QDataStream & | stream | ) |
Definition at line 586 of file StatCumProbDistDynCalc.cpp.
void Isis::StatCumProbDistDynCalc::readStatistics | ( | QXmlStreamReader * | xmlReader | ) |
Definition at line 49 of file StatCumProbDistDynCalc.cpp.
void Isis::StatCumProbDistDynCalc::save | ( | QXmlStreamWriter & | stream, |
const Project * | project ) const |
Definition at line 552 of file StatCumProbDistDynCalc.cpp.
void Isis::StatCumProbDistDynCalc::setQuantiles | ( | unsigned int | nodes | ) |
Definition at line 168 of file StatCumProbDistDynCalc.cpp.
void Isis::StatCumProbDistDynCalc::validate | ( | ) |
Definition at line 615 of file StatCumProbDistDynCalc.cpp.
double Isis::StatCumProbDistDynCalc::value | ( | double | cumProb | ) |
Provides the value of the variable that has the given cumulative probility (according the current estimate of cumulative probility function)
[in] | cumProb | – cumlative probability, domain [0, 1] |
IsisProgrammerError | – StatCumProbDistDynCalc will return no data until the number of observations added matches the number of quantiles (i.e. number of nodes) selected. |
IsisProgrammerError | – Invalid cumulative probability passed in to StatCumProbDistDynCalc::value(double cumProb). Must be on the domain [0, 1]. |
Definition at line 243 of file StatCumProbDistDynCalc.cpp.
References cumProb(), m_numberCells, m_numberQuantiles, m_observationValues, m_quantiles, Isis::IException::Programmer, and Isis::toString().
Referenced by cumProb(), Isis::BundleSolutionInfo::outputHeader(), and Isis::BundleResults::printMaximumLikelihoodTierInformation().
QDataStream & Isis::StatCumProbDistDynCalc::write | ( | QDataStream & | stream | ) | const |
Definition at line 575 of file StatCumProbDistDynCalc.cpp.
QList<double> Isis::StatCumProbDistDynCalc::m_idealNumObsBelowQuantile |
The ideal number of observations that should be less than or equal to the value of the corresponding quantiles, note this is dynamically changing as observations are added.
Definition at line 108 of file StatCumProbDistDynCalc.h.
Referenced by addObs(), and initialize().
unsigned int Isis::StatCumProbDistDynCalc::m_numberCells |
The number of cells or histogram bins that are being used to model the probility density function.
Definition at line 93 of file StatCumProbDistDynCalc.h.
Referenced by addObs(), cumProb(), initialize(), max(), and value().
unsigned int Isis::StatCumProbDistDynCalc::m_numberObservations |
The number of observations, note this is dynamically changing as observations are added.
Definition at line 100 of file StatCumProbDistDynCalc.h.
Referenced by addObs(), and initialize().
unsigned int Isis::StatCumProbDistDynCalc::m_numberQuantiles |
The number of quantiles being used to model the probility density function.
This value is one more than the number of cells, (i.e. m_numberQuantiles=m_cells+1).
Definition at line 96 of file StatCumProbDistDynCalc.h.
Referenced by addObs(), cumProb(), initialize(), and value().
QList<int> Isis::StatCumProbDistDynCalc::m_numObsBelowQuantile |
The actual number of observations that are less than or equal to the value of the corresponding quantiles, note this is dynamically changing as observations are added.
Definition at line 113 of file StatCumProbDistDynCalc.h.
Referenced by addObs(), and initialize().
QList<double> Isis::StatCumProbDistDynCalc::m_observationValues |
The calculated values of the quantiles, note this is dynamically changing as observations are added.
Definition at line 105 of file StatCumProbDistDynCalc.h.
Referenced by addObs(), cumProb(), initialize(), max(), min(), and value().
QList<double> Isis::StatCumProbDistDynCalc::m_quantiles |
The target quantiles being modeled, between 0 and 1.
Definition at line 103 of file StatCumProbDistDynCalc.h.
Referenced by addObs(), cumProb(), initialize(), and value().