Ignored Distribution

The IgnoredDistribution and IgnoredEstimator classes provide a way of either ignoring particular features or specifying distributions that are known in advance. This allows users to fix a certain component of the model (i.e. fix a component of a composite distribution).

IgnoredDistribution

class pysp.stats.ignored.IgnoredDistribution(dist, name=None, keys=None)

IgnoredDistribution object for using IgnoredDistributions in estimation.

dist

Distribution to be ignored.

Type:

SequenceEncodableProbabilityDistribution

name

Set name for object instance.

Type:

Optional[str]

keys

Keys for distribution (just a place holder).

Type:

Optional[str]

__init__(dist, name=None, keys=None)

IgnoredDistribution object.

Parameters:
  • dist (Optional[SequenceEncodableProbabilityDistribution]) – Distribution to be ignored.

  • name (Optional[str]) – Set name for object instance.

  • keys (Optional[str]) – Keys for distribution (just a place holder).

density(x)

Evaluate the density of the IgnoredDistribution at x.

Parameters:

x (T) – Type corresponding to attribute ‘dist’.

Returns:

Density of attribute ‘dist’ at x

Return type:

float

dist_to_encoder()

Create DataSequenceEncoder object for SequenceEncodableProbabilityDistribution instance.

Return type:

IgnoredDataEncoder

Returns:

DataSequenceEncoder

estimator(pseudo_count=None)

Create a ParameterEstimator for corresponding SequenceEncodableProbabilityDistribution.

Parameters:

pseudo_count (Optional[float]) – Regularize sufficient statistics in estimation step.

Return type:

IgnoredEstimator

Returns:

ParameterEstimator

log_density(x)

Evaluate the log-density of the IgnoredDistribution at x.

Parameters:

x (T) – Type corresponding to attribute ‘dist’.

Returns:

log-density of attribute ‘dist’ at x.

Return type:

float

sampler(seed=None)

Create a DistributionSampler object for a given ProbabilityDistribution.

Parameters:

seed (Optional[int]) – Set seed for drawing samples from distribution.

Return type:

IgnoredSampler

seq_log_density(x)

Vectorized evaluation of the log density.

Parameters:

x (EncodedDataSequence) – EncodedDataSequence for corresponding SequenceEncodedProbabilityDistribution.

Return type:

ndarray

Returns:

np.ndarray

IgnoredEstimator

class pysp.stats.ignored.IgnoredEstimator(dist=NullDistribution(name=None), pseudo_count=None, suff_stat=None, keys=None, name=None)

IgnoredEstimator object for consistency in estimation step.

dist

Distribution to be ignored.

Type:

SequenceEncodableProbabilityDistribution

pseudo_count

Place holder for consistency.

Type:

Optional[float]

suff_stat

Place holder for consistency.

Type:

Optional[Any]

keys

Place holder for consistency.

Type:

Optional[str]

name

Set name for object instance.

Type:

Optional[str]

__init__(dist=NullDistribution(name=None), pseudo_count=None, suff_stat=None, keys=None, name=None)

IgnoredEstimator object.

Parameters:
  • dist (Optional[SequenceEncodableProbabilityDistribution]) – Distribution to be ignored.

  • pseudo_count (Optional[float]) – Place holder for consistency.

  • suff_stat (Optional[Any]) – Place holder for consistency.

  • keys (Optional[str]) – Place holder for consistency.

  • name (Optional[str]) – Set name for object instance.

accumulator_factory()

Create SequenceEncodableStatisticAccumulator object.

estimate(nobs, suff_stat)

Estimate SequenceEncodableProbabilityDistribution for sufficient statistics.

Parameters:
  • nobs (Optional[float]) – Weighted number of observations.

  • suff_stat (Tuple[int, np.ndarray, np.ndarray, np.ndarray]) – Sufficient statistics for dirichlet distribution.

Return type:

IgnoredDistribution

Returns:

SequenceEncodableProbabilityDistribution

IgnoredSampler

class pysp.stats.ignored.IgnoredSampler(dist, seed=None)

IgnoredSampler object for generating samples from Ignored distribution.

dist_sampler

DistributionSampler for ignored distribution.

Type:

DistributionSampler

null_sampler

True if IgnoredDistribution is the NullDistribution.

Type:

bool

sample(size=None)

Generate samples from distribution.

Parameters:

size (Optional[int]) – Number of samples to generate.

Returns:

Samples from distribution.