Integer Multinomial Distribution

Data Type: Sequence[Tuple[int, float]]

The integer multinomial distribution is a generalization of the binomial distribution to k classes. The multinomial give the probability of observing n_k success of class k in \(n=\sum_{i}n_i\) trials. The probability mass function is given by

\[f(\boldsymbol{x} \vert n, \boldsymbol{p}) = \frac{n!}{x_1!\dots x_k!} p_1^{x_1}\dots p_k^{x_k},\]

where \(\sum_{i=1}^{k} p_i = 1\) and \(\sum_{i=1}^{k} x_i = n\).

For more info see Multinomial Distribution.

IntegerMultinomialDistribution

class dmx.stats.intmultinomial.IntegerMultinomialDistribution(min_val=0, p_vec=None, len_dist=NullDistribution(name=None), name=None, keys=None)

IntegerMultinomialDistribution object.

p_vec

Probability of each integer category for a trial.

Type:

ndarray

min_val

Smallest integer value for category range. Defaults to 0.

Type:

int

max_val

Largest value of category range. Set by min_val + len(p_vec) - 1.

Type:

int

log_p_vec

Log of p_vec member instance.

Type:

ndarray

num_vals

Total number of integer valued categories.

Type:

int

len_dist

Distribution for number of trials. Set to NullDistribution if None.

Type:

SequenceEncodableProbabilityDistribution

keys

Keys for distribution passed when ParameterEstimator is created.

Type:

Optional[str]

name

Name for object instance.

Type:

Optional[str]

__init__(min_val=0, p_vec=None, len_dist=NullDistribution(name=None), name=None, keys=None)

IntegerMultinomialDistribution object.

Parameters:
  • min_val (int) – Set the minimum value on range of values.

  • p_vec (Union[List[float],np.ndarray) – Probabilities for values. Length determines number of categories.

  • len_dist (Optional[SequenceEncodableProbabilityDistribution]) – Optional length distributions serving as for the number of trials.

  • name (Optional[str]) – Set name for object instance.

  • keys (Optional[str]) – Set key for distribution.

density(x)

Evaluate the density of IntegerMultinomialDistribution at observed value x.

Parameters:

x (Sequence[Tuple[int, float]]) – Sequence of Tuple(s) containing the integer category value and number of successes.

Returns:

Density at x.

Return type:

float

dist_to_encoder()

Create DataSequenceEncoder object for SequenceEncodableProbabilityDistribution instance.

Return type:

IntegerMultinomialDataEncoder

Returns:

DataSequenceEncoder

estimator(pseudo_count=None)

Create a ParameterEstimator for corresponding SequenceEncodableProbabilityDistribution.

Parameters:

pseudo_count (Optional[float]) – Regularize sufficient statistics in estimation step.

Return type:

IntegerMultinomialEstimator

Returns:

ParameterEstimator

log_density(x)

Evaluate the log-density of IntegerMultinomialDistribution at observed value x.

Parameters:

x (Sequence[Tuple[int, float]]) – Sequence of Tuple(s) containing the integer category value and number of successes.

Returns:

Log-density at x.

Return type:

float

sampler(seed=None)

Create a DistributionSampler object for a given ProbabilityDistribution.

Parameters:

seed (Optional[int]) – Set seed for drawing samples from distribution.

Return type:

IntegerMultinomialSampler

seq_log_density(x)

Vectorized evaluation of the log density.

Parameters:

x (EncodedDataSequence) – EncodedDataSequence for corresponding SequenceEncodedProbabilityDistribution.

Return type:

ndarray

Returns:

np.ndarray

IntegerMultinomialEstimator

class dmx.stats.intmultinomial.IntegerMultinomialEstimator(min_val=None, max_val=None, len_estimator=<dmx.stats.null_dist.NullEstimator object>, len_dist=None, name=None, pseudo_count=None, suff_stat=None, keys=None)

IntegerMultinomialEstimator object for estimating integer multinomial distributions from aggregated data.

min_val

Set minimum value integer multinomial.

Type:

Optional[int]

max_val

Set maximum value for integer multinomial.

Type:

Optional[int]

len_estimator

ParameterEstimator for number of trials, set to NullEstimator() if None is passed as arg.

Type:

ParameterEstimator

len_dist

Optional SequenceEncodableProbabilityDistribution for fixing distribution on number of trials.

Type:

Optional[SequenceEncodableProbabilityDistribution]

name

Set name for object instance.

Type:

Optional[str]

pseudo_count

Used to re-weight sufficient statistics if suff_stat is passed.

Type:

Optional[float]

suff_stat

Set minimum value and counts for categories. If ‘min_val’ and ‘max_val’ are both not None, this is ignored in estimation.

Type:

Optional[Tuple[int, np.ndarray]]

keys

Set key for merging sufficient statistics of objects with matching keys.

Type:

Optional[str]

__init__(min_val=None, max_val=None, len_estimator=<dmx.stats.null_dist.NullEstimator object>, len_dist=None, name=None, pseudo_count=None, suff_stat=None, keys=None)

IntegerMultinomialEstimator object.

Parameters:
  • min_val (Optional[int]) – Set minimum value integer multinomial.

  • max_val (Optional[int]) – Set maximum value for integer multinomial.

  • len_estimator (Optional[ParameterEstimator]) – Optional ParameterEstimator for number of trials.

  • len_dist (Optional[SequenceEncodableProbabilityDistribution]) – Optional SequenceEncodableProbabilityDistribution for fixing distribution on number of trials.

  • name (Optional[str]) – Set name for object instance.

  • pseudo_count (Optional[float]) – Used to re-weight sufficient statistics if suff_stat is passed.

  • suff_stat (Optional[Tuple[int, np.ndarray]]) – Set minimum value and counts for categories.

  • keys (Optional[str]) – Set key for merging sufficient statistics of objects with matching keys.

accumulator_factory()

Create SequenceEncodableStatisticAccumulator object.

Return type:

IntegerMultinomialAccumulatorFactory

estimate(nobs, suff_stat)

Estimate SequenceEncodableProbabilityDistribution for sufficient statistics.

Parameters:
  • nobs (Optional[float]) – Weighted number of observations.

  • suff_stat (Tuple[int, np.ndarray, np.ndarray, np.ndarray]) – Sufficient statistics for dirichlet distribution.

Return type:

IntegerMultinomialDistribution

Returns:

SequenceEncodableProbabilityDistribution

IntegerMultinomialSampler

class dmx.stats.intmultinomial.IntegerMultinomialSampler(dist, seed=None)

Create IntegerMultinomialSampler object for sampling from IntegerMultinomialDistribution object instance.

dist

IntegerMultinomialDistribution object instance to sample from.

Type:

IntegerMultinomialDistribution

rng

RandomState set with seed if passed.

Type:

RandomState

len_sampler

DistributionSampler object for number of trials.

Type:

DistributionSampler

sample(size=None)

Draw independent samples from an integer multinomial distribution.

Parameters:

size (Optional[int]) – Number of samples to draw.

Return type:

Union[List[Tuple[int, float]], List[List[Tuple[int, float]]]]

Returns:

List length size containing List[Tuple[int, float]]. If size is None, returns one sample

List[Tuple[int, float]].