Integer Multinomial Distribution
Data Type: Sequence[Tuple[int, float]]
The integer multinomial distribution is a generalization of the binomial distribution to k classes. The multinomial give the probability of observing n_k success of class k in \(n=\sum_{i}n_i\) trials. The probability mass function is given by
where \(\sum_{i=1}^{k} p_i = 1\) and \(\sum_{i=1}^{k} x_i = n\).
For more info see Multinomial Distribution.
IntegerMultinomialDistribution
- class pysp.stats.intmultinomial.IntegerMultinomialDistribution(min_val=0, p_vec=None, len_dist=NullDistribution(name=None), name=None, keys=None)
IntegerMultinomialDistribution object.
- p_vec
Probability of each integer category for a trial.
- Type:
ndarray
- min_val
Smallest integer value for category range. Defaults to 0.
- Type:
int
- max_val
Largest value of category range. Set by min_val + len(p_vec) - 1.
- Type:
int
- log_p_vec
Log of p_vec member instance.
- Type:
ndarray
- num_vals
Total number of integer valued categories.
- Type:
int
- len_dist
Distribution for number of trials. Set to NullDistribution if None.
- keys
Keys for distribution passed when ParameterEstimator is created.
- Type:
Optional[str]
- name
Name for object instance.
- Type:
Optional[str]
- __init__(min_val=0, p_vec=None, len_dist=NullDistribution(name=None), name=None, keys=None)
IntegerMultinomialDistribution object.
- Parameters:
min_val (int) – Set the minimum value on range of values.
p_vec (Union[List[float],np.ndarray) – Probabilities for values. Length determines number of categories.
len_dist (Optional[SequenceEncodableProbabilityDistribution]) – Optional length distributions serving as for the number of trials.
name (Optional[str]) – Set name for object instance.
keys (Optional[str]) – Set key for distribution.
- density(x)
Evaluate the density of IntegerMultinomialDistribution at observed value x.
- Parameters:
x (Sequence[Tuple[int, float]]) – Sequence of Tuple(s) containing the integer category value and number of successes.
- Returns:
Density at x.
- Return type:
float
- dist_to_encoder()
Create DataSequenceEncoder object for SequenceEncodableProbabilityDistribution instance.
- Return type:
IntegerMultinomialDataEncoder- Returns:
DataSequenceEncoder
- estimator(pseudo_count=None)
Create a ParameterEstimator for corresponding SequenceEncodableProbabilityDistribution.
- Parameters:
pseudo_count (Optional[float]) – Regularize sufficient statistics in estimation step.
- Return type:
- Returns:
ParameterEstimator
- log_density(x)
Evaluate the log-density of IntegerMultinomialDistribution at observed value x.
- Parameters:
x (Sequence[Tuple[int, float]]) – Sequence of Tuple(s) containing the integer category value and number of successes.
- Returns:
Log-density at x.
- Return type:
float
- sampler(seed=None)
Create a DistributionSampler object for a given ProbabilityDistribution.
- Parameters:
seed (Optional[int]) – Set seed for drawing samples from distribution.
- Return type:
- seq_log_density(x)
Vectorized evaluation of the log density.
- Parameters:
x (EncodedDataSequence) – EncodedDataSequence for corresponding SequenceEncodedProbabilityDistribution.
- Return type:
ndarray- Returns:
np.ndarray
IntegerMultinomialEstimator
- class pysp.stats.intmultinomial.IntegerMultinomialEstimator(min_val=None, max_val=None, len_estimator=<pysp.stats.null_dist.NullEstimator object>, len_dist=None, name=None, pseudo_count=None, suff_stat=None, keys=None)
IntegerMultinomialEstimator object for estimating integer multinomial distributions from aggregated data.
- min_val
Set minimum value integer multinomial.
- Type:
Optional[int]
- max_val
Set maximum value for integer multinomial.
- Type:
Optional[int]
- len_estimator
ParameterEstimator for number of trials, set to NullEstimator() if None is passed as arg.
- Type:
- len_dist
Optional SequenceEncodableProbabilityDistribution for fixing distribution on number of trials.
- Type:
Optional[SequenceEncodableProbabilityDistribution]
- name
Set name for object instance.
- Type:
Optional[str]
- pseudo_count
Used to re-weight sufficient statistics if suff_stat is passed.
- Type:
Optional[float]
- suff_stat
Set minimum value and counts for categories. If ‘min_val’ and ‘max_val’ are both not None, this is ignored in estimation.
- Type:
Optional[Tuple[int, np.ndarray]]
- keys
Set key for merging sufficient statistics of objects with matching keys.
- Type:
Optional[str]
- __init__(min_val=None, max_val=None, len_estimator=<pysp.stats.null_dist.NullEstimator object>, len_dist=None, name=None, pseudo_count=None, suff_stat=None, keys=None)
IntegerMultinomialEstimator object.
- Parameters:
min_val (Optional[int]) – Set minimum value integer multinomial.
max_val (Optional[int]) – Set maximum value for integer multinomial.
len_estimator (Optional[ParameterEstimator]) – Optional ParameterEstimator for number of trials.
len_dist (Optional[SequenceEncodableProbabilityDistribution]) – Optional SequenceEncodableProbabilityDistribution for fixing distribution on number of trials.
name (Optional[str]) – Set name for object instance.
pseudo_count (Optional[float]) – Used to re-weight sufficient statistics if suff_stat is passed.
suff_stat (Optional[Tuple[int, np.ndarray]]) – Set minimum value and counts for categories.
keys (Optional[str]) – Set key for merging sufficient statistics of objects with matching keys.
- accumulator_factory()
Create SequenceEncodableStatisticAccumulator object.
- Return type:
IntegerMultinomialAccumulatorFactory
- estimate(nobs, suff_stat)
Estimate SequenceEncodableProbabilityDistribution for sufficient statistics.
- Parameters:
nobs (Optional[float]) – Weighted number of observations.
suff_stat (Tuple[int, np.ndarray, np.ndarray, np.ndarray]) – Sufficient statistics for dirichlet distribution.
- Return type:
- Returns:
SequenceEncodableProbabilityDistribution
IntegerMultinomialSampler
- class pysp.stats.intmultinomial.IntegerMultinomialSampler(dist, seed=None)
Create IntegerMultinomialSampler object for sampling from IntegerMultinomialDistribution object instance.
- dist
IntegerMultinomialDistribution object instance to sample from.
- rng
RandomState set with seed if passed.
- Type:
RandomState
- len_sampler
DistributionSampler object for number of trials.
- Type:
- sample(size=None)
Draw independent samples from an integer multinomial distribution.
- Parameters:
size (Optional[int]) – Number of samples to draw.
- Return type:
Union[List[Tuple[int,float]],List[List[Tuple[int,float]]]]- Returns:
- List length size containing List[Tuple[int, float]]. If size is None, returns one sample
List[Tuple[int, float]].