Bernoulli Set Distribution
Data Type: Sequence[str]
The Bernoulli set distribution is distribution over the power sets of elements \(V = \{v_0, v_1, ..., v_{n-1}\}\). Each element \(v_i\) is included in the set with probability \(p_i\). Note there is no constraint \(\sum_{i} p_i = 1\), as each \(p_i\) simply models the probability that element v_i is included in the set. Let x be a subset of V. The probability mass function for a Bernoulli set distribution is given by
For speed, the user can map observed values \(v_i \rightarrow i\) and use the Integer Categorical Distribution.
BernoulliSetDistribution
- class pysp.stats.setdist.BernoulliSetDistribution(pmap, min_prob=1e-128, name=None, keys=None)
BernoulliSetDistribution object for creating a Bernoulli set distribution.
- keys
Keys for object instance.
- Type:
Optional[str]
- name
Name to object instance.
- Type:
Optional[str]
- pmap
Maps elements in support to probabilities.
- Type:
Dict[Any, float]
- required
An observation must contain this subset of elements. Else, return probability 0.0.
- Type:
Set
- nlog_sum
Normalizing term for computing numerically stable likelihood.
- Type:
float
- log_dmap
Map from elements to their corrected log probability of inclusion in the set.
- Type:
Dict[Any, float]
- min_prob
Minimum probability for elements. Corrects for prob = 0.
- Type:
float
- num_required
Number of required elements in a subset. Corrected if min_prob was non-zero.
- Type:
int
- __init__(pmap, min_prob=1e-128, name=None, keys=None)
BernoulliSetDistribution object.
- Parameters:
pmap (Dict[Any, float]) – Maps values to probabilities.
min_prob (float) – Minimum probability for numerical stability in log prob calculations.
name (Optional[str]) – Set name to object instance.
keys (Optional[str]) – Set keys for object instance.
- dist_to_encoder()
Create DataSequenceEncoder object for SequenceEncodableProbabilityDistribution instance.
- Return type:
BernoulliSetDataEncoder- Returns:
DataSequenceEncoder
- estimator(pseudo_count=None)
Create a ParameterEstimator for corresponding SequenceEncodableProbabilityDistribution.
- Parameters:
pseudo_count (Optional[float]) – Regularize sufficient statistics in estimation step.
- Return type:
- Returns:
ParameterEstimator
- log_density(x)
Evaluate the log-density of distribution.
- Return type:
float- Returns:
float
- sampler(seed=None)
Create a DistributionSampler object for a given ProbabilityDistribution.
- Parameters:
seed (Optional[int]) – Set seed for drawing samples from distribution.
- Return type:
- seq_log_density(x)
Vectorized evaluation of the log density.
- Parameters:
x (EncodedDataSequence) – EncodedDataSequence for corresponding SequenceEncodedProbabilityDistribution.
- Return type:
ndarray- Returns:
np.ndarray
BernoulliSetEstimator
- class pysp.stats.setdist.BernoulliSetEstimator(min_prob=1e-128, pseudo_count=None, suff_stat=None, name=None, keys=None)
BernoulliSetEstimator object for estimating Bernoulli set distribution from aggregated sufficient statistics.
- min_prob
Minimum probability for elements estimated with prob = 0.
- Type:
float
- pseudo_count
Used to re-weight suff_stats in estimation.
- Type:
Optional[float]
- suff_stat
Optional dictionary containing value to probability mapping.
- Type:
Optional[Dict[Any, float]]
- name
Set name for object instance.
- Type:
Optional[str]
- keys
Set key for merging sufficient statistics.
- Type:
Optional[str]
- __init__(min_prob=1e-128, pseudo_count=None, suff_stat=None, name=None, keys=None)
BernoulliSetEstimator object.
- Parameters:
min_prob (float) – Minimum probability for elements estimated with prob = 0.
pseudo_count (Optional[float]) – Used to re-weight suff_stats in estimation.
suff_stat (Optional[Dict[Any, float]]) – Optional dictionary containing value to probability mapping.
name (Optional[str]) – Set name for object instance.
keys (Optional[str]) – Set key for merging sufficient statistics.
- accumulator_factory()
Create SequenceEncodableStatisticAccumulator object.
- Return type:
BernoulliSetAccumulatorFactory
- estimate(nobs, suff_stat)
Estimate SequenceEncodableProbabilityDistribution for sufficient statistics.
- Parameters:
nobs (Optional[float]) – Weighted number of observations.
suff_stat (Tuple[int, np.ndarray, np.ndarray, np.ndarray]) – Sufficient statistics for dirichlet distribution.
- Return type:
- Returns:
SequenceEncodableProbabilityDistribution
BernoulliSetSampler
- class pysp.stats.setdist.BernoulliSetSampler(dist, seed=None)
BernoulliSetSampler object for generating samples from BernoulliSetDistribution object instance.
- dist
Object instance to sample from.
- Type:
- seed
Set seed for random number generator.
- Type:
Optional[int]
- sample(size=None)
Generate samples from distribution.
- Parameters:
size (Optional[int]) – Number of samples to generate.
- Return type:
Union[Sequence[Any],List[Sequence[Any]]]- Returns:
Samples from distribution.