Conditional Distribution
The Conditional distribution is used to model conditional dependencies between two random variables. This can be used for separate data types.
Assume we have observed \((x_i, y_i)\) where \(x_i\) has data type \(T_1\) and \(y_i\) has data type \(T_2\) The Conditional distribution is used to model conditional dependencies between two random variables. This can be used for separate data types.
Assume we have observed \((x_i, y_i)\) where \(x_i\) has data type \(T_1\) and \(y_i\) has data type \(T_2\). Choosing a compatible given distribution \(f(x_i \vert theta_1)\) for \(x_i\) and a distribution \(g(y_i \vert \theta_2)\), the conditional density is given by
Note that each value of \(x_i\) emits a distribution over the support of y values.
ConditionalDistribution
- class pysp.stats.conditional.ConditionalDistribution(dmap, default_dist=NullDistribution(name=None), given_dist=NullDistribution(name=None), name=None, keys=None)
ConditionalDistribution object for data types x=Tuple[T0, T1].
- dmap
Mapping from T0 to distributions.
- Type:
Dict[T0, SequenceEncodableProbabilityDistribution]
- default_dist
Default distribution if key not in dmap.
- given_dist
Distribution for given variable.
- has_default
True if default_dist is not NullDistribution.
- Type:
bool
- has_given
True if given_dist is not NullDistribution.
- Type:
bool
- name
Name assigned to object.
- Type:
Optional[str]
- keys
All ConditionalDistribution objects with same keys value are the same distribution.
- Type:
Optional[str]
- __init__(dmap, default_dist=NullDistribution(name=None), given_dist=NullDistribution(name=None), name=None, keys=None)
Initialize ConditionalDistribution.
- Parameters:
dmap (Union[Dict[Any, SequenceEncodableProbabilityDistribution], List[SequenceEncodableProbabilityDistribution]]) – Used to create dictionary of distributions.
default_dist (Optional[SequenceEncodableProbabilityDistribution]) – Distribution for case where x[0] is not a key in dmap.
given_dist (Optional[SequenceEncodableProbabilityDistribution]) – Distribution for the given variable.
name (Optional[str], optional) – Name assigned to object.
keys (Optional[str], optional) – All ConditionalDistribution objects with same keys value are the same distribution.
- density(x)
Evaluate density of ConditionalDistribution at Tuple x.
- Parameters:
x (Tuple[T0, T1]) – T0 data type must match keys of dmap, T1 must match value of dmap distribution for key value.
- Returns:
Density of ConditionalDistribution at Tuple x.
- Return type:
float
- dist_to_encoder()
Return a ConditionalDistributionDataEncoder for this distribution.
- Returns:
Encoder object.
- Return type:
ConditionalDistributionDataEncoder
- estimator(pseudo_count=None)
Create ConditionalDistributionEstimator from sufficient statistics.
- Parameters:
pseudo_count (Optional[float], optional) – Used to inflate the sufficient statistics.
- Returns:
Estimator object.
- Return type:
- log_density(x)
Evaluate log-density of ConditionalDistribution at Tuple x.
- Parameters:
x (Tuple[T0, T1]) – T0 data type must match keys of dmap, T1 must match value of dmap distribution for key value.
- Returns:
Log-density of ConditionalDistribution at Tuple x.
- Return type:
float
- sampler(seed=None)
Create ConditionalDistributionSampler for sampling from this distribution.
- Parameters:
seed (Optional[int], optional) – Seed for random number generator.
- Returns:
Sampler object.
- Return type:
- seq_log_density(x)
Vectorized log-density for encoded data.
- Parameters:
x (ConditionalEncodedDataSequence) – Encoded data sequence.
- Returns:
Log-density values.
- Return type:
np.ndarray
- Raises:
Exception – If input is not a ConditionalEncodedDataSequence.
ConditionalDistributionEstimator
- class pysp.stats.conditional.ConditionalDistributionEstimator(estimator_map, default_estimator=<pysp.stats.null_dist.NullEstimator object>, given_estimator=<pysp.stats.null_dist.NullEstimator object>, name=None, keys=None)
Estimator for ConditionalDistribution.
- estimator_map
Estimators for each conditional distribution.
- Type:
Dict[T0, ParameterEstimator]
- default_estimator
Estimator for default_distribution.
- Type:
- given_estimator
Estimator for given_distribution.
- Type:
- name
Name for object.
- Type:
Optional[str]
- keys
ConditionalDistributionEstimator with matching ‘keys’ will be aggregated.
- Type:
Optional[str]
- __init__(estimator_map, default_estimator=<pysp.stats.null_dist.NullEstimator object>, given_estimator=<pysp.stats.null_dist.NullEstimator object>, name=None, keys=None)
Initialize ConditionalDistributionEstimator.
- Parameters:
estimator_map (Dict[T0, ParameterEstimator]) – Estimators for each conditional distribution.
default_estimator (Optional[ParameterEstimator]) – Estimator for default_distribution.
given_estimator (Optional[ParameterEstimator]) – Estimator for given_distribution.
name (Optional[str], optional) – Name for object.
keys (Optional[str], optional) – ConditionalDistributionEstimator with matching ‘keys’ will be aggregated.
- Raises:
TypeError – If keys is not a string or None.
- accumulator_factory()
Return a ConditionalDistributionAccumulatorFactory for this estimator.
- Returns:
Factory object.
- Return type:
ConditionalDistributionAccumulatorFactory
- estimate(nobs, suff_stat)
Estimate a ConditionalDistribution from aggregated data.
- Parameters:
nobs (Optional[float]) – Not used. Kept for consistency.
suff_stat (Tuple[Dict[T0, SS0], Optional[SS1], Optional[SS2]]) – Sufficient statistics.
- Returns:
Estimated distribution.
- Return type:
ConditionalDistributionSampler
- class pysp.stats.conditional.ConditionalDistributionSampler(dist, seed=None)
Sampler for ConditionalDistribution.
- dist
ConditionalDistribution object to draw samples from.
- Type:
- default_sampler
Sampler for default_dist.
- Type:
- has_default_sampler
True if default sampler is not NullDistribution.
- Type:
bool
- given_sampler
Sampler for given_dist.
- Type:
- has_given_sampler
True if given sampler is not NullDistribution.
- Type:
bool
- samplers
Dictionary of samplers for each key.
- Type:
Dict[T0, DistributionSampler]
- sample(size=None)
Sample independent samples from ConditionalDistribution.
- Parameters:
size (Optional[int], optional) – Number of samples to draw. If None, returns a single sample.
- Returns:
A tuple or list of tuples of (T0, T1).
- Return type:
Union[Tuple[Any, Any], List[Tuple[Any, Any]]]
- sample_given(x)
Sample from conditional distribution given value x.
- Parameters:
x (T0) – Value of given/conditional variable.
- Returns:
Single sample from the conditional distribution for x.
- Return type:
Any
- single_sample()
Generate a single sample from the ConditionalDistribution.
- Returns:
(T0, T1) as defined from dmap and given_distribution types in dist.
- Return type:
Tuple[Any, Any]