Joint Mixture Distribution
The Joint Mixture Distribution is a mixture of mixtures (see Mixture Distribution). This model is particularly useful when observations can belong to multiple latent groups simultaneously. This model can capture mutli-level clustering and dependencies. For \(K_1 = K_2\), this model can be viewed as a single-step Hidden Markov Distribution. The generative process for a Joint Mixture Model with \(K_1\) outer-states and \(K_2\) inner-states is described as
where the initial group membership is drawn \(P(Z_1 = k_1) = \pi_{k_1}\) and transition probability is given by \(P(Z_2 = k_2 \vert Z_1 = k_1) = \tau_{k_1, k_2}\).
JointMixtureDistribution
- class dml.stats.jmixture.JointMixtureDistribution(components1, components2, w1, w2, taus12, taus21, keys=(None, None, None), name=None)
JointMixtureDistribution object for defining a joint mixture distribution.
Notes
Data type is Tuple[T0, T1] where all components1 entries and component2 entries are compatible with T0 and T1 respectively.
- components1
Mixture components for mixture of X1.
- Type:
Sequence[SequenceEncodableProbabilityDistribution]
- components2
Mixture components for mixture X2.
- Type:
Sequence[SequenceEncodableProbabilityDistribution]
- w1
Probability of drawing X1 from component i.
- Type:
np.ndarray
- w2
Probability of drawing X2 from component j.
- Type:
np.ndarray
- num_components1
Number of mixture components for X1.
- Type:
int
- num_components2
Number of mixture components for X2.
- Type:
int
- taus12
2-d Numpy array with probabilities of drawing X2 from comp j given X1 was drawn from comp i. Rows are component X1 state.
- Type:
np.ndarray
- taus21
2-d Numpy array with probabilities of drawing X1 from comp i given X2 was drawn from comp j. Rows are component X1 state.
- Type:
np.ndarray
- log_w1
Log-probability of drawing X1 from component i.
- Type:
np.ndarray
- log_w2
Log-probability of drawing X2 from component j.
- Type:
np.ndarray
- log_taus12
2-d Numpy array with log-probabilities of drawing X2 from comp j given X1 was drawn from comp i. Rows are component X1 state.
- Type:
np.ndarray
- log_taus21
2-d Numpy array with log-probabilities of drawing X1 from comp i given X2 was drawn from comp j. Rows are component X1 state.
- Type:
np.ndarray
- keys
Set keys for weights, mixture components of X1, mixture components of X2.
- Type:
Optional[Tuple[Optional[str], Optional[str], Optional[str]]]
- name
Set name to object.
- Type:
Optional[str]
- __init__(components1, components2, w1, w2, taus12, taus21, keys=(None, None, None), name=None)
JointMixtureDistribution object.
- Parameters:
components1 (Sequence[SequenceEncodableProbabilityDistribution]) – Mixture components for mixture of X1.
components2 (Sequence[SequenceEncodableProbabilityDistribution]) – Mixture components for mixture X2.
w1 (np.ndarray) – Probability of drawing X1 from component i.
w2 (np.ndarray) – Probability of drawing X2 from component j.
taus12 (np.ndarray) – 2-d Numpy array with probabilities of drawing X2 from comp j given X1 was drawn from comp i. Rows are component X1 state.
taus21 (np.ndarray) – 2-d Numpy array with probabilities of drawing X1 from comp i given X2 was drawn from comp j. Rows are component X1 state.
keys (Optional[Tuple[Optional[str], Optional[str], Optional[str]]]) – Set keys for weights, mixture components of X1, mixture components of X2.
name (Optional[str]) – Set name to object.
- dist_to_encoder()
Create DataSequenceEncoder object for SequenceEncodableProbabilityDistribution instance.
- Return type:
- Returns:
DataSequenceEncoder
- estimator(pseudo_count=None)
Create a ParameterEstimator for corresponding SequenceEncodableProbabilityDistribution.
- Parameters:
pseudo_count (Optional[float]) – Regularize sufficient statistics in estimation step.
- Return type:
- Returns:
ParameterEstimator
- log_density(x)
Evaluate the log-density of distribution.
- Return type:
float- Returns:
float
- sampler(seed=None)
Create a DistributionSampler object for a given ProbabilityDistribution.
- Parameters:
seed (Optional[int]) – Set seed for drawing samples from distribution.
- Return type:
- seq_log_density(x)
Vectorized evaluation of the log density.
- Parameters:
x (EncodedDataSequence) – EncodedDataSequence for corresponding SequenceEncodedProbabilityDistribution.
- Return type:
ndarray- Returns:
np.ndarray
JointMixtureEstimator
- class dml.stats.jmixture.JointMixtureEstimator(estimators1, estimators2, suff_stat=None, pseudo_count=None, keys=(None, None, None), name=None)
JointMixtureEstimator object for estimating joint mixture distribution from aggregated sufficient stats.
- estimators1
Estimators for mixture component of X1.
- Type:
Sequence[ParameterEstimator]
- estimators2
Estimators for mixture component of X2.
- Type:
Sequence[ParameterEstimator]
- suff_stat
- pseudo_count
Used to re-weight the state counts in estimation.
- Type:
Optional[Tuple[float, float, float]]
- keys
Set keys for weights, mixture components of X1, mixture components of X2.
- Type:
Optional[Tuple[Optional[str], Optional[str], Optional[str]]]
- name
Set name to object.
- Type:
Optional[str]
- __init__(estimators1, estimators2, suff_stat=None, pseudo_count=None, keys=(None, None, None), name=None)
JointMixtureEstimator object.
- Parameters:
estimators1 (Sequence[ParameterEstimator]) – Estimators for mixture component of X1.
estimators2 (Sequence[ParameterEstimator]) – Estimators for mixture component of X2.
suff_stat (
Optional[Tuple[ndarray,ndarray,ndarray,Tuple[TypeVar(E0),...],Tuple[TypeVar(E1),...]]])pseudo_count (Optional[Tuple[float, float, float]]) – Used to re-weight the state counts in estimation.
keys (Optional[Tuple[Optional[str], Optional[str], Optional[str]]]) – Set keys for weights, mixture components of X1, mixture components of X2.
name (Optional[str]) – Set name to object.
- accumulator_factory()
Create SequenceEncodableStatisticAccumulator object.
- Return type:
JointMixtureEstimatorAccumulatorFactory
- estimate(nobs, suff_stat)
Estimate SequenceEncodableProbabilityDistribution for sufficient statistics.
- Parameters:
nobs (Optional[float]) – Weighted number of observations.
suff_stat (Tuple[int, np.ndarray, np.ndarray, np.ndarray]) – Sufficient statistics for dirichlet distribution.
- Return type:
- Returns:
SequenceEncodableProbabilityDistribution
JointMixtureSampler
- class dml.stats.jmixture.JointMixtureSampler(dist, seed=None)
JointMixtureSampler object for sampling from a joint mixture distribution.
- rng
RandomState for seeding samples.
- Type:
RandomState
- dist
Distribution to sample from.
- Type:
- comp_sampler1
Inner-mixture sampler.
- Type:
- comp_sampler2
Outer-mixture sampler.
- Type:
- sample(size=None)
Generate samples from distribution.
- Parameters:
size (Optional[int]) – Number of samples to generate.
- Return type:
Union[Tuple[Any,Any],Sequence[Tuple[Any,Any]]]- Returns:
Samples from distribution.