tmnt.modeling¶
Core Neural Net architectures for topic modeling.
Classes
BaseSeqBowVED (llm, latent_dist[, ...]) |
|
BaseVAE ([vocab_size, latent_distribution, ...]) |
|
BowVAEModel (enc_dim, embedding_size, ...[, ...]) |
Defines the neural architecture for a bag-of-words topic model. |
CoherenceRegularizer ([coherence_pen, ...]) |
|
ContinuousCovariateModel (n_topics, vocab_size) |
|
CovariateBowVAEModel ([covar_net_layers]) |
Bag-of-words topic model with labels used as co-variates |
CovariateModel (n_topics, n_covars, vocab_size) |
|
GeneralizedSDMLLoss ([smoothing_parameter, ...]) |
Calculates Batchwise Smoothed Deep Metric Learning (SDML) Loss given two input tensors and a smoothing weight SDM Loss learns similarity between paired samples by using unpaired samples in the minibatch as potential negative examples. |
MetricBowVAEModel (*args, **kwargs) |
|
MetricSeqBowVED (*args, **kwargs) |
|
MultiNegativeCrossEntropyLoss ([...]) |
Inputs: |
SeqBowVED (*args, **kwargs) |
-
class
BaseVAE
(vocab_size=2000, latent_distribution=LogisticGaussianDistribution( (mu_encoder): Linear(in_features=100, out_features=20, bias=True) (mu_bn): BatchNorm1d(20, eps=0.0001, momentum=0.8, affine=True, track_running_stats=True) (softmax): Softmax(dim=1) (softplus): Softplus(beta=1, threshold=20) (lv_encoder): Linear(in_features=100, out_features=20, bias=True) (lv_bn): BatchNorm1d(20, eps=0.001, momentum=0.8, affine=True, track_running_stats=True) (post_sample_dr_o): Dropout(p=0.1, inplace=False) ), coherence_reg_penalty=0.0, redundancy_reg_penalty=0.0, n_covars=0, device='cpu', **kwargs)[source]¶ Bases:
Module
-
class
BowVAEModel
(enc_dim, embedding_size, n_encoding_layers, enc_dr, n_labels=0, gamma=1.0, multilabel=False, classifier_dropout=0.1, *args, **kwargs)[source]¶ Bases:
BaseVAE
Defines the neural architecture for a bag-of-words topic model.
Parameters: - enc_dim (int) – Number of dimension of input encoder (first FC layer)
- embedding_size (int) – Number of dimensions for embedding layer
- n_encoding_layers (int) – Number of layers used for the encoder. (default = 1)
- enc_dr (float) – Dropout after each encoder layer. (default = 0.1)
- n_covars (int) – Number of values for categorical co-variate (0 for non-CovariateData BOW model)
- device (str) – context device
-
encode_data
(data, include_bn=True)[source]¶ Encode data to the mean of the latent distribution defined by the input data.
Parameters¶
- data: mxnet.ndarray.NDArray or mxnet.symbol.Symbol
- input data of shape (batch_size, vocab_size)
Returns¶
- mxnet.ndarray.NDArray or mxnet.symbol.Symbol
- Result of encoding with shape (batch_size, n_latent)
-
predict
(data)[source]¶ Predict the label given the input data (ignoring VAE reconstruction)
Parameters: data (tensor) – input data tensor Returns: unnormalized outputs over label values Return type: output vector (tensor)
-
forward
(data)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
MetricBowVAEModel
(*args, **kwargs)[source]¶ Bases:
BowVAEModel
-
forward
(F, data1, data2)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
CovariateBowVAEModel
(covar_net_layers=1, *args, **kwargs)[source]¶ Bases:
BowVAEModel
Bag-of-words topic model with labels used as co-variates
-
encode_data_with_covariates
(data, covars, include_bn=False)[source]¶ Encode data to the mean of the latent distribution defined by the input data
-
get_ordered_terms_with_covar_at_data
(data, k, covar)[source]¶ Uses test/training data-point as the input points around which term sensitivity is computed
-
forward
(F, data, covars)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
CovariateModel
(n_topics, n_covars, vocab_size, interactions=False, device='cpu')[source]¶ Bases:
Module
-
forward
(topic_distrib, covars)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
ContinuousCovariateModel
(n_topics, vocab_size, total_layers=1, device='device')[source]¶ Bases:
Module
-
forward
(topic_distrib, scalars)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
CoherenceRegularizer
(coherence_pen=1.0, redundancy_pen=1.0)[source]¶ Bases:
Module
-
forward
(w, emb)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
BaseSeqBowVED
(llm, latent_dist, num_classes=0, dropout=0.0, vocab_size=2000, kld=0.1, device='cpu', use_pooling=True, entropy_loss_coef=1000.0, redundancy_reg_penalty=0.0, pre_trained_embedding=None)[source]¶ Bases:
BaseVAE
-
class
SeqBowVED
(*args, **kwargs)[source]¶ Bases:
BaseSeqBowVED
-
forward
(input_ids, attention_mask, bow=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
MetricSeqBowVED
(*args, **kwargs)[source]¶ Bases:
BaseSeqBowVED
-
forward
(in1, mask1, bow1, in2, mask2, bow2)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
GeneralizedSDMLLoss
(smoothing_parameter=0.3, weight=1.0, batch_axis=0, x2_downweight_idx=-1, **kwargs)[source]¶ Bases:
_Loss
Calculates Batchwise Smoothed Deep Metric Learning (SDML) Loss given two input tensors and a smoothing weight SDM Loss learns similarity between paired samples by using unpaired samples in the minibatch as potential negative examples.
The loss is described in greater detail in “Large Scale Question Paraphrase Retrieval with Smoothed Deep Metric Learning.” - by Bonadiman, Daniele, Anjishnu Kumar, and Arpit Mittal. arXiv preprint arXiv:1905.12786 (2019). URL: https://arxiv.org/pdf/1905.12786.pdf
Parameters¶
- smoothing_parameterfloat
- Probability mass to be distributed over the minibatch. Must be < 1.0.
- weightfloat or None
- Global scalar weight for loss.
- batch_axisint, default 0
- The axis that represents mini-batch.
- Inputs:
- x1: Minibatch of data points with shape (batch_size, vector_dim)
- x2: Minibatch of data points with shape (batch_size, vector_dim) Each item in x1 is a positive sample for the items with the same label in x2 That is, x1[0] and x2[0] form a positive pair iff label(x1[0]) = label(x2[0]) All data points in different rows should be decorrelated
- Outputs:
- loss: loss tensor with shape (batch_size,).
-
forward
(x1, l1, x2, l2)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
MultiNegativeCrossEntropyLoss
(smoothing_parameter=0.1, metric_loss_temp=0.1, batch_axis=0, **kwargs)[source]¶ Bases:
_Loss
- Inputs:
- x1: Minibatch of data points with shape (batch_size, vector_dim)
- x2: Minibatch of data points with shape (batch_size, vector_dim) Each item in x1 is a positive sample for the items with the same label in x2 That is, x1[0] and x2[0] form a positive pair iff label(x1[0]) = label(x2[0]) All data points in different rows should be decorrelated
- Outputs:
- loss: loss tensor with shape (batch_size,).
-
forward
(x1, l1, x2, l2)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.