tmnt.modeling¶

Core Neural Net architectures for topic modeling.

Classes

`BaseSeqBowVED`(llm, latent_dist[, ...])
`BaseVAE`([vocab_size, latent_distribution, ...])
`BowVAEModel`(enc_dim, embedding_size, ...[, ...])	Defines the neural architecture for a bag-of-words topic model.
`CoherenceRegularizer`([coherence_pen, ...])
`ContinuousCovariateModel`(n_topics, vocab_size)
`CovariateBowVAEModel`([covar_net_layers])	Bag-of-words topic model with labels used as co-variates
`CovariateModel`(n_topics, n_covars, vocab_size)
`GeneralizedSDMLLoss`([smoothing_parameter, ...])	Calculates Batchwise Smoothed Deep Metric Learning (SDML) Loss given two input tensors and a smoothing weight SDM Loss learns similarity between paired samples by using unpaired samples in the minibatch as potential negative examples.
`MetricBowVAEModel`(args, *kwargs)
`MetricSeqBowVED`(args, *kwargs)
`MultiNegativeCrossEntropyLoss`([...])	Inputs:
`SeqBowVED`(args, *kwargs)

class BaseVAE(vocab_size=2000, latent_distribution=LogisticGaussianDistribution( (mu_encoder): Linear(in_features=100, out_features=20, bias=True) (mu_bn): BatchNorm1d(20, eps=0.0001, momentum=0.8, affine=True, track_running_stats=True) (softmax): Softmax(dim=1) (softplus): Softplus(beta=1, threshold=20) (lv_encoder): Linear(in_features=100, out_features=20, bias=True) (lv_bn): BatchNorm1d(20, eps=0.001, momentum=0.8, affine=True, track_running_stats=True) (post_sample_dr_o): Dropout(p=0.1, inplace=False) ), coherence_reg_penalty=0.0, redundancy_reg_penalty=0.0, n_covars=0, device='cpu', **kwargs)[source]¶

Bases: Module

get_ordered_terms()[source]¶: Returns the top K terms for each topic based on sensitivity analysis. Terms whose probability increases the most for a unit increase in a given topic score/probability are those most associated with the topic.

get_topic_vectors()[source]¶: Returns unnormalized topic vectors

class BowVAEModel(enc_dim, embedding_size, n_encoding_layers, enc_dr, n_labels=0, gamma=1.0, multilabel=False, classifier_dropout=0.1, *args, **kwargs)[source]¶

Bases: BaseVAE

Defines the neural architecture for a bag-of-words topic model.

Parameters:

enc_dim (int) – Number of dimension of input encoder (first FC layer)
embedding_size (int) – Number of dimensions for embedding layer
n_encoding_layers (int) – Number of layers used for the encoder. (default = 1)
enc_dr (float) – Dropout after each encoder layer. (default = 0.1)
n_covars (int) – Number of values for categorical co-variate (0 for non-CovariateData BOW model)
device (str) – context device

encode_data(data, include_bn=True)[source]¶

Encode data to the mean of the latent distribution defined by the input data.

Parameters¶

data: mxnet.ndarray.NDArray or mxnet.symbol.Symbol: input data of shape (batch_size, vocab_size)

Returns¶

mxnet.ndarray.NDArray or mxnet.symbol.Symbol: Result of encoding with shape (batch_size, n_latent)

predict(data)[source]¶

Predict the label given the input data (ignoring VAE reconstruction)

Parameters:	data (tensor) – input data tensor
Returns:	unnormalized outputs over label values
Return type:	output vector (tensor)

forward(data)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class MetricBowVAEModel(*args, **kwargs)[source]¶

Bases: BowVAEModel

forward(F, data1, data2)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class CovariateBowVAEModel(covar_net_layers=1, *args, **kwargs)[source]¶

Bases: BowVAEModel

Bag-of-words topic model with labels used as co-variates

encode_data_with_covariates(data, covars, include_bn=False)[source]¶: Encode data to the mean of the latent distribution defined by the input data

get_ordered_terms_with_covar_at_data(data, k, covar)[source]¶: Uses test/training data-point as the input points around which term sensitivity is computed

get_topic_vectors(data, covar)[source]¶: Returns unnormalized topic vectors based on the input data

forward(F, data, covars)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class CovariateModel(n_topics, n_covars, vocab_size, interactions=False, device='cpu')[source]¶

Bases: Module

forward(topic_distrib, covars)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ContinuousCovariateModel(n_topics, vocab_size, total_layers=1, device='device')[source]¶

Bases: Module

forward(topic_distrib, scalars)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class CoherenceRegularizer(coherence_pen=1.0, redundancy_pen=1.0)[source]¶

Bases: Module

forward(w, emb)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class BaseSeqBowVED(llm, latent_dist, num_classes=0, dropout=0.0, vocab_size=2000, kld=0.1, device='cpu', use_pooling=True, entropy_loss_coef=1000.0, redundancy_reg_penalty=0.0, pre_trained_embedding=None)[source]¶

Bases: BaseVAE

get_ordered_terms()[source]¶: Returns the top K terms for each topic based on sensitivity analysis. Terms whose probability increases the most for a unit increase in a given topic score/probability are those most associated with the topic.

class SeqBowVED(*args, **kwargs)[source]¶

Bases: BaseSeqBowVED

forward(input_ids, attention_mask, bow=None)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class MetricSeqBowVED(*args, **kwargs)[source]¶

Bases: BaseSeqBowVED

forward(in1, mask1, bow1, in2, mask2, bow2)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class GeneralizedSDMLLoss(smoothing_parameter=0.3, weight=1.0, batch_axis=0, x2_downweight_idx=-1, **kwargs)[source]¶

Bases: _Loss

Calculates Batchwise Smoothed Deep Metric Learning (SDML) Loss given two input tensors and a smoothing weight SDM Loss learns similarity between paired samples by using unpaired samples in the minibatch as potential negative examples.

The loss is described in greater detail in “Large Scale Question Paraphrase Retrieval with Smoothed Deep Metric Learning.” - by Bonadiman, Daniele, Anjishnu Kumar, and Arpit Mittal. arXiv preprint arXiv:1905.12786 (2019). URL: https://arxiv.org/pdf/1905.12786.pdf

Parameters¶

smoothing_parameterfloat

Probability mass to be distributed over the minibatch. Must be < 1.0.

weightfloat or None

Global scalar weight for loss.

batch_axisint, default 0

The axis that represents mini-batch.

Inputs:

x1: Minibatch of data points with shape (batch_size, vector_dim)
x2: Minibatch of data points with shape (batch_size, vector_dim) Each item in x1 is a positive sample for the items with the same label in x2 That is, x1[0] and x2[0] form a positive pair iff label(x1[0]) = label(x2[0]) All data points in different rows should be decorrelated

Outputs:

loss: loss tensor with shape (batch_size,).

forward(x1, l1, x2, l2)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class MultiNegativeCrossEntropyLoss(smoothing_parameter=0.1, metric_loss_temp=0.1, batch_axis=0, **kwargs)[source]¶

Bases: _Loss

Inputs:

x1: Minibatch of data points with shape (batch_size, vector_dim)
x2: Minibatch of data points with shape (batch_size, vector_dim) Each item in x1 is a positive sample for the items with the same label in x2 That is, x1[0] and x2[0] form a positive pair iff label(x1[0]) = label(x2[0]) All data points in different rows should be decorrelated

Outputs:

loss: loss tensor with shape (batch_size,).

forward(x1, l1, x2, l2)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.