pixyz.losses (Loss API)¶
Loss¶
-
class
pixyz.losses.losses.
Loss
(p, q=None, input_var=None)[source]¶ Bases:
object
Loss class. In Pixyz, all loss classes are required to inherit this class.
Examples
>>> import torch >>> from torch.nn import functional as F >>> from pixyz.distributions import Bernoulli, Normal >>> from pixyz.losses import StochasticReconstructionLoss, KullbackLeibler ... >>> # Set distributions >>> class Inference(Normal): ... def __init__(self): ... super().__init__(cond_var=["x"], var=["z"], name="q") ... self.model_loc = torch.nn.Linear(128, 64) ... self.model_scale = torch.nn.Linear(128, 64) ... def forward(self, x): ... return {"loc": self.model_loc(x), "scale": F.softplus(self.model_scale(x))} ... >>> class Generator(Bernoulli): ... def __init__(self): ... super().__init__(cond_var=["z"], var=["x"], name="p") ... self.model = torch.nn.Linear(64, 128) ... def forward(self, z): ... return {"probs": torch.sigmoid(self.model(z))} ... >>> p = Generator() >>> q = Inference() >>> prior = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), ... var=["z"], features_shape=[64], name="p_{prior}") ... >>> # Define a loss function (VAE) >>> reconst = StochasticReconstructionLoss(q, p) >>> kl = KullbackLeibler(q, prior) >>> loss_cls = (reconst - kl).mean() >>> print(loss_cls) mean \left(- D_{KL} \left[q(z|x)||p_{prior}(z) \right] - \mathbb{E}_{q(z|x)} \left[\log p(x|z) \right] \right) >>> # Evaluate this loss function >>> data = torch.randn(1, 128) # Pseudo data >>> loss = loss_cls.eval({"x": data}) >>> print(loss) # doctest: +SKIP tensor(65.5939, grad_fn=<MeanBackward0>)
-
__init__
(p, q=None, input_var=None)[source]¶ Parameters: - p (pixyz.distributions.Distribution) – Distribution.
- q (pixyz.distributions.Distribution, defaults to None) – Distribution.
- input_var (
list
ofstr
, defaults to None) – Input variables of this loss function. In general, users do not need to set them explicitly because these depend on the given distributions and each loss function.
-
input_var
¶ Input variables of this distribution.
Type: list
-
loss_text
¶
-
abs
()[source]¶ Return an instance of
pixyz.losses.losses.AbsLoss
.Returns: An instance of pixyz.losses.losses.AbsLoss
Return type: pixyz.losses.losses.AbsLoss
-
mean
()[source]¶ Return an instance of
pixyz.losses.losses.BatchMean
.Returns: An instance of pixyz.losses.BatchMean
Return type: pixyz.losses.losses.BatchMean
-
sum
()[source]¶ Return an instance of
pixyz.losses.losses.BatchSum
.Returns: An instance of pixyz.losses.losses.BatchSum
Return type: pixyz.losses.losses.BatchSum
-
expectation
(p, input_var=None, sample_shape=torch.Size([]))[source]¶ Return an instance of
pixyz.losses.Expectation
.Parameters: - p (pixyz.distributions.Distribution) – Distribution for sampling.
- input_var (list) – Input variables of this loss.
- sample_shape (
list
orNoneType
, defaults to torch.Size()) – Shape of generating samples.
Returns: An instance of
pixyz.losses.Expectation
Return type:
-
eval
(x_dict={}, return_dict=False, **kwargs)[source]¶ Evaluate the value of the loss function given inputs (
x_dict
).Parameters: - x_dict (
dict
, defaults to {}) – Input variables. - return_dict (bool, default to False.) – Whether to return samples along with the evaluated value of the loss function.
Returns: - loss (torch.Tensor) – the evaluated value of the loss function.
- x_dict (
dict
) – All samples generated when evaluating the loss function. Ifreturn_dict
is False, it is not returned.
- x_dict (
-
Probability density function¶
LogProb¶
-
class
pixyz.losses.
LogProb
(p, sum_features=True, feature_dims=None)[source]¶ Bases:
pixyz.losses.losses.Loss
The log probability density/mass function.
Examples
>>> import torch >>> from pixyz.distributions import Normal >>> p = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), var=["x"], ... features_shape=[10]) >>> loss_cls = LogProb(p) # or p.log_prob() >>> print(loss_cls) \log p(x) >>> sample_x = torch.randn(2, 10) # Psuedo data >>> loss = loss_cls.eval({"x": sample_x}) >>> print(loss) # doctest: +SKIP tensor([12.9894, 15.5280])
Prob¶
-
class
pixyz.losses.
Prob
(p, sum_features=True, feature_dims=None)[source]¶ Bases:
pixyz.losses.pdf.LogProb
The probability density/mass function.
Examples
>>> import torch >>> from pixyz.distributions import Normal >>> p = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), var=["x"], ... features_shape=[10]) >>> loss_cls = Prob(p) # or p.prob() >>> print(loss_cls) p(x) >>> sample_x = torch.randn(2, 10) # Psuedo data >>> loss = loss_cls.eval({"x": sample_x}) >>> print(loss) # doctest: +SKIP tensor([3.2903e-07, 5.5530e-07])
Expected value¶
Expectation¶
-
class
pixyz.losses.
Expectation
(p, f, input_var=None, sample_shape=torch.Size([]))[source]¶ Bases:
pixyz.losses.losses.Loss
Expectation of a given function (Monte Carlo approximation).
where .
Note that doesn’t need to be able to sample, which is known as the law of the unconscious statistician (LOTUS).
Therefore, in this class, is assumed to
pixyz.Loss
.Examples
>>> import torch >>> from pixyz.distributions import Normal >>> from pixyz.losses import LogProb >>> q = Normal(loc="x", scale=torch.tensor(1.), var=["z"], cond_var=["x"], ... features_shape=[10]) # q(z|x) >>> p = Normal(loc="z", scale=torch.tensor(1.), var=["x"], cond_var=["z"], ... features_shape=[10]) # p(x|z) >>> loss_cls = LogProb(p).expectation(q) # equals to Expectation(q, LogProb(p)) >>> print(loss_cls) \mathbb{E}_{p(z|x)} \left[\log p(x|z) \right] >>> sample_x = torch.randn(2, 10) # Psuedo data >>> loss = loss_cls.eval({"x": sample_x}) >>> print(loss) # doctest: +SKIP tensor([-12.8181, -12.6062])
Entropy¶
CrossEntropy¶
-
class
pixyz.losses.
CrossEntropy
(p, q, input_var=None)[source]¶ Bases:
pixyz.losses.losses.SetLoss
Cross entropy, a.k.a., the negative expected value of log-likelihood (Monte Carlo approximation).
where .
Note
This class is a special case of the
Expectation
class.Examples
>>> import torch >>> from pixyz.distributions import Normal >>> p = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), var=["x"], features_shape=[64], name="p") >>> q = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), var=["x"], features_shape=[64], name="q") >>> loss_cls = CrossEntropy(p, q) >>> print(loss_cls) - \mathbb{E}_{p(x)} \left[\log q(x) \right] >>> loss = loss_cls.eval()
Entropy¶
-
class
pixyz.losses.
Entropy
(p, input_var=None)[source]¶ Bases:
pixyz.losses.losses.SetLoss
Entropy (Monte Carlo approximation).
where .
Note
This class is a special case of the
Expectation
class.Examples
>>> import torch >>> from pixyz.distributions import Normal >>> p = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), var=["x"], features_shape=[64]) >>> loss_cls = Entropy(p) >>> print(loss_cls) - \mathbb{E}_{p(x)} \left[\log p(x) \right] >>> loss = loss_cls.eval()
AnalyticalEntropy¶
-
class
pixyz.losses.
AnalyticalEntropy
(p, q=None, input_var=None)[source]¶ Bases:
pixyz.losses.losses.Loss
Entropy (analytical).
Examples
>>> import torch >>> from pixyz.distributions import Normal >>> p = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), var=["x"], features_shape=[64]) >>> loss_cls = AnalyticalEntropy(p) >>> print(loss_cls) - \mathbb{E}_{p(x)} \left[\log p(x) \right] >>> loss = loss_cls.eval()
StochasticReconstructionLoss¶
-
class
pixyz.losses.
StochasticReconstructionLoss
(encoder, decoder, input_var=None)[source]¶ Bases:
pixyz.losses.losses.SetLoss
Reconstruction Loss (Monte Carlo approximation).
where .
Note
This class is a special case of the
Expectation
class.Examples
>>> import torch >>> from pixyz.distributions import Normal >>> q = Normal(loc="x", scale=torch.tensor(1.), var=["z"], cond_var=["x"], features_shape=[64], name="q") # q(z|x) >>> p = Normal(loc="z", scale=torch.tensor(1.), var=["x"], cond_var=["z"], features_shape=[64], name="p") # p(x|z) >>> loss_cls = StochasticReconstructionLoss(q, p) >>> print(loss_cls) - \mathbb{E}_{q(z|x)} \left[\log p(x|z) \right] >>> loss = loss_cls.eval({"x": torch.randn(1,64)})
Lower bound¶
ELBO¶
-
class
pixyz.losses.
ELBO
(p, q, input_var=None)[source]¶ Bases:
pixyz.losses.losses.SetLoss
The evidence lower bound (Monte Carlo approximation).
where .
Note
This class is a special case of the
Expectation
class.Examples
>>> import torch >>> from pixyz.distributions import Normal >>> q = Normal(loc="x", scale=torch.tensor(1.), var=["z"], cond_var=["x"], features_shape=[64]) # q(z|x) >>> p = Normal(loc="z", scale=torch.tensor(1.), var=["x"], cond_var=["z"], features_shape=[64]) # p(x|z) >>> loss_cls = ELBO(p, q) >>> print(loss_cls) \mathbb{E}_{p(z|x)} \left[\log p(x|z) - \log p(z|x) \right] >>> loss = loss_cls.eval({"x": torch.randn(1, 64)})
Statistical distance¶
KullbackLeibler¶
-
class
pixyz.losses.
KullbackLeibler
(p, q, input_var=None, dim=None)[source]¶ Bases:
pixyz.losses.losses.Loss
Kullback-Leibler divergence (analytical).
Examples
>>> import torch >>> from pixyz.distributions import Normal, Beta >>> p = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), var=["z"], features_shape=[64], name="p") >>> q = Beta(concentration0=torch.tensor(1.), concentration1=torch.tensor(1.), ... var=["z"], features_shape=[64], name="q") >>> loss_cls = KullbackLeibler(p, q) >>> print(loss_cls) D_{KL} \left[p(z)||q(z) \right] >>> loss = loss_cls.eval()
WassersteinDistance¶
-
class
pixyz.losses.
WassersteinDistance
(p, q, metric=PairwiseDistance(), input_var=None)[source]¶ Bases:
pixyz.losses.losses.Loss
Wasserstein distance.
However, instead of the above true distance, this class computes the following one.
Here, is the upper of (i.e., ), and these are equal when both and are degenerate (deterministic) distributions.
Examples
>>> import torch >>> from pixyz.distributions import Normal >>> p = Normal(loc="x", scale=torch.tensor(1.), var=["z"], cond_var=["x"], features_shape=[64], name="p") >>> q = Normal(loc="x", scale=torch.tensor(1.), var=["z"], cond_var=["x"], features_shape=[64], name="q") >>> loss_cls = WassersteinDistance(p, q) >>> print(loss_cls) W^{upper} \left(p(z|x), q(z|x) \right) >>> loss = loss_cls.eval({"x": torch.randn(1, 64)})
MMD¶
-
class
pixyz.losses.
MMD
(p, q, input_var=None, kernel='gaussian', **kernel_params)[source]¶ Bases:
pixyz.losses.losses.Loss
The Maximum Mean Discrepancy (MMD).
where is any positive definite kernel.
Examples
>>> import torch >>> from pixyz.distributions import Normal >>> p = Normal(loc="x", scale=torch.tensor(1.), var=["z"], cond_var=["x"], features_shape=[64], name="p") >>> q = Normal(loc="x", scale=torch.tensor(1.), var=["z"], cond_var=["x"], features_shape=[64], name="q") >>> loss_cls = MMD(p, q, kernel="gaussian") >>> print(loss_cls) D_{MMD^2} \left[p(z|x)||q(z|x) \right] >>> loss = loss_cls.eval({"x": torch.randn(1, 64)}) >>> # Use the inverse (multi-)quadric kernel >>> loss = MMD(p, q, kernel="inv-multiquadratic").eval({"x": torch.randn(10, 64)})
Adversarial statistical distance¶
AdversarialJensenShannon¶
-
class
pixyz.losses.
AdversarialJensenShannon
(p, q, discriminator, input_var=None, optimizer=<class 'torch.optim.adam.Adam'>, optimizer_params={}, inverse_g_loss=True)[source]¶ Bases:
pixyz.losses.adversarial_loss.AdversarialLoss
Jensen-Shannon divergence (adversarial training).
where .
This class acts as a metric that evaluates a given distribution (generator). If you want to learn this evaluation metric itself, i.e., discriminator (critic), use the
train
method.Examples
>>> import torch >>> from pixyz.distributions import Deterministic, DataDistribution, Normal >>> # Generator >>> class Generator(Deterministic): ... def __init__(self): ... super(Generator, self).__init__(cond_var=["z"], var=["x"], name="p") ... self.model = nn.Linear(32, 64) ... def forward(self, z): ... return {"x": self.model(z)} >>> p_g = Generator() >>> prior = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), ... var=["z"], features_shape=[32], name="p_{prior}") >>> p = (p_g*prior).marginalize_var("z") >>> print(p) Distribution: p(x) = \int p(x|z)p_{prior}(z)dz Network architecture: Normal( name=p_{prior}, distribution_name=Normal, var=['z'], cond_var=[], input_var=[], features_shape=torch.Size([32]) (loc): torch.Size([1, 32]) (scale): torch.Size([1, 32]) ) Generator( name=p, distribution_name=Deterministic, var=['x'], cond_var=['z'], input_var=['z'], features_shape=torch.Size([]) (model): Linear(in_features=32, out_features=64, bias=True) ) >>> # Data distribution (dummy distribution) >>> p_data = DataDistribution(["x"]) >>> print(p_data) Distribution: p_{data}(x) Network architecture: DataDistribution( name=p_{data}, distribution_name=Data distribution, var=['x'], cond_var=[], input_var=['x'], features_shape=torch.Size([]) ) >>> # Discriminator (critic) >>> class Discriminator(Deterministic): ... def __init__(self): ... super(Discriminator, self).__init__(cond_var=["x"], var=["t"], name="d") ... self.model = nn.Linear(64, 1) ... def forward(self, x): ... return {"t": torch.sigmoid(self.model(x))} >>> d = Discriminator() >>> print(d) Distribution: d(t|x) Network architecture: Discriminator( name=d, distribution_name=Deterministic, var=['t'], cond_var=['x'], input_var=['x'], features_shape=torch.Size([]) (model): Linear(in_features=64, out_features=1, bias=True) ) >>> >>> # Set the loss class >>> loss_cls = AdversarialJensenShannon(p, p_data, discriminator=d) >>> print(loss_cls) mean(D_{JS}^{Adv} \left[p(x)||p_{data}(x) \right]) >>> >>> sample_x = torch.randn(2, 64) # Psuedo data >>> loss = loss_cls.eval({"x": sample_x}) >>> print(loss) # doctest: +SKIP tensor(1.3723, grad_fn=<AddBackward0>) >>> # For evaluating a discriminator loss, set the `discriminator` option to True. >>> loss_d = loss_cls.eval({"x": sample_x}, discriminator=True) >>> print(loss_d) # doctest: +SKIP tensor(1.4990, grad_fn=<AddBackward0>) >>> # When training the evaluation metric (discriminator), use the train method. >>> train_loss = loss_cls.train({"x": sample_x})
References
[Goodfellow+ 2014] Generative Adversarial Networks
-
d_loss
(y_p, y_q, batch_n)[source]¶ Evaluate a discriminator loss given outputs of the discriminator.
Parameters: - y_p (torch.Tensor) – Output of discriminator given sample from p.
- y_q (torch.Tensor) – Output of discriminator given sample from q.
- batch_n (int) – Batch size of inputs.
Returns: Return type: torch.Tensor
-
g_loss
(y_p, y_q, batch_n)[source]¶ Evaluate a generator loss given outputs of the discriminator.
Parameters: - y_p (torch.Tensor) – Output of discriminator given sample from p.
- y_q (torch.Tensor) – Output of discriminator given sample from q.
- batch_n (int) – Batch size of inputs.
Returns: Return type: torch.Tensor
-
AdversarialKullbackLeibler¶
-
class
pixyz.losses.
AdversarialKullbackLeibler
(p, q, discriminator, **kwargs)[source]¶ Bases:
pixyz.losses.adversarial_loss.AdversarialLoss
Kullback-Leibler divergence (adversarial training).
where .
Note that this divergence is minimized to close to .
Examples
>>> import torch >>> from pixyz.distributions import Deterministic, DataDistribution, Normal >>> # Generator >>> class Generator(Deterministic): ... def __init__(self): ... super(Generator, self).__init__(cond_var=["z"], var=["x"], name="p") ... self.model = nn.Linear(32, 64) ... def forward(self, z): ... return {"x": self.model(z)} >>> p_g = Generator() >>> prior = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), ... var=["z"], features_shape=[32], name="p_{prior}") >>> p = (p_g*prior).marginalize_var("z") >>> print(p) Distribution: p(x) = \int p(x|z)p_{prior}(z)dz Network architecture: Normal( name=p_{prior}, distribution_name=Normal, var=['z'], cond_var=[], input_var=[], features_shape=torch.Size([32]) (loc): torch.Size([1, 32]) (scale): torch.Size([1, 32]) ) Generator( name=p, distribution_name=Deterministic, var=['x'], cond_var=['z'], input_var=['z'], features_shape=torch.Size([]) (model): Linear(in_features=32, out_features=64, bias=True) ) >>> # Data distribution (dummy distribution) >>> p_data = DataDistribution(["x"]) >>> print(p_data) Distribution: p_{data}(x) Network architecture: DataDistribution( name=p_{data}, distribution_name=Data distribution, var=['x'], cond_var=[], input_var=['x'], features_shape=torch.Size([]) ) >>> # Discriminator (critic) >>> class Discriminator(Deterministic): ... def __init__(self): ... super(Discriminator, self).__init__(cond_var=["x"], var=["t"], name="d") ... self.model = nn.Linear(64, 1) ... def forward(self, x): ... return {"t": torch.sigmoid(self.model(x))} >>> d = Discriminator() >>> print(d) Distribution: d(t|x) Network architecture: Discriminator( name=d, distribution_name=Deterministic, var=['t'], cond_var=['x'], input_var=['x'], features_shape=torch.Size([]) (model): Linear(in_features=64, out_features=1, bias=True) ) >>> >>> # Set the loss class >>> loss_cls = AdversarialKullbackLeibler(p, p_data, discriminator=d) >>> print(loss_cls) mean(D_{KL}^{Adv} \left[p(x)||p_{data}(x) \right]) >>> >>> sample_x = torch.randn(2, 64) # Psuedo data >>> loss = loss_cls.eval({"x": sample_x}) >>> # The evaluation value might be negative if the discriminator training is incomplete. >>> print(loss) # doctest: +SKIP tensor(-0.8377, grad_fn=<AddBackward0>) >>> # For evaluating a discriminator loss, set the `discriminator` option to True. >>> loss_d = loss_cls.eval({"x": sample_x}, discriminator=True) >>> print(loss_d) # doctest: +SKIP tensor(1.9321, grad_fn=<AddBackward0>) >>> # When training the evaluation metric (discriminator), use the train method. >>> train_loss = loss_cls.train({"x": sample_x})
References
[Kim+ 2018] Disentangling by Factorising
-
g_loss
(y_p, batch_n)[source]¶ Evaluate a generator loss given an output of the discriminator.
Parameters: - y_p (torch.Tensor) – Output of discriminator given sample from p.
- batch_n (int) – Batch size of inputs.
Returns: Return type: torch.Tensor
-
d_loss
(y_p, y_q, batch_n)[source]¶ Evaluate a discriminator loss given outputs of the discriminator.
Parameters: - y_p (torch.Tensor) – Output of discriminator given sample from p.
- y_q (torch.Tensor) – Output of discriminator given sample from q.
- batch_n (int) – Batch size of inputs.
Returns: Return type: torch.Tensor
-
AdversarialWassersteinDistance¶
-
class
pixyz.losses.
AdversarialWassersteinDistance
(p, q, discriminator, clip_value=0.01, **kwargs)[source]¶ Bases:
pixyz.losses.adversarial_loss.AdversarialJensenShannon
Wasserstein distance (adversarial training).
Examples
>>> import torch >>> from pixyz.distributions import Deterministic, DataDistribution, Normal >>> # Generator >>> class Generator(Deterministic): ... def __init__(self): ... super(Generator, self).__init__(cond_var=["z"], var=["x"], name="p") ... self.model = nn.Linear(32, 64) ... def forward(self, z): ... return {"x": self.model(z)} >>> p_g = Generator() >>> prior = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), ... var=["z"], features_shape=[32], name="p_{prior}") >>> p = (p_g*prior).marginalize_var("z") >>> print(p) Distribution: p(x) = \int p(x|z)p_{prior}(z)dz Network architecture: Normal( name=p_{prior}, distribution_name=Normal, var=['z'], cond_var=[], input_var=[], features_shape=torch.Size([32]) (loc): torch.Size([1, 32]) (scale): torch.Size([1, 32]) ) Generator( name=p, distribution_name=Deterministic, var=['x'], cond_var=['z'], input_var=['z'], features_shape=torch.Size([]) (model): Linear(in_features=32, out_features=64, bias=True) ) >>> # Data distribution (dummy distribution) >>> p_data = DataDistribution(["x"]) >>> print(p_data) Distribution: p_{data}(x) Network architecture: DataDistribution( name=p_{data}, distribution_name=Data distribution, var=['x'], cond_var=[], input_var=['x'], features_shape=torch.Size([]) ) >>> # Discriminator (critic) >>> class Discriminator(Deterministic): ... def __init__(self): ... super(Discriminator, self).__init__(cond_var=["x"], var=["t"], name="d") ... self.model = nn.Linear(64, 1) ... def forward(self, x): ... return {"t": self.model(x)} >>> d = Discriminator() >>> print(d) Distribution: d(t|x) Network architecture: Discriminator( name=d, distribution_name=Deterministic, var=['t'], cond_var=['x'], input_var=['x'], features_shape=torch.Size([]) (model): Linear(in_features=64, out_features=1, bias=True) ) >>> >>> # Set the loss class >>> loss_cls = AdversarialWassersteinDistance(p, p_data, discriminator=d) >>> print(loss_cls) mean(W^{Adv} \left(p(x), p_{data}(x) \right)) >>> >>> sample_x = torch.randn(2, 64) # Psuedo data >>> loss = loss_cls.eval({"x": sample_x}) >>> print(loss) # doctest: +SKIP tensor(-0.0060, grad_fn=<SubBackward0>) >>> # For evaluating a discriminator loss, set the `discriminator` option to True. >>> loss_d = loss_cls.eval({"x": sample_x}, discriminator=True) >>> print(loss_d) # doctest: +SKIP tensor(-0.3802, grad_fn=<NegBackward>) >>> # When training the evaluation metric (discriminator), use the train method. >>> train_loss = loss_cls.train({"x": sample_x})
References
[Arjovsky+ 2017] Wasserstein GAN
-
d_loss
(y_p, y_q, *args, **kwargs)[source]¶ Evaluate a discriminator loss given outputs of the discriminator.
Parameters: - y_p (torch.Tensor) – Output of discriminator given sample from p.
- y_q (torch.Tensor) – Output of discriminator given sample from q.
- batch_n (int) – Batch size of inputs.
Returns: Return type: torch.Tensor
-
g_loss
(y_p, y_q, *args, **kwargs)[source]¶ Evaluate a generator loss given outputs of the discriminator.
Parameters: - y_p (torch.Tensor) – Output of discriminator given sample from p.
- y_q (torch.Tensor) – Output of discriminator given sample from q.
- batch_n (int) – Batch size of inputs.
Returns: Return type: torch.Tensor
-
Loss for sequential distributions¶
IterativeLoss¶
-
class
pixyz.losses.
IterativeLoss
(step_loss, max_iter=None, input_var=None, series_var=None, update_value={}, slice_step=None, timestep_var=['t'])[source]¶ Bases:
pixyz.losses.losses.Loss
Iterative loss.
This class allows implementing an arbitrary model which requires iteration.
where .
Examples
>>> import torch >>> from torch.nn import functional as F >>> from pixyz.distributions import Normal, Bernoulli, Deterministic >>> >>> # Set distributions >>> x_dim = 128 >>> z_dim = 64 >>> h_dim = 32 >>> >>> # p(x|z,h_{prev}) >>> class Decoder(Bernoulli): ... def __init__(self): ... super().__init__(cond_var=["z", "h_prev"], var=["x"], name="p") ... self.fc = torch.nn.Linear(z_dim + h_dim, x_dim) ... def forward(self, z, h_prev): ... return {"probs": torch.sigmoid(self.fc(torch.cat((z, h_prev), dim=-1)))} ... >>> # q(z|x,h_{prev}) >>> class Encoder(Normal): ... def __init__(self): ... super().__init__(cond_var=["x", "h_prev"], var=["z"], name="q") ... self.fc_loc = torch.nn.Linear(x_dim + h_dim, z_dim) ... self.fc_scale = torch.nn.Linear(x_dim + h_dim, z_dim) ... def forward(self, x, h_prev): ... xh = torch.cat((x, h_prev), dim=-1) ... return {"loc": self.fc_loc(xh), "scale": F.softplus(self.fc_scale(xh))} ... >>> # f(h|x,z,h_{prev}) (update h) >>> class Recurrence(Deterministic): ... def __init__(self): ... super().__init__(cond_var=["x", "z", "h_prev"], var=["h"], name="f") ... self.rnncell = torch.nn.GRUCell(x_dim + z_dim, h_dim) ... def forward(self, x, z, h_prev): ... return {"h": self.rnncell(torch.cat((z, x), dim=-1), h_prev)} >>> >>> p = Decoder() >>> q = Encoder() >>> f = Recurrence() >>> >>> # Set the loss class >>> step_loss_cls = p.log_prob().expectation(q * f).mean() >>> print(step_loss_cls) mean \left(\mathbb{E}_{p(h,z|x,h_{prev})} \left[\log p(x|z,h_{prev}) \right] \right) >>> loss_cls = IterativeLoss(step_loss=step_loss_cls, ... series_var=["x"], update_value={"h": "h_prev"}) >>> print(loss_cls) \sum_{t=1}^{t_{max}} mean \left(\mathbb{E}_{p(h,z|x,h_{prev})} \left[\log p(x|z,h_{prev}) \right] \right) >>> >>> # Evaluate >>> x_sample = torch.randn(30, 2, 128) # (timestep_size, batch_size, feature_size) >>> h_init = torch.zeros(2, 32) # (batch_size, h_dim) >>> loss = loss_cls.eval({"x": x_sample, "h_prev": h_init}) >>> print(loss) # doctest: +SKIP tensor(-2826.0906, grad_fn=<AddBackward0>
Loss for special purpose¶
Parameter¶
-
class
pixyz.losses.losses.
Parameter
(input_var)[source]¶ Bases:
pixyz.losses.losses.Loss
This class defines a single variable as a loss class.
It can be used such as a coefficient parameter of a loss class.
Examples
>>> loss_cls = Parameter("x") >>> print(loss_cls) x >>> loss = loss_cls.eval({"x": 2}) >>> print(loss) 2
SetLoss¶
-
class
pixyz.losses.losses.
SetLoss
(loss)[source]¶ Bases:
pixyz.losses.losses.Loss
Operators¶
LossOperator¶
-
class
pixyz.losses.losses.
LossOperator
(loss1, loss2)[source]¶ Bases:
pixyz.losses.losses.Loss
LossSelfOperator¶
AddLoss¶
-
class
pixyz.losses.losses.
AddLoss
(loss1, loss2)[source]¶ Bases:
pixyz.losses.losses.LossOperator
Apply the add operation to the two losses.
Examples
>>> loss_cls_1 = ValueLoss(2) >>> loss_cls_2 = Parameter("x") >>> loss_cls = loss_cls_1 + loss_cls_2 # equals to AddLoss(loss_cls_1, loss_cls_2) >>> print(loss_cls) x + 2 >>> loss = loss_cls.eval({"x": 3}) >>> print(loss) 5
SubLoss¶
-
class
pixyz.losses.losses.
SubLoss
(loss1, loss2)[source]¶ Bases:
pixyz.losses.losses.LossOperator
Apply the sub operation to the two losses.
Examples
>>> loss_cls_1 = ValueLoss(2) >>> loss_cls_2 = Parameter("x") >>> loss_cls = loss_cls_1 - loss_cls_2 # equals to SubLoss(loss_cls_1, loss_cls_2) >>> print(loss_cls) 2 - x >>> loss = loss_cls.eval({"x": 4}) >>> print(loss) -2 >>> loss_cls = loss_cls_2 - loss_cls_1 # equals to SubLoss(loss_cls_2, loss_cls_1) >>> print(loss_cls) x - 2 >>> loss = loss_cls.eval({"x": 4}) >>> print(loss) 2
MulLoss¶
-
class
pixyz.losses.losses.
MulLoss
(loss1, loss2)[source]¶ Bases:
pixyz.losses.losses.LossOperator
Apply the mul operation to the two losses.
Examples
>>> loss_cls_1 = ValueLoss(2) >>> loss_cls_2 = Parameter("x") >>> loss_cls = loss_cls_1 * loss_cls_2 # equals to MulLoss(loss_cls_1, loss_cls_2) >>> print(loss_cls) 2 x >>> loss = loss_cls.eval({"x": 4}) >>> print(loss) 8
DivLoss¶
-
class
pixyz.losses.losses.
DivLoss
(loss1, loss2)[source]¶ Bases:
pixyz.losses.losses.LossOperator
Apply the div operation to the two losses.
Examples
>>> loss_cls_1 = ValueLoss(2) >>> loss_cls_2 = Parameter("x") >>> loss_cls = loss_cls_1 / loss_cls_2 # equals to DivLoss(loss_cls_1, loss_cls_2) >>> print(loss_cls) \frac{2}{x} >>> loss = loss_cls.eval({"x": 4}) >>> print(loss) 0.5 >>> loss_cls = loss_cls_2 / loss_cls_1 # equals to DivLoss(loss_cls_2, loss_cls_1) >>> print(loss_cls) \frac{x}{2} >>> loss = loss_cls.eval({"x": 4}) >>> print(loss) 2.0
NegLoss¶
-
class
pixyz.losses.losses.
NegLoss
(loss1)[source]¶ Bases:
pixyz.losses.losses.LossSelfOperator
Apply the neg operation to the loss.
Examples
>>> loss_cls_1 = Parameter("x") >>> loss_cls = -loss_cls_1 # equals to NegLoss(loss_cls_1) >>> print(loss_cls) - x >>> loss = loss_cls.eval({"x": 4}) >>> print(loss) -4
AbsLoss¶
-
class
pixyz.losses.losses.
AbsLoss
(loss1)[source]¶ Bases:
pixyz.losses.losses.LossSelfOperator
Apply the abs operation to two losses.
Examples
>>> import torch >>> from pixyz.distributions import Normal >>> from pixyz.losses import LogProb >>> p = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), var=["x"], ... features_shape=[10]) >>> loss_cls = LogProb(p).abs() # equals to AbsLoss(LogProb(p)) >>> print(loss_cls) |\log p(x)| >>> sample_x = torch.randn(2, 10) # Psuedo data >>> loss = loss_cls.eval({"x": sample_x}) >>> print(loss) # doctest: +SKIP tensor([12.9894, 15.5280])
BatchMean¶
-
class
pixyz.losses.losses.
BatchMean
(loss1)[source]¶ Bases:
pixyz.losses.losses.LossSelfOperator
Average a loss class over given batch data.
where and is a loss function.
Examples
>>> import torch >>> from pixyz.distributions import Normal >>> from pixyz.losses import LogProb >>> p = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), var=["x"], ... features_shape=[10]) >>> loss_cls = LogProb(p).mean() # equals to BatchMean(LogProb(p)) >>> print(loss_cls) mean \left(\log p(x) \right) >>> sample_x = torch.randn(2, 10) # Psuedo data >>> loss = loss_cls.eval({"x": sample_x}) >>> print(loss) # doctest: +SKIP tensor(-14.5038)
BatchSum¶
-
class
pixyz.losses.losses.
BatchSum
(loss1)[source]¶ Bases:
pixyz.losses.losses.LossSelfOperator
Summation a loss class over given batch data.
where and is a loss function.
Examples
>>> import torch >>> from pixyz.distributions import Normal >>> from pixyz.losses import LogProb >>> p = Normal(loc=torch.tensor(0.), scale=torch.tensor(1.), var=["x"], ... features_shape=[10]) >>> loss_cls = LogProb(p).sum() # equals to BatchSum(LogProb(p)) >>> print(loss_cls) sum \left(\log p(x) \right) >>> sample_x = torch.randn(2, 10) # Psuedo data >>> loss = loss_cls.eval({"x": sample_x}) >>> print(loss) # doctest: +SKIP tensor(-31.9434)