- The score function estimator: a single sample, DOES estimate the score function etc.
- The actual point of reinforce: not taking the derivative through the expectation, but taking a derivative through a random variable. And the big kicker is that we take it through the RV, SO that we assume no dependence (i.e. an rv X sampled has no dependence on the probability distribution governing X); and so the problem resorts down to maximum likelihood estimation (essentially); just trying to max/min the probability of something happening; as weighted by the reward function
- The two views of taking a derivative through a random variable.
- The two views of VAEs. (from a probability view, and from a neural network view). We can also discuss all elbos and all formulations of the training. We want to maximize the log pdf.
- The super authouritative guide to neural nets. This one involves all the important and nice quantities: we PRODUCE parameters at the very end, and then we do a form of maximum likelihood on these parameters, trying to maximize probability of our ORIGINAL input. Simple!
- This HAS the effect of also regenerating our image, for all intents and purposes.
- Additional articles and details about machine learning:
- Where does cross entropy loss come from? It is COMPLETELY just a loss function of our choosing! But it is one that kind of encodes our intuition about how the loss should be for examples.
- Want to relearn the derivation! (For cross-entropy loss and how it has a linear residual)
- Explain the difference between continuous and discrete variables. For instance, when we do classification, are we doing discrete or continuous? (A: we have discrete number of outputs. But each variable is itself cts.)
- Follow-up: say we do VAEs, on MNIST. Then, say we have a latent dimension of 10 variables. Technically, these are discrete factors of variation. But they can vary continuously inside.
- But there could be multiple formulations:
- We could predict a single vector, then have Gumbel softmax, as an alternate approach to classification etc.
- https://github.com/vithursant/VAE-Gumbel-Softmax