




value iteration








https://math.stackexchange.com/questions/2639577/why-is-the-gradient-of-this-expectation-intractable

turn a integration in high dim to a expectation problem???











computational efficiency -> low resolution to high resolution


this hard attention -> a lot applications!!! -> improve efficiency
but still need RNN -> may be slow
efficiency depends on the case
high resolution input -> fast by this method


Q learning may be harder to tune