value iteration
https://math.stackexchange.com/questions/2639577/why-is-the-gradient-of-this-expectation-intractable
turn a integration in high dim to a expectation problem???
computational efficiency -> low resolution to high resolution
this hard attention -> a lot applications!!! -> improve efficiency
but still need RNN -> may be slow
efficiency depends on the case
high resolution input -> fast by this method
Q learning may be harder to tune