It's pretty slow right now, I suspect the value iteration is slow. Good to do some profiling to pin it down. Possible that moving things to GPU would speed things up (although I suspect environments we're using so far are small enough it's not worth the overhead).