Avoid calling PulseGen::Step() on idle axes by checking for a non-zero
queue size (which is more efficient to compute).
Increase stepTimerQuantum to 128us to ensure acceleration can be
computed in realtime for 3 axes at the same time.
Fix the logic of the static assertion, which was flipped: we need to
create slices larger than the maximal step frequency in order to ensure
no axis is starved while moving.