wrapping a keras optimizer to implement gradient accumulation