Sequentialmodellingwithself-attentionhasachievedcuttingedgeperformances innaturallanguageprocessing.Withadvantagesinmodelflexibility,computation complexityandinterpretability,self-attentionisgraduallybecomingakey componentineventsequencemodels.However,likemostothers