본문 바로가기

Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling

(1)