The authors present a novel approach to incorporate language information into extracting visual features by conditioning the Batch Normalization parameters on the language. They apply Conditional ...