A research team from Stanford University and Google announced WALT, a diffusion model that generates photorealistic videos from text. Many videos actually generated using 'WALT' have been released.