Abstract: Despite the widespread adoption of vision sensors in edge applications, such as surveillance, video transmission consumes substantial spectrum resources. Semantic communication (SC) offers a ...
Abstract: Video-based commonsense captioning aims to generate captions for the video content while providing multiple commonsense about the underlying event. Existing methods utilize video features to ...