Posts by Tags

Multimodal Generation

Summaries and Thoughts on Multimodal Alignment and Generation

5 minute read

Published:

The multimodal generation blog covers innovative models such as LAFITE, CAFE, ARTIST, and LLaVA-Reward, which aim to improve text-to-image generation through methods on generalization ability, better multimodal alginment and enhanced text rendering.

Multimodal Understanding