multimodalPublished: January 3, 2024
MAMMOTH: Massive multimodal helper for multi-discipline reasoning
By Robert Kim, Meera Nair, Sofia Rodriguez
Research TL;DR
"Presents a multimodal assistant trained on complex scientific datasets. Shows significant gains in graphical reasoning and visual instruction following."
Abstract
We present MAMMOTH, a multimodal architecture trained on extensive mathematical, scientific, and document parsing instruction sets. MAMMOTH sets new benchmarks in multi-turn multi-modal reasoning and chart understanding.