AI Research Paper Feed
Curated index of breakthrough publications in artificial intelligence. Access synthesized TL;DRs, key technical abstracts, and direct citations.
By Sarah Meade, Alex Johnson, Liam Patel
Proposes a optimization framework to mitigate over-refusal in aligned LLMs. Balances safety bounds against instruction utility.
By DeepSeek-AI, Daya Guo, Dejian Yang et al.
Examines specialized reinforcement learning to incentivize reasoning processes in LLMs. Delivers top-tier coding and math benchmarks using open weights.
By Robert Kim, Meera Nair, Sofia Rodriguez
Presents a multimodal assistant trained on complex scientific datasets. Shows significant gains in graphical reasoning and visual instruction following.
By Aohan Zeng, Ming Ji, Luoxuan Weng et al.
Examines instruction-tuning methodologies optimized for agentic operations. Builds a customized instruction set to teach LLMs specialized planning and tool-use behaviors.
By Rafael Rafailov, Archit Sharma, Eric Mitchell et al.
Proposes Direct Preference Optimization (DPO) as an alternative to PPO-based RLHF. Simplifies alignment by optimizing the policy directly from human preference data.
By Haotian Liu, Chunyuan Li, Qingyang Wu et al.
Pioneers multimodal instruction tuning by connecting CLIP vision encoders with LLaMA. Lays the groundwork for open-source visual assistants like LLaVA.