RESEARCH FEED

AI Research Paper Feed

Curated index of breakthrough publications in artificial intelligence. Access synthesized TL;DRs, key technical abstracts, and direct citations.

alignmentFeb 4, 2026
Reconciling safety and utility in reinforcement learning alignment

By Sarah Meade, Alex Johnson, Liam Patel

Proposes a optimization framework to mitigate over-refusal in aligned LLMs. Balances safety bounds against instruction utility.

llmJan 22, 2025
DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning

By DeepSeek-AI, Daya Guo, Dejian Yang et al.

Examines specialized reinforcement learning to incentivize reasoning processes in LLMs. Delivers top-tier coding and math benchmarks using open weights.

multimodalJan 3, 2024
MAMMOTH: Massive multimodal helper for multi-discipline reasoning

By Robert Kim, Meera Nair, Sofia Rodriguez

Presents a multimodal assistant trained on complex scientific datasets. Shows significant gains in graphical reasoning and visual instruction following.

agentsOct 27, 2023
AgentTuning: Enabling generalized agent capabilities in LLMs

By Aohan Zeng, Ming Ji, Luoxuan Weng et al.

Examines instruction-tuning methodologies optimized for agentic operations. Builds a customized instruction set to teach LLMs specialized planning and tool-use behaviors.

alignmentMay 29, 2023
Direct preference optimization: Your language model is secretly a reward model

By Rafael Rafailov, Archit Sharma, Eric Mitchell et al.

Proposes Direct Preference Optimization (DPO) as an alternative to PPO-based RLHF. Simplifies alignment by optimizing the policy directly from human preference data.

multimodalApr 17, 2023
Visual instruction tuning

By Haotian Liu, Chunyuan Li, Qingyang Wu et al.

Pioneers multimodal instruction tuning by connecting CLIP vision encoders with LLaMA. Lays the groundwork for open-source visual assistants like LLaVA.