AI RESEARCH PAPERS & ACADEMIC SOURCES
- Stress Tests REVEAL Fragile Temporal and Visual Grounding in Video-Language Models
- Advancing Digital Twin Generation Through a Novel Simulation Framework and Quantitative Benchmarking
- Selective Prior Synchronization via SYNC Loss
- MDE-VIO: Enhancing Visual-Inertial Odometry Using Learned Depth Priors
- Exploring Real-Time Super-Resolution: Benchmarking and Fine-Tuning for Streaming Content
- ArtContext: Contextualizing Artworks with Open-Access Art History Articles and Wikidata Knowledge through a LoRA-Tuned CLIP Model
- Ctrl&Shift: High-Quality Geometry-Aware Object Manipulation in Visual Generation
- A Dual-Branch Framework for Semantic Change Detection with Boundary and Temporal Awareness
- Arbitrary Ratio Feature Compression via Next Token Prediction
- What if Agents Could Imagine? Reinforcing Open-Vocabulary HOI Comprehension through Generation
- Vascular anatomy-aware self-supervised pre-training for X-ray angiogram analysis
- Supervise-assisted Multi-modality Fusion Diffusion Model for PET Restoration
- LUVE : Latent-Cascaded Ultra-High-Resolution Video Generation with Dual Frequency Experts
- Move What Matters: Parameter-Efficient Domain Adaptation via Optimal Transport Flow for Collaborative Perception
- A Large Language Model for Disaster Structural Reconnaissance Summarization
- Electrostatics-Inspired Surface Reconstruction (EISR): Recovering 3D Shapes as a Superposition of Poisson's PDE Solutions
- GR-Diffusion: 3D Gaussian Representation Meets Diffusion in Whole-Body PET Reconstruction
- EmoSpace: Fine-Grained Emotion Prototype Learning for Immersive Affective Content Generation
- Clutt3R-Seg: Sparse-view 3D Instance Segmentation for Language-grounded Grasping in Cluttered Scenes
- Egocentric Gaze Estimation via Neck-Mounted Camera
- U-Net with Hadamard Transform and DCT Latent Spaces for Next-day Wildfire Spread Prediction
- RI-Mamba: Rotation-Invariant Mamba for Robust Text-to-Shape Retrieval
- TG-Field: Geometry-Aware Radiative Gaussian Fields for Tomographic Reconstruction
- GSO-SLAM: Bidirectionally Coupled Gaussian Splatting and Direct Visual Odometry
- STVG-R1: Incentivizing Instance-Level Reasoning and Grounding in Videos via Reinforcement Learning
- Adaptive Debiasing Tsallis Entropy for Test-Time Adaptation
- Code2Worlds: Empowering Coding LLMs for 4D World Generation
- Light4D: Training-Free Extreme Viewpoint 4D Video Relighting
- Efficient Segment Anything with Depth-Aware Fusion and Limited Training Data
- JEPA-VLA: Video Predictive Embedding is Needed for VLA Models
- WorldTree: Towards 4D Dynamic Worlds from Monocular Video using Tree-Chains
- DiffPlace: Street View Generation via Place-Controllable Diffusion Model Enhancing Place Recognition
- Spatial Chain-of-Thought: Bridging Understanding and Generation Models for Spatial Reasoning Generation
- Can Local Vision-Language Models improve Activity Recognition over Vision Transformers? -- Case Study on Newborn Resuscitation
- Projected Representation Conditioning for High-fidelity Novel View Synthesis
- A DMD-Based Adaptive Modulation Method for High Dynamic Range Imaging in High-Glare Environments
- GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
- AssetFormer: Modular 3D Assets Generation with Autoregressive Transformer
- PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback
- FAIL: Flow Matching Adversarial Imitation Learning for Image Generation
- TexSpot: 3D Texture Enhancement with Spatially-uniform Point Latent Representation
- DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation
- EO-VAE: Towards A Multi-sensor Tokenizer for Earth Observation Data
- Best of Both Worlds: Multimodal Reasoning and Generation via Unified Discrete Flow Matching
- Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching
- Mitigating Error Accumulation in Continuous Navigation via Memory-Augmented Kalman Filtering
- U-DAVI: Uncertainty-Aware Diffusion-Prior-Based Amortized Variational Inference for Image Reconstruction
- Learning Perceptual Representations for Gaming NR-VQA with Multi-Task FR Signals
- UPDA: Unsupervised Progressive Domain Adaptation for No-Reference Point Cloud Quality Assessment
- Learning A Physical-aware Diffusion Model Based on Transformer for Underwater Image Enhancement
- Scale Contrastive Learning with Selective Attentions for Blind Image Quality Assessment
- Dual Frequency Branch Framework with Reconstructed Sliding Windows Attention for AI-Generated Image Detection
- From Implicit Ambiguity to Explicit Solidity: Diagnosing Interior Geometric Degradation in Neural Radiance Fields for Dense 3D Scene Understanding
- Response-Based Knowledge Distillation for Multilingual Jailbreak Prevention Unwittingly Compromises Safety
- Retrieval Heads are Dynamic
- PRIME: Policy-Reinforced Iterative Multi-agent Execution for Algorithmic Reasoning in Large Language Models
- Synthesizing the Virtual Advocate: A Multi-Persona Speech Generation Framework for Diverse Linguistic Jurisdictions in Indic Languages
- Author-in-the-Loop Response Generation and Evaluation: Integrating Author Expertise and Intent in Responses to Peer Review
- Barriers to Discrete Reasoning with Transformers: A Survey Across Depth, Exactness, and Bandwidth
- Mechanistic Interpretability for Large Language Model Alignment: Progress, Challenges, and Future Directions
- Code Mixologist : A Practitioner's Guide to Building Code-Mixed LLMs
- MetaMem: Evolving Meta-Memory for Knowledge Utilization through Self-Reflective Symbolic Optimization
- Mechanistic Evidence for Faithfulness Decay in Chain-of-Thought Reasoning
- The Automatic Verification of Image-Text Claims (AVerImaTeC) Shared Task
- SurveyLens: A Research Discipline-Aware Benchmark for Automatic Survey Generation
- Are Aligned Large Language Models Still Misaligned?
- Evaluating Alignment of Behavioral Dispositions in LLMs
- Advancing AI Trustworthiness Through Patient Simulation: Risk Assessment of Conversational Agents for Antidepressant Selection
- LoopFormer: Elastic-Depth Looped Transformers for Latent Reasoning via Shortcut Modulation
- ADRD-Bench: A Preliminary LLM Benchmark for Alzheimer's Disease and Related Dementias
- When Audio-LLMs Don't Listen: A Cross-Linguistic Study of Modality Arbitration
- Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm
- SIGHT: Reinforcement Learning with Self-Evidence and Information-Gain Diverse Branching for Search Agent
- PRIME: A Process-Outcome Alignment Benchmark for Verifiable Reasoning in Mathematics and Engineering
- Scene-Aware Memory Discrimination: Deciding Which Personal Knowledge Stays
- PACE: Prefix-Protected and Difficulty-Aware Compression for Efficient Reasoning
- Which Feedback Works for Whom? Differential Effects of LLM-Generated Feedback Elements Across Learner Profiles
- Finding Sense in Nonsense with Generated Contexts: Perspectives from Humans and Language Models
- Thinking with Drafting: Optical Decompression via Logical Reconstruction
- Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning
- A Subword Embedding Approach for Variation Detection in Luxembourgish User Comments
- LLM-based Triplet Extraction from Financial Reports
- Benchmark Illusion: Disagreement among LLMs and Its Scientific Consequences
- Cross-Modal Robustness Transfer (CMRT): Training Robust Speech Translation Models Using Adversarial Text
- Who is the richest club in the championship? Detecting and Rewriting Underspecified Questions Improve QA Performance
- Do Large Language Models Adapt to Language Variation across Socioeconomic Status?
- Scaling Model and Data for Multilingual Machine Translation with Open Large Language Models
- DHPLT: large-scale multilingual diachronic corpora and word representations for semantic change modelling
- Automatic Simplification of Common Vulnerabilities and Exposures Descriptions
- LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss
- Disentangling Ambiguity from Instability in Large Language Models: A Clinical Text-to-SQL Case Study
- Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
- P-GenRM: Personalized Generative Reward Model with Test-time User-based Scaling
- A Rule-based Computational Model for Gaidhlig Morphology
- WavBench: Benchmarking Reasoning, Colloquialism, and Paralinguistics for End-to-End Spoken Dialogue Models
- CitiLink-Minutes: A Multilayer Annotated Dataset of Municipal Meeting Minutes
- Query-focused and Memory-aware Reranker for Long Context Processing
- ExStrucTiny: A Benchmark for Schema-Variable Structured Information Extraction from Document Images
- Detecting Overflow in Compressed Token Representations for Retrieval-Augmented Generation
- On-Policy Context Distillation for Language Models
- Althea: Human-AI Collaboration for Fact-Checking and Critical Reasoning
- Agent-Diff: Benchmarking LLM Agents on Enterprise API Tasks via Code Execution with State-Diff-Based Evaluation
- ABot-M0: VLA Foundation Model for Robotic Manipulation with Action Manifold Learning
- Jailbreaking Leaves a Trace: Understanding and Detecting Jailbreak Attacks from Internal Representations of Large Language Models
- Mask What Matters: Mitigating Object Hallucinations in Multimodal Large Language Models with Object-Aligned Visual Contrastive Decoding
- More Haste, Less Speed: Weaker Single-Layer Watermark Improves Distortion-Free Watermark Ensembles
- Artificial intelligence is creating a new global linguistic hierarchy
- Embodied Agents Meet Personalization: Investigating Challenges and Solutions Through the Lens of Memory Utilization
- DD-MDN: Human Trajectory Forecasting with Diffusion-Based Dual Mixture Density Networks and Uncertainty Self-Calibration
- ReTracing: An Archaeological Approach Through Body, Machine, and Generative Systems
- Aggregate Models, Not Explanations: Improving Feature Importance Estimation
- Decentralized Non-convex Stochastic Optimization with Heterogeneous Variance
- How to Sample High Quality 3D Fractals for Action Recognition Pre-Training?
- A Comparative Study of MAP and LMMSE Estimators for Blind Inverse Problems
- EqDeepRx: Learning a Scalable MIMO Receiver
- Free Lunch for Stabilizing Rectified Flow Inversion
- Scale-Invariant Fast Convergence in Games
- DMAP: A Distribution Map for Text
- Echo: Towards Advanced Audio Comprehension via Audio-Interleaved Reasoning
- TADA! Tuning Audio Diffusion Models through Activation Steering
- Insights on Muon from Simple Quadratics
- Benchmarking Vision-Language Models for French PDF-to-Markdown Conversion
- Calibrated Bayesian Deep Learning for Explainable Decision Support Systems Based on Medical Imaging
- The Implicit Bias of Logit Regularization
- Safety Beyond the Training Data: Robust Out-of-Distribution MPC via Conformalized System Level Synthesis
- Iskra: A System for Inverse Geometry Processing
- Towards Personalized Bangla Book Recommendation: A Large-Scale Multi-Entity Book Graph Dataset
- Convex Markov Games and Beyond: New Proof of Existence, Characterization and Learning Algorithms for Nash Equilibria
- Moonshine v2: Ergodic Streaming Encoder ASR for Latency-Critical Speech Applications
- Is Online Linear Optimization Sufficient for Strategic Robustness?
- T3D: Few-Step Diffusion Language Models via Trajectory Self-Distillation with Direct Discriminative Optimization
- MonarchRT: Efficient Attention for Real-Time Video Generation
- Learning to Control: The iUzawa-Net for Nonsmooth Optimal Control of Linear PDEs
- Accelerating nuclear-norm regularized low-rank matrix optimization through Burer-Monteiro decomposition
- Optimizing Sampling Patterns for Compressed Sensing MRI with Diffusion Generative Models
- SeqRisk: Transformer-augmented latent variable model for robust survival prediction with longitudinal data
- Fine-tuning Quantized Neural Networks with Zeroth-order Optimization
- Hyperparameter Transfer with Mixture-of-Expert Layers
- Accelerating Large Language Model Inference with Self-Supervised Early Exits
- Feature-Based Interpretable Surrogates for Optimization
- Controlling Dynamical Systems into Unseen Target States Using Machine Learning
- SonicSieve: Bringing Directional Speech Extraction to Smartphones Using Acoustic Microstructures
- Empirical Likelihood-Based Fairness Auditing: Distribution-Free Certification and Flagging
- Potential-energy gating for robust state estimation in bistable stochastic systems
- DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels
- Dopamine: Brain Modes, Not Brains
- U-Former ODE: Fast Probabilistic Forecasting of Irregular Time Series
- TUBO: A Tailored ML Framework for Reliable Network Traffic Forecasting
- MUSE: Multi-Tenant Model Serving With Seamless Model Updates
- Temperature as a Meta-Policy: Adaptive Temperature in LLM Reinforcement Learning
- Latent-Variable Learning of SPDEs via Wiener Chaos
- Temporal Difference Learning with Constrained Initial Representations
- SpaTeoGL: Spatiotemporal Graph Learning for Interpretable Seizure Onset Zone Analysis from Intracranial EEG
- TopoFair: Linking Topological Bias to Fairness in Link Prediction Benchmarks
- From Path Signatures to Sequential Modeling: Incremental Signature Contributions for Offline RL
- Deep Kernel Fusion for Transformers
- CAAL: Confidence-Aware Active Learning for Heteroscedastic Atmospheric Regression
- Towards Sustainable Investment Policies Informed by Opponent Shaping
- Robust Optimization Approach and Learning Based Hide-and-Seek Game for Resilient Network Design
- A$^{2}$V-SLP: Alignment-Aware Variational Modeling for Disentangled Sign Language Production
- In-Context Function Learning in Large Language Models
- Universal Diffusion-Based Probabilistic Downscaling
- Learning Conditional Averages
- Extending Puzzle for Mixture-of-Experts Reasoning Models with Application to GPT-OSS Acceleration
- Temporally Unified Adversarial Perturbations for Time Series Forecasting
- Using predictive multiplicity to measure individual performance within the AI Act
- Are Two LLMs Better Than One? A Student-Teacher Dual-Head LLMs Architecture for Pharmaceutical Content Optimization
- RAM-Net: Expressive Linear Attention with Selectively Addressable Memory
- Momentum LMS Theory beyond Stationarity: Stability, Tracking, and Regret
- FedGRPO: Privately Optimizing Foundation Models with Group-Relative Rewards from Domain Client
- Improved state mixing in higher-order and block diagonal linear recurrent networks
- Protein Circuit Tracing via Cross-layer Transcoders
- PrefillShare: A Shared Prefill Module for KV Reuse in Multi-LLM Disaggregated Serving
- Improving HPC Code Generation Capability of LLMs via Online Reinforcement Learning with Real-Machine Benchmark Rewards
- PathCRF: Ball-Free Soccer Event Detection via Possession Path Inference from Player Trajectories
- Empirical Gaussian Processes
- Geometry of Uncertainty: Learning Metric Spaces for Multimodal State Estimation in RL
- Few-Shot Design Optimization by Exploiting Auxiliary Information
- Capability-Oriented Training Induced Alignment Risk
- Oscillators Are All You Need: Irregular Time Series Modelling via Damped Harmonic Oscillators with Closed-Form Solutions
- It's TIME: Towards the Next Generation of Time Series Forecasting Benchmarks
- SafeNeuron: Neuron-Level Safety Alignment for Large Language Models
- Amortized Molecular Optimization via Group Relative Policy Optimization
- How Sampling Shapes LLM Alignment: From One-Shot Optima to Iterative Dynamics
- WaveFormer: Wavelet Embedding Transformer for Biomedical Signals
- Learning to Forget Attention: Memory Consolidation for Adaptive Compute Reduction
- Diffusion Alignment Beyond KL: Variance Minimisation as Effective Policy Optimiser
- Categorical Flow Maps
- Community Concealment from Unsupervised Graph Learning-Based Clustering
- Self-Supervised Learning via Flow-Guided Neural Operator on Time-Series Data
- Function-Space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage
- When and What to Ask: AskBench and Rubric-Guided RLVR for LLM Clarification
- Generative AI-Driven Phase Control for RIS-Aided Cell-Free Massive MIMO Systems
- Active Zero: Self-Evolving Vision-Language Models through Active Environment Exploration
- Unlearnable phases of matter
- Hierarchical Testing of a Hybrid Machine Learning-Physics Global Atmosphere Model
- Amortised and provably-robust simulation-based inference
- Sample-Free Safety Assessment of Neural Network Controllers via Taylor Methods
- Traffic Flow Reconstruction from Limited Collected Data
- Latent Forcing: Reordering the Diffusion Trajectory for Pixel-Space Image Generation
- The Cost of Learning under Multiple Change Points
- Optimizing Agent Planning for Security and Autonomy
- Surface impedance inference via neural fields and sparse acoustic data obtained by a compact array
- Adaptive Power Iteration Method for Differentially Private PCA
- Calibration and Evaluation of Car-Following Models for Autonomous Shuttles Using a Novel Multi-Criteria Framework
- HyperDet: 3D Object Detection with Hyper 4D Radar Point Clouds
- PLESS: Pseudo-Label Enhancement with Spreading Scribbles for Weakly Supervised Segmentation
- Enforcing Reciprocity in Operator Learning for Seismic Wave Propagation
- LAER-MoE: Load-Adaptive Expert Re-layout for Efficient Mixture-of-Experts Training
- Estimation of instrument and noise parameters for inverse problem based on prior diffusion model
- PAC-Bayesian Generalization Guarantees for Fairness on Stochastic and Deterministic Classifiers
- NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control
- STaR: Scalable Task-Conditioned Retrieval for Long-Horizon Multimodal Robot Memory
- GAC-KAN: An Ultra-Lightweight GNSS Interference Classifier for GenAI-Powered Consumer Edge Devices
- Predicting the post-wildfire mudflow onset using machine learning models on multi-parameter experimental data
- AM-FM: A Foundation Model for Ambient Intelligence Through WiFi
- Adaptive Physics Transformer with Fused Global-Local Attention for Subsurface Energy Systems
- Towards Compressive and Scalable Recurrent Memory
- Charting Empirical Laws for LLM Fine-Tuning in Scientific Multi-Discipline Learning
- Protein Language Model Embeddings Improve Generalization of Implicit Transfer Operators
- The Magic Correlations: Understanding Knowledge Transfer from Pretraining to Supervised Fine-Tuning
- Patch the Distribution Mismatch: RL Rewriting Agent for Stable Off-Policy SFT
- Learning Glioblastoma Tumor Heterogeneity Using Brain Inspired Topological Neural Networks
- Evaluating Memory Structure in LLM Agents
- Efficient Analysis of the Distilled Neural Tangent Kernel
- Structured Hybrid Mechanistic Models for Robust Estimation of Time-Dependent Intervention Outcomes
- Toward Adaptive Non-Intrusive Reduced-Order Models: Design and Challenges
- WSBD: Freezing-Based Optimizer for Quantum Neural Networks
- Provably Efficient Algorithms for S- and Non-Rectangular Robust MDPs with General Parameterization
- Sparse Semantic Dimension as a Generalization Certificate for LLMs
- CADET: Context-Conditioned Ads CTR Prediction With a Decoder-Only Transformer
- TimeSynth: A Framework for Uncovering Systematic Biases in Time Series Forecasting
- Multi-Level Strategic Classification: Incentivizing Improvement through Promotion and Relegation Dynamics
- Hierarchical Concept Embedding & Pursuit for Interpretable Image Classification
- Assessing Low Back Movement with Motion Tape Sensor Data Through Deep Learning
- PRISM: A 3D Probabilistic Neural Representation for Interpretable Shape Modeling
- External Division of Two Bregman Proximity Operators for Poisson Inverse Problems
- Exploring Multiple High-Scoring Subspaces in Generative Flow Networks
- Partial GFlowNet: Accelerating Convergence in Large State Spaces via Strategic Partitioning
- A Generic Framework for Fair Consensus Clustering in Streams
- Calibrating an Imperfect Auxiliary Predictor for Unobserved No-Purchase Choice
- Unifying Stable Optimization and Reference Regularization in RLHF
- PASCAL: A Phase-Aware Scheduling Algorithm for Serving Reasoning-based Large Language Models
- Real-Time Proactive Anomaly Detection via Forward and Backward Forecast Modeling
- The Implicit Bias of Steepest Descent with Mini-batch Stochastic Gradient
- Brain4FMs: A Benchmark of Foundation Models for Electrical Brain Signal
- Learn from Your Mistakes: Self-Correcting Masked Diffusion Models
- SkillRater: Untangling Capabilities in Multimodal Data
- How Well Do Large-Scale Chemical Language Models Transfer to Downstream Tasks?
- TreeGrad-Ranker: Feature Ranking via $O(L)$-Time Gradients for Decision Trees
- GP2F: Cross-Domain Graph Prompting with Adaptive Fusion of Pre-trained Graph Neural Networks
- TIP: Resisting Gradient Inversion via Targeted Interpretable Perturbation in Federated Learning
- Both Topology and Text Matter: Revisiting LLM-guided Out-of-Distribution Detection on Text-attributed Graphs
- UMAP Is Spectral Clustering on the Fuzzy Nearest-Neighbor Graph
- Fully First-Order Algorithms for Online Bilevel Optimization
- Explainable Machine-Learning based Detection of Knee Injuries in Runners
- SpiralFormer: Looped Transformers Can Learn Hierarchical Dependencies via Multi-Resolution Recursion
- Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception
- Towards Fair and Comprehensive Evaluation of Routers in Collaborative LLM Systems
- SynthRAR: Ring Artifacts Reduction in CT with Unrolled Network and Synthetic Data Training
- Where Bits Matter in World Model Planning: A Paired Mixed-Bit Study for Efficient Spatial Reasoning
- Agentic AI for Cybersecurity: A Meta-Cognitive Architecture for Governable Autonomy
- Mitigating Mismatch within Reference-based Preference Optimization
- Leveraging LLMs to support co-evolution between definitions and instances of textual DSLs: A Systematic Evaluation
- DynaHOI: Benchmarking Hand-Object Interaction for Dynamic Target
- Who Does What? Archetypes of Roles Assigned to LLMs During Human-AI Decision-Making
- AdaptEvolve: Improving Efficiency of Evolutionary AI Agents through Adaptive Model Selection
- IncompeBench: A Permissively Licensed, Fine-Grained Benchmark for Music Information Retrieval
- Synthesis of Late Gadolinium Enhancement Images via Implicit Neural Representations for Cardiac Scar Segmentation
- Towards Performance-Enhanced Model-Contrastive Federated Learning using Historical Information in Heterogeneous Scenarios
- TAVAE: A VAE with Adaptable Priors Explains Contextual Modulation in the Visual Cortex
- Manifold-Aware Temporal Domain Generalization for Large Language Models
- Accelerating Robotic Reinforcement Learning with Agent Guidance
- Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?
- On the Sensitivity of Firing Rate-Based Federated Spiking Neural Networks to Differential Privacy
- An Empirical Study of the Imbalance Issue in Software Vulnerability Detection
- Fourier Transformers for Latent Crystallographic Diffusion and Generative Modeling
- ModelWisdom: An Integrated Toolkit for TLA+ Model Visualization, Digest and Repair
- Choose Your Agent: Tradeoffs in Adopting AI Advisors, Coaches, and Delegates in Multi-Party Negotiation
- DeepSight: An All-in-One LM Safety Toolkit
- Multi Graph Search for High-Dimensional Robot Motion Planning
- On the Complexity of Offline Reinforcement Learning with $Q^\star$-Approximation and Partial Coverage
- KAN-FIF: Spline-Parameterized Lightweight Physics-based Tropical Cyclone Estimation on Meteorological Satellite
- Meta-Sel: Efficient Demonstration Selection for In-Context Learning via Supervised Meta-Learning
- Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation
- On the Adoption of AI Coding Agents in Open-source Android and iOS Development
- dVoting: Fast Voting for dLLMs
- 3DGSNav: Enhancing Vision-Language Model Reasoning for Object Navigation via Active 3D Gaussian Splatting
- SAGEO Arena: A Realistic Environment for Evaluating Search-Augmented Generative Engine Optimization
- Visual Reasoning Benchmark: Evaluating Multimodal LLMs on Classroom-Authentic Visual Problems from Primary Education
- DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing
- VIRENA: Virtual Arena for Research, Education, and Democratic Innovation
- The Observer Effect in World Models: Invasive Adaptation Corrupts Latent Physics
- Towards On-Policy SFT: Distribution Discriminant Theory and its Applications in LLM Training
- Bandit Learning in Matching Markets with Interviews
- Energy-Aware Spike Budgeting for Continual Learning in Spiking Neural Networks for Neuromorphic Vision
- Olmix: A Framework for Data Mixing Throughout LM Development
- Intrinsic-Energy Joint Embedding Predictive Architectures Induce Quasimetric Spaces
- ExtractBench: A Benchmark and Evaluation Methodology for Complex Structured Extraction
- A technical curriculum on language-oriented artificial intelligence in translation and specialised communication
- On the implicit regularization of Langevin dynamics with projected noise
- Creative Ownership in the Age of AI
- AttentionRetriever: Attention Layers are Secretly Long Document Retrievers
- UniT: Unified Multimodal Chain-of-Thought Test-time Scaling
- Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment
- Phase Transition for Budgeted Multi-Agent Synergy
- Compiling High-Level Neural Network Specifications into VNN-LIB Queries
- NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews
- PBP: Post-training Backdoor Purification for Malware Classifiers
- SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents
- Credal Concept Bottleneck Models: Structural Separation of Epistemic and Aleatoric Uncertainty
- AI-Driven Clinical Decision Support System for Enhanced Diabetes Diagnosis and Management
- Toward Reliable Tea Leaf Disease Diagnosis Using Deep Learning Model: Enhancing Robustness With Explainable AI and Adversarial Training
- How Many Features Can a Language Model Store Under the Linear Representation Hypothesis?
- DeepRed: an architecture for redshift estimation
- HiFloat4 Format for Language Model Inference
- CryptoAnalystBench: Failures in Multi-Tool Long-Form LLM Analysis
- Predictive Associative Memory: Retrieval Beyond Similarity Through Temporal Co-occurrence
- Security Threat Modeling for Emerging AI-Agent Protocols: A Comparative Analysis of MCP, A2A, Agora, and ANP
- MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation
- Situated, Dynamic, and Subjective: Envisioning the Design of Theory-of-Mind-Enabled Everyday AI with Industry Practitioners
- Divide and Learn: Multi-Objective Combinatorial Optimization at Scale
- When Models Examine Themselves: Vocabulary-Activation Correspondence in Self-Referential Processing
- Bootstrapping-based Regularisation for Reducing Individual Prediction Instability in Clinical Risk Prediction Models
- Finding the Cracks: Improving LLMs Reasoning with Paraphrastic Probing and Consistency Verification
- The Energy of Falsehood: Detecting Hallucinations via Diffusion Model Likelihoods
- The Manifold of the Absolute: Religious Perennialism as Generative Inference
- Retrieval-Aware Distillation for Transformer-SSM Hybrids
- General and Efficient Steering of Unconditional Diffusion
- Can We Really Learn One Representation to Optimize All Rewards?
- When Visibility Outpaces Verification: Delayed Verification and Narrative Lock-in in Agentic AI Discourse
- Gradients Must Earn Their Influence: Unifying SFT with Generalized Entropic Objectives
- Fighting MRI Anisotropy: Learning Multiple Cardiac Shapes From a Single Implicit Neural Representation
- Towards Reliable Machine Translation: Scaling LLMs for Critical Error Detection and Safety
- Enhanced Portable Ultra Low-Field Diffusion Tensor Imaging with Bayesian Artifact Correction and Deep Learning-Based Super-Resolution
- From Noise to Order: Learning to Rank via Denoising Diffusion
- EM-Aware Physical Synthesis: Neural Inductor Modeling and Intelligent Placement & Routing for RF Circuits
- Compiler-Guided Inference-Time Adaptation: Improving GPT-5 Programming Performance in Idris
- Understanding Persuasive Interactions between Generative Social Agents and Humans: The Knowledge-based Persuasion Model (KPM)
- RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline Analysis
- Multimodal Fact-Level Attribution for Verifiable Reasoning
- Differentially Private and Communication Efficient Large Language Model Split Inference via Stochastic Quantization and Soft Prompt
- How Smart Is Your GUI Agent? A Framework for the Future of Software Interaction
- Locally Interpretable Individualized Treatment Rules for Black-Box Decision Models
- Adaptive Milestone Reward for GUI Agents
- Stop Tracking Me! Proactive Defense Against Attribute Inference Attack in LLMs
- AltTS: A Dual-Path Framework with Alternating Optimization for Multivariate Time Series Forecasting
- Krause Synchronization Transformers
- Native Reasoning Models: Training Language Models to Reason on Unverifiable Data
- TS-Memory: Plug-and-Play Memory for Time Series Foundation Models
- Perception-based Image Denoising via Generative Compression
- ReaDy-Go: Real-to-Sim Dynamic 3D Gaussian Splatting Simulation for Environment-Specific Visual Navigation with Moving Obstacles
- Analytical Search
- Gradient Compression May Hurt Generalization: A Remedy by Synthetic Data Guided Sharpness Aware Minimization
- ABot-N0: Technical Report on the VLA Foundation Model for Versatile Embodied Navigation
- PLOT-CT: Pre-log Voronoi Decomposition Assisted Generation for Low-dose CT Reconstruction
- ArGEnT: Arbitrary Geometry-encoded Transformer for Operator Learning
- ScalSelect: Scalable Training-Free Multimodal Data Selection for Efficient Visual Instruction Tuning
- Variation-aware Flexible 3D Gaussian Editing
- ViTaS: Visual Tactile Soft Fusion Contrastive Learning for Visuomotor Learning
- Brain Tumor Classifiers Under Attack: Robustness of ResNet Variants Against Transferable FGSM and PGD Attacks
- DMind-3: A Sovereign Edge--Local--Cloud AI System with Controlled Deliberation and Correction-Based Tuning for Safe, Low-Latency Transaction Execution
- LoRA-based Parameter-Efficient LLMs for Continuous Learning in Edge-based Malware Detection
- SToRM: Supervised Token Reduction for Multi-modal LLMs toward efficient end-to-end autonomous driving
- Provable Offline Reinforcement Learning for Structured Cyclic MDPs
- PatientHub: A Unified Framework for Patient Simulation
- DRACO: a Cross-Domain Benchmark for Deep Research Accuracy, Completeness, and Objectivity
- ANML: Attribution-Native Machine Learning with Guaranteed Robustness
- OMEGA-Avatar: One-shot Modeling of 360{\deg} Gaussian Avatars
- TabSieve: Explicit In-Table Evidence Selection for Tabular Prediction
- Semantically Conditioned Diffusion Models for Cerebral DSA Synthesis
- LLM-Driven 3D Scene Generation of Agricultural Simulation Environments
- Adapting Vision-Language Models for E-commerce Understanding at Scale
- AmbiBench: Benchmarking Mobile GUI Agents Beyond One-Shot Instructions in the Wild
- Cooperation Breakdown in LLM Agents Under Communication Delays
- MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling
- Safe Fairness Guarantees Without Demographics in Classification: Spectral Uncertainty Set Perspective
- Evaluating LLM Safety Under Repeated Inference via Accelerated Prompt Stress Testing
- ULTRA:Urdu Language Transformer-based Recommendation Architecture
- Improving Neural Retrieval with Attribution-Guided Query Rewriting
- Resource-Aware Deployment Optimization for Collaborative Intrusion Detection in Layered Networks
- PhyNiKCE: A Neurosymbolic Agentic Framework for Autonomous Computational Fluid Dynamics
- Benchmark Health Index: A Systematic Framework for Benchmarking the Benchmarks of LLMs
- Right for the Wrong Reasons: Epistemic Regret Minimization for Causal Rung Collapse in LLMs
- Beyond Pixels: Vector-to-Graph Transformation for Reliable Schematic Auditing
- ThinkRouter: Efficient Reasoning via Routing Thinking between Latent and Discrete Spaces
- Beyond Parameter Arithmetic: Sparse Complementary Fusion for Distribution-Aware Model Merging
- Cross-Architecture Model Diffing with Crosscoders: Unsupervised Discovery of Differences Between LLMs
- Text2GQL-Bench: A Text to Graph Query Language Benchmark [Experiment, Analysis & Benchmark]
- AIR: Improving Agent Safety through Incident Response
- TSR: Trajectory-Search Rollouts for Multi-Turn RL of LLM Agents
- How to Optimize Multispecies Set Predictions in Presence-Absence Modeling ?
- RELATE: A Reinforcement Learning-Enhanced LLM Framework for Advertising Text Generation
- FlowMind: Execute-Summarize for Structured Workflow Generation from LLM Reasoning
- Beyond End-to-End Video Models: An LLM-Based Multi-Agent System for Educational Video Generation
- Detecting RLVR Training Data via Structural Convergence of Reasoning
- Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation
- PuYun-LDM: A Latent Diffusion Model for High-Resolution Ensemble Weather Forecasts
- Predicting LLM Output Length via Entropy-Guided Representations
- Revis: Sparse Latent Steering to Mitigate Object Hallucination in Large Vision-Language Models
- Prototype Transformer: Towards Language Model Architectures Interpretable by Design
- Talk2DM: Enabling Natural Language Querying and Commonsense Reasoning for Vehicle-Road-Cloud Integrated Dynamic Maps with Large Language Models
- Intelligent AI Delegation
- From Atoms to Trees: Building a Structured Feature Forest with Hierarchical Sparse Autoencoders
- When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation
- AlphaPROBE: Alpha Mining via Principled Retrieval and On-graph biased evolution
- MEME: Modeling the Evolutionary Modes of Financial Markets
- Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments
- CSEval: A Framework for Evaluating Clinical Semantics in Text-to-Image Generation
- InjectRBP: Steering Large Language Model Reasoning Behavior via Pattern Injection
- Multi UAVs Preflight Planning in a Shared and Dynamic Airspace
- LawThinker: A Deep Research Legal Agent in Dynamic Environments
- Tiny Recursive Reasoning with Mamba-2 Attention Hybrid
- Differentiable Modal Logic for Multi-Agent Diagnosis, Orchestration and Communication
- The Pensieve Paradigm: Stateful Language Models Mastering Their Own Context
- Stop Unnecessary Reflection: Training LRMs for Efficient Reasoning with Adaptive Reflection and Length Coordinated Penalty
- Commencing-Student Enrolment Forecasting Under Data Sparsity with Time Series Foundation Models
- HLA: Hadamard Linear Attention
- Neutral Prompts, Non-Neutral People: Quantifying Gender and Skin-Tone Bias in Gemini Flash 2.5 Image and GPT Image 1.5
- Value Alignment Tax: Measuring Value Trade-offs in LLM Alignment
- STAR : Bridging Statistical and Agentic Reasoning for Large Model Performance Prediction
- Seq2Seq2Seq: Lossless Data Compression via Discrete Latent Transformers and Reinforcement Learning
- GPT-4o Lacks Core Features of Theory of Mind
- Sci-CoE: Co-evolving Scientific Reasoning LLMs via Geometric Consensus with Sparse Supervision
- Statistical Parsing for Logical Information Retrieval
- Pedagogically-Inspired Data Synthesis for Language Model Knowledge Distillation
- SAM3-LiteText: An Anatomical Study of the SAM3 Text Encoder for Efficient Vision-Language Segmentation
- "Sorry, I Didn't Catch That": How Speech Models Miss What Matters Most
- Think like a Scientist: Physics-guided LLM Agent for Equation Discovery
- CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn and Multi-Step Agentic Tool Use
- Agentic Test-Time Scaling for WebAgents
- HybridRAG: A Practical LLM-based ChatBot Framework based on Pre-Generated Q&A over Raw Unstructured Documents
- Methodological Variation in Studying Staff and Student Perceptions of AI
- BIRD: A Museum Open Dataset Combining Behavior Patterns and Identity Types to Better Model Visitors' Experience
- Nested Named Entity Recognition in Plasma Physics Research Articles
- Automated Optimization Modeling via a Localizable Error-Driven Perspective
- Assessing LLM Reliability on Temporally Recent Open-Domain Questions
- Small Updates, Big Doubts: Does Parameter-Efficient Fine-tuning Enhance Hallucination Detection ?
- Visualizing and Benchmarking LLM Factual Hallucination Tendencies via Internal State Analysis and Clustering
- Enhancing SDG-Text Classification with Combinatorial Fusion Analysis and Generative AI
- Disentangling Direction and Magnitude in Transformer Representations: A Double Dissociation Through L2-Matched Perturbation Analysis
- Efficient Hyper-Parameter Search for LoRA via Language-aided Bayesian Optimization
- The Script Tax: Measuring Tokenization-Driven Efficiency and Latency Disparities in Multilingual Language Models
- Evaluating Few-Shot Temporal Reasoning of LLMs for Human Activity Prediction in Smart Environments
- What Do LLMs Know About Alzheimer's Disease? Fine-Tuning, Probing, and Data Synthesis for AD Detection
- From Instruction to Output: The Role of Prompting in Modern NLG
- KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models
- Spectra: Rethinking Optimizers for LLMs Under Spectral Anisotropy
- TDPNavigator-Placer: Thermal- and Wirelength-Aware Chiplet Placement in 2.5D Systems Through Multi-Agent Reinforcement Learning
- MuCO: Generative Peptide Cyclization Empowered by Multi-stage Conformation Optimization
- Time-TK: A Multi-Offset Temporal Interaction Framework Combining Transformer and Kolmogorov-Arnold Networks for Time Series Forecasting
- MELINOE: Fine-Tuning Enables Memory-Efficient Inference for Mixture-of-Experts Models
- Position-Aware Self-supervised Representation Learning for Cross-mode Radar Signal Recognition
- Hybrid operator learning of wave scattering maps in high-contrast media
- DDL2PropBank Agent: Benchmarking Multi-Agent Frameworks' Developer Experience Through a Novel Relational Schema Mapping Task
- interwhen: A Generalizable Framework for Verifiable Reasoning with Test-time Monitors
- Zero-Sacrifice Persistent-Robustness Adversarial Defense for Pre-Trained Encoders
- UltraLIF: Fully Differentiable Spiking Neural Networks via Ultradiscretization and Max-Plus Algebra
- Explaining AI Without Code: A User Study on Explainable AI
- Latent Generative Solvers for Generalizable Long-Term Physics Simulation
- On Decision-Valued Maps and Representational Dependence
- Voxtral Realtime
- The PBSAI Governance Ecosystem: A Multi-Agent AI Reference Architecture for Securing Enterprise AI Estates
- Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation
- Bi-Level Prompt Optimization for Multimodal LLM-as-a-Judge
- AgentNoiseBench: Benchmarking Robustness of Tool-Using LLM Agents Under Noisy Condition
- Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization
- ReplicatorBench: Benchmarking LLM Agents for Replicability in Social and Behavioral Sciences
- Causal-JEPA: Learning World Models through Object-Level Latent Interventions
- GHOST: Unmasking Phantom States in Mamba2 via Grouped Hidden-state Output-aware Selection & Truncation
- TRACER: Trajectory Risk Aggregation for Critical Episodes in Agentic Reasoning
- Distributionally Robust Cooperative Multi-Agent Reinforcement Learning via Robust Value Factorization
- Credit Where It is Due: Cross-Modality Connectivity Drives Precise Reinforcement Learning for MLLM Reasoning
- AgentLeak: A Full-Stack Benchmark for Privacy Leakage in Multi-Agent LLM Systems
- Human-Inspired Continuous Learning of Internal Reasoning Processes: Learning How to Think for Adaptive AI Systems
- CausalAgent: A Conversational Multi-Agent System for End-to-End Causal Inference
- Budget-Constrained Agentic Large Language Models: Intention-Based Planning for Costly Tool Use
- SemaPop: Semantic-Persona Conditioned Population Synthesis
- Learning to Configure Agentic AI Systems
- The Five Ws of Multi-Agent Communication: Who Talks to Whom, When, What, and Why -- A Survey from MARL to Emergent Language and LLMs
- MAPLE: Modality-Aware Post-training and Learning Ecosystem
- scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery
- When Agents Disagree With Themselves: Measuring Behavioral Consistency in LLM-Based Agents
- Neuro-Symbolic Multitasking: A Unified Framework for Discovering Generalizable Solutions to PDE Families
- Do MLLMs Really Understand Space? A Mathematical Reasoning Evaluation
- Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm
Research Sources: 486 | Generated: 2/13/2026
