AI Research News Feeds for October 1st, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios
Learning an Ensemble Token from Task-driven Priors in Facial Analysis
Taming Diffusion Transformer for Efficient Mobile Video Generation in Seconds
tCURLoRA: Tensor CUR Decomposition Based Low-Rank Parameter Adaptation and Its Application in Medical Image Segmentation
Towards autonomous photogrammetric forest inventory using a lightweight under-canopy robotic drone
PolSAM: Polarimetric Scattering Mechanism Informed Segment Anything Model
J-NeuS: Joint field optimization for Neural Surface reconstruction in urban scenes with limited image overlap
Binary Diffusion Probabilistic Model
RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
DecAlign: Hierarchical Cross-Modal Alignment for Decoupled Multimodal Representation Learning
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
BoundMatch: Boundary detection applied to semi-supervised segmentation
SuperEvent: Cross-Modal Learning of Event-based Keypoint Detection for SLAM
BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation
Neural Catalog: Scaling Species Recognition with Catalog of Life-Augmented Generation
KDC-Diff: A Latent-Aware Diffusion Model with Knowledge Retention for Memory-Efficient Image Generation
StPR: Spatiotemporal Preservation and Routing for Exemplar-Free Video Class-Incremental Learning
Multi-View Projection for Unsupervised Domain Adaptation in 3D Semantic Segmentation
How Do Large Vision-Language Models See Text in Image? Unveiling the Distinctive Role of OCR Heads
Photography Perspective Composition: Towards Aesthetic Perspective Recommendation
AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding
Fork-Merge Decoding: Enhancing Multimodal Understanding in Audio-Visual Large Language Models
Anatomy-DT: A Cross-Diffusion Digital Twin for Anatomical Evolution
Online Mapping for Autonomous Driving: Addressing Sensor Generalization and Dynamic Map Updates in Campus Environments
LTA-L2S: Lexical Tone-Aware Lip-to-Speech Synthesis for Mandarin with Cross-Lingual Transfer Learning
dVLA: Diffusion Vision-Language-Action Model with Multimodal Chain-of-Thought
Multi-modal Liver Segmentation and Fibrosis Staging Using Real-world MRI Images
MR$^2$-Bench: Going Beyond Matching to Reasoning in Multimodal Retrieval
GastroViT: A Vision Transformer Based Ensemble Learning Approach for Gastrointestinal Disease Classification with Grad CAM & SHAP Visualization
Automated and Scalable SEM Image Analysis of Perovskite Solar Cell Materials via a Deep Segmentation Framework
M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation
RealLiFe: Real-Time Light Field Reconstruction via Hierarchical Sparse Gradient Descent
Developing Generalist Foundation Models from a Multimodal Dataset for 3D Computed Tomography
Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection
LoRA-PT: Low-Rank Adapting UNETR for Hippocampus Segmentation Using Principal Tensor Singular Values and Vectors
Rethinking Weak-to-Strong Augmentation in Source-Free Domain Adaptive Object Detection
Multi-temporal crack segmentation in concrete structures using deep learning approaches
ResGS: Residual Densification of 3D Gaussian for Efficient Detail Recovery
PRISM: Progressive Rain removal with Integrated State-space Modeling
Post-Training Quantization via Residual Truncation and Zero Suppression for Diffusion Models
Multi-View Camera System for Variant-Aware Autonomous Vehicle Inspection and Defect Detection
Stylos: Multi-View 3D Stylization with Single-Forward Gaussian Splatting
CBAM Integrated Attention Driven Model For Betel Leaf Diseases Classification With Explainable AI
Revealing the Power of Post-Training for Small Language Models via Knowledge Distillation
DEPTHOR++: Robust Depth Enhancement from a Real-World Lightweight dToF and RGB Guidance
Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation
Autoproof: Automated Segmentation Proofreading for Connectomics
DiffCamera: Arbitrary Refocusing on Images
Video Object Segmentation-Aware Audio Generation
Hy-Facial: Hybrid Feature Extraction by Dimensionality Reduction Methods for Enhanced Facial Expression Classification
DA$^2$: Depth Anything in Any Direction
HART: Human Aligned Reconstruction Transformer
Benchmarking Egocentric Visual-Inertial SLAM at City Scale
Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
TTT3R: 3D Reconstruction as Test-Time Training
Challenges and Solutions in Selecting Optimal Lossless Data Compression Algorithms
Geometric Learning of Canonical Parameterizations of $2D$-curves
EasyOcc: 3D Pseudo-Label Supervision for Fully Self-Supervised Semantic Occupancy Prediction Models
Predicting Penalty Kick Direction Using Multi-Modal Deep Learning with Pose-Guided Attention
EchoGen: Generating Visual Echoes in Any Scene via Feed-Forward Subject-Driven Auto-Regressive Model
Human-MME: A Holistic Evaluation Benchmark for Human-Centric Multimodal Large Language Models
Beyond Overall Accuracy: Pose- and Occlusion-driven Fairness Analysis in Pedestrian Detection for Autonomous Driving
TSalV360: A Method and Dataset for Text-driven Saliency Detection in 360-Degrees Videos
Generalized Fine-Grained Category Discovery with Multi-Granularity Conceptual Experts
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance
Interpret, prune and distill Donut : towards lightweight VLMs for VQA on document
Seeing Space and Motion: Enhancing Latent Actions with Spatial and Dynamic Awareness for VLA
Cat: Post-training quantization error reduction via cluster-based affine transformation
Continuous Space-Time Video Super-Resolution with 3D Fourier Fields
SQUARE: Semantic Query-Augmented Fusion and Efficient Batch Reranking for Training-free Zero-Shot Composed Image Retrieval
Go with Your Gut: Scaling Confidence for Autoregressive Image Generation
PANDA: Towards Generalist Video Anomaly Detection via Agentic AI Engineer
MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation
Image-Difficulty-Aware Evaluation of Super-Resolution Models
LiDAR Point Cloud Colourisation Using Multi-Camera Fusion and Low-Light Image Enhancement
MAPLE: Multi-scale Attribute-enhanced Prompt Learning for Few-shot Whole Slide Image Classification
DeepSketcher: Internalizing Visual Manipulation for Multimodal Reasoning
LLaVAShield: Safeguarding Multimodal Multi-Turn Dialogues in Vision-Language Models
UniMMAD: Unified Multi-Modal and Multi-Class Anomaly Detection via MoE-Driven Feature Decompression
Self-Supervised Anatomical Consistency Learning for Vision-Grounded Medical Report Generation
A Multi-purpose Tracking Framework for Salmon Welfare Monitoring in Challenging Environments
PinPoint3D: Fine-Grained 3D Part Segmentation from a Few Clicks
Towards Reliable and Holistic Visual In-Context Learning Prompt Selection
AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment
New Fourth-Order Grayscale Indicator-Based Telegraph Diffusion Model for Image Despeckling
SETR: A Two-Stage Semantic-Enhanced Framework for Zero-Shot Composed Image Retrieval
GeoLink: Empowering Remote Sensing Foundation Model with OpenStreetMap Data
PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution
Causally Guided Gaussian Perturbations for Out-Of-Distribution Generalization in Medical Imaging
SGS: Segmentation-Guided Scoring for Global Scene Inconsistencies
DGM4+: Dataset Extension for Global Scene Inconsistency
AIMCoT: Active Information-driven Multimodal Chain-of-Thought for Vision-Language Reasoning
How Diffusion Models Memorize
ProbMed: A Probabilistic Framework for Medical Multimodal Binding
SAGE: Spatial-visual Adaptive Graph Exploration for Visual Place Recognition
LaTo: Landmark-tokenized Diffusion Transformer for Fine-grained Human Face Editing
The 1st Solution for MOSEv1 Challenge on LSVOS 2025: CGFSeg
LieHMR: Autoregressive Human Mesh Recovery with $SO(3)$ Diffusion
Dragging with Geometry: From Pixels to Geometry-Guided Image Editing
IPDRecon: Image-Plane Geometric Decoding for View-Invariant Indoor Scene Reconstruction
ART-VITON: Measurement-Guided Latent Diffusion for Artifact-Free Virtual Try-On
PCPO: Proportionate Credit Policy Optimization for Aligning Image Generation Models
Self-Evolving Vision-Language Models for Image Quality Assessment via Voting and Ranking
EchoingECG: An Electrocardiogram Cross-Modal Model for Echocardiogram Tasks
Adapting SAM with Dynamic Similarity Graphs for Few-Shot Parameter-Efficient Small Dense Object Detection: A Case Study of Chickpea Pods in Field Conditions
Overview of GeoLifeCLEF 2023: Species Composition Prediction with High Spatial Resolution at Continental Scale Using Remote Sensing
MuSLR: Multimodal Symbolic Logical Reasoning
PatchEAD: Unifying Industrial Visual Prompting Frameworks for Patch-Exclusive Anomaly Detection
Seeing Through Deception: Uncovering Misleading Creator Intent in Multimodal News with Vision-Language Models
ReLoop: "Seeing Twice and Thinking Backwards" via Closed-loop Training to Mitigate Hallucinations in Multimodal understanding
Taming the Tri-Space Tension: ARC-Guided Hallucination Modeling and Control for Text-to-Image Generation
LUMA: Low-Dimension Unified Motion Alignment with Dual-Path Anchoring for Text-to-Motion Diffusion Model
Editing Physiological Signals in Videos Using Latent Representations
DepthLM: Metric Depth From Vision Language Models
Infrastructure Sensor-enabled Vehicle Data Generation using Multi-Sensor Fusion for Proactive Safety Applications at Work Zone
Seeing Before Reasoning: A Unified Framework for Generalizable and Explainable Fake Image Detection
Robust Visual Localization in Compute-Constrained Environments by Salient Edge Rendering and Weighted Hamming Similarity
FishNet++: Analyzing the capabilities of Multimodal Large Language Models in marine biology
GaussianLens: Localized High-Resolution Reconstruction via On-Demand Gaussian Densification
LMOD+: A Comprehensive Multimodal Dataset and Benchmark for Developing and Evaluating Multimodal Large Language Models in Ophthalmology
Anchor-free Cross-view Object Geo-localization with Gaussian Position Encoding and Cross-view Association
DescribeEarth: Describe Anything for Remote Sensing Images
OmniDFA: A Unified Framework for Open Set Synthesis Image Detection and Few-Shot Attribution
Cut the Deadwood Out: Backdoor Purification via Guided Module Substitution
SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL
Safety is Not Only About Refusal: Reasoning-Enhanced Fine-tuning for Interpretable LLM Safety
A Position Paper on the Automatic Generation of Machine Learning Leaderboards
Wolf Hidden in Sheep's Conversations: Toward Harmless Data-Based Backdoor Attacks for Jailbreaking Large Language Models
Frankentext: Stitching random text fragments into long-form narratives
v1: Learning to Point Visual Tokens for Multimodal Grounded Reasoning
ConfRAG: Confidence-Guided Retrieval-Augmenting Generation
TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document Reasoning
Medical Red Teaming Protocol of Language Models: On the Importance of User Perspectives in Healthcare Settings
Text-Based Approaches to Item Alignment to Content Standards in Large-Scale Reading & Writing Tests
CreAgentive: An Agent Workflow Driven Multi-Category Creative Generation Engine
dParallel: Learnable Parallel Decoding for dLLMs
BatonVoice: An Operationalist Framework for Enhancing Controllable Speech Synthesis with Linguistic Intelligence from LLMs
Training Matryoshka Mixture-of-Experts for Elastic Inference-Time Expert Utilization
Towards Reliable Benchmarking: A Contamination Free, Controllable Evaluation Framework for Multi-step LLM Function Calling
Generating Difficult-to-Translate Texts
Scaling Spoken Language Models with Syllabic Speech Tokenization
ActorDB: A Unified Database Model Integrating Single-Writer Actors, Incremental View Maintenance, and Zero-Trust Messaging
Fingerprinting LLMs via Prompt Injection
FinCap: Topic-Aligned Captions for Short-Form Financial YouTube Videos
A Multimodal LLM Approach for Visual Question Answering on Multiparametric 3D Brain MRI
VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs
ProfVLM: A Lightweight Video-Language Model for Multi-View Proficiency Estimation
Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages
Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework
Vocabulary Customization for Efficient Domain-Specific LLM Deployment
The Hunger Game Debate: On the Emergence of Over-Competition in Multi-Agent Systems
CliniBench: A Clinical Outcome Prediction Benchmark for Generative and Encoder-Based Language Models
MGen: Millions of Naturally Occurring Generics in Context
Explaining novel senses using definition generation with open language models
VietBinoculars: A Zero-Shot Approach for Detecting Vietnamese LLM-Generated Text
Optimizing Speech Language Models for Acoustic Consistency
One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy Gradient
Latent Thinking Optimization: Your Latent Reasoning Language Model Secretly Encodes Reward Signals in its Latent Thoughts
Fast-dLLM v2: Efficient Block-Diffusion LLM
An Annotation Scheme for Factuality and its Application to Parliamentary Proceedings
Automatic Fact-checking in English and Telugu
DyFlow: Dynamic Workflow Framework for Agentic Reasoning
The Silent Judge: Unacknowledged Shortcut Bias in LLM-as-a-Judge
Limited Preference Data? Learning Better Reward Model with Latent Space Synthesis
IMProofBench: Benchmarking AI on Research-Level Mathematical Proof Generation
Reinforced Strategy Optimization for Conversational Recommender Systems via Network-of-Experts
The Flaw of Averages: Quantifying Uniformity of Performance on Benchmarks
Mitigating Biases in Language Models via Bias Unlearning
Atomic Thinking of LLMs: Decoupling and Exploring Mathematical Reasoning Abilities
CATCH: A Novel Data Synthesis Framework for High Therapy Fidelity and Memory-Driven Planning Chain of Thought in AI Counseling
Assessing Algorithmic Bias in Language-Based Depression Detection: A Comparison of DNN and LLM Approaches
ReTAG: Retrieval-Enhanced, Topic-Augmented Graph-Based Global Sensemaking
Personalized Scientific Figure Caption Generation: An Empirical Study on Author-Specific Writing Style Transfer
Believing without Seeing: Quality Scores for Contextualizing Vision-Language Model Explanations
ReFACT: A Benchmark for Scientific Confabulation Detection with Positional Error Annotations
ASR Under Noise: Exploring Robustness for Sundanese and Javanese
Mem-{\alpha}: Learning Memory Construction via Reinforcement Learning
Understanding the Mixture-of-Experts with Nadaraya-Watson Kernel
Bringing Emerging Architectures to Sequence Labeling in NLP
Reliability Crisis of Reference-free Metrics for Grammatical Error Correction
RAGferee: Building Contextual Reward Models for Retrieval-Augmented Generation
RE$^2$: Improving Chinese Grammatical Error Correction via Retrieving Appropriate Examples with Explanation
Unspoken Hints: Accuracy Without Acknowledgement in LLM Reasoning
RE-Searcher: Robust Agentic Search with Goal-oriented Planning and Self-reflection
The Rise of AfricaNLP: Contributions, Contributors, and Community Impact (2005-2025)
Beyond WER: Probing Whisper's Sub-token Decoder Across Diverse Language Resource Levels
Performance and competence intertwined: A computational model of the Null Subject stage in English-speaking children
Don't Sweat the Small Stuff: Segment-Level Meta-Evaluation Based on Pairwise Difference Correlation
Transformers through the lens of support-preserving maps between measures
The Media Bias Detector: A Framework for Annotating and Analyzing the News at Scale
QFrBLiMP: a Quebec-French Benchmark of Linguistic Minimal Pairs
Apple: Toward General Active Perception via Reinforcement Learning
A quantitative analysis of semantic information in deep representations of text and images
GIM: Improved Interpretability for Large Language Models
Discovering and Steering Interpretable Concepts in Large Generative Music Models
tenSVD algorithm for compression
A Copula Based Supervised Filter for Feature Selection in Diabetes Risk Prediction Using Machine Learning
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction
LoLA: Low-Rank Linear Attention With Sparse Caching
Regularizing Learnable Feature Extraction for Automatic Speech Recognition
TADA: Improved Diffusion Sampling with Training-free Augmented Dynamics
Regret Analysis of Posterior Sampling-Based Expected Improvement for Bayesian Optimization
BEDTime: A Unified Benchmark for Automatically Describing Time Series
Asymptotic Classification Error for Heavy-Tailed Renewal Processes
Efficient Fairness-Performance Pareto Front Computation
Brain Tumor Classification on MRI in Light of Molecular Markers
Information Design with Unknown Prior
Unified Cross-Modal Image Synthesis with Hierarchical Mixture of Product-of-Experts
Watermark under Fire: A Robustness Evaluation of LLM Watermarking
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving
Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond
Surrogate models for diffusion on graphs via sparse polynomials
Scalable Fingerprinting of Large Language Models
Controllable Motion Generation via Diffusion Modal Coupling
A Review on Riemannian Metric Learning: Closer to You than You Imagine
Approximation properties of neural ODEs
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models
Neural Kinematic Bases for Fluids
AutoJudge: Judge Decoding Without Manual Annotation
Fast Likelihood-Free Parameter Estimation for L\'evy Processes
Using Knowledge Graphs to harvest datasets for efficient CLIP model training
Detecting Instruction Fine-tuning Attacks on Language Models using Influence Function
Neural Multivariate Regression: Qualitative Insights from the Unconstrained Feature Model
AutoRAN: Automated Hijacking of Safety Reasoning in Large Reasoning Models
What Can RL Bring to VLA Generalization? An Empirical Study
Feature-aware Hypergraph Generation via Next-Scale Prediction
On Fitting Flow Models with Large Sinkhorn Couplings
Flatness After All?
IMPACT: Importance-Aware Activation Space Reconstruction
Towards Robust Surrogate Models: Benchmarking Machine Learning Approaches to Expediting Phase Field Simulations of Brittle Fracture
The Serial Scaling Hypothesis
FlowCast-ODE: Continuous Hourly Weather Forecasting with Dynamic Flow Matching and ODE Solver
Bengali Document Layout Analysis -- A YOLOV8 Based Ensembling Approach
Minimum Description Feature Selection for Complexity Reduction in Machine Learning-based Wireless Positioning
Fast training of accurate physics-informed neural networks without gradient descent
RobustNeuralNetworks.jl: a Package for Machine Learning and Data-Driven Control with Certified Robustness
Efficiently Escaping Saddle Points for Policy Optimization
Model Extraction Attacks Revisited
Collective Counterfactual Explanations: Balancing Individual Goals and Collective Dynamics
Complexity Reduction in Machine Learning-Based Wireless Positioning: Minimum Description Features
From Mean to Extreme: Formal Differential Privacy Bounds on the Success of Real-World Data Reconstruction Attacks
FedGCS: A Generative Framework for Efficient Client Selection in Federated Learning via Gradient-based Optimization
Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Amelia: A Large Dataset and Model for Airport Surface Movement Forecasting
Large-Scale Targeted Cause Discovery via Learning from Simulated Data
Towards Convexity in Anomaly Detection: A New Formulation of SSLM with Unique Optimal Solutions
Information-Geometric Barycenters for Bayesian Federated Learning
Learning Theory for Kernel Bilevel Optimization
Teaching Metric Distance to Discrete Autoregressive Language Models
InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences
Understanding Formal Reasoning Failures in LLMs as Abstract Interpreters
Beyond Next Token Probabilities: Learnable, Fast Detection of Hallucinations and Data Contamination on LLM Output Distributions
Identifying and Evaluating Inactive Heads in Pretrained LLMs
TrackCore-F: Deploying Transformer-Based Subatomic Particle Tracking on FPGAs
Are neural scaling laws leading quantum chemistry astray?
TrackFormers Part 2: Enhanced Transformer-Based Models for High-Energy Physics Track Reconstruction
An Orthogonal Learner for Individualized Outcomes in Markov Decision Processes
Stabilization of nonlinear systems with unknown delays via delay-adaptive neural operator approximate predictors
Contrastive Diffusion Guidance for Spatial Inverse Problems
Signal-Aware Workload Shifting Algorithms with Uncertainty-Quantified Predictors
Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents
Towards Verified Code Reasoning by LLMs
Pretrain-Test Task Alignment Governs Generalization in In-Context Learning
Estimating Dimensionality of Neural Representations from Finite Samples
DeepProv: Behavioral Characterization and Repair of Neural Networks via Inference Provenance Graph Analysis
Source Separation for A Cappella Music
DeepScientist: Advancing Frontier-Pushing Scientific Findings Progressively
Convergence and Divergence of Language Models under Different Random Seeds
Max-Sliced Wasserstein Distance and its use for GANs
CO3: Contrasting Concepts Compose Better
Scaling Equilibrium Propagation to Deeper Neural Network Architectures
BALLAST: Bayesian Active Learning with Look-ahead Amendment for Sea-drifter Trajectories under Spatio-Temporal Vector Fields
GaussEdit: Adaptive 3D Scene Editing with Text and Image Prompts
Text-to-Scene with Large Reasoning Models
Efficient Distributed Training via Dual Batch Sizes and Cyclic Progressive Learning
EVODiff: Entropy-aware Variance Optimized Diffusion Inference
Ordinal Label-Distribution Learning with Constrained Asymmetric Priors for Imbalanced Retinal Grading
Non-Vacuous Generalization Bounds: Can Rescaling Invariances Help?
Benchmarking Diarization Models
Self-supervised learning for phase retrieval
The silence of the weights: an investigation of structural pruning strategies for attention-based audio signal architectures
Hybrid Quantum-Classical Optimisation of Traveling Salesperson Problem
Why is topology hard to learn?
PRPO: Paragraph-level Policy Optimization for Vision-Language Deepfake Detection
FLOWER: A Flow-Matching Solver for Inverse Problems
Ultra-Reliable Risk-Aggregated Sum Rate Maximization via Model-Aided Deep Learning
TAU: A Benchmark for Cultural Sound Understanding Beyond Semantics
Personalized Auto-Grading and Feedback System for Constructive Geometry Tasks Using Large Language Models on an Online Math Platform
Enhancing Split Learning with Sharded and Blockchain-Enabled SplitFed Approaches
Conservative Decisions with Risk Scores
MetaChest: Generalized few-shot learning of patologies from chest X-rays
Coupling Generative Modeling and an Autoencoder with the Causal Bridge
RFG: Test-Time Scaling for Diffusion Large Language Model Reasoning with Reward-Free Guidance
When Langevin Monte Carlo Meets Randomization: Non-asymptotic Error Bounds beyond Log-Concavity and Gradient Lipschitzness
Generalized Contrastive Learning for Universal Multimodal Retrieval
Using Images from a Video Game to Improve the Detection of Truck Axles
Importance Sampling for Multi-Negative Multimodal Direct Preference Optimization
Transformer-Based Rate Prediction for Multi-Band Cellular Handsets
Test time training enhances in-context learning of nonlinear functions
Detecting Hope Across Languages: Multiclass Classification for Positive Online Discourse
SAC Flow: Sample-Efficient Reinforcement Learning of Flow-Based Policies via Velocity-Reparameterized Sequential Modeling
Sharpness of Minima in Deep Matrix Factorization: Exact Expressions
Logo-VGR: Visual Grounded Reasoning for Open-world Logo Recognition
RoBiologyDataChoiceQA: A Romanian Dataset for improving Biology understanding of Large Language Models
Better Privilege Separation for Agents by Restricting Data Types
AccidentBench: Benchmarking Multimodal Understanding and Reasoning in Vehicle Accidents and Beyond
SPATA: Systematic Pattern Analysis for Detailed and Transparent Data Cards
Understanding Practitioners Perspectives on Monitoring Machine Learning Systems
Cyclic Ablation: Testing Concept Localization against Functional Regeneration in AI
RANGER -- Repository-Level Agent for Graph-Enhanced Retrieval
Evaluating the Impact of Radiographic Noise on Chest X-ray Semantic Segmentation and Disease Classification Using a Scalable Noise Injection Framework
Position-Blind Ptychography: Viability of image reconstruction via data-driven variational inference
Mechanisms of Matter: Language Inferential Benchmark on Physicochemical Hypothesis in Materials Synthesis
Aspects of holographic entanglement using physics-informed-neural-networks
Bayesian Transformer for Pan-Arctic Sea Ice Concentration Mapping and Uncertainty Estimation using Sentinel-1, RCM, and AMSR2 Data
Neural Optimal Transport Meets Multivariate Conformal Prediction
SimulRAG: Simulator-based RAG for Grounding LLMs in Long-form Scientific QA
Fair Classification by Direct Intervention on Operating Characteristics
Scalable Boltzmann Generators for equilibrium sampling of large-scale materials
One-shot Conditional Sampling: MMD meets Nearest Neighbors
AGNOMIN - Architecture Agnostic Multi-Label Function Name Prediction
Defeating Cerberus: Concept-Guided Privacy-Leakage Mitigation in Multimodal Language Models
LLM-Assisted Emergency Triage Benchmark: Bridging Hospital-Rich and MCI-Like Field Simulation
Data-to-Energy Stochastic Dynamics
Refine Drugs, Don't Complete Them: Uniform-Source Discrete Flows for Fragment-Based Drug Discovery
Extensions of Robbins-Siegmund Theorem with Applications in Reinforcement Learning
fev-bench: A Realistic Benchmark for Time Series Forecasting
DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick
Equivariance by Local Canonicalization: A Matter of Representation
Entropy After $\langle \texttt{/Think} \rangle$ for reasoning model early exiting
Machine-Learning Driven Load Shedding to Mitigate Instability Attacks in Power Grids
The Loss Kernel: A Geometric Probe for Deep Learning Interpretability
TASP: Topology-aware Sequence Parallelism
Bayesian Influence Functions for Hessian-Free Data Attribution
Importance of localized dilatation and distensibility in identifying determinants of thoracic aortic aneurysm with neural operators
Linking Process to Outcome: Conditional Reward Modeling for LLM Reasoning
Clarification as Supervision: Reinforcement Learning for Vision-Language Interfaces
Uncertainty Quantification for Regression using Proper Scoring Rules
Recursive Self-Aggregation Unlocks Deep Thinking in Large Language Models
Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models
Alignment-Aware Decoding
Neighbor-aware informal settlement mapping with graph convolutional networks
PDE Solvers Should Be Local: Fast, Stable Rollouts with Learned Local Stencils
Marginal Flow: a flexible and efficient framework for density estimation
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners
Machine Learning Detection of Lithium Plating in Lithium-ion Cells: A Gaussian Process Approach
Beyond Linear Probes: Dynamic Safety Monitoring for Language Models
From Fragile to Certified: Wasserstein Audits of Group Fairness Under Distribution Shift
Wasserstein Distributionally Robust Optimization Through the Lens of Structural Causal Models and Individual Fairness
Reframing Generative Models for Physical Systems using Stochastic Interpolants
Tuning the Tuner: Introducing Hyperparameter Optimization for Auto-Tuning
NeuroTTT: Bridging Pretraining-Downstream Task Misalignment in EEG Foundation Models via Test-Time Training
Attribution-Guided Decoding
A Review on Single-Problem Multi-Attempt Heuristic Optimization
ACE: Adapting sampling for Counterfactual Explanations
A Generalized Information Bottleneck Theory of Deep Learning
FedMuon: Federated Learning with Bias-corrected LMO-based Optimization
Memory-Driven Self-Improvement for Decision Making with Large Language Models
Improving Sampling Efficiency in RLVR through Adaptive Rollout and Response Reuse
Decentralized Asynchronous Multi-player Bandits
Kairos: Towards Adaptive and Generalizable Time Series Foundation Models
MIDAS: Misalignment-based Data Augmentation Strategy for Imbalanced Multimodal Learning
RL-Guided Data Selection for Language Model Finetuning
Federated Learning with Enhanced Privacy via Model Splitting and Random Client Participation
ReNF: Rethinking the Design Space of Neural Long-Term Time Series Forecasters
Reevaluating Convolutional Neural Networks for Spectral Analysis: A Focus on Raman Spectroscopy
Exact Solutions to the Quantum Schr\"odinger Bridge Problem
CAST: Continuous and Differentiable Semi-Structured Sparsity-Aware Training for Large Language Models
Informed Asymmetric Actor-Critic: Leveraging Privileged Signals Beyond Full-State Access
FITS: Towards an AI-Driven Fashion Information Tool for Sustainability
Stealthy Yet Effective: Distribution-Preserving Backdoor Attacks on Graph Classification
Scaling Up Temporal Domain Generalization via Temporal Experts Averaging
Clip-Low Increases Entropy and Clip-High Decreases Entropy in Reinforcement Learning of Large Language Models
UncertainGen: Uncertainty-Aware Representations of DNA Sequences for Metagenomic Binning
Domain-Aware Hyperdimensional Computing for Edge Smart Manufacturing
Accelerating Transformers in Online RL
Guiding Mixture-of-Experts with Temporal Multimodal Interactions
Minimalist Explanation Generation and Circuit Discovery
A Unified Probabilistic Framework for Dictionary Learning with Parsimonious Activation
Can VLM Pseudo-Labels Train a Time-Series QA Model That Outperforms the VLM?
Physics-Informed Learning for Human Whole-Body Kinematics Prediction via Sparse IMUs
Adaptive Graph Coarsening for Efficient GNN Training
Expert Merging: Model Merging with Unsupervised Expert Alignment and Importance-Guided Layer Chunking
Reweighted Flow Matching via Unbalanced OT for Label-free Long-tailed Generation
MuPlon: Multi-Path Causal Optimization for Claim Verification through Controlling Confounding
Beyond Point Estimates: Likelihood-Based Full-Posterior Wireless Localization
A Physics-Guided Probabilistic Surrogate Modeling Framework for Digital Twins of Underwater Radiated Noise
Less is More: Towards Simple Graph Contrastive Learning
Rotation Control Unlearning: Quantifying and Controlling Continuous Unlearning for LLM with The Cognitive Rotation Space
OPPO: Accelerating PPO-based RLHF via Pipeline Overlap
Online Decision Making with Generative Action Sets
A Hamiltonian driven Geometric Construction of Neural Networks on the Lognormal Statistical Manifold
From Cheap Geometry to Expensive Physics: Elevating Neural Operators via Latent Shape Pretraining
Characterization and Learning of Causal Graphs with Latent Confounders and Post-treatment Selection from Interventional Data
Scalable Disk-Based Approximate Nearest Neighbor Search with Page-Aligned Graph
Can Molecular Foundation Models Know What They Don't Know? A Simple Remedy with Preference Optimization
EEsizer: LLM-Based AI Agent for Sizing of Analog and Mixed Signal Circuit
World Model for AI Autonomous Navigation in Mechanical Thrombectomy
Flow Matching with Semidiscrete Couplings
Meta-Router: Bridging Gold-standard and Preference-based Evaluations in Large Language Model Routing
Lightweight and Robust Federated Data Valuation
Safe In-Context Reinforcement Learning
Machine Learning Algorithms for Improving Black Box Optimization Solvers
Binary Sparse Coding for Interpretability
Effective Model Pruning
Layer-wise dynamic rank for compressing large language models
Swift: An Autoregressive Consistency Model for Efficient Weather Forecasting
How Does Preconditioning Guide Feature Learning in Deep Neural Networks?
Deep set based operator learning with uncertainty quantification
Growing Winning Subnetworks, Not Pruning Them: A Paradigm for Density Discovery in Sparse Neural Networks
Nudging the Boundaries of LLM Reasoning
Norm-Q: Effective Compression Method for Hidden Markov Models in Neuro-Symbolic Applications
Conformal Prediction for Signal Temporal Logic Inference
MSCoD: An Enhanced Bayesian Updating Framework with Multi-Scale Information Bottleneck and Cooperative Attention for Structure-Based Drug Design
Integrated Forecasting of Marine Renewable Power: An Adaptively Bayesian-Optimized MVMD-LSTM Framework for Wind-Solar-Wave Energy
Simple, Fast and Efficient Injective Manifold Density Estimation with Random Projections
WDformer: A Wavelet-based Differential Transformer Model for Time Series Forecasting
Sampling via Gaussian Mixture Approximations
Fine-tuning of Large Language Models for Domain-Specific Cybersecurity Knowledge
Heterogeneous Multi-agent Collaboration in UAV-assisted Mobile Crowdsensing Networks
MAESTRO : Adaptive Sparse Attention and Robust Learning for Multimodal Dynamic Time Series
Optimisation of Resource Allocation in Heterogeneous Wireless Networks Using Deep Reinforcement Learning
Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region
Deep Survival Analysis for Competing Risk Modeling with Functional Covariates and Missing Data Imputation
On the Shape of Latent Variables in a Denoising VAE-MoG: A Posterior Sampling-Based Study
Crowdsourcing Without People: Modelling Clustering Algorithms as Experts
Multi-Task Equation Discovery
Leveraging Vulnerabilities in Temporal Graph Neural Networks via Strategic High-Impact Assaults
Feedback Control for Small Budget Pacing
SOLD: SELFIES-based Objective-driven Latent Diffusion
VLHSA: Vision-Language Hierarchical Semantic Alignment for Jigsaw Puzzle Solving with Eroded Gaps
Polynomial Contrastive Learning for Privacy-Preserving Representation Learning on Graphs
Hyperbolic Optimization
DPSformer: A long-tail-aware model for improving heavy rainfall prediction
LEMs: A Primer On Large Execution Models
Anomaly detection by partitioning of multi-variate time series
Evaluating Double Descent in Machine Learning: Insights from Tree-Based Models Applied to a Genomic Prediction Task
On The Dynamic Ensemble Selection for TinyML-based Systems -- a Preliminary Study
Sensor optimization for urban wind estimation with cluster-based probabilistic framework
AMLA: MUL by ADD in FlashAttention Rescaling
Mind the Gap: A Review of Arabic Post-Training Datasets and Their Limitations
The Impact of Language Mixing on Bilingual LLM Reasoning
Measuring the Measures: Discriminative Capacity of Representational Similarity Metrics Across Model Families
Long-Horizon Visual Imitation Learning via Plan and Code Reflection
Multi Layered Autonomy and AI Ecologies in Robotic Art Installations
Negative-Guided Subject Fidelity Optimization for Zero-Shot Subject-Driven Generation
Static Word Embeddings for Sentence Semantic Representation
VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models
QGuard:Question-based Zero-shot Guard for Multi-modal LLM Safety
FeDa4Fair: Client-Level Federated Datasets for Fairness Evaluation
When Does Multimodality Lead to Better Time Series Forecasting?
SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions
DNN-Based Precoding in RIS-Aided mmWave MIMO Systems With Practical Phase Shift
Deep Graph Learning for Industrial Carbon Emission Analysis and Policy Impact
LATTE: Latent Trajectory Embedding for Diffusion-Generated Image Detection
HumanVideo-MME: Benchmarking MLLMs for Human-Centric Video Understanding
On the Effectiveness of Methods and Metrics for Explainable AI in Remote Sensing Image Scene Classification
CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design
Scalable LLM Math Reasoning Acceleration with Low-rank Distillation
TensorRL-QAS: Reinforcement learning with tensor networks for improved quantum architecture search
Modeling Saliency Dataset Bias
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
scSiameseClu: A Siamese Clustering Framework for Interpreting single-cell RNA Sequencing Data
Structured Agent Distillation for Large Language Model
ELEPHANT: Measuring and understanding social sycophancy in LLMs
Silent Leaks: Implicit Knowledge Extraction Attack on RAG Systems through Benign Queries
DEBATE, TRAIN, EVOLVE: Self Evolution of Language Model Reasoning
Causal Interventions Reveal Shared Structure Across English Filler-Gap Constructions
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models
Find the Fruit: Zero-Shot Sim2Real RL for Occlusion-Aware Plant Manipulation
LLM Agents for Interactive Exploration of Historical Cadastre Data: Framework and Application to Venice
Value-Guided Search for Efficient Chain-of-Thought Reasoning
SelfReflect: Can LLMs Communicate Their Internal Answer Distribution?
Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features
ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models
Personalized Subgraph Federated Learning with Differentiable Auxiliary Projections
Linear Attention for Efficient Bidirectional Sequence Modeling
Adaptive Conformal Guidance for Learning under Uncertainty
Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation
Voting or Consensus? Decision-Making in Multi-Agent Debate
FANformer: Improving Large Language Models Through Effective Periodicity Modeling
Answer, Refuse, or Guess? Investigating Risk-Aware Decision Making in Language Models
Hierarchical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM
Rethinking Diffusion Model in High Dimension
Revisiting semi-supervised learning in the era of foundation models
A Survey on SAR ship classification using Deep Learning
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
Value Profiles for Encoding Human Variation
CODA: Repurposing Continuous VAEs for Discrete Tokenization
Enabling Rapid Shared Human-AI Mental Model Alignment via the After-Action Review
Lobster: A GPU-Accelerated Framework for Neurosymbolic Programming
Adaptive Rectification Sampling for Test-Time Compute Scaling
Fair Uncertainty Quantification for Depression Prediction
Stochastic Layer-wise Learning: Scalable and Efficient Alternative to Backpropagation
Simple yet Effective Semi-supervised Knowledge Distillation from Vision-Language Models via Dual-Head Optimization
SSTP: Efficient Sample Selection for Trajectory Prediction
Composing Global Solutions to Reasoning Tasks via Algebraic Objects in Neural Nets
FAN: Fourier Analysis Networks
pEBR: A Probabilistic Approach to Embedding Based Retrieval
Unlocking Transfer Learning for Open-World Few-Shot Recognition
BianCang: A Traditional Chinese Medicine Large Language Model
Learning Semantic Association Rules from Internet of Things Data
Dynamic Pricing in High-Speed Railways Using Multi-Agent Reinforcement Learning
Dagger Behind Smile: Fool LLMs with a Happy Ending Story
A space-decoupling framework for optimization on bounded-rank matrices with orthogonally invariant constraints
CE-SDWV: Effective and Efficient Concept Erasure for Text-to-Image Diffusion Models via a Semantic-Driven Word Vocabulary
LFTR: Learning-Free Token Reduction for Multimodal Large Language Models
Should You Use Your Large Language Model to Explore or Exploit?
Dual Alignment Maximin Optimization for Offline Model-based RL
iVISPAR -- An Interactive Visual-Spatial Reasoning Benchmark for VLMs
Towards Reasoning Ability of Small Language Models
Solving the Cold Start Problem on One's Own as an End User via Preference Transfer
A physical approach to qualia and the emergence of conscious observers in qualia space
Medical Question Summarization with Entity-driven Contrastive Learning
Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance
Fast Exact Unlearning for In-Context Learning Data for LLMs
scCDCG: Efficient Deep Structural Clustering for single-cell RNA-seq via Deep Cut-informed Graph Embedding
Bird Eye-View to Street-View: A Survey
Pretrained Hybrids with MAD Skills
Preemptive Detection and Correction of Misaligned Actions in LLM Agents
Investigating Long-term Training for Remote Sensing Object Detection
The Quest for the Right Mediator: Surveying Mechanistic Interpretability Through the Lens of Causal Mediation Analysis
MENLO: From Preferences to Proficiency - Evaluating and Modeling Native-like Quality Across 47 Languages
Searching for Difficult-to-Translate Test Examples at Scale
Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training
Learning Generalizable Shape Completion with SIM(3) Equivariance
OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction
Stitch: Training-Free Position Control in Multimodal Diffusion Transformers
Sparse View Tomographic Reconstruction of Elongated Objects using Learned Primal-Dual Networks
Efficient Dynamic Ensembling for Multiple LLM Experts
FastCoder: Accelerating Repository-level Code Generation via Efficient Retrieval and Verification
Memorize or Generalize? Evaluating LLM Code Generation with Code Rewriting
Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment
TAMO: Fine-Grained Root Cause Analysis via Tool-Assisted LLM Agent with Multi-Modality Observation Data in Cloud-Native Systems
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges
Survey: Multi-Armed Bandits Meet Large Language Models
R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning
VS-Bench: Evaluating VLMs for Strategic Abilities in Multi-Agent Environments
Ascent Fails to Forget
AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size
ACT: Agentic Classification Tree
Adaptive Planning for Multi-Attribute Controllable Summarization with Monte Carlo Tree Search
Attention over Scene Graphs: Indoor Scene Representations Toward CSAI Classification
On Deepfake Voice Detection - It's All in the Presentation
Regression Language Models for Code
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications
Indoor/Outdoor Spectrum Sharing Enabled by GNSS-based Classifiers
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
MUSE-Explainer: Counterfactual Explanations for Symbolic Music Graph Classification Models
TAP: Two-Stage Adaptive Personalization of Multi-task and Multi-Modal Foundation Models in Federated Learning
OceanGym: A Benchmark Environment for Underwater Embodied Agents
The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models
Parametric Neural Amp Modeling with Active Learning
AI-assisted Advanced Propellant Development for Electric Propulsion
Are Robust LLM Fingerprints Adversarially Robust?
Deconstructing Self-Bias in LLM-generated Translation Benchmarks
An Experimental Study on Generating Plausible Textual Explanations for Video Summarization
3DiFACE: Synthesizing and Editing Holistic 3D Facial Animation
Sandbagging in a Simple Survival Bandit Problem
Finetune Once: Decoupling General & Domain Learning with Dynamic Boosted Annealing
Point2RBox-v3: Self-Bootstrapping from Point Annotations via Integrated Pseudo-Label Refinement and Utilization
Representation-Based Data Quality Audits for Audio
Noise-Guided Transport for Imitation Learning
QUARTZ : QA-based Unsupervised Abstractive Refinement for Task-oriented Dialogue Summarization
Feedback Forensics: A Toolkit to Measure AI Personality
LLM-MCoX: Large Language Model-based Multi-robot Coordinated Exploration and Search
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
SoK: Systematic analysis of adversarial threats against deep learning approaches for autonomous anomaly detection systems in SDN-IoT networks
TimeScope: Towards Task-Oriented Temporal Grounding In Long Videos
Vector-Valued Reproducing Kernel Banach Spaces for Neural Networks and Operators
SDA-PLANNER: State-Dependency Aware Adaptive Planner for Embodied Task Planning
Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning
Game-Time: Evaluating Temporal Dynamics in Spoken Language Models
SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From
Real-time Noise Detection and Classification in Single-Channel EEG: A Lightweight Machine Learning Approach for EMG, White Noise, and EOG Artifacts
On Computing Top-$k$ Simple Shortest Paths from a Single Source
End-to-End Aspect-Guided Review Summarization at Scale
Enhancing PINN Performance Through Lie Symmetry Group
AGOCS -- Accurate Google Cloud Simulator Framework
Leveraging AI modelling for FDS with Simvue: monitor and optimise for more sustainable simulations
OWL: Geometry-Aware Spatial Reasoning for Audio Large Language Models
Bubble, Bubble, AI's Rumble: Why Global Financial Regulatory Incident Reporting is Our Shield Against Systemic Stumbles
EntroPE: Entropy-Guided Dynamic Patch Encoder for Time Series Forecasting
Towards Continual Expansion of Data Coverage: Automatic Text-guided Edge-case Synthesis
Auto-ARGUE: LLM-Based Report Generation Evaluation
AttriGen: Automated Multi-Attribute Annotation for Blood Cell Datasets
Optimizing Indoor Environmental Quality in Smart Buildings Using Deep Learning
Toward an Unbiased Collective Memory for Efficient LLM-Based Agentic 6G Cross-Domain Management
Comparative Analysis of Ant Colony Optimization and Google OR-Tools for Solving the Open Capacitated Vehicle Routing Problem in Logistics
Beyond Pixels: Efficient Dataset Distillation via Sparse Gaussian Representation
Type-Less yet Type-Aware Inductive Link Prediction with Pretrained Language Models
PerQ: Efficient Evaluation of Multilingual Text Personalization Quality
User-Centric Communication Service Provision for Edge-Assisted Mobile Augmented Reality
Accelerating LLM Inference with Precomputed Query Storage
The Impact of Scaling Training Data on Adversarial Robustness
From MNIST to ImageNet: Understanding the Scalability Boundaries of Differentiable Logic Gate Networks
AIM: Adaptive Intervention for Deep Multi-task Learning of Molecular Properties
Data-Free Continual Learning of Server Models in Model-Heterogeneous Federated learning
Reconcile Certified Robustness and Accuracy for DNN-based Smoothed Majority Vote Classifier
R-Log: Incentivizing Log Analysis Capability in LLMs via Reasoning-based Reinforcement Learning
MHINDR - a DSM5 based mental health diagnosis and recommendation framework using LLM
VRWKV-Editor: Reducing quadratic complexity in transformer-based video editing
Learning Egocentric In-Hand Object Segmentation through Weak Supervision from Human Narrations
PFDepth: Heterogeneous Pinhole-Fisheye Joint Depth Estimation via Distortion-aware Gaussian-Splatted Volumetric Fusion
Indirect Attention: Turning Context Misalignment into a Feature
Muon Outperforms Adam in Tail-End Associative Memory Learning
SeMoBridge: Semantic Modality Bridge for Efficient Few-Shot Adaptation of CLIP
CEAID: Benchmark of Multilingual Machine-Generated Text Detection Methods for Central European Languages
Editable Noise Map Inversion: Encoding Target-image into Noise For High-Fidelity Image Manipulation
Point-It-Out: Benchmarking Embodied Reasoning for Vision Language Models in Multi-Stage Visual Grounding
Better with Less: Small Proprietary Models Surpass Large Language Models in Financial Transaction Understanding
CardioForest: An Explainable Ensemble Learning Model for Automatic Wide QRS Complex Tachycardia Diagnosis from ECG
Learning to Reason as Action Abstractions with Scalable Mid-Training RL
VELA: An LLM-Hybrid-as-a-Judge Approach for Evaluating Long Image Captions
Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling
Supporting Creative Ownership through Deep Learning-Based Music Variation
Distillation of Large Language Models via Concrete Score Matching
RAE: A Neural Network Dimensionality Reduction Method for Nearest Neighbors Preservation in Vector Search
S$^2$FS: Spatially-Aware Separability-Driven Feature Selection in Fuzzy Decision Systems
Training-Free Reward-Guided Image Editing via Trajectory Optimal Control
More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation
Vector sketch animation generation with differentialable motion trajectories
Efficient On-Policy Reinforcement Learning via Exploration of Sparse Parameter Space
scUnified: An AI-Ready Standardized Resource for Single-Cell RNA Sequencing Analysis
RoleConflictBench: A Benchmark of Role Conflict Scenarios for Evaluating LLMs' Contextual Sensitivity
Capacity-Net-Based RIS Precoding Design without Channel Estimation for mmWave MIMO System
Deep Reinforcement Learning-Based Precoding for Multi-RIS-Aided Multiuser Downlink Systems with Practical Phase Shift
EEG-based AI-BCI Wheelchair Advancement: Hybrid Deep Learning with Motor Imagery for Brain Computer Interface
LD-MoLE: Learnable Dynamic Routing for Mixture of LoRA Experts
Annotation-Efficient Active Test-Time Adaptation with Conformal Prediction
HNote: Extending YNote with Hexadecimal Encoding for Fine-Tuning LLMs in Music Modeling
DeepCodeSeek: Real-Time API Retrieval for Context-Aware Code Generation
The AI Productivity Index (APEX)
Towards A Universally Transferable Acceleration Method for Density Functional Theory
Boundary-to-Region Supervision for Offline Safe Reinforcement Learning
Controlled Generation for Private Synthetic Text
Think Less, Label Better: Multi-Stage Domain-Grounded Synthetic Data Generation for Fine-Tuning Large Language Models in Telecommunications
Dolphin v1.0 Technical Report
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
Free Lunch Alignment of Text-to-Image Diffusion Models without Preference Image Pairs
V-HUB: A Visual-Centric Humor Understanding Benchmark for Video LLMs
Autonomy-Aware Clustering: When Local Decisions Supersede Global Prescriptions
LLM-RG: Referential Grounding in Outdoor Scenarios using Large Language Models
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources
Calibrating Verbalized Confidence with Self-Generated Distractors
VISOR++: Universal Visual Inputs based Steering for Large Vision Language Models
Self-Rewarding Rubric-Based Reinforcement Learning for Open-Ended Reasoning
Steering an Active Learning Workflow Towards Novel Materials Discovery via Queue Prioritization
Toxicity in Online Platforms and AI Systems: A Survey of Needs, Challenges, Mitigations, and Future Directions
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play
Aligning Multilingual Reasoning with Verifiable Semantics from a High-Resource Expert Model
Hybrid Approach for Enhancing Lesion Segmentation in Fundus Images
Probing the Limits of Stylistic Alignment in Vision-Language Models
AttentionViG: Cross-Attention-Based Dynamic Neighbor Aggregation in Vision GNNs
K-Prism: A Knowledge-Guided and Prompt Integrated Universal Medical Image Segmentation Model
Unsupervised Detection of Spatiotemporal Anomalies in PMU Data Using Transformer-Based BiGAN
Quadratic Programming Approach for Nash Equilibrium Computation in Multiplayer Imperfect-Information Games
STAC: When Innocent Tools Form Dangerous Chains to Jailbreak LLM Agents
BaB-prob: Branch and Bound with Preactivation Splitting for Probabilistic Verification of Neural Networks
YOLO-Based Defect Detection for Metal Sheets
Data-Efficient Multitask DAgger
Discontinuous Epitope Fragments as Sufficient Target Templates for Efficient Binder Design
Translation from Wearable PPG to 12-Lead ECG
EMO-TTA: Improving Test-Time Adaptation of Audio-Language Models for Speech Emotion Recognition
Not Wrong, But Untrue: LLM Overconfidence in Document-Based Queries
DeepFake Detection in Dyadic Video Calls using Point of Gaze Tracking
XR Blocks: Accelerating Human-centered AI + XR Innovation
Economic Competition, EU Regulation, and Executive Orders: A Framework for Discussing AI Policy Implications in CS Courses
FlashOmni: A Unified Sparse Attention Engine for Diffusion Transformers
From Faithfulness to Correctness: Generative Reward Models that Think Critically
Rethinking Parameter Sharing for LLM Fine-Tuning with Multiple LoRAs
Emotion-Aligned Generation in Diffusion Text to Speech Models via Preference-Guided Optimization
Polychromic Objectives for Reinforcement Learning
Beyond Noisy-TVs: Noise-Robust Exploration Via Learning Progress Monitoring
Joint Embeddings Go Temporal
Multi-patch isogeometric neural solver for partial differential equations on computer-aided design domains
PIPer: On-Device Environment Setup via Online Reinforcement Learning
Effectiveness of Large Language Models in Simulating Regional Psychological Structures: An Empirical Examination of Personality and Subjective Well-being
Artificial Authority: From Machine Minds to Political Alignments. An Experimental Analysis of Democratic and Autocratic Biases in Large-Language Models
ClustRecNet: A Novel End-to-End Deep Learning Framework for Clustering Algorithm Recommendation
A Measurement Study of Model Context Protocol
AI in Pakistani Schools: Adoption, Usage, and Perceived Impact among Educators
Learning Relationships Between Separate Audio Tracks for Creative Applications
Automatically Generating Web Applications from Requirements Via Multi-Agent Test-Driven Development
Scaling Behaviors of LLM Reinforcement Learning Post-Training: An Empirical Study in Mathematical Reasoning
Uncertainty-Aware Generative Oversampling Using an Entropy-Guided Conditional Variational Autoencoder
VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes
From Internal Representations to Text Quality: A Geometric Approach to LLM Evaluation
Generative Value Conflicts Reveal LLM Priorities
Cold-Start Active Correlation Clustering
Let Physics Guide Your Protein Flows: Topology-aware Unfolding and Generation
Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs
SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs
A Deep Learning Approach for Spatio-Temporal Forecasting of InSAR Ground Deformation in Eastern Ireland
A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects
Reinforcement Learning-Guided Chain-of-Draft for Token-Efficient Code Generation
Comprehensive Analysis of VQC for Financial Fraud Detection: A Comparative Study of Quantum Encoding Techniques and Architectural Optimizations
Protocode: Prototype-Driven Interpretability for Code Generation in LLMs
BuildBench: Benchmarking LLM Agents on Compiling Real-World Open-Source Software
BEV-VLM: Trajectory Planning via Unified BEV Abstraction
Knowledge distillation through geometry-aware representational alignment
The Sandbox Configurator: A Framework to Support Technical Assessment in AI Regulatory Sandboxes
Artificial Intelligence-Powered Assessment Framework for Skill-Oriented Engineering Lab Education
How Effective Are Time-Series Models for Rainfall Nowcasting? A Comprehensive Benchmark for Rainfall Nowcasting Incorporating PWV Data
From NL2SQL to NL2GeoSQL: GeoSQL-Eval for automated evaluation of LLMs on PostGIS queries
Cognifying Education: Mapping AI's transformative role in emotional, creative, and collaborative learning
Dynamic Policy Induction for Adaptive Prompt Optimization: Bridging the Efficiency-Accuracy Gap via Lightweight Reinforcement Learning
A Weather Foundation Model for the Power Grid
InfMasking: Unleashing Synergistic Information by Contrastive Multimodal Interactions
DNABERT-2: Fine-Tuning a Genomic Language Model for Colorectal Gene Enhancer Classification
VoiceBridge: Designing Latent Bridge Models for General Speech Restoration at Scale
Devstral: Fine-tuning Language Models for Coding Agent Applications
APRIL: API Synthesis with Automatic Prompt Optimization and Reinforcement Learning
Towards Repository-Level Program Verification with Large Language Models
Generating High-Quality Datasets for Code Editing via Open-Source Language Models
Spectral Logit Sculpting: Adaptive Low-Rank Logit Transformation for Controlled Text Generation
Multi-level Diagnosis and Evaluation for Robust Tabular Feature Engineering with Large Language Models
STCast: Adaptive Boundary Alignment for Global and Regional Weather Forecasting
Six Sigma For Neural Networks: Taguchi-based optimization
On-the-Fly Adaptation to Quantization: Configuration-Aware LoRA for Efficient Fine-Tuning of Quantized LLMs
Learning to Condition: A Neural Heuristic for Scalable MPE Inference
Enhancing Linear Attention with Residual Learning
Energy Guided Geometric Flow Matching
FedCLF - Towards Efficient Participant Selection for Federated Learning in Heterogeneous IoV Networks
Machine Learning for Pattern Detection in Printhead Nozzle Logging
Quantum est in Libris: Navigating Archives with GenAI, Uncovering Tension Between Preservation and Innovation
PALADIN: Self-Correcting Language Model Agents to Cure Tool-Failure Cases
HAMMER: Hamiltonian Curiosity Augmented Large Language Model Reinforcement
A Benchmark for Localizing Code and Non-Code Issues in Software Projects
Zero-Shot Decentralized Federated Learning
Extreme Self-Preference in Language Models
STaR-Attack: A Spatio-Temporal and Narrative Reasoning Attack Framework for Unified Multimodal Understanding and Generation Models
The Average Patient Fallacy
TVS Sidekick: Challenges and Practical Insights from Deploying Large Language Models in the Enterprise
Combining Knowledge Graphs and NLP to Analyze Instant Messaging Data in Criminal Investigations
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!
SCUBA: Salesforce Computer Use Benchmark
Rearchitecting Datacenter Lifecycle for AI: A TCO-Driven Framework
HilbertA: Hilbert Attention for Image Generation with Diffusion Models
Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark
Fairness Testing in Retrieval-Augmented Generation: How Small Perturbations Reveal Bias in Small Language Models
Fine-tuning Behavioral Cloning Policies with Preference-Based Reinforcement Learning
TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance
Branching Out: Broadening AI Measurement and Evaluation with Measurement Trees
Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking
FMIP: Joint Continuous-Integer Flow For Mixed-Integer Linear Programming
AdaptCache: KV Cache Native Storage Hierarchy for Low-Delay and High-Quality Language Model Serving
Beyond the Algorithm: A Field Guide to Deploying AI Agents in Clinical Practice
90% Faster, 100% Code-Free: MLLM-Driven Zero-Code 3D Game Development
'Too much alignment; not enough culture': Re-balancing cultural alignment practices in LLMs
LLM Agents for Knowledge Discovery in Atomic Layer Processing
Human-Centered Evaluation of RAG outputs: a framework and questionnaire for human-AI collaboration
Diversity-Incentivized Exploration for Versatile Reasoning
Benchmarking Deep Learning Convolutions on Energy-constrained CPUs
SlimPack: Fine-Grained Asymmetric Packing for Balanced and Efficient Variable-Length LLM Training
ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning
Interactive Learning for LLM Reasoning
AI Playing Business Games: Benchmarking Large Language Models on Managerial Decision-Making in Dynamic Simulations
SafeBehavior: Simulating Human-Like Multistage Reasoning to Mitigate Jailbreak Attacks in Large Language Models
How Far Do Time Series Foundation Models Paint the Landscape of Real-World Benchmarks ?
Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents
MC-GNNAS-Dock: Multi-criteria GNN-based Algorithm Selection for Molecular Docking
Commmunication-Efficient and Accurate Approach for Aggregation in Federated Low-Rank Adaptation
OntoAligner Meets Knowledge Graph Embedding Aligners
Transformer Classification of Breast Lesions: The BreastDCEDL_AMBL Benchmark Dataset and 0.92 AUC Baseline
CIMNAS: A Joint Framework for Compute-In-Memory-Aware Neural Architecture Search
Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs
SafeMind: Benchmarking and Mitigating Safety Risks in Embodied LLM Agents
DeepJSONEval: Benchmarking Complex Nested JSON Data Mining for Large Language Models
KIRETT: Smart Integration of Vital Signs Data for Intelligent Decision Support in Rescue Scenarios
Quantitative Evaluation of KIRETT Wearable Demonstrator for Rescue Operations
Boosting Process-Correct CoT Reasoning by Modeling Solvability of Multiple-Choice QA
NuRisk: A Visual Question Answering Dataset for Agent-Level Risk Assessment in Autonomous Driving
Automated Model Discovery via Multi-modal & Multi-step Pipeline
RoRecomp: Enhancing Reasoning Efficiency via Rollout Response Recomposition in Reinforcement Learning
Scalable and Robust LLM Unlearning by Correcting Responses with Retrieved Exclusions
Towards Unified Multimodal Misinformation Detection in Social Media: A Benchmark Dataset and Baseline
Towards Human Engagement with Realistic AI Combat Pilots
CoLLM-NAS: Collaborative Large Language Models for Efficient Knowledge-Guided Neural Architecture Search
Evaluating the Use of Large Language Models as Synthetic Social Agents in Social Science Research
SafeEvalAgent: Toward Agentic and Self-Evolving Safety Evaluation of LLMs
MEDAKA: Construction of Biomedical Knowledge Graphs Using Large Language Models
LMILAtt: A Deep Learning Model for Depression Detection from Social Media Users Enhanced by Multi-Instance Learning Based on Attention Mechanism
On Explaining Proxy Discrimination and Unfairness in Individual Decisions Made by AI Systems
GroundSight: Augmenting Vision-Language Models with Grounding Information and De-hallucination
SING-SQL: A Synthetic Data Generation Framework for In-Domain Text-to-SQL Translation
Collaborative Compression for Large-Scale MoE Deployment on Edge
ScheduleMe: Multi-Agent Calendar Assistant
Cooperative Autonomous Driving in Diverse Behavioral Traffic: A Heterogeneous Graph Reinforcement Learning Approach
NePTune: A Neuro-Pythonic Framework for Tunable Compositional Reasoning on Vision-Language
Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training
Galton's Law of Mediocrity: Why Large Language Models Regress to the Mean and Fail at Creativity in Advertising
Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs
Deontic Argumentation
PUREVQ-GAN: Defending Data Poisoning Attacks through Vector-Quantized Bottlenecks
Chain-in-Tree: Back to Sequential Reasoning in LLM Tree Search
HiStyle: Hierarchical Style Embedding Predictor for Text-Prompt-Guided Controllable Speech Synthesis
ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack
Aging Decline in Basketball Career Trend Prediction Based on Machine Learning and LSTM Model
RadOnc-GPT: An Autonomous LLM Agent for Real-Time Patient Outcomes Labeling at Scale
Learning to Interact in World Latent for Team Coordination
Evaluating Foundation Models with Pathological Concept Learning for Kidney Cancer
A(I)nimism: Re-enchanting the World Through AI-Mediated Object Interaction
Radiology's Last Exam (RadLE): Benchmarking Frontier Multimodal AI Against Human Experts and a Taxonomy of Visual Reasoning Errors in Radiology
IRIS: Intrinsic Reward Image Synthesis
Skip-It? Theoretical Conditions for Layer Skipping in Vision-Language Models
ATLAS: Constraints-Aware Multi-Agent Collaboration for Real-World Travel Planning
Building the EHR Foundation Model via Next Event Prediction
Causal Autoencoder-like Generation of Feedback Fuzzy Cognitive Maps with an LLM Agent
Hybrid Reward Normalization for Process-supervised Non-verifiable Agentic Tasks
Echoes of Humanity: Exploring the Perceived Humanness of AI Music
A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments
SMS: Self-supervised Model Seeding for Verification of Machine Unlearning
SOCK: A Benchmark for Measuring Self-Replication in Large Language Models
AutoLabs: Cognitive Multi-Agent Systems with Self-Correction for Autonomous Chemical Experimentation
Iterative Residual Cross-Attention Mechanism: An Integrated Approach for Audio-Visual Navigation Tasks
Landmark-Guided Knowledge for Vision-and-Language Navigation
Dive into the Agent Matrix: A Realistic Evaluation of Self-Replication Risk in LLM Agents
Spontaneous High-Order Generalization in Neural Theory-of-Mind Networks
SynthPert: Enhancing LLM Biological Reasoning via Synthetic Reasoning Traces for Cellular Perturbation Prediction
Structural Reward Model: Enhancing Interpretability, Efficiency, and Scalability in Reward Modeling
Where LLM Agents Fail and How They can Learn From Failures
From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models
Saliency Guided Longitudinal Medical Visual Question Answering
Boolean Satisfiability via Imitation Learning
Adaptive Test-Time Reasoning via Reward-Guided Dual-Phase Search
RADAR: Reasoning-Ability and Difficulty-Aware Routing for Reasoning LLMs
The Open Syndrome Definition
GESA: Graph-Enhanced Semantic Allocation for Generalized, Fair, and Explainable Candidate-Role Matching
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
Plug-and-Play Emotion Graphs for Compositional Prompting in Zero-Shot Speech Emotion Recognition
TDHook: A Lightweight Framework for Interpretability
Message passing-based inference in an autoregressive active inference agent
Understanding Generative Recommendation with Semantic IDs from a Model-scaling View
Beyond Static Retrieval: Opportunities and Pitfalls of Iterative Retrieval in GraphRAG
Blueprint-Bench: Comparing spatial intelligence of LLMs, agents and image models
The Causal Abstraction Network: Theory and Learning
A Formal Comparison Between Chain-of-Thought and Latent Thought
Neo-Grounded Theory: A Methodological Innovation Integrating High-Dimensional Vector Clustering and Multi-Agent Collaboration for Qualitative Research
Memory Management and Contextual Consistency for Long-Running Low-Code Agents
Fact Grounded Attention: Eliminating Hallucination in Large Language Models Through Attention Level Knowledge Integration
Language Model Planning from an Information Theoretic Perspective
RADAR: A Risk-Aware Dynamic Multi-Agent Framework for LLM Safety Evaluation via Role-Specialized Collaboration
RL in the Wild: Characterizing RLVR Training in LLM Deployment
Toward Causal-Visual Programming: Enhancing Agentic Reasoning in Low-Code Environments
ID-RAG: Identity Retrieval-Augmented Generation for Long-Horizon Persona Coherence in Generative Agents
Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution

Research Sources: 849 | Generated: 10/1/2025