AI Research News Feeds for September 24th, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

3D-ADAM: A Dataset for 3D Anomaly Detection in Additive Manufacturing
DWTGS: Rethinking Frequency Regularization for Sparse-view 3D Gaussian Splatting
Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable
AvatarShield: Visual Reinforcement Learning for Human-Centric Synthetic Video Detection
Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels
Image Segmentation and Classification of E-waste for Training Robots for Waste Segregation
Gaussian Herding across Pens: An Optimal Transport Perspective on Global Gaussian Reduction for 3DGS
WaveFormer: A Lightweight Transformer Model for sEMG-based Gesture Recognition
Uncertainty-Aware Information Pursuit for Interpretable and Reliable Medical Image Analysis
Exploring Image Generation via Mutually Exclusive Probability Spaces and Local Correlation Hypothesis
JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation Framework
HDM: Hybrid Diffusion Model for Unified Image Anomaly Detection
Leveraging Large Models to Evaluate Novel Content: A Case Study on Advertisement Creativity
Latent Beam Diffusion Models for Generating Visual Sequences
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer
Split Matching for Inductive Zero-shot Semantic Segmentation
InstanceBEV: Unifying Instance and BEV Representation for 3D Panoptic Segmentation
REACT: Real-time Efficiency and Accuracy Compromise for Tradeoffs in Scene Graph Generation
Deep Spherical Superpixels
Your Turn: At Home Turning Angle Estimation for Parkinson's Disease Severity Assessment
Variational Bayes Gaussian Splatting
CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation
Superpixel Segmentation: A Long-Lasting Ill-Posed Problem
SparseDiT: Token Sparsification for Efficient Diffusion Transformer
Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation
Injecting Explainability and Lightweight Design into Weakly Supervised Video Anomaly Detection Systems
EventVL: Understand Event Streams via Multimodal Large Language Model
Without Paired Labeled Data: End-to-End Self-Supervised Learning for Drone-view Geo-Localization
COLT: Enhancing Video Large Language Models with Continual Tool Usage
Evaluation Framework of Superpixel Methods with a Global Regularity Measure
ZoDIAC: Zoneout Dropout Injection Attention Calculation
Fix your downsampling ASAP! Be natively more robust via Aliasing and Spectral Artifact free Pooling
Individualized Mapping of Aberrant Cortical Thickness via Stochastic Cortical Self-Reconstruction
Quantum Annealing for Minimum Bisection Problem: A Machine Learning-based Approach for Penalty Parameter Tuning
A Fast Initialization Method for Neural Network Controllers: A Case Study of Image-based Visual Servoing Control for the multicopter Interception
LLM-based Vulnerability Discovery through the Lens of Code Metrics
Circuit Complexity From Physical Constraints: Scaling Limitations of Attention
VoxGuard: Evaluating User and Attribute Privacy in Speech via Membership Inference Attacks
Large-Scale, Longitudinal Study of Large Language Models During the 2024 US Election Season
Robotic Skill Diversification via Active Mutation of Reward Functions in Reinforcement Learning During a Liquid Pouring Task
BRAID: Input-Driven Nonlinear Dynamical Modeling of Neural-Behavioral Data
Scalable bayesian shadow tomography for quantum property estimation with set transformers
Query-Centric Diffusion Policy for Generalizable Robotic Assembly
Confidential LLM Inference: Performance and Cost Across CPU and GPU TEEs
Integrating Stacked Intelligent Metasurfaces and Power Control for Dynamic Edge Inference via Over-The-Air Neural Networks
Accurate and Efficient Prediction of Wi-Fi Link Quality Based on Machine Learning
A Cost-Benefit Analysis of On-Premise Large Language Model Deployment: Breaking Even with Commercial LLM Services
Energy-convergence trade off for the training of neural networks on bio-inspired hardware
SPADE: A Large Language Model Framework for Soil Moisture Pattern Recognition and Anomaly Detection in Precision Agriculture
Learning Progression-Guided AI Evaluation of Scientific Models To Support Diverse Multi-Modal Understanding in NGSS Classroom
Synthesizing Attitudes, Predicting Actions (SAPA): Behavioral Theory-Guided LLMs for Ridesourcing Mode Choice Modeling
Multimodal Health Risk Prediction System for Chronic Diseases via Vision-Language Fusion and Large Language Models
Towards General Computer Control with Hierarchical Agents and Multi-Level Action Spaces
G\"odel Test: Can Large Language Models Solve Easy Conjectures?
A Validation Strategy for Deep Learning Models: Evaluating and Enhancing Robustness
PPG-Distill: Efficient Photoplethysmography Signals Analysis via Foundation Model Distillation
Study Design and Demystification of Physics Informed Neural Networks for Power Flow Simulation
Stability and Generalization of Adversarial Diffusion Training
The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review
EarthquakeNPP: A Benchmark for Earthquake Forecasting with Neural Point Processes
Bayesian Multivariate Density-Density Regression
Single-stream Policy Optimization
Hierarchical Semi-Markov Models with Duration-Aware Dynamics for Activity Sequences
Enhanced Survival Trees
Clapping: Removing Per-sample Storage for Pipeline Parallel Distributed Optimization with Communication Compression
Unveiling the Role of Learning Rate Schedules via Functional Scaling Laws
Linear Regression under Missing or Corrupted Coordinates
A Neural Difference-of-Entropies Estimator for Mutual Information
Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time
Demystifying Spectral Feature Learning for Instrumental Variable Regression
Temporal Conformal Prediction (TCP): A Distribution-Free Statistical and Machine Learning Framework for Adaptive Risk Forecasting
Packed-Ensembles for Efficient Uncertainty Estimation
Sum-of-norms regularized Nonnegative Matrix Factorization
Surrogate Modelling of Proton Dose with Monte Carlo Dropout Uncertainty Quantification
Statistical Insight into Meta-Learning via Predictor Subspace Characterization and Quantification of Task Diversity
End-Cut Preference in Survival Trees
Estimating Heterogeneous Causal Effect on Networks via Orthogonal Learning
Consistency of Selection Strategies for Fraud Detection
Neighbor Embeddings Using Unbalanced Optimal Transport Metrics
Recovering Wasserstein Distance Matrices from Few Measurements
A Gradient Flow Approach to Solving Inverse Problems with Latent Diffusion Models
Augmenting Limited and Biased RCTs through Pseudo-Sample Matching-Based Observational Data Fusion Method
Tensor Train Completion from Fiberwise Observations Along a Single Mode
Forest tree species classification and entropy-derived uncertainty mapping using extreme gradient boosting and Sentinel-1/2 data
Reconstruction of Optical Coherence Tomography Images from Wavelength-space Using Deep-learning
Human-Interpretable Uncertainty Explanations for Point Cloud Registration
DexSkin: High-Coverage Conformable Robotic Skin for Learning Contact-Rich Manipulation
Text Slider: Efficient and Plug-and-Play Continuous Concept Control for Image/Video Synthesis via LoRA Adapters
Quantum Random Synthetic Skyrmion Texture Generation, a Qiskit Simulation
One-shot Embroidery Customization via Contrastive LoRA Modulation
Towards Robust LiDAR Localization: Deep Learning-based Uncertainty Estimation
Category-Level Object Shape and Pose Estimation in Less Than a Millisecond
FUNCanon: Learning Pose-Aware Action Primitives via Functional Object Canonicalization for Generalizable Robotic Manipulation
MOIS-SAM2: Exemplar-based Segment Anything Model 2 for multilesion interactive segmentation of neurobromas in whole-body MRI
CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching
Semantic-Aware Particle Filter for Reliable Vineyard Robot Localisation
Neural Network-Driven Direct CBCT-Based Dose Calculation for Head-and-Neck Proton Treatment Planning
Does Embodiment Matter to Biomechanics and Function? A Comparative Analysis of Head-Mounted and Hand-Held Assistive Devices for Individuals with Blindness and Low Vision
Latent Action Pretraining Through World Modeling
Zero-Shot Visual Deepfake Detection: Can AI Predict and Prevent Fake Content Before It's Created?
Machine learning approach to single-shot multiparameter estimation for the non-linear Schr\"odinger equation
Differentiable Light Transport with Gaussian Surfels via Adapted Radiosity for Efficient Relighting and Geometry Reconstruction
Dynamical Modeling of Behaviorally Relevant Spatiotemporal Patterns in Neural Imaging Data
Efficient Breast and Ovarian Cancer Classification via ViT-Based Preprocessing and Transfer Learning
VLN-Zero: Rapid Exploration and Cache-Enabled Neurosymbolic Vision-Language Planning for Zero-Shot Transfer in Robot Navigation
Enabling Plant Phenotyping in Weedy Environments using Multi-Modal Imagery via Synthetic and Generated Training Data
HyKid: An Open MRI Dataset with Expert-Annotated Multi-Structure and Choroid Plexus in Pediatric Hydrocephalus
MsFIN: Multi-scale Feature Interaction Network for Traffic Accident Anticipation
DevFD: Developmental Face Forgery Detection by Learning Shared and Orthogonal LoRA Subspaces
Lavida-O: Elastic Masked Diffusion Models for Unified Multimodal Understanding and Generation
ConViS-Bench: Estimating Video Similarity Through Semantic Concepts
Adversarially-Refined VQ-GAN with Dense Motion Tokenization for Spatio-Temporal Heatmaps
Graph-Radiomic Learning (GrRAiL) Descriptor to Characterize Imaging Heterogeneity in Confounding Tumor Pathologies
Moving by Looking: Towards Vision-Driven Avatar Motion Generation
OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction
Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications
Investigating Traffic Accident Detection Using Multimodal Large Language Models
Track-On2: Enhancing Online Point Tracking with Memory
KAMERA: Enhancing Aerial Surveys of Ice-associated Seals in Arctic Environments
NeuCODEX: Edge-Cloud Co-Inference with Spike-Driven Compression and Dynamic Early-Exit
RoSe: Robust Self-supervised Stereo Matching under Adverse Weather Conditions
YOLO-LAN: Precise Polyp Detection via Optimized Loss, Augmentations and Negatives
The 1st Solution for MOSEv2 Challenge 2025: Long-term and Concept-aware Video Segmentation via SeC
Reading Images Like Texts: Sequential Image Understanding in Vision-Language Models
Vision-Free Retrieval: Rethinking Multimodal Search with Textual Scene Descriptions
Long Story Short: Disentangling Compositionality and Long-Caption Understanding in VLMs
Audio-Driven Universal Gaussian Head Avatars
SynapFlow: A Modular Framework Towards Large-Scale Analysis of Dendritic Spines
No Labels Needed: Zero-Shot Image Classification with Collaborative Self-Learning
Seeing Through Reflections: Advancing 3D Scene Reconstruction in Mirror-Containing Environments with Gaussian Splatting
Generative data augmentation for biliary tract detection on intraoperative images
Prompt-DAS: Annotation-Efficient Prompt Learning for Domain Adaptive Semantic Segmentation of Electron Microscopy Images
Unveiling Chain of Step Reasoning for Vision-Language Models with Fine-grained Rewards
Weakly Supervised Food Image Segmentation using Vision Transformers and Segment Anything Model
A DyL-Unet framework based on dynamic learning for Temporally Consistent Echocardiographic Segmentation
WaveletGaussian: Wavelet-domain Diffusion for Sparse-view 3D Gaussian Object Reconstruction
3rd Place Report of LSVOS 2025 MeViS Track: Sa2VA-i: Improving Sa2VA Results with Consistent Training and Inference
Benchmarking Vision-Language and Multimodal Large Language Models in Zero-shot and Few-shot Scenarios: A study on Christian Iconography
ViG-LRGC: Vision Graph Neural Networks with Learnable Reparameterized Graph Construction
Attack for Defense: Adversarial Agents for Point Prompt Optimization Empowering Segment Anything Model
SmartWilds: Multimodal Wildlife Monitoring Dataset
RS3DBench: A Comprehensive Benchmark for 3D Spatial Perception in Remote Sensing
DeblurSplat: SfM-free 3D Gaussian Splatting with Event Camera for Robust Deblurring
Moir\'eNet: A Compact Dual-Domain Network for Image Demoir\'eing
Frequency-Domain Decomposition and Recomposition for Robust Audio-Visual Segmentation
xAI-CV: An Overview of Explainable Artificial Intelligence in Computer Vision
LiDAR Point Cloud Image-based Generation Using Denoising Diffusion Probabilistic Models
Advancing Metallic Surface Defect Detection via Anomaly-Guided Pretraining on a Large Industrial Dataset
Knowledge Transfer from Interaction Learning
HyPSAM: Hybrid Prompt-driven Segment Anything Model for RGB-Thermal Salient Object Detection
TriFusion-AE: Language-Guided Depth and LiDAR Fusion for Robust Point Cloud Processing
FixingGS: Enhancing 3D Gaussian Splatting via Training-Free Score Distillation
Bi-VLM: Pushing Ultra-Low Precision Post-Training Quantization Boundaries in Vision-Language Models
DiSSECT: Structuring Transfer-Ready Medical Image Representations through Discrete Self-Supervision
Real-time Deer Detection and Warning in Connected Vehicles via Thermal Sensing and Deep Learning
Towards Application Aligned Synthetic Surgical Image Synthesis
A Kernel Space-based Multidimensional Sparse Model for Dynamic PET Image Denoising
Surgical Video Understanding with Label Interpolation
Hyper-Bagel: A Unified Acceleration Framework for Multimodal Understanding and Generation
Understanding-in-Generation: Reinforcing Generative Capability of Unified Model via Infusing Understanding into Generation
Zero-shot Monocular Metric Depth for Endoscopic Images
LEAF-Mamba: Local Emphatic and Adaptive Fusion State Space Model for RGB-D Salient Object Detection
Lightweight Vision Transformer with Window and Spatial Attention for Food Image Classification
OSDA: A Framework for Open-Set Discovery and Automatic Interpretation of Land-cover in Remote Sensing Imagery
Overview of PlantCLEF 2021: cross-domain plant identification
AGSwap: Overcoming Category Boundaries in Object Fusion via Adaptive Group Swapping
Overview of LifeCLEF Plant Identification task 2019: diving into data deficient tropical countries
RSVG-ZeroOV: Exploring a Training-Free Framework for Zero-Shot Open-Vocabulary Visual Grounding in Remote Sensing Images
What Makes You Unique? Attribute Prompt Composition for Object Re-Identification
Pre-training CLIP against Data Poisoning with Optimal Transport-based Matching and Alignment
GeoRemover: Removing Objects and Their Causal Visual Artifacts
SEGA: A Transferable Signed Ensemble Gaussian Black-Box Attack against No-Reference Image Quality Assessment Models
HadaSmileNet: Hadamard fusion of handcrafted and deep-learning features for enhancing facial emotion recognition of genuine smiles
Event-guided 3D Gaussian Splatting for Dynamic Human and Scene Reconstruction
Live-E2T: Real-time Threat Monitoring in Video via Deduplicated Event Reasoning and Chain-of-Thought
The Photographer Eye: Teaching Multimodal Large Language Models to See and Critique like Photographers
Enhancing Video Object Segmentation in TrackRAD Using XMem Memory Network
SSCM: A Spatial-Semantic Consistent Model for Multi-Contrast MRI Super-Resolution
Training-Free Multi-Style Fusion Through Reference-Based Adaptive Modulation
MLF-4DRCNet: Multi-Level Fusion with 4D Radar and Camera for 3D Object Detection in Autonomous Driving
Prompt-Guided Dual Latent Steering for Inversion Problems
Learning neuroimaging models from health system-scale data
Improving the color accuracy of lighting estimation models
Check Field Detection Agent (CFD-Agent) using Multimodal Large Language and Vision Language Models
Losing the Plot: How VLM responses degrade on imperfect charts
CPT-4DMR: Continuous sPatial-Temporal Representation for 4D-MRI Reconstruction
An Analysis of Kalman Filter based Object Tracking Methods for Fast-Moving Tiny Objects
MoCrop: Training Free Motion Guided Cropping for Efficient Video Action Recognition
Codebook-Based Adaptive Feature Compression With Semantic Enhancement for Edge-Cloud Systems
MK-UNet: Multi-kernel Lightweight CNN for Medical Image Segmentation
BridgeSplat: Bidirectionally Coupled CT and Non-Rigid Gaussian Splatting for Deformable Intraoperative Surgical Navigation
Source-Free Domain Adaptive Semantic Segmentation of Remote Sensing Images with Diffusion-Guided Label Enrichment
Hyperbolic Coarse-to-Fine Few-Shot Class-Incremental Learning
HazeFlow: Revisit Haze Physical Model as ODE and Non-Homogeneous Haze Generation for Real-World Dehazing
TinyEcoWeedNet: Edge Efficient Real-Time Aerial Agricultural Weed Detection
Learning Contrastive Multimodal Fusion with Improved Modality Dropout for Disease Detection and Prediction
Rethinking Pulmonary Embolism Segmentation: A Study of Current Approaches and Challenges with an Open Weight Model
Improving Handshape Representations for Sign Language Processing: A Graph Neural Network Approach
Influence of Classification Task and Distribution Shift Type on OOD Detection in Fetal Ultrasound
OrthoLoC: UAV 6-DoF Localization and Calibration Using Orthographic Geodata
A Single Image Is All You Need: Zero-Shot Anomaly Localization Without Training Data
Align Where the Words Look: Cross-Attention-Guided Patch Alignment with Contrastive and Transport Regularization for Bengali Captioning
TinyBEV: Cross Modal Knowledge Distillation for Efficient Multi Task Bird's Eye View Perception and Planning
BlurBall: Joint Ball and Motion Blur Estimation for Table Tennis Ball Tracking
MVP: Motion Vector Propagation for Zero-Shot Video Object Detection
Self Identity Mapping
MAGIA: Sensing Per-Image Signals from Single-Round Averaged Gradients for Label-Inference-Free Gradient Inversion
A Deep Learning Approach for Spatio-Temporal Forecasting of InSAR Ground Deformation in Eastern Ireland
A Framework for Generating Artificial Datasets to Validate Absolute and Relative Position Concepts
The Describe-Then-Generate Bottleneck: How VLM Descriptions Alter Image Generation Outcomes
AI-Derived Structural Building Intelligence for Urban Resilience: An Application in Saint Vincent and the Grenadines
VLA-LPAF: Lightweight Perspective-Adaptive Fusion for Vision-Language-Action to Enable More Unconstrained Robotic Manipulation
URNet: Uncertainty-aware Refinement Network for Event-based Stereo Depth Estimation
Visionerves: Automatic and Reproducible Hybrid AI for Peripheral Nervous System Recognition Applied to Endometriosis Cases
V-SenseDrive: A Privacy-Preserving Road Video and In-Vehicle Sensor Fusion Framework for Road Safety & Driver Behaviour Modelling
Qianfan-VL: Domain-Enhanced Universal Vision-Language Models
Meta-Semantics Augmented Few-Shot Relational Learning
Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders
MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning
Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models
When and How Long Did Therapy Happen? Soft-Supervising Temporal Localization Using Audio-Language Models
RAG+: Enhancing Retrieval-Augmented Generation with Application-Aware Reasoning
LogicGuard: Improving Embodied LLM agents through Temporal Logic based Critics
PolypSeg-GradCAM: Towards Explainable Computer-Aided Gastrointestinal Disease Detection Using U-Net Based Segmentation and Grad-CAM Visualization on the Kvasir Dataset
PerceptronCARE: A Deep Learning-Based Intelligent Teleopthalmology Application for Diabetic Retinopathy Diagnosis
Improving Image Captioning Descriptiveness by Ranking and LLM-based Fusion
Is Pre-training Truly Better Than Meta-Learning?
MediSyn: A Generalist Text-Guided Latent Diffusion Model For Diverse Medical Image Synthesis
DOTA: Distributional Test-Time Adaptation of Vision-Language Models
EMMA: End-to-End Multimodal Model for Autonomous Driving
Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability
Fine-Tuning is Subgraph Search: A New Lens on Learning Dynamics
DeepResonance: Enhancing Multimodal Music Understanding via Music-centric Multi-way Instruction Tuning
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer
A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
Automating Steering for Safe Multimodal Large Language Models
Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models
JOLT-SQL: Joint Loss Tuning of Text-to-SQL with Confusion-aware Noisy Schema Sampling
Memorization or Reasoning? Exploring the Idiom Understanding of LLMs
Large Language Models Implicitly Learn to See and Hear Just By Reading
Large Language Models Do Multi-Label Classification Differently
Unraveling Misinformation Propagation in LLM Reasoning
LiTEx: A Linguistic Taxonomy of Explanations for Understanding Within-Label Variation in Natural Language Inference
Please Translate Again: Two Simple Experiments on Whether Human-Like Reasoning Helps Translation
Combining Constrained and Unconstrained Decoding via Boosting: BoostCD and Its Application to Information Extraction
A suite of allotaxonometric tools for the comparison of complex systems using rank-turbulence divergence
Think in Safety: Unveiling and Mitigating Safety Alignment Collapse in Multimodal Large Reasoning Model
Language Models Can Predict Their Own Behavior
Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests
LightThinker: Thinking Step-by-Step Compression
Can LLMs Explain Themselves Counterfactually?
Anything Goes? A Crosslinguistic Study of (Im)possible Language Learning in LMs
Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners
LookAhead Tuning: Safer Language Models via Partial Answer Previews
Pandora: A Code-Driven Large Language Model Agent for Unified Reasoning Across Diverse Structured Knowledge
Citrus-V: Advancing Medical Foundation Models with Unified Medical Image Grounding for Clinical Reasoning
Finding My Voice: Generative Reconstruction of Disordered Speech for Automated Clinical Evaluation
Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture
Enhancing Unsupervised Sentence Embeddings via Knowledge-Driven Data Augmentation and Gaussian-Decayed Contrastive Learning
Post-hoc Study of Climate Microtargeting on Social Media Ads with LLMs: Thematic Insights and Fairness Evaluation
Exploring Model Kinship for Merging Large Language Models
Language Models as Causal Effect Generators
Compositional Phoneme Approximation for L1-Grounded L2 Pronunciation Training
Improving Low-Resource Sequence Labeling with Knowledge Fusion and Contextual Label Explanations
VLDBench Evaluating Multimodal Disinformation with Regulatory Alignment
The Illusion of Readiness: Stress Testing Large Frontier Models on Multimodal Medical Benchmarks
Memory-QA: Answering Recall Questions Based on Multimodal Memories
No Verifiable Reward for Prosody: Toward Preference-Guided Prosody Learning in TTS
HarmoniFuse: A Component-Selective and Prompt-Adaptive Framework for Multi-Task Speech Language Modeling
Teaching Audio Models to Reason: A Unified Framework for Source- and Layer-wise Distillation
OraPO: Oracle-educated Reinforcement Learning for Data-efficient and Factual Radiology Report Generation
Agentic AutoSurvey: Let LLMs Survey LLMs
Pay More Attention To Audio: Mitigating Imbalance of Cross-Modal Attention in Large Audio Language Models
Failure Makes the Agent Stronger: Enhancing Accuracy through Structured Reflection for Reliable Tool Interactions
VIR-Bench: Evaluating Geospatial and Temporal Understanding of MLLMs via Travel Video Itinerary Reconstruction
ColorBlindnessEval: Can Vision-Language Models Pass Color Blindness Tests?
Measuring AI "Slop" in Text
Soft Tokens, Hard Truths
Online Process Reward Leanring for Agentic Reinforcement Learning
Steering Multimodal Large Language Models Decoding for Context-Aware Safety
Systematic Comparative Analysis of Large Pretrained Language Models on Contextualized Medication Event Extraction
CompLLM: Compression for Long Context Q&A
Reinforcement Learning on Pre-Training Data
Extracting Conceptual Spaces from LLMs Using Prototype Embeddings
SloPalSpeech: A 2,8000-Hour Slovak Speech Corpus from Parliamentary Data
WolBanking77: Wolof Banking Speech Intent Classification Dataset
DRISHTIKON: A Multimodal Multilingual Benchmark for Testing Language Models' Understanding on Indian Culture
Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR
Multi-Hierarchical Feature Detection for Large Language Model Generated Text
Diversity Boosts AI-Generated Text Detection
Extractive Fact Decomposition for Interpretable Natural Language Inference in one Forward Pass
DTW-Align: Bridging the Modality Gap in End-to-End Speech Translation with Dynamic Time Warping Alignment
Investigating Test-Time Scaling with Reranking for Machine Translation
Charting a Decade of Computational Linguistics in Italy: The CLiC-it Corpus
Pathways of Thoughts: Multi-Directional Thinking for Long-form Personalized Question Answering
Are most sentences unique? An empirical examination of Chomskyan claims
Human-Annotated NER Dataset for the Kyrgyz Language
Context-Aware Hierarchical Taxonomy Generation for Scientific Papers via LLM-Guided Multi-Aspect Clustering
Anecdoctoring: Automated Red-Teaming Across Language and Place
Consistency-Aware Parameter-Preserving Knowledge Editing Framework for Multi-Hop Question Answering
Analyzing Uncertainty of LLM-as-a-Judge: Interval Evaluations with Conformal Prediction
MemOrb: A Plug-and-Play Verbal-Reinforcement Memory Layer for E-Commerce Customer Service
LOTUSDIS: A Thai far-field meeting corpus for robust conversational ASR
Global-Recent Semantic Reasoning on Dynamic Text-Attributed Graphs with Large Language Models
False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models
When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models
Financial Risk Relation Identification through Dual-view Adaptation
AECBench: A Hierarchical Benchmark for Knowledge Evaluation of Large Language Models in the AEC Field
Beyond the Leaderboard: Understanding Performance Disparities in Large Language Models via Model Diffing
MAPEX: A Multi-Agent Pipeline for Keyphrase Extraction
Are Smaller Open-Weight LLMs Closing the Gap to Proprietary Models for Biomedical Question Answering?
Developing an AI framework to automatically detect shared decision-making in patient-doctor conversations
CogniLoad: A Synthetic Natural Language Reasoning Benchmark With Tunable Length, Intrinsic Difficulty, and Distractor Density
LAWCAT: Efficient Distillation from Quadratic to Linear Attention with Convolution across Tokens for Long Context Modeling
Actions Speak Louder than Prompts: A Large-Scale Study of LLMs for Graph Inference
A Rhythm-Aware Phrase Insertion for Classical Arabic Poetry Composition
Trace Is In Sentences: Unbiased Lightweight ChatGPT-Generated Text Detector
CCQA: Generating Question from Solution Can Improve Inference-Time Reasoning in SLMs
Prior-based Noisy Text Data Filtering: Fast and Strong Alternative For Perplexity
TsqLoRA: Towards Sensitivity and Quality Low-Rank Adaptation for Efficient Fine-Tuning
UniECG: Understanding and Generating ECG in One Unified Model
A Good Plan is Hard to Find: Aligning Models with Preferences is Misaligned with What Helps Users
Thinking in a Crowd: How Auxiliary Information Shapes LLM Reasoning
SIRAG: Towards Stable and Interpretable RAG with A Process-Supervised Multi-Agent Framework
ERFC: Happy Customers with Emotion Recognition and Forecasting in Conversation in Call Centers
Evaluating Large Language Models for Detecting Antisemitism
Exploiting Tree Structure for Credit Assignment in RL Training of LLMs
Brittleness and Promise: Knowledge Graph Based Reward Modeling for Diagnostic Reasoning
Speculate Deep and Accurate: Lossless and Training-Free Acceleration for Offloaded LLMs via Substitute Speculative Decoding
Speech Vecalign: an Embedding-based Method for Aligning Parallel Speech Documents
Interactive Real-Time Speaker Diarization Correction with Human Feedback
NormGenesis: Multicultural Dialogue Generation via Exemplar-Guided Social Norm Modeling and Violation Recovery
Evaluating the Creativity of LLMs in Persian Literary Text Generation
Towards Practical Multi-label Causal Discovery in High-Dimensional Event Sequences via One-Shot Graph Aggregation
FedFiTS: Fitness-Selected, Slotted Client Scheduling for Trustworthy Federated Learning in Healthcare AI
Analysis on distribution and clustering of weight
PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generatio
GSTM-HMU: Generative Spatio-Temporal Modeling for Human Mobility Understanding
Efficient Reinforcement Learning by Reducing Forgetting with Elephant Activation Functions
Dynamic Prompt Fusion for Multi-Task and Cross-Domain Adaptation in LLMs
GAUSS: Benchmarking Structured Mathematical Skills for Large Language Models
Event Causality Identification with Synthetic Control
ZERA: Zero-init Instruction Evolving Refinement Agent - From Zero Instructions to Structured Prompts via Principle-based Optimization
Theoretical Foundations of Representation Learning using Unlabeled Data: Statistics and Optimization
Fully Learnable Neural Reward Machines
OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment
Improving Credit Card Fraud Detection through Transformer-Enhanced GAN Oversampling
Latent Danger Zone: Distilling Unified Attention for Cross-Architecture Black-box Attacks
Beyond Backpropagation: Exploring Innovative Algorithms for Energy-Efficient Deep Neural Network Training
Diffusion Bridge Variational Inference for Deep Gaussian Processes
Graph Neural Networks with Similarity-Navigated Probabilistic Feature Copying
Asymptotically Optimal Problem-Dependent Bandit Policies for Transfer Learning
Algorithms for Adversarially Robust Deep Learning
DRO-REBEL: Distributionally Robust Relative-Reward Regression for Fast and Efficient LLM Alignment
Shared-Weights Extender and Gradient Voting for Neural Network Expansion
NGRPO: Negative-enhanced Group Relative Policy Optimization
Exploring Heterophily in Graph-level Tasks
Enhancing the Effectiveness and Durability of Backdoor Attacks in Federated Learning through Maximizing Task Distinction
Tackling GNARLy Problems: Graph Neural Algorithmic Reasoning Reimagined through Reinforcement Learning
Towards Privacy-Aware Bayesian Networks: A Credal Approach
Lift What You Can: Green Online Learning with Heterogeneous Ensembles
Central Limit Theorems for Asynchronous Averaged Q-Learning
Otters: An Energy-Efficient SpikingTransformer via Optical Time-to-First-Spike Encoding
Learning From Simulators: A Theory of Simulation-Grounded Learning
CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure
Flow marching for a generative PDE foundation model
HyperAdapt: Simple High-Rank Adaptation
Subspace Clustering of Subspaces: Unifying Canonical Correlation Analysis and Subspace Clustering
Towards Rational Pesticide Design with Graph Machine Learning Models for Ecotoxicology
A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications
LLM-Enhanced Self-Evolving Reinforcement Learning for Multi-Step E-Commerce Payment Fraud Risk Detection
Theory of periodic convolutional neural network
MOMEMTO: Patch-based Memory Gate Model in Time Series Foundation Model
Diagonal Linear Networks and the Lasso Regularization Path
Probabilistic Machine Learning for Uncertainty-Aware Diagnosis of Industrial Systems
Training-Free Data Assimilation with GenCast
Graph-based Clustering Revisited: A Relaxation of Kernel $k$-Means Perspective
SimpleFold: Folding Proteins is Simpler than You Think
Physics-informed time series analysis with Kolmogorov-Arnold Networks under Ehrenfest constraints
Hybrid Data can Enhance the Utility of Synthetic Data for Training Anti-Money Laundering Models
APRIL: Active Partial Rollouts in Reinforcement Learning to tame long-tail generation
Reverse-Complement Consistency for DNA Language Models
Symphony-MoE: Harmonizing Disparate Pre-trained Models into a Coherent Mixture-of-Experts
Global Minimizers of Sigmoid Contrastive Loss
Explainable Graph Neural Networks: Understanding Brain Connectivity and Biomarkers in Dementia
Interaction Topological Transformer for Multiscale Learning in Porous Materials
DS-Diffusion: Data Style-Guided Diffusion Model for Time-Series Generation
Reflect before Act: Proactive Error Correction in Language Models
Graph Enhanced Trajectory Anomaly Detection
Towards Provable Emergence of In-Context Reinforcement Learning
Development of Deep Learning Optimizers: Approaches, Concepts, and Update Rules
Explicit Path CGR: Maintaining Sequence Fidelity in Geometric Representations
Diffusion Policies with Offline and Inverse Reinforcement Learning for Promoting Physical Activity in Older Adults Using Wearable Sensors
MeshODENet: A Graph-Informed Neural Ordinary Differential Equation Neural Network for Simulating Mesh-Based Physical Systems
Fast Linear Solvers via AI-Tuned Markov Chain Monte Carlo-based Matrix Inversion
GluMind: Multimodal Parallel Attention and Knowledge Retention for Robust Cross-Population Blood Glucose Forecasting
Probabilistic Geometric Principal Component Analysis with application to neural data
Discrete-time diffusion-like models for speech synthesis
Individualized non-uniform quantization for vector search
DSFT: Inspiring Diffusion Large Language Models to Comprehend Mathematical and Logical Patterns
MobiGPT: A Foundation Model for Mobile Wireless Networks
PiMoE: Token-Level Routing for Integrating High-Precision Computation and Reasoning
FedIA: A Plug-and-Play Importance-Aware Gradient Pruning Aggregation Method for Domain-Robust Federated Graph Learning on Node Classification
SBVR: Summation of BitVector Representation for Efficient LLM Quantization
TurnBack: A Geospatial Route Cognition Benchmark for Large Language Models through Reverse Route
Conversational Orientation Reasoning: Egocentric-to-Allocentric Navigation with Multimodal Chain-of-Thought
Variational Task Vector Composition
MolPILE - large-scale, diverse dataset for molecular representation learning
FastMTP: Accelerating LLM Inference with Enhanced Multi-Token Prediction
Multi-Worker Selection based Distributed Swarm Learning for Edge IoT with Non-i.i.d. Data
GnnXemplar: Exemplars to Explanations - Natural Language Rules for Global GNN Interpretability
KM-GPT: An Automated Pipeline for Reconstructing Individual Patient Data from Kaplan-Meier Plots
AdaSTI: Conditional Diffusion Models with Adaptive Dependency Modeling for Spatio-Temporal Imputation
Early Prediction of Multi-Label Care Escalation Triggers in the Intensive Care Unit Using Electronic Health Records
ConceptFlow: Hierarchical and Fine-grained Concept-Based Explanation for Convolutional Neural Networks
Sparse Training Scheme for Multimodal LLM
HyperNAS: Enhancing Architecture Representation for NAS Predictor via Hypernetwork
WLFM: A Well-Logs Foundation Model for Multi-Task and Cross-Well Geological Interpretation
A deep reinforcement learning platform for antibiotic discovery
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe
Developing Training Procedures for Piecewise-linear Spline Activation Functions in Neural Networks
A Simple and Reproducible Hybrid Solver for a Truck-Drone VRP with Recharge
Safe-SAIL: Towards a Fine-grained Safety Landscape of Large Language Models via Sparse Autoencoder Interpretation Framework
Accounting for Uncertainty in Machine Learning Surrogates: A Gauss-Hermite Quadrature Approach to Reliability Analysis
Research on Metro Transportation Flow Prediction Based on the STL-GRU Combined Model
Two ways to knowledge?
Self-Evolving LLMs via Continual Instruction Tuning
A Weighted Gradient Tracking Privacy-Preserving Method for Distributed Optimization
SDGF: Fusing Static and Multi-Scale Dynamic Correlations for Multivariate Time Series Forecasting
From Parameters to Performance: A Data-Driven Study on LLM Structure and Development
LoRALib: A Standardized Benchmark for Evaluating LoRA-MoE Methods
Rank-Induced PL Mirror Descent: A Rank-Faithful Second-Order Algorithm for Sleeping Experts
Comparative Analysis of FOLD-SE vs. FOLD-R++ in Binary Classification and XGBoost in Multi-Category Classification
A Machine Learning Framework for Pathway-Driven Therapeutic Target Discovery in Metabolic Disorders
A Study of Skews, Imbalances, and Pathological Conditions in LLM Inference Deployment on GPU Clusters detectable from DPU
Towards Scalable and Structured Spatiotemporal Forecasting
Amortized Latent Steering: Low-Cost Alternative to Test-Time Optimization
Robust and continuous machine learning of usage habits to adapt digital interfaces to user needs
Decentor-V: Lightweight ML Training on Low-Power RISC-V Edge Devices
MobileRL: Online Agentic Reinforcement Learning for Mobile GUI Agents
A Coopetitive-Compatible Data Generation Framework for Cross-silo Federated Learning
Prediction of Coffee Ratings Based On Influential Attributes Using SelectKBest and Optimal Hyperparameters
NurseSchedRL: Attention-Guided Reinforcement Learning for Nurse-Patient Assignment
Anomaly Detection in Electric Vehicle Charging Stations Using Federated Learning
Machine Learnability as a Measure of Order in Aperiodic Sequences
Data Valuation and Selection in a Federated Model Marketplace
BULL-ODE: Bullwhip Learning with Neural ODEs and Universal Differential Equations under Stochastic Demand
Model-Based Transfer Learning for Real-Time Damage Assessment of Bridge Networks
AdaMixT: Adaptive Weighted Mixture of Multi-Scale Expert Transformers for Time Series Forecasting
Solve it with EASE
Machine Learning-Based Classification of Vessel Types in Straits Using AIS Tracks
Localized PCA-Net Neural Operators for Scalable Solution Reconstruction of Elliptic PDEs
Prompt Optimization Meets Subspace Representation Learning for Few-shot Out-of-Distribution Detection
Large language models surpass domain-specific architectures for antepartum electronic fetal monitoring analysis

Research Sources: 462 | Generated: 9/24/2025