AI Research News Feeds for December 29th, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

Learning Association via Track-Detection Matching for Multi-Object Tracking
ProEdit: Inversion-based Editing From Prompts Done Right
See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning
A Graph-Augmented knowledge Distillation based Dual-Stream Vision Transformer with Region-Aware Attention for Gastrointestinal Disease Classification with Explainable AI
Modified TSception for Analyzing Driver Drowsiness and Mental Workload from EEG
RT-Focuser: A Real-Time Lightweight Model for Edge-side Image Deblurring
The Color-Clinical Decoupling: Why Perceptual Calibration Fails Clinical Biomarkers in Smartphone Dermatology
SketchPlay: Intuitive Creation of Physically Realistic VR Content with Gesture-Driven Sketching
Co-Teaching for Unsupervised Domain Adaptation and Expansion
Self-Supervised Skeleton-Based Action Representation Learning: A Benchmark and Beyond
AlignFreeNet: Is Cross-Modal Pre-Alignment Necessary? An End-to-End Alignment-Free Lightweight Network for Visible-Infrared Object Detection
Multi-Part Object Representations via Graph Structures and Co-Part Discovery
Total Normal Curvature Regularization and its Minimization for Surface and Image Smoothing
Non-Contrast CT Esophageal Varices Grading through Clinical Prior-Enhanced Multi-Organ Analysis
D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning
Efficient Vision Mamba for MRI Super-Resolution via Hybrid Selective Scanning
AlignAR: Generative Sentence Alignment for Arabic-English Parallel Corpora of Legal and Literary Texts
TimeBill: Time-Budgeted Inference for Large Language Models
Explainable Statute Prediction via Attention-based Model and LLM Prompting
Accelerate Speculative Decoding with Sparse Computation in Verification
SWE-RM: Execution-free Feedback For Software Engineering Agents
Broken Words, Broken Performance: Effect of Tokenization on Performance of LLMs
Self-attention vector output similarities reveal how machines pay attention
Context as a Tool: Context Management for Long-Horizon SWE-Agents
Toward Secure and Compliant AI: Organizational Standards and Protocols for NLP Model Lifecycle Management
MAD: Multi-Alignment MEG-to-Text Decoding
Understanding Virality: A Rubric based Vision-Language Model Framework for Short-Form Edutainment Evaluation
IMA++: ISIC Archive Multi-Annotator Dermoscopic Skin Lesion Segmentation Dataset
Generative Multi-Focus Image Fusion
SVBench: Evaluation of Video Generation Models on Social Reasoning
Fixed-Budget Parameter-Efficient Training with Frozen Encoders Improves Multimodal Chest X-Ray Classification
Fixed-Threshold Evaluation of a Hybrid CNN-ViT for AI-Generated Image Detection Across Photos and Art
MuS-Polar3D: A Benchmark Dataset for Computational Polarimetric 3D Imaging under Multi-Scattering Conditions
Vision Transformers are Circulant Attention Learners
EraseLoRA: MLLM-Driven Foreground Exclusion and Background Subtype Aggregation for Dataset-Free Object Removal
Toward Intelligent Scene Augmentation for Context-Aware Object Placement and Sponsor-Logo Integration
LLM-Free Image Captioning Evaluation in Reference-Flexible Settings
UltraLBM-UNet: Ultralight Bidirectional Mamba-based Model for Skin Lesion Segmentation
From Shallow Humor to Metaphor: Towards Label-Free Harmful Meme Detection via LMM Agent Self-Improvement
GaussianEM: Model compositional and conformational heterogeneity using 3D Gaussians
TAMEing Long Contexts in Personalization: Towards Training-Free and State-Aware MLLM Personalized Assistant
CausalFSFG: Rethinking Few-Shot Fine-Grained Visual Categorization from Causal Perspective
SymDrive: Realistic and Controllable Driving Simulator via Symmetric Auto-regressive Online Restoration
Training-Free Disentangled Text-Guided Image Editing via Sparse Latent Constraints
Omni-Weather: Unified Multimodal Foundation Model for Weather Generation and Understanding
UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture
Contrastive Graph Modeling for Cross-Domain Few-Shot Medical Image Segmentation
SlideChain: Semantic Provenance for Lecture Understanding via Blockchain Registration
Analyzing the Mechanism of Attention Collapse in VGGT from a Dynamics Perspective
ShinyNeRF: Digitizing Anisotropic Appearance in Neural Radiance Fields
Prior-AttUNet: Retinal OCT Fluid Segmentation Based on Normal Anatomical Priors and Attention Gating
FUSE: Unifying Spectral and Semantic Cues for Robust AI-Generated Image Detection
Spatiotemporal-Untrammelled Mixture of Experts for Multi-Person Motion Prediction
RAPTOR: Real-Time High-Resolution UAV Video Prediction with Efficient Video Attention
AstraNav-World: World Model for Foresight Control and Consistency
Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation
SyncAnyone: Implicit Disentanglement via Progressive Self-Correction for Lip-Syncing in the wild
Scene-VLM: Multimodal Video Scene Segmentation via Vision-Language Models
AI for Mycetoma Diagnosis in Histopathological Images: The MICCAI 2024 Challenge
Diffusion Posterior Sampling for Super-Resolution under Gaussian Measurement Noise
End-to-End 3D Spatiotemporal Perception with Multimodal Fusion and V2X Collaboration
Breaking Alignment Barriers: TPS-Driven Semantic Correlation Learning for Alignment-Free RGB-T Salient Object Detection
Fast Inference of Visual Autoregressive Model with Adjacency-Adaptive Dynamical Draft Trees
Training-free Conditional Image Embedding Framework Leveraging Large Vision Language Models
EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decomposition
DPAR: Dynamic Patchification for Efficient Autoregressive Visual Generation
SLIM-Brain: A Data- and Training-Efficient Foundation Model for fMRI Data Analysis
Reloc-VGGT: Visual Re-localization with Geometry Grounded Transformer
CrownGen: Patient-customized Crown Generation via Point Diffusion Model
High-Fidelity and Long-Duration Human Image Animation with Diffusion Transformer
Patch as Node: Human-Centric Graph Representation Learning for Multimodal Action Recognition
Automated Discovery of Parsimonious Spectral Indices via Normalized Difference Polynomials
Perceive and Calibrate: Analyzing and Enhancing Robustness of Medical Multi-Modal Large Language Models
A Lightweight Multi-Scale Attention Framework for Real-Time Spinal Endoscopic Instance Segmentation
iSHIFT: Lightweight Slow-Fast GUI Agent with Adaptive Perception
Patch-Discontinuity Mining for Generalized Deepfake Detection
Backdoor Attacks on Prompt-Driven Video Segmentation Foundation Models
MAI-UI Technical Report: Real-World Centric Foundation GUI Agents
Yume-1.5: A Text-Controlled Interactive World Generation Model
GQ-VAE: A gated quantized VAE for learning variable length tokens
Exploring the Heterogeneity of Tabular Data: A Diversity-aware Data Generator via LLMs
Hybrid Combinatorial Multi-armed Bandits with Probabilistically Triggered Arms
DuaDeep-SeqAffinity: Dual-Stream Deep Learning Framework for Sequence-Only Antigen-Antibody Affinity Prediction
HWL-HIN: A Hypergraph-Level Hypergraph Isomorphism Network as Powerful as the Hypergraph Weisfeiler-Lehman Test with Application to Higher-Order Network Robustness
Direction Finding with Sparse Arrays Based on Variable Window Size Spatial Smoothing
Why Smooth Stability Assumptions Fail for ReLU Learning
Scaling Adversarial Training via Data Selection
Explainable Multimodal Regression via Information Decomposition
Harnessing Data Spaces to Build Intelligent Smart City Infrastructures Across the Cloud-Edge Continuum
Sensitivity Analysis of the Consistency Assumption
Deep learning-enhanced dual-mode multiplexed optical sensor for point-of-care diagnostics of cardiovascular diseases
Learning to Reconfigure: Using Device Status to Select the Right Constrained Coding Scheme
A Tool Bottleneck Framework for Clinically-Informed and Interpretable Medical Image Understanding
Cerberus: Multi-Agent Reasoning and Coverage-Guided Exploration for Static Detection of Runtime Errors
Scalable Deep Subspace Clustering Network
Dynamic Attention (DynAttn): Interpretable High-Dimensional Spatio-Temporal Forecasting (with Application to Conflict Fatalities)
Fuzzwise: Intelligent Initial Corpus Generation for Fuzzing
An approach to Fisher-Rao metric for infinite dimensional non-parametric information geometry
CCAD: Compressed Global Feature Conditioned Anomaly Detection
Quantum Nondecimated Wavelet Transform: Theory, Circuits, and Applications
nncase: An End-to-End Compiler for Efficient LLM Deployment on Heterogeneous Storage Architectures
Incorporating rank-free coupling and external field via an amplitude-only modulated spatial photonic Ising machine
Quantitative Verification of Omega-regular Properties in Probabilistic Programming
Semantic Codebooks as Effective Priors for Neural Speech Compression
The Deepfake Detective: Interpreting Neural Forensics Through Sparse Features and Manifolds
Assessing the Effectiveness of Membership Inference on Generative Music
BertsWin: Resolving Topological Sparsity in 3D Masked Autoencoders via Component-Balanced Structural Optimization
Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models
Tilt Matching for Scalable Sampling and Fine-Tuning
Scalable Class-Incremental Learning Based on Parametric Neural Collapse
AutoPP: Towards Automated Product Poster Generation and Optimization
Data relativistic uncertainty framework for low-illumination anime scenery image enhancement
Modeling high dimensional point clouds with the spherical cluster model
Look Closer! An Adversarial Parametric Editing Framework for Hallucination Mitigation in VLMs
Prefill vs. Decode Bottlenecks: SRAM-Frequency Tradeoffs and the Memory-Bandwidth Ceiling
A Frobenius-Optimal Projection for Enforcing Linear Conservation in Learned Dynamical Models
Revisiting Bi-Encoder Neural Search: An Encoding--Searching Separation Perspective
HopCast: Calibration of Autoregressive Dynamics Models
Bias-variance decompositions: the exclusive privilege of Bregman divergences
Surrogate Representation Inference for Text and Image Annotations
Robust Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models
Generative Language Models on Nucleotide Sequences of Human Genes
Robust Federated Learning in Unreliable Wireless Networks: A Client Selection Approach
Beyond Heuristics: A Decision-Theoretic Framework for Agent Memory Management
Gamayun's Path to Multilingual Mastery: Cost-Efficient Training of a 1.5B-Parameter LLM
Rethinking Sample Polarity in Reinforcement Learning with Verifiable Rewards
Heaven-Sent or Hell-Bent? Benchmarking the Intelligence and Defectiveness of LLM Hallucinations
MoRAgent: Parameter Efficient Agent Tuning with Mixture-of-Roles
Ara-HOPE: Human-Centric Post-Editing Evaluation for Dialectal Arabic to Modern Standard Arabic Translation
On The Conceptualization and Societal Impact of Cross-Cultural Bias
Method Decoration (DeMe): A Framework for LLM-Driven Adaptive Method Generation in Dynamic IoT Environments
Knowledge Reasoning of Large Language Models Integrating Graph-Structured Information for Pest and Disease Control in Tobacco
LibContinual: A Comprehensive Library towards Realistic Continual Learning
From In Silico to In Vitro: Evaluating Molecule Generative Models for Hit Generation
StreamAvatar: Streaming Diffusion Models for Real-Time Interactive Human Avatars
Unifying Learning Dynamics and Generalization in Transformers Scaling Law
Introducing TrGLUE and SentiTurca: A Comprehensive Benchmark for Turkish General Language Understanding and Sentiment Analysis
A2P-Vis: an Analyzer-to-Presenter Agentic Pipeline for Visual Insights Generation and Reporting
Agentic Structured Graph Traversal for Root Cause Analysis of Code-related Incidents in Cloud Applications
Creative Agents: Empowering Agents with Imagination for Creative Tasks
Pre-training Vision Transformers with Formula-driven Supervised Learning
SCALA: Split Federated Learning with Concatenated Activations and Logit Adjustments
GroupDebate: Enhancing the Efficiency of Multi-Agent Debate Using Group Discussion
An Exploration of Higher Education Course Evaluation by Large Language Models
A Causal Lens for Evaluating Faithfulness Metrics
Physics-Informed Neural Solvers for Periodic Quantum Eigenproblems
A Reinforcement Learning Approach to Synthetic Data Generation
kooplearn: A Scikit-Learn Compatible Library of Algorithms for Evolution Operator Learning
A Survey of Freshness-Aware Wireless Networking with Reinforcement Learning
DeepCQ: General-Purpose Deep-Surrogate Framework for Lossy Compression Quality Prediction
An Equivariance Toolbox for Learning Dynamics
RLLaVA: An RL-central Framework for Language and Vision Assistants
Statistical vs. Deep Learning Models for Estimating Substance Overdose Excess Mortality in the US
When Bayesian Tensor Completion Meets Multioutput Gaussian Processes: Functional Universality and Rank Learning
Missing Pattern Tree based Decision Grouping and Ensemble for Deep Incomplete Multi-View Clustering
Perplexity-Aware Data Scaling Law: Perplexity Landscapes Predict Performance for Continual Pre-training
Global-Graph Guided and Local-Graph Weighted Contrastive Learning for Unified Clustering on Incomplete and Noise Multi-View Data
First Provable Guarantees for Practical Private FL: Beyond Restrictive Assumptions
Generative Actor Critic
AVP-Fusion: Adaptive Multi-Modal Fusion and Contrastive Learning for Two-Stage Antiviral Peptide Identification
Discovering Sparse Recovery Algorithms Using Neural Architecture Search
AnchorGK: Anchor-based Incremental and Stratified Graph Learning Framework for Inductive Spatio-Temporal Kriging
RefineBridge: Generative Bridge Models Improve Financial Forecasting by Foundation Models
Videos are Sample-Efficient Supervisions: Behavior Cloning from Videos via Latent Representations
Robustness and Scalability Of Machine Learning for Imbalanced Clinical Data in Emergency and Critical Care
A Data-Driven Multi-Objective Approach for Predicting Mechanical Performance, Flowability, and Porosity in Ultra-High-Performance Concrete (UHPC)
MAD-NG: Meta-Auto-Decoder Neural Galerkin Method for Solving Parametric Partial Differential Equations
Mechanical Strength Prediction of Steel-Polypropylene Fiber-based High-Performance Concrete Using Hybrid Machine Learning Algorithms
Causal-HM: Restoring Physical Generative Logic in Multimodal Anomaly Detection via Hierarchical Modulation
Rethinking Output Alignment For 1-bit Post-Training Quantization of Large Language Models
Dictionary-Transform Generative Adversarial Networks
Dynamic Feedback Engines: Layer-Wise Control for Self-Regulating Continual Learning
Approximation Capabilities of Feedforward Neural Networks with GELU Activations
VAMP-Net: An Interpretable Multi-Path Framework of Genomic Permutation-Invariant Set Attention and Quality-Aware 1D-CNN for MTB Drug Resistance
Synthetic Financial Data Generation for Enhanced Financial Modelling
Smart IoT-Based Leak Forecasting and Detection for Energy-Efficient Liquid Cooling in AI Data Centers
SpatialBench: Can Agents Analyze Real-World Spatial Biology Data?
Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks
EcoNet: Multiagent Planning and Control Of Household Energy Resources Using Active Inference
Atomistic Simulation Guided Convolutional Neural Networks for Thermal Modeling of Friction Stir Welding
Query Carefully: Detecting the Unanswerables in Text-to-SQL Tasks
Fairness Is Not Just Ethical: Performance Trade-Off via Data Correlation Tuning to Mitigate Bias in ML Software
CosmoCore-Evo: Evolutionary Dream-Replay Reinforcement Learning for Adaptive Code Generation
Multi-Agent LLM Committees for Autonomous Software Beta Testing
Reflection-Driven Control for Trustworthy Code Agents
AInsteinBench: Benchmarking Coding Agents on Scientific Repositories
Safe Path Planning and Observation Quality Enhancement Strategy for Unmanned Aerial Vehicles in Water Quality Monitoring Tasks
LLM-Driven Feature-Level Adversarial Attacks on Android Malware Detectors
Teaching People LLM's Errors and Getting it Right
Morality is Contextual: Learning Interpretable Moral Contexts from Human Data with Probabilistic Clustering and Large Language Models
dUltra: Ultra-Fast Diffusion Language Models via Reinforcement Learning
Intelligent recognition of GPR road hidden defect images based on feature fusion and attention mechanism
GPF-Net: Gated Progressive Fusion Learning for Polyp Re-Identification
Efficient MoE Inference with Fine-Grained Scheduling of Disaggregated Expert Parallelism
Oogiri-Master: Benchmarking Humor Understanding via Oogiri
MotionTeller: Multi-modal Integration of Wearable Time-Series with LLMs for Health and Behavioral Understanding
DiverseGRPO: Mitigating Mode Collapse in Image Generation via Diversity-Aware GRPO
Selective LLM-Guided Regularization for Enhancing Recommendation Models
Hierarchy-Aware Fine-Tuning of Vision-Language Models
Human-AI Interaction Alignment: Designing, Evaluating, and Evolving Value-Centered AI For Reciprocal Human-AI Futures
Bidirectional Human-AI Alignment in Education for Trustworthy Learning Environments
Exploration of Reproducible Generated Image Detection
Towards Long-window Anchoring in Vision-Language Model Distillation
A Unified Definition of Hallucination, Or: It's the World Model, Stupid
Residual Prior Diffusion: A Probabilistic Framework Integrating Coarse Latent Priors with Diffusion Models
LLM-I2I: Boost Your Small Item2Item Recommendation Model with Large Language Model
TrackTeller: Temporal Multimodal 3D Grounding for Behavior-Dependent Object References
Variance-Aware Prior-Based Tree Policies for Monte Carlo Tree Search
Enabling Ultra-Fast Cardiovascular Imaging Across Heterogeneous Clinical Environments with a Generalist Foundation Model and Multimodal Database
Structural Induced Exploration for Balanced and Scalable Multi-Robot Path Planning
Near-Optimal Coalition Structures in Polynomial Time
Comparative Analysis of Deep Learning Models for Perception in Autonomous Vehicles
RIPCN: A Road Impedance Principal Component Network for Probabilistic Traffic Flow Forecasting
BeHGAN: Bengali Handwritten Word Generation from Plain Text Using Generative Adversarial Networks
Zero-Shot to Zero-Lies: Detecting Bengali Deepfake Audio through Transfer Learning
Enabling Conversational Behavior Reasoning Capabilities in Full-Duplex Speech
Detecting AI-Generated Paraphrases in Bengali: A Comparative Study of Zero-Shot and Fine-Tuned Transformers
Do Latent Tokens Think? A Causal and Adversarial Analysis of Chain-of-Continuous-Thought
CATCH: A Controllable Theme Detection Framework with Contextualized Clustering and Hierarchical Generation
Multiconnectivity for SAGIN: Current Trends, Challenges, AI-driven Solutions, and Opportunities
An Information Theoretic Perspective on Agentic System Design
HELP: Hierarchical Embodied Language Planner for Household Tasks
A Model of Causal Explanation on Neural Networks for Tabular Data
How Do Agents Perform Code Optimization? An Empirical Study
A-QCF-Net: An Adaptive Quaternion Cross-Fusion Network for Multimodal Liver Tumor Segmentation from Unpaired Datasets
Inference-based GAN Video Generation
InstructMoLE: Instruction-Guided Mixture of Low-rank Experts for Multi-Conditional Image Generation
Five Years of SciCap: What We Learned and Future Directions for Scientific Figure Captioning
Multi-agent Adaptive Mechanism Design
Applications of synthetic financial data in portfolio and risk modeling
CellMamba: Adaptive Mamba for Accurate and Efficient Cell Detection
S&P 500 Stock's Movement Prediction using CNN
HeartBench: Probing Core Dimensions of Anthropomorphic Intelligence in LLMs
A Comedy of Estimators: On KL Regularization in RL Training of LLMs
MoonBot: Modular and On-Demand Reconfigurable Robot Toward Moon Base Construction
Balancing Accuracy and Efficiency: CNN Fusion Models for Diabetic Retinopathy Screening
Secure and Explainable Fraud Detection in Finance via Hierarchical Multi-source Dataset Distillation
Bridging the Copyright Gap: Do Large Vision-Language Models Recognize and Respect Copyrighted Content?
CricBench: A Multilingual Benchmark for Evaluating LLMs in Cricket Analytics
MASFIN: A Multi-Agent System for Decomposed Financial Reasoning and Forecasting
Optimizing Resource Allocation for Geographically-Distributed Inference by Large Language Models
Aerial World Model for Long-horizon Visual Generation and Navigation in 3D Space
MMCTOP: A Multimodal Textualization and Mixture-of-Experts Framework for Clinical Trial Outcome Prediction
Flexible Multitask Learning with Factorized Diffusion Policy
Semiparametric Preference Optimization: Your Language Model is Secretly a Single-Index Model
Unsupervised Anomaly Detection in Brain MRI via Disentangled Anatomy Learning
LVLM-Aided Alignment of Task-Specific Vision Models
LongFly: Long-Horizon UAV Vision-and-Language Navigation with Spatiotemporal Context Integration
Meta-Learning-Based Handover Management in NextG O-RAN
From Visual Perception to Deep Empathy: An Automated Assessment Framework for House-Tree-Person Drawings Using Multimodal LLMs and Multi-Agent Collaboration
A Study of Solving Life-and-Death Problems in Go Using Relevance-Zone Based Solvers
Three-way conflict analysis based on alliance and conflict functions
Feasible strategies in three-way conflict analysis with three-valued ratings
Three-way decision with incomplete information based on similarity and satisfiability
LogicLens: Visual-Logical Co-Reasoning for Text-Centric Forgery Analysis
Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model
NEMO-4-PAYPAL: Leveraging NVIDIA's Nemo Framework for empowering PayPal's Commerce Agent
A Medical Multimodal Diagnostic Framework Integrating Vision-Language Models and Logic Tree Reasoning
AMS-IO-Bench and AMS-IO-Agent: Benchmarking and Structured Reasoning for Analog and Mixed-Signal Integrated Circuit Input/Output Design
Democratizing Drug Discovery with an Orchestrated, Knowledge-Driven Multi-Agent Team for User-Guided Therapeutic Design
Multiple-play Stochastic Bandits with Prioritized Arm Capacity Sharing
Towards Responsible and Explainable AI Agents with Consensus-Driven Reasoning
Compliance Rating Scheme: A Data Provenance Framework for Generative AI Datasets
Accelerating Scientific Discovery with Autonomous Goal-evolving Agents

Research Sources: 264 | Generated: 12/29/2025