AI Research News Feeds for August 12th, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

A CrossMod-Transformer deep learning framework for multi-modal pain detection through EDA and ECG fusion
Prompt-based bioinformatics: a new interface for multi-omics analysis
AI-driven fusion of multimodal data for Alzheimer’s disease biomarker assessment
AI-assisted cervical cytology precancerous screening for high-risk population in resource-limited regions using a compact microscope
A practical framework for appropriate implementation and review of artificial intelligence (FAIR-AI) in healthcare
Deep learning-based prediction of the selection factors for quantifying selection in immune receptor repertoires
Decomposing Global AUC into Cluster-Level Contributions for Localized Model Diagnostics
A Meta-Learning Method for Estimation of Causal Excursion Effects to Assess Time-Varying Moderation
Stability and performance guarantees for misspecified multivariate score-driven filters
Physics-Informed Generative Modeling of Wireless Channels
Semantic Mapping in Indoor Embodied AI -- A Survey on Advances, Challenges, and Future Directions
DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control
OceanSim: A GPU-Accelerated Underwater Robot Perception Simulation Framework
MeshPad: Interactive Sketch-Conditioned Artist-Reminiscent Mesh Generation and Editing
Dual-domain Modulation Network for Lightweight Image Super-Resolution
Evaluating structural uncertainty in accelerated MRI: are voxelwise measures useful surrogates?
UniCalib: Targetless LiDAR-Camera Calibration via Probabilistic Flow on Unified Depth Representations
Retuve: Automated Multi-Modality Analysis of Hip Dysplasia with Open Source AI
SOPHY: Learning to Generate Simulation-Ready Objects with Physical Materials
Is Single-View Mesh Reconstruction Ready for Robotics?
Efficient RAW Image Deblurring with Adaptive Frequency Modulation
Maximum Dispersion, Maximum Concentration: Enhancing the Quality of MOP Solutions
Spatio-Temporal Representation Decoupling and Enhancement for Federated Instrument Segmentation in Surgical Videos
A Steel Surface Defect Detection Method Based on Lightweight Convolution Optimization
Learned Regularization for Microwave Tomography
ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks
On Representation Learning with Feedback
EA-KD: Entropy-based Adaptive Knowledge Distillation
Scene Summarization: Clustering Scene Videos into Spatially Diverse Frames
BonnBeetClouds3D: A Dataset Towards Point Cloud-based Organ-level Phenotyping of Sugar Beet Plants under Field Conditions
Towards Customized Knowledge Distillation for Chip-Level Dense Image Predictions
Compact and De-biased Negative Instance Embedding for Multi-Instance Learning on Whole-Slide Image Classification
Spotter+GPT: Turning Sign Spottings into Sentences with LLMs
Goldilocks Test Sets for Face Verification
InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows
Learning Multi-view Anomaly Detection with Efficient Adaptive Selection
FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Alignment-free Raw Video Demoireing
Prompt-Softbox-Prompt: A Free-Text Embedding Control for Image Editing
Ethical Challenges in Computer Vision: Ensuring Privacy and Mitigating Bias in Publicly Available Datasets
PainDiffusion: Learning to Express Pain
SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal Dissection with Mamba
Multimodal Deception in Explainable AI: Concept-Level Backdoor Attacks on Concept Bottleneck Models
SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity
Flow Matching Posterior Sampling: A Training-free Conditional Generation for Flow Matching
Solving Zero-Shot 3D Visual Grounding as Constraint Satisfaction Problems
Quadratic Gaussian Splatting: High Quality Surface Reconstruction with Second-order Geometric Primitives
VideoSAVi: Self-Aligned Video Language Models without Human Supervision
DuoCast: Duo-Probabilistic Diffusion for Precipitation Nowcasting
BadPatch: Diffusion-Based Generation of Physical Adversarial Patches
Street Gaussians without 3D Object Tracker
Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction
TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models
DAViD: Modeling Dynamic Affordance of 3D Objects Using Pre-trained Video Diffusion Models
DWTNeRF: Boosting Few-shot Neural Radiance Fields via Discrete Wavelet Transform
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer
Confidence-Based Annotation Of Brain Tumours In Ultrasound
GaussianFlowOcc: Sparse and Weakly Supervised Occupancy Estimation using Gaussian Splatting and Temporal Flow
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation
TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification
MambaFlow: A Mamba-Centric Architecture for End-to-End Optical Flow Estimation
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
From Limited Labels to Open Domains:An Efficient Learning Method for Drone-view Geo-Localization
ROODI: Reconstructing Occluded Objects with Denoising Inpainters
VFM-UDA++: Improving Network Architectures and Data Strategies for Unsupervised Domain Adaptive Semantic Segmentation
AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models
GAITGen: Disentangled Motion-Pathology Impaired Gait Generative Model -- Bringing Motion Generation to the Clinical Domain
Leveraging Sparse Annotations for Leukemia Diagnosis on the Large Leukemia Dataset
Hierarchical Relation-augmented Representation Generalization for Few-shot Action Recognition
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detection
DamageCAT: A Deep Learning Transformer Framework for Typology-Based Post-Disaster Building Damage Categorization
Interpreting the linear structure of vision-language model embedding spaces
Just Say the Word: Annotation-Free Fine-Grained Object Counting
Decoupled Global-Local Alignment for Improving Compositional Understanding
CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
3D Gaussian Splatting Data Compression with Mixture of Priors
QuickSplat: Fast 3D Surface Reconstruction via Learned Gaussian Initialization
Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language
Unintended Bias in 2D+ Image Segmentation and Its Effect on Attention Asymmetry
Toward Patient-specific Partial Point Cloud to Surface Completion for Pre- to Intra-operative Registration in Image-guided Liver Interventions
NEXT: Multi-Grained Mixture of Experts via Text-Modulation for Multi-Modal Object Re-Identification
EF-VI: Enhancing End-Frame Injection for Video Inbetweening
Rhetorical Text-to-Image Generation via Two-layer Diffusion Policy Optimization
HiGarment: Cross-modal Harmony Based Diffusion Model for Flat Sketch to Realistic Garment Image
Video Signature: In-generation Watermarking for Latent Video Diffusion Models
Zoom-Refine: Boosting High-Resolution Multimodal Understanding via Localized Zoom and Self-Refinement
DanceChat: Large Language Model-Guided Music-to-Dance Generation
Simple Radiology VLLM Test-time Scaling with Thought Graph Traversal
3DGS-IEval-15K: A Large-scale Image Quality Evaluation Database for 3D Gaussian-Splatting
CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion
Co-VisiON: Co-Visibility ReasONing on Sparse Image Sets of Indoor Scenes
CLGRPO: Reasoning Ability Enhancement for Small VLMs
UltraAD: Fine-Grained Ultrasound Anomaly Classification via Few-Shot CLIP Adaptation
Temporal Rate Reduction Clustering for Human Motion Segmentation
MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models
PCLVis: Visual Analytics of Process Communication Latency in Large-Scale Simulation
CPKD: Clinical Prior Knowledge-Constrained Diffusion Models for Surgical Phase Recognition in Endoscopic Submucosal Dissection
When Trackers Date Fish: A Benchmark and Framework for Underwater Multiple Fish Tracking
Multimodal Visual Transformer for Sim2real Transfer in Visual Reinforcement Learning
LifelongPR: Lifelong point cloud place recognition based on sample replay and prompt learning
ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding
Efficient Face Image Quality Assessment via Self-training and Knowledge Distillation
FROSS: Faster-than-Real-Time Online 3D Semantic Scene Graph Generation from RGB-D Images
Less is More: Skim Transformer for Light Field Image Super-resolution
A nonlinear elasticity model in computer vision
Rethinking Theoretical Illumination for Efficient Low-Light Image Enhancement
A Plug-and-Play Method for Guided Multi-contrast MRI Reconstruction based on Content/Style Modeling
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
Comparison Reveals Commonality: Customized Image Generation through Contrastive Inversion
Correspondence as Video: Test-Time Adaption on SAM2 for Reference Segmentation in the Wild
Dream4D: Lifting Camera-Controlled I2V towards Spatiotemporally Consistent 4D Generation
Prototype-Guided Curriculum Learning for Zero-Shot Learning
Forecasting Continuous Non-Conservative Dynamical Systems in SO(3)
GaitSnippet: Gait Recognition Beyond Unordered Sets and Ordered Sequences
Anatomy-Aware Low-Dose CT Denoising via Pretrained Vision Models and Semantic-Guided Contrastive Learning
Boosting Active Defense Persistence: A Two-Stage Defense Framework Combining Interruption and Poisoning Against Deepfake
Power Battery Detection
MambaTrans: Multimodal Fusion Image Translation via Large Language Model Priors for Downstream Visual Tasks
Pose-RFT: Enhancing MLLMs for 3D Pose Generation via Hybrid Action Reinforcement Fine-Tuning
DiTVR: Zero-Shot Diffusion Transformer for Video Restoration
Semi-supervised Multiscale Matching for SAR-Optical Image
Segmenting and Understanding: Region-aware Semantic Attention for Fine-grained Image Quality Assessment with Large Language Models
MIMIC: Multimodal Inversion for Model Interpretation and Conceptualization
Effortless Vision-Language Model Specialization in Histopathology without Annotation
CBDES MoE: Hierarchically Decoupled Mixture-of-Experts for Functional Modules in Autonomous Driving
Morphological Analysis of Semiconductor Microstructures using Skeleton Graphs
Tracking Any Point Methods for Markerless 3D Tissue Tracking in Endoscopic Stereo Images
CATP: Contextually Adaptive Token Pruning for Efficient and Enhanced Multimodal In-Context Learning
TAP: Parameter-efficient Task-Aware Prompting for Adverse Weather Removal
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
CTC Transcription Alignment of the Bullinger Letters: Automatic Improvement of Annotation Quality
Generative Video Matting
Mem4D: Decoupling Static and Dynamic Memory for Dynamic Scene Reconstruction
RSVLM-QA: A Benchmark Dataset for Remote Sensing Vision Language Model-based Question Answering
TAG: A Simple Yet Effective Temporal-Aware Approach for Zero-Shot Video Temporal Grounding
VOIDFace: A Privacy-Preserving Multi-Network Face Recognition With Enhanced Security
TrackOR: Towards Personalized Intelligent Operating Rooms Through Robust Tracking
The Escalator Problem: Identifying Implicit Motion Blindness in AI for Accessibility
Prompt-Guided Relational Reasoning for Social Behavior Understanding with Vision Foundation Models
Sample-aware RandAugment: Search-free Automatic Data Augmentation for Effective Image Recognition
Mitigating Biases in Surgical Operating Rooms with Geometry
TRIDE: A Text-assisted Radar-Image weather-aware fusion network for Depth Estimation
S^2VG: 3D Stereoscopic and Spatial Video Generation via Denoising Frame Matrix
Information Bottleneck-based Causal Attention for Multi-label Medical Image Recognition
ME-TST+: Micro-expression Analysis via Temporal State Transition with ROI Relationship Awareness
Matrix-3D: Omnidirectional Explorable 3D World Generation
3D Plant Root Skeleton Detection and Extraction
TBAC-UniImage: Unified Understanding and Generation by Ladder-Side Diffusion Tuning
A Physics-Driven Neural Network with Parameter Embedding for Generating Quantitative MR Maps from Weighted Images
Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control
FantasyStyle: Controllable Stylized Distillation for 3D Gaussian Splatting
Pindrop it! Audio and Visual Deepfake Countermeasures for Robust Detection and Fine Grained-Localization
ReconDreamer-RL: Enhancing Reinforcement Learning via Diffusion-based Scene Reconstruction
CD-TVD: Contrastive Diffusion for 3D Super-Resolution with Scarce High-Resolution Time-Varying Data
3D Human Mesh Estimation from Single View RGBD
PP-Motion: Physical-Perceptual Fidelity Evaluation for Human Motion Generation
THAT: Token-wise High-frequency Augmentation Transformer for Hyperspectral Pansharpening
KARMA: Efficient Structural Defect Segmentation via Kolmogorov-Arnold Representation Learning
Reinforcement Learning in Vision: A Survey
Spatial-ORMLLM: Improve Spatial Relation Understanding in the Operating Room with Multimodal Large Language Model
SAGOnline: Segment Any Gaussians Online
Learning User Preferences for Image Generation Model
StableAvatar: Infinite-Length Audio-Driven Avatar Video Generation
ReferSplat: Referring Segmentation in 3D Gaussian Splatting
Learning an Implicit Physics Model for Image-based Fluid Simulation
Codebook-enabled Generative End-to-end Semantic Communication Powered by Transformer
Digital generation of the 3-D pore architecture of isotropic membranes using 2-D cross-sectional scanning electron microscopy images
Vibration-Based Energy Metric for Restoring Needle Alignment in Autonomous Robotic Ultrasound
Fading the Digital Ink: A Universal Black-Box Attack Framework for 3DGS Watermarking Systems
KLASSify to Verify: Audio-Visual Deepfake Detection Using SSL-based Audio and Handcrafted Visual Features
Progressive Bird's Eye View Perception for Safety-Critical Autonomous Driving: A Comprehensive Survey
MSPT: A Lightweight Face Image Quality Assessment Method with Multi-stage Progressive Training
AD-AVSR: Asymmetric Dual-stream Enhancement for Robust Audio-Visual Speech Recognition
Sea-Undistort: A Dataset for Through-Water Image Restoration in High Resolution Airborne Bathymetric Mapping
IPBA: Imperceptible Perturbation Backdoor Attack in Federated Self-Supervised Learning
Adaptive Cache Enhancement for Test-Time Adaptation of Vision-Language Models
GAPNet: A Lightweight Framework for Image and Video Salient Object Detection via Granularity-Aware Paradigm
Voice Pathology Detection Using Phonation
From Prediction to Explanation: Multimodal, Explainable, and Interactive Deepfake Detection Framework for Non-Expert Users
LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation
X2Edit: Revisiting Arbitrary-Instruction Image Editing through Self-Constructed Data and Task-Aware Representation Learning
An Iterative Reconstruction Method for Dental Cone-Beam Computed Tomography with a Truncated Field of View
Enhancing Egocentric Object Detection in Static Environments using Graph-based Spatial Anomaly Detection and Correction
A Trustworthy Method for Multimodal Emotion Recognition
AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning
LaRender: Training-Free Occlusion Control in Image Generation via Latent Rendering
Collaborative Learning of Scattering and Deep Features for SAR Target Recognition with Noisy Labels
Undress to Redress: A Training-Free Framework for Virtual Try-On
DiffVC-OSD: One-Step Diffusion-based Perceptual Neural Video Compression Framework
Make Your MoVe: Make Your 3D Contents by Adapting Multi-View Diffusion Models to External Editing
Multi-view Normal and Distance Guidance Gaussian Splatting for Surface Reconstruction
A Registration-Based Star-Shape Segmentation Model and Fast Algorithms
Enhancing Small-Scale Dataset Expansion with Triplet-Connection-based Sample Re-Weighting
Grouped Speculative Decoding for Autoregressive Image Generation
Med-GRIM: Enhanced Zero-Shot Medical VQA using prompt-embedded Multimodal Graph RAG
DiTalker: A Unified DiT-based Framework for High-Quality and Speaking Styles Controllable Portrait Animation
BigTokDetect: A Clinically-Informed Vision-Language Model Framework for Detecting Pro-Bigorexia Videos on TikTok
Frequency Prior Guided Matching: A Data Augmentation Approach for Generalizable Semi-Supervised Polyp Segmentation
Large Language Models Facilitate Vision Reflection in Image Classification
Benchmarking Deep Learning-Based Object Detection Models on Feature Deficient Astrophotography Imagery Dataset
MILD: Multi-Layer Diffusion Strategy for Complex and Precise Multi-IP Aware Human Erasing
Statistical Confidence Rescoring for Robust 3D Scene Graph Generation from Multi-View Images
Slice or the Whole Pie? Utility Control for AI Models
Static and Plugged: Make Embodied Evaluation Simple
StyleTailor: Towards Personalized Fashion Styling via Hierarchical Negative Feedback
Grounding Emotion Recognition with Visual Prototypes: VEGA -- Revisiting CLIP in MERC
ContextGuard-LVLM: Enhancing News Veracity through Fine-grained Cross-modal Contextual Consistency Verification
VL-MedGuide: A Visual-Linguistic Large Model for Intelligent and Explainable Skin Disease Auxiliary Diagnosis
CycleDiff: Cycle Diffusion Models for Unpaired Image-to-image Translation
Rethinking Key-frame-based Micro-expression Recognition: A Robust and Accurate Framework Against Key-frame Errors
Towards Robust Red-Green Watermarking for Autoregressive Image Generators
Learning More by Seeing Less: Line Drawing Pretraining for Efficient, Transferable, and Human-Aligned Vision
Fourier Optics and Deep Learning Methods for Fast 3D Reconstruction in Digital Holography
Restage4D: Reanimating Deformable 3D Reconstruction from a Single Video
VOccl3D: A Video Benchmark Dataset for 3D Human Pose and Shape Estimation under real Occlusions
DiffUS: Differentiable Ultrasound Rendering from Volumetric Imaging
Edge Detection for Organ Boundaries via Top Down Refinement and SubPixel Upsampling
DualResolution Residual Architecture with Artifact Suppression for Melanocytic Lesion Segmentation
VesselRW: Weakly Supervised Subcutaneous Vessel Segmentation via Learned Random Walk Propagation
Low-Rank Expert Merging for Multi-Source Domain Adaptation in Person Re-Identification
Hybrid Machine Learning Framework for Predicting Geometric Deviations from 3D Surface Metrology
A Joint Sparse Self-Representation Learning Method for Multiview Clustering
LWT-ARTERY-LABEL: A Lightweight Framework for Automated Coronary Artery Identification
Fusion-Based Brain Tumor Classification Using Deep Learning and Explainable AI, and Rule-Based Reasoning
eMotions: A Large-Scale Dataset and Audio-Visual Fusion Network for Emotion Analysis in Short-form Videos
A Simple yet Powerful Instance-Aware Prompting Framework for Training-free Camouflaged Object Segmentation
MultiRef: Controllable Image Generation with Multiple Visual References
Talk2Image: A Multi-Agent System for Multi-Turn Image Generation and Editing
AR-GRPO: Training Autoregressive Image Generation Models via Reinforcement Learning
SLRTP2025 Sign Language Production Challenge: Methodology, Results, and Future Work
Adversarial Video Promotion Against Text-to-Video Retrieval
Evaluating Fisheye-Compatible 3D Gaussian Splatting Methods on Real Images Beyond 180 Degree Field of View
TADoc: Robust Time-Aware Document Image Dewarping
OctreeNCA: Single-Pass 184 MP Segmentation on Consumer Hardware
S2-UniSeg: Fast Universal Agglomerative Pooling for Scalable Segment Anything without Supervision
Spatio-Temporal Conditional Diffusion Models for Forecasting Future Multiple Sclerosis Lesion Masks Conditioned on Treatments
HiMat: DiT-based Ultra-High Resolution SVBRDF Generation
DocRefine: An Intelligent Framework for Scientific Document Understanding and Content Optimization based on Multimodal Large Model Agents
MV-CoRe: Multimodal Visual-Conceptual Reasoning for Complex Visual Question Answering
Large Language Model Evaluated Stand-alone Attention-Assisted Graph Neural Network with Spatial and Structural Information Interaction for Precise Endoscopic Image Segmentation
3DGS-VBench: A Comprehensive Video Quality Evaluation Benchmark for 3DGS Compression
SAGCNet: Spatial-Aware Graph Completion Network for Missing Slice Imputation in Population CMR Imaging
TeSO: Representing and Compressing 3D Point Cloud Scenes with Textured Surfel Octree
ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting
Communication-Efficient Multi-Agent 3D Detection via Hybrid Collaboration
CMAMRNet: A Contextual Mask-Aware Network Enhancing Mural Restoration Through Comprehensive Mask Guidance
Dynamic Pattern Alignment Learning for Pretraining Lightweight Human-Centric Vision Models
SketchAnimator: Animate Sketch via Motion Customization of Text-to-Video Diffusion Models
CoopDiff: Anticipating 3D Human-object Interactions via Contact-consistent Decoupled Diffusion
EventRR: Event Referential Reasoning for Referring Video Object Segmentation
Similarity Matters: A Novel Depth-guided Network for Image Restoration and A New Dataset
Unsupervised Real-World Super-Resolution via Rectified Flow Degradation Modelling
Bridging Semantic Logic Gaps: A Cognition-Inspired Multimodal Boundary-Preserving Network for Image Manipulation Localization
Generic Calibration: Pose Ambiguity/Linear Solution and Parametric-hybrid Pipeline
HaDM-ST: Histology-Assisted Differential Modeling for Spatial Transcriptomics Generation
Landmark Guided Visual Feature Extractor for Visual Speech Recognition with Limited Resource
ASM-UNet: Adaptive Scan Mamba Integrating Group Commonalities and Individual Variations for Fine-Grained Segmentation
Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers
SUIT: Spatial-Spectral Union-Intersection Interaction Network for Hyperspectral Object Tracking
Understanding Dynamic Scenes in Ego Centric 4D Point Clouds
Small-Large Collaboration: Training-efficient Concept Personalization for Large VLM using a Meta Personalized Small VLM
SynMatch: Rethinking Consistency in Medical Image Segmentation with Sparse Annotations
BEVANet: Bilateral Efficient Visual Attention Network for Real-Time Semantic Segmentation
MobileViCLIP: An Efficient Video-Text Model for Mobile Devices
DocR1: Evidence Page-Guided GRPO for Multi-Page Document Understanding
RORPCap: Retrieval-based Objects and Relations Prompt for Image Captioning
Planner-Refiner: Dynamic Space-Time Refinement for Vision-Language Alignment in Videos
CoAR: Concept Injection into Autoregressive Models for Personalized Text-to-Image Generation
SODiff: Semantic-Oriented Diffusion Model for JPEG Compression Artifacts Removal
GS4Buildings: Prior-Guided Gaussian Splatting for 3D Building Reconstruction
Training and Inference within 1 Second -- Tackle Cross-Sensor Degradation of Real-World Pansharpening with Efficient Residual Feature Tailoring
DIP-GS: Deep Image Prior For Gaussian Splatting Sparse View Recovery
LET-US: Long Event-Text Understanding of Scenes
ForensicsSAM: Toward Robust and Unified Image Forgery Detection and Localization Resisting to Adversarial Attack
CharacterShot: Controllable and Consistent 4D Character Animation
CLUE: Leveraging Low-Rank Adaptation to Capture Latent Uncovered Evidence for Image Forgery Localization
Levarging Learning Bias for Noisy Anomaly Detection
Health Care Waste Classification Using Deep Learning Aligned with Nepal's Bin Color Guidelines
AURA: A Fine-Grained Benchmark and Decomposed Metric for Audio-Visual Reasoning
Novel View Synthesis with Gaussian Splatting: Impact on Photogrammetry Model Accuracy and Resolution
VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding
FormCoach: Lift Smarter, Not Harder
Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing
Enhancing Reliability of Medical Image Diagnosis through Top-rank Learning with Rejection Module
Enhanced Generative Structure Prior for Chinese Text Image Super-resolution
Domain Generalization of Pathological Image Segmentation by Patch-Level and WSI-Level Contrastive Learning
CoT-Pose: Chain-of-Thought Reasoning for 3D Pose Generation from Abstract Prompts
Adaptive Pseudo Label Selection for Individual Unlabeled Data by Positive and Unlabeled Learning
Decoupled Functional Evaluation of Autonomous Driving Models via Feature Map Quality Scoring
Splat4D: Diffusion-Enhanced 4D Gaussian Splatting for Temporally and Spatially Consistent Content Creation
Joint Transcription of Acoustic Guitar Strumming Directions and Chords
Improving Document Retrieval Coherence for Semantically Equivalent Queries
Exploring Procedural Data Generation for Automatic Acoustic Guitar Fingerpicking Transcription
Audio-Thinker: Guiding Audio Language Model When and How to Think via Reinforcement Learning
How Chinese are Chinese Language Models? The Puzzling Lack of Language Policy in China's LLMs
CLAIR-A: Leveraging Large Language Models to Judge Audio Captions
Strengthening False Information Propagation Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques in comparison to BERT
ReGLA: Refining Gated Linear Attention
ALFA: Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning
URO-Bench: Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models
Invisible Walls in Cities: Leveraging Large Language Models to Predict Urban Segregation Experience with Social Media Content
X-EcoMLA: Upcycling Pre-Trained Attention into MLA for Efficient and Extreme KV Compression
Both Direct and Indirect Evidence Contribute to Dative Alternation Preferences in Language Models
Overcoming Vocabulary Constraints with Pixel-level Fallback
NoveltyBench: Evaluating Language Models for Humanlike Diversity
QUDsim: Quantifying Discourse Similarities in LLM-Generated Text
GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning
Planning with Diffusion Models for Target-Oriented Dialogue Systems
Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control
RAIR: Retrieval-Augmented Iterative Refinement for Chinese Spelling Correction
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
Rethinking Prompt Optimizers: From Prompt Merits to Optimization
Decoding the Multimodal Mind: Generalizable Brain-to-Text Translation via Multimodal Alignment and Adaptive Routing
The taggedPBC: Annotating a massive parallel corpus for crosslinguistic investigations
WebDancer: Towards Autonomous Information Seeking Agency
Document Valuation in LLM Summaries: A Cluster Shapley Approach
PersianMedQA: Evaluating Large Language Models on a Persian-English Bilingual Medical Question Answering Benchmark
ACCESS DENIED INC: The First Benchmark Environment for Sensitivity Awareness
Structure-Augmented Reasoning Generation
PersonalAI: A Systematic Comparison of Knowledge Graph Storage and Retrieval Approaches for Personalized LLM agents
MDC-R: The Minecraft Dialogue Corpus with Reference
Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning
Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective
EduCoder: An Open-Source Annotation System for Education Transcript Data
Investigating writing style as a contributor to gender gaps in science and technology
SynthVLM: Towards High-Quality and Efficient Synthesis of Image-Caption Datasets for Vision-Language Models
MathScape: Benchmarking Multimodal Large Language Models in Real-World Mathematical Contexts
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs
How Far Are We from Generating Missing Modalities with Foundation Models?
GenEscape: Hierarchical Multi-Agent Generation of Escape Room Puzzles
Vec2Summ: Text Summarization via Probabilistic Sentence Embeddings
BharatBBQ: A Multilingual Bias Benchmark for Question Answering in the Indian Context
Gradient Surgery for Safe LLM Fine-Tuning
Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Enhancing Rumor Detection Methods with Propagation Structure Infused Language Model
Prompt Tuning for Few-Shot Continual Learning Named Entity Recognition
The 2D+ Dynamic Articulatory Model DYNARTmo: Tongue-Palate Contact Area Estimation
Arce: Augmented Roberta with Contextualized Elucidations for Ner in Automated Rule Checking
CCFQA: A Benchmark for Cross-Lingual and Cross-Modal Speech and Text Factuality Evaluation
Think Before You Talk: Enhancing Meaningful Dialogue Generation in Full-Duplex Speech Language Models with Planning-Inspired Text Guidance
Let's Revise Step-by-Step: A Unified Local Search Framework for Code Generation with LLMs
Positional Biases Shift as Inputs Approach Context Window Limits
Augmenting Bias Detection in LLMs Using Topological Data Analysis
From Trial-and-Error to Improvement: A Systematic Analysis of LLM Exploration Mechanisms in RLVR
Keyword-Centric Prompting for One-Shot Event Detection with Self-Generated Rationale Enhancements
What am I missing here?: Evaluating Large Language Models for Masked Sentence Prediction
Exploring Causal Effect of Social Bias on Faithfulness Hallucinations in Large Language Models
SASST: Leveraging Syntax-Aware Chunking and LLMs for Simultaneous Speech Translation
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
Can You Trick the Grader? Adversarial Persuasion of LLM Judges
Evaluating Compositional Approaches for Focus and Sentiment Analysis
Evaluating Large Language Models as Expert Annotators
LLMs for Law: Evaluating Legal-Specific LLMs on Contract Understanding
Large Language Models for Czech Aspect-Based Sentiment Analysis
Few-shot Cross-lingual Aspect-Based Sentiment Analysis with Sequence-to-Sequence Models
Tailored Emotional LLM-Supporter: Enhancing Cultural Sensitivity
Challenges and opportunities in portraying emotion in generated sign language
Expert Preference-based Evaluation of Automated Related Work Generation
Large Language Models for Subjective Language Understanding: A Survey
Toward Machine Interpreting: Lessons from Human Interpreting Studies
Understanding Syntactic Generalization in Structure-inducing Language Models
The Medical Metaphors Corpus (MCC)
WideSearch: Benchmarking Agentic Broad Info-Seeking
Progressive Depth Up-scaling via Optimal Transport
9th Workshop on Sign Language Translation and Avatar Technologies (SLTAT 2025)
Iterative refinement, not training objective, makes HuBERT behave differently from wav2vec 2.0
Czech Dataset for Complex Aspect-Based Sentiment Analysis Tasks
Data-Efficient Biomedical In-Context Learning: A Diversity-Enhanced Submodular Perspective
REX-RAG: Reasoning Exploration with Policy Correction in Retrieval-Augmented Generation
Efficient Speculative Decoding for Llama at Scale: Challenges and Solutions
Exploring Safety Alignment Evaluation of LLMs in Chinese Mental Health Dialogues via LLM-as-Judge
Jinx: Unlimited LLMs for Probing Alignment Failures
Towards Real-World Rumor Detection: Anomaly Detection Framework with Graph Supervised Contrastive Learning
PrLM: Learning Explicit Reasoning for Personalized RAG via Contrastive Reward Optimization
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent
Train It and Forget It: Merge Lists are Unnecessary for BPE Inference in Language Models
Measuring Stereotype and Deviation Biases in Large Language Models
Testing the Limits of Machine Translation from One Book
SEVADE: Self-Evolving Multi-Agent Analysis with Decoupled Evaluation for Hallucination-Resistant Irony Detection
Annotating Errors in English Learners' Written Language Production: Advancing Automated Written Feedback Systems
The ReQAP System for Question Answering over Personal Information
Score Before You Speak: Improving Persona Consistency in Dialogue Generation using Response Quality Scores
Two-Stage Quranic QA via Ensemble Retrieval and Instruction-Tuned Answer Extraction
Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
Gradient Descent Finds Over-Parameterized Neural Networks with Sharp Generalization for Nonparametric Regression
TDDBench: A Benchmark for Training data detection
Quantum Policy Gradient in Reproducing Kernel Hilbert Space
Pairwise Markov Chains for Volatility Forecasting
Reconstruction of boosted and resolved multi-Higgs-boson events with symmetry-preserving attention networks
Mamba-based Deep Learning Approach for Sleep Staging on a Wireless Multimodal Wearable System without Electroencephalography
Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study
Accurate and thermodynamically consistent hydrogen equation of state for planetary modeling with flow matching
MatCLIP: Light- and Shape-Insensitive Assignment of PBR Material Models
ScaffoldGPT: A Scaffold-based GPT Model for Drug Optimization
A Practical Introduction to Kernel Discrepancies: MMD, HSIC & KSD
Exploration of Hepatitis B Virus Infection Dynamics through Physics-Informed Deep Learning Approach
EDiT: Efficient Diffusion Transformers with Linear Compressed Attention
Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering
How Relevance Emerges: Interpreting LoRA Fine-Tuning in Reranking LLMs
TFMPathy: Tabular Foundation Model for Privacy-Aware, Generalisable Empathy Detection from Videos
Exploring Video-Based Driver Activity Recognition under Noisy Labels
Interactive Imitation Learning for Dexterous Robotic Manipulation: Challenges and Perspectives -- A Survey
SVarM: Linear Support Varifold Machines for Classification and Regression on Geometric Data
OmniFluids: Physics Pre-trained Modeling of Fluid Dynamics
Inference-Time Gaze Refinement for Micro-Expression Recognition: Enhancing Event-Based Eye Tracking with Motion-Aware Post-Processing
Position: Certified Robustness Does Not (Yet) Imply Model Security
Coupled Entropy: A Goldilocks Generalization for Complex Systems
Optimal and Practical Batched Linear Bandit Algorithm
Phase transition of the Sinkhorn-Knopp algorithm
From Time-series Generation, Model Selection to Transfer Learning: A Comparative Review of Pixel-wise Approaches for Large-scale Crop Mapping
Wasserstein Barycenter Soft Actor-Critic
PAE MobiLLM: Privacy-Aware and Efficient LLM Fine-Tuning on the Mobile Device via Additive Side-Tuning
ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space
Accurate Measles Rash Detection via Vision Transformer Fine-Tuning
Highly Fast Text Segmentation With Pairwise Markov Chains
Online Learning and Optimization for Queues with Unknown Demand Curve and Service Distribution
Empathy Detection from Text, Audiovisual, Audio or Physiological Signals: A Systematic Review of Task Formulations and Machine Learning Methods
A Deep Learning Based Resource Allocator for Communication Networks with Dynamic User Utility Demands
Training 3D ResNets to Extract BSM Physics Parameters from Simulated Data
Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains
A variational Bayes approach to debiased inference for low-dimensional parameters in high-dimensional linear regression
RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design
Scikit-fingerprints: easy and efficient computation of molecular fingerprints in Python
FQGA-single: Towards Fewer Training Epochs and Fewer Model Parameters for Image-to-Image Translation Tasks
Quantum-data-driven dynamical transition in quantum learning
FlatQuant: Flatness Matters for LLM Quantization
Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts
Tensor Decomposition with Unaligned Observations
MS-IMAP -- A Multi-Scale Graph Embedding Approach for Interpretable Manifold Learning
ADAM-SINDy: An Efficient Optimization Framework for Parameterized Nonlinear Dynamical System Identification
An information-matching approach to optimal experimental design and active learning
sbi reloaded: a toolkit for simulation-based inference workflows
$\ell_0$-Regularized Quadratic Surface Support Vector Machines
chebgreen: Learning and Interpolating Continuous Empirical Green's Functions from Data
Covering Multiple Objectives with a Small Set of Solutions Using Bayesian Optimization
On the Emergence of Position Bias in Transformers
Chaos into Order: Neural Framework for Expected Value Estimation of Stochastic Partial Differential Equations
Active Learning of Model Discrepancy with Bayesian Experimental Design
Optimistic Interior Point Methods for Sequential Hypothesis Testing by Betting
Active Advantage-Aligned Online Reinforcement Learning with Offline Data
Fenchel-Young Variational Learning
On the Duality between Gradient Transformations and Adapters
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
Real-Time Moving Flock Detection in Pedestrian Trajectories Using Sequential Deep Learning Models
Robustness to Geographic Distribution Shift Using Location Encoders
Average-DICE: Stationary Distribution Correction by Regression
Gradient Extrapolation for Debiased Representation Learning
Empirical Analysis of Privacy-Fairness-Accuracy Trade-offs in Federated Learning: A Step Towards Responsible AI
Uncertainty propagation in feed-forward neural network models
Model-Agnostic Policy Explanations with Large Language Models
Resource-efficient Inference with Foundation Model Programs
Self-Supervised Autoencoder Network for Robust Heart Rate Extraction from Noisy Photoplethysmogram: Applying Blind Source Separation to Biosignal Analysis
Time Marching Neural Operator FE Coupling: AI Accelerated Physics Modeling
CAOTE: KV Cache Eviction for LLMs via Attention Output Error-Based Token Selection
Unveiling 3D Ocean Biogeochemical Provinces in the North Atlantic: A Systematic Comparison and Validation of Clustering Methods
DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering
Forecasting at Full Spectrum: Holistic Multi-Granular Traffic Modeling under High-Throughput Inference Regimes
FedSDAF: Leveraging Source Domain Awareness for Enhanced Federated Domain Generalization
mAIstro: an open-source multi-agentic system for automated end-to-end development of radiomics and deep learning models for medical imaging
A Square Peg in a Square Hole: Meta-Expert for Long-Tailed Semi-Supervised Learning
Learning to Reason without External Rewards
Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents
MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection
Machine Learning Algorithms for Improving Exact Classical Solvers in Mixed Integer Continuous Optimization
Model-Agnostic Sentiment Distribution Stability Analysis for Robust LLM-Generated Texts Detection
Explainable AI for Curie Temperature Prediction in Magnetic Materials
TerraMAE: Learning Spatial-Spectral Representations from Hyperspectral Earth Observation Data via Adaptive Masked Autoencoders
Statistical Inference for Autoencoder-based Anomaly Detection after Representation Learning-based Domain Adaptation
Taking the Garbage Out of Data-Driven Prediction Across Climate Timescales
Reconstruction of Solar EUV Irradiance Using CaII K Images and SOHO/SEM Data with Bayesian Deep Learning and Uncertainty Quantification
Membership Inference Attacks with False Discovery Rate Control
SEF-MK: Speaker-Embedding-Free Voice Anonymization through Multi-k-means Quantization
QuProFS: An Evolutionary Training-free Approach to Efficient Quantum Feature Map Search
AugLift: Boosting Generalization in Lifting-based 3D Human Pose Estimation
Sensory robustness through top-down feedback and neural stochasticity in recurrent vision models
How Does a Deep Neural Network Look at Lexical Stress?
BIGBOY1.2: Generating Realistic Synthetic Data for Disease Outbreak Modelling and Analytics
Channel Charting in Smart Radio Environments
Nonparametric Reaction Coordinate Optimization with Histories: A Framework for Rare Event Dynamics
Event-Aware Sentiment Factors from LLM-Augmented Financial Tweets: A Transparent Framework for Interpretable Quant Trading
Grounding Multilingual Multimodal LLMs With Cultural Knowledge
Statistical Theory of Multi-stage Newton Iteration Algorithm for Online Continual Learning
Structured Superposition of Autoencoders for UEP Codes at Intermediate Blocklengths
Commentary Generation for Soccer Highlights
Barron Space Representations for Elliptic PDEs with Homogeneous Boundary Conditions
Exploiting Layer Normalization Fine-tuning in Visual Transformer Foundation Models for Classification
Generative Inversion for Property-Targeted Materials Design: Application to Shape Memory Alloys
G-IFT: A Gated Linear Unit adapter with Iterative Fine-Tuning for Low-Resource Children's Speaker Verification
Recommendation Is a Dish Better Served Warm
Being-M0.5: A Real-Time Controllable Vision-Language-Motion Model
Unequal Uncertainty: Rethinking Algorithmic Interventions for Mitigating Discrimination from AI
EFU: Enforcing Federated Unlearning via Functional Encryption
Stochastic dynamics learning with state-space systems
Meta Off-Policy Estimation
Safeguarding Generative AI Applications in Preclinical Imaging through Hybrid Anomaly Detection
Gaussian Approximation for Two-Timescale Linear Stochastic Approximation
Frequency-Domain Analysis of Time-Dependent Multiomic Data in Progressive Neurodegenerative Diseases: A Proposed Quantum-Classical Hybrid Approach with Quaternionic Extensions
Adaptive Source-Channel Coding for Semantic Communications
Likelihood Ratio Tests by Kernel Gaussian Embedding
Sharper Perturbed-Kullback-Leibler Exponential Tail Bounds for Beta and Dirichlet Distributions
Prediction error certification for PINNs: Theory, computation, and application to Stokes flow
Optimizing Federated Learning for Scalable Power-demand Forecasting in Microgrids
Robust Anomaly Detection in O-RAN: Leveraging LLMs against Data Manipulation Attacks
PrIINeR: Towards Prior-Informed Implicit Neural Representations for Accelerated MRI
MDD-Net: Multimodal Depression Detection through Mutual Transformer
Assessing LLM Text Detection in Educational Contexts: Does Human Contribution Affect Detection?
An effective potential for generative modelling with active matter
Integrating Task-Specific and Universal Adapters for Pre-Trained Model-based Class-Incremental Learning
Adaptive Learning for IRS-Assisted Wireless Networks: Securing Opportunistic Communications Against Byzantine Eavesdroppers
Fed-TGAN: Federated Learning Framework for Synthesizing Tabular Data
SOInter: A Novel Deep Energy Based Interpretation Method for Explaining Structured Output Models
AdaBoost is not an Optimal Weak to Strong Learner
Optimal Multi-Distribution Learning
Monte Carlo with kernel-based Gibbs measures: Guarantees for probabilistic herding
Finite-Time Convergence Analysis of ODE-based Generative Models for Stochastic Interpolants
Intrinsic training dynamics of deep neural networks
Tight Bounds for Schr\"odinger Potential Estimation in Unpaired Image-to-Image Translation Problems
Parity Requires Unified Input Dependence and Negative Eigenvalues in SSMs
Efficient Reward Identification In Max Entropy Reinforcement Learning with Sparsity and Rank Priors
Unsupervised operator learning approach for dissipative equations via Onsager principle
Towards Unveiling Predictive Uncertainty Vulnerabilities in the Context of the Right to Be Forgotten
MOTGNN: Interpretable Graph Neural Networks for Multi-Omics Disease Classification
Online Convex Optimization with Heavy Tails: Old Algorithms, New Regrets, and Applications
N-BEATS-MOE: N-BEATS with a Mixture-of-Experts Layer for Heterogeneous Time Series Forecasting
Enhancing Privacy in Decentralized Min-Max Optimization: A Differentially Private Approach
FairDRL-ST: Disentangled Representation Learning for Fair Spatio-Temporal Mobility Prediction
Physics-Informed Multimodal Bearing Fault Classification under Variable Operating Conditions using Transfer Learning
Multimodal Remote Inference
When and how can inexact generative models still sample from the data manifold?
Extracting Complex Topology from Multivariate Functional Approximation: Contours, Jacobi Sets, and Ridge-Valley Graphs
Beyond Single: A Data Selection Principle for LLM Alignment via Fine-Grained Preference Signals
Multi-Turn Jailbreaks Are Simpler Than They Seem
Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation
Multi-Hop Privacy Propagation for Differentially Private Federated Learning in Social Networks
Semantic-Enhanced Time-Series Forecasting via Large Language Models
Detecting Mislabeled and Corrupted Data via Pointwise Mutual Information
Robust Reinforcement Learning over Wireless Networks with Homomorphic State Representations
Separation and Collaboration: Two-Level Routing Grouped Mixture-of-Experts for Multi-Domain Continual Learning
A Tutorial: An Intuitive Explanation of Offline Reinforcement Learning Theory
Topological Feature Compression for Molecular Graph Neural Networks
EvoCoT: Overcoming the Exploration Bottleneck in Reinforcement Learning
Learning Satellite Attitude Dynamics with Physics-Informed Normalising Flow
Score Augmentation for Diffusion Models
Adaptive Fine-Tuning via Pattern Specialization for Deep Time Series Forecasting
Shapley-Inspired Feature Weighting in $k$-means with No Additional Hyperparameters
A Physics-informed Deep Operator for Real-Time Freeway Traffic State Estimation
Communication-Efficient Zero-Order and First-Order Federated Learning Methods over Wireless Networks
Deep Learning-Based Analysis of Power Consumption in Gasoline, Electric, and Hybrid Vehicles
From Source to Target: Leveraging Transfer Learning for Predictive Process Monitoring in Organizations
ELF: Efficient Logic Synthesis by Pruning Redundancy in Refactoring
Symbolic Quantile Regression for the Interpretable Prediction of Conditional Quantiles
Fast and Generalizable parameter-embedded Neural Operators for Lithium-Ion Battery Simulation
NeuroDx-LM: A Clinical Large-Scale Model for EEG-based Neurological Disorder Detection
OFAL: An Oracle-Free Active Learning Framework
FairFLRep: Fairness aware fault localization and repair of Deep Neural Networks
Federated Learning for Epileptic Seizure Prediction Across Heterogeneous EEG Datasets
Cross-Subject and Cross-Montage EEG Transfer Learning via Individual Tangent Space Alignment and Spatial-Riemannian Feature Fusion
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
Do Streetscapes Still Matter for Customer Ratings of Eating and Drinking Establishments in Car-Dependent Cities?
RMT-PPAD: Real-time Multi-task Learning for Panoptic Perception in Autonomous Driving
What Makes "Good" Distractors for Object Hallucination Evaluation in Large Vision-Language Models?
Transfer Learning with EfficientNet for Accurate Leukemia Cell Classification
Generative Bid Shading in Real-Time Bidding Advertising
Age-Diverse Deepfake Dataset: Bridging the Age Gap in Deepfake Detection
From Label Error Detection to Correction: A Modular Framework and Benchmark for Object Detection Datasets
Communication-Learning Co-Design for Differentially Private Over-the-Air Federated Distillation
On the effectiveness of multimodal privileged knowledge distillation in two vision transformer based diagnostic applications
Bridging Brain Connectomes and Clinical Reports for Early Alzheimer's Disease Diagnosis
ImpliHateVid: A Benchmark Dataset and Two-stage Contrastive Learning Framework for Implicit Hate Speech Detection in Videos
Benchmarking Self-Driving Labs
Federated Online Learning for Heterogeneous Multisource Streaming Data
Machines Learn Number Fields, But How? The Case of Galois Groups
Role of Large Language Models and Retrieval-Augmented Generation for Accelerating Crystalline Material Discovery: A Systematic Review
A Tight Lower Bound for the Approximation Guarantee of Higher-Order Singular Value Decomposition
ClimateSOM: A Visual Analysis Workflow for Climate Ensemble Datasets
Mitigating Distribution Shift in Graph-Based Android Malware Classification via Function Metadata and LLM Embeddings
Story Ribbons: Reimagining Storyline Visualizations with Large Language Models
A Score-based Diffusion Model Approach for Adaptive Learning of Stochastic Partial Differential Equation Solutions
MOCA-HESP: Meta High-dimensional Bayesian Optimization for Combinatorial and Mixed Spaces via Hyper-ellipsoid Partitioning
Energy Efficient Task Offloading in UAV-Enabled MEC Using a Fully Decentralized Deep Reinforcement Learning Approach
Text to Speech System for Meitei Mayek Script
Near-Optimal Convergence of Accelerated Gradient Methods under Generalized and $(L_0, L_1)$-Smoothness
BrainATCL: Adaptive Temporal Brain Connectivity Learning for Functional Link Prediction and Age Estimation
Approaching Maximal Information Extraction in Low-Signal Regimes via Multiple Instance Learning
From Nodes to Narratives: Explaining Graph Neural Networks with LLMs and Graph Context
Multi-Level Service Performance Forecasting via Spatiotemporal Graph Neural Networks
How Effectively Can Large Language Models Connect SNP Variants and ECG Phenotypes for Cardiovascular Risk Prediction?
A Globally Optimal Analytic Solution for Semi-Nonnegative Matrix Factorization with Nonnegative or Mixed Inputs
Strategic Incentivization for Locally Differentially Private Federated Learning
Policy Newton methods for Distortion Riskmetrics
PySeizure: A single machine learning classifier framework to detect seizures in diverse datasets
Self-Organizing Survival Manifolds: A Theory for Unsupervised Discovery of Prognostic Structures in Biological Systems
Semi-Supervised Supply Chain Fraud Detection with Unsupervised Pre-Filtering
GFlowNets for Learning Better Drug-Drug Interaction Representations
Hypergraph Neural Network with State Space Models for Node Classification
Local Diffusion Models and Phases of Data Distributions
Learning to Forget with Information Divergence Reweighted Objectives for Noisy Labels
Privacy-Preserving Tabular Synthetic Data Generation Using TabularARGN
Transferring Social Network Knowledge from Multiple GNN Teachers to Kolmogorov-Arnold Networks
Watermarking Kolmogorov-Arnold Networks for Emerging Networked Applications via Activation Perturbation
Stabilizing Federated Learning under Extreme Heterogeneity with HeteRo-Select
CISO: Species Distribution Modeling Conditioned on Incomplete Species Observations
Fed MobiLLM: Efficient Federated LLM Fine-Tuning over Heterogeneous Mobile Devices via Server Assisted Side-Tuning
Technical Report: Full-Stack Fine-Tuning for the Q Programming Language
Conformal Prediction and Trustworthy AI
QuiZSF: An efficient data-model interaction framework for zero-shot time-series forecasting
BoRA: Towards More Expressive Low-Rank Adaptation with Block Diversity
Structure-Preserving Digital Twins via Conditional Neural Whitney Forms
Discovery Learning accelerates battery design evaluation
UniMove: A Unified Model for Multi-city Human Mobility Prediction
A Comparative Study of Feature Selection in Tsetlin Machines
TLCCSP: A Scalable Framework for Enhancing Time Series Forecasting with Time-Lagged Cross-Correlations
A Stage-Aware Mixture of Experts Framework for Neurodegenerative Disease Progression Modelling
Differentiable Adaptive Kalman Filtering via Optimal Transport
Improving Real-Time Concept Drift Detection using a Hybrid Transformer-Autoencoder Framework
RAPNet: A Receptive-Field Adaptive Convolutional Neural Network for Pansharpening
SystolicAttention: Fusing FlashAttention within a Single Systolic Array
Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?
El Agente: An Autonomous Agent for Quantum Chemistry
Reasoning Capabilities of Large Language Models on Dynamic Tasks
Identification of Probabilities of Causation: A Complete Characterization
SMART: Self-Generating and Self-Validating Multi-Dimensional Assessment for LLMs' Mathematical Problem Solving
Personalized Constitutionally-Aligned Agentic Superego: Secure AI Behavior Aligned to Diverse Human Values
Reinforcement Learning for Hybrid Charging Stations Planning and Operation Considering Fixed and Mobile Chargers
Efficient Contextual Preferential Bayesian Optimization with Historical Examples
Active Policy Improvement from Multiple Black-box Oracles
Blending Imitation and Reinforcement Learning for Robust Policy Improvement
Deep Neural Networks Can Learn Generalizable Same-Different Visual Relations
ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
Sparse Variational Student-t Processes
SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring
On the Sample Efficiency of Abstractions and Potential-Based Reward Shaping in Reinforcement Learning
Runtime Monitoring and Enforcement of Conditional Fairness in Generative AIs
Fractured Glass, Failing Cameras: Simulating Physics-Based Adversarial Samples for Autonomous Driving Systems
From Spikes to Heavy Tails: Unveiling the Spectral Evolution of Neural Networks
LVBench: An Extreme Long Video Understanding Benchmark
AI-AI Bias: large language models favor communications generated by large language models
A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
Chain of Thought Still Thinks Fast: APriCoT Helps with Thinking Slow
Reward-Directed Score-Based Diffusion Models via q-Learning
EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping
In-Situ Fine-Tuning of Wildlife Models in IoT-Enabled Camera Traps for Efficient Adaptation
A Closer Look at Machine Unlearning for Large Language Models
Exploring Spatial Representation to Enhance LLM Reasoning in Aerial Vision-Language Navigation
MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection
EfficientEQA: An Efficient Approach to Open-Vocabulary Embodied Question Answering
Zero-Shot Voice Conversion via Content-Aware Timbre Ensemble and Conditional Flow Matching
Steering AI-Driven Personalization of Scientific Text for General Audiences
Understanding and Mitigating Memorization in Generative Models via Sharpness of Probability Landscapes
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation
B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens
POEX: Towards Policy Executable Jailbreak Attacks Against the LLM-based Robots
MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval
Toward Intelligent and Secure Cloud: Large Language Model Empowered Proactive Defense
WebWalker: Benchmarking LLMs in Web Traversal
Ehrenfeucht-Haussler Rank and Chain of Thought
Softplus Attention with Re-weighting Boosts Length Extrapolation in Large Language Models
Improving Your Model Ranking on Chatbot Arena by Vote Rigging
Mitigating Traffic Oscillations in Mixed Traffic Flow with Scalable Deep Koopman Predictive Control
MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization
Schema-Guided Scene-Graph Reasoning based on Multi-Agent Large Language Model System
Predicting Depression in Screening Interviews from Interactive Multi-Theme Collaboration
Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA
Collective Reasoning Among LLMs: A Framework for Answer Validation Without Ground Truth
ElementaryNet: A Non-Strategic Neural Network for Predicting Human Behavior in Normal-Form Games
From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
FunGraph: Functionality Aware 3D Scene Graphs for Language-Prompted Scene Interaction
A Theory of Learning with Autoregressive Chain of Thought
Learning Adaptive Dexterous Grasping from Single Demonstrations
Learning 3D-Gaussian Simulators from RGB Videos
$\mu$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models
How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence
Bidirectional Hierarchical Protein Multi-Modal Representation Learning
Sparsity Outperforms Low-Rank Projections in Few-Shot Adaptation
A Multimodal Deep Learning Approach for White Matter Shape Prediction in Diffusion MRI Tractography
Can LLM-based Financial Investing Strategies Outperform the Market in Long Run?
Uniform Loss vs. Specialized Optimization: A Comparative Analysis in Multi-Task Learning
RIDGECUT: Learning Graph Partitioning with Rings and Wedges
Extracting Probabilistic Knowledge from Large Language Models for Bayesian Network Parameterization
Improving LLM Outputs Against Jailbreak Attacks with Expert Model Integration
FP4 All the Way: Fully Quantized Training of LLMs
CADRE: Customizable Assurance of Data Readiness in Privacy-Preserving Federated Learning
Verbal Werewolf: Engage Users with Verbalized Agentic Werewolf Game Framework
HERGC: Heterogeneous Experts Representation and Generative Completion for Multimodal Knowledge Graphs
Winner-takes-all for Multivariate Probabilistic Time Series Forecasting
MLOps with Microservices: A Case Study on the Maritime Domain
Physics-Informed Teleconnection-Aware Transformer for Global Subseasonal-to-Seasonal Forecasting
A Two-stage Optimization Method for Wide-range Single-electron Quantum Magnetic Sensing
MMET: A Multi-Input and Multi-Scale Transformer for Efficient PDEs Solving
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving
Granular-Ball-Induced Multiple Kernel K-Means
Robust Behavior Cloning Via Global Lipschitz Regularization
Robust Anomaly Detection in Network Traffic: Evaluating Machine Learning Models on CICIDS2017
CycleDistill: Bootstrapping Machine Translation using LLMs with Cyclical Distillation
Exploring Adapter Design Tradeoffs for Low Resource Music Generation
ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation
Probabilistic Optimality for Inference-time Scaling
Foundation versus Domain-specific Models: Performance Comparison, Fusion, and Explainability in Face Recognition
Addressing The Devastating Effects Of Single-Task Data Poisoning In Exemplar-Free Continual Learning
LIRA: Inferring Segmentation in Large Multi-modal Models with Local Interleaved Region Assistance
Speckle2Self: Self-Supervised Ultrasound Speckle Reduction Without Clean Data
Bridging the Last Mile of Prediction: Enhancing Time Series Forecasting with Conditional Guided Flow Matching
Towards Theoretical Understanding of Transformer Test-Time Computing: Investigation on In-Context Linear Regression
IBPS: Indian Bail Prediction System
ShoulderShot: Generating Over-the-Shoulder Dialogue Videos
On the Limits of Selective AI Prediction: A Case Study in Clinical Decision Making
SOFA: Deep Learning Framework for Simulating and Optimizing Atrial Fibrillation Ablation
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
InterChart: Benchmarking Visual Reasoning Across Decomposed and Distributed Chart Information
Efficient Approximate Posterior Sampling with Annealed Langevin Monte Carlo
Attribution Explanations for Deep Neural Networks: A Theoretical Perspective
Grasp-HGN: Grasping the Unexpected
Discovering Spatial Correlations between Earth Observations in Global Atmospheric State Estimation by using Adaptive Graph Structure Learning
GLiClass: Generalist Lightweight Model for Sequence Classification Tasks
AIS-LLM: A Unified Framework for Maritime Trajectory Prediction, Anomaly Detection, and Collision Risk Assessment with Explainable Forecasting
MORE-CLEAR: Multimodal Offline Reinforcement learning for Clinical notes Leveraged Enhanced State Representation
TAR-TVG: Enhancing VLMs with Timestamp Anchor-Constrained Reasoning for Temporal Video Grounding
LoSemB: Logic-Guided Semantic Bridging for Inductive Tool Retrieval
Energy Consumption in Parallel Neural Network Training
Training-Free ANN-to-SNN Conversion for High-Performance Spiking Transformer
DoorDet: Semi-Automated Multi-Class Door Detection Dataset via Object Detection and Large Language Models
CognitiveArm: Enabling Real-Time EEG-Controlled Prosthetic Arm Using Embodied Machine Learning
A Rule-Based Approach to Specifying Preferences over Conflicting Facts and Querying Inconsistent Knowledge Bases
Chimera: Harnessing Multi-Agent LLMs for Automatic Insider Threat Simulation
Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment
Sparse Probabilistic Graph Circuits
UniSVG: A Unified Dataset for Vector Graphic Understanding and Generation with Multimodal Large Language Models
Pareto Multi-Objective Alignment for Language Models
PCA-Guided Autoencoding for Structured Dimensionality Reduction in Active Infrared Thermography
MIND: A Noise-Adaptive Denoising Framework for Medical Images Integrating Multi-Scale Transformer
Architectural Co-Design for Zero-Shot Anomaly Detection: Decoupling Representation and Dynamically Fusing Features in CLIP
Auditory Intelligence: Understanding the World Through Sound
DETACH: Cross-domain Learning for Long-Horizon Tasks via Mixture of Disentangled Experts
Deep Space Weather Model: Long-Range Solar Flare Prediction from Multi-Wavelength Images
Vertex Features for Neural Global Illumination
Towards Human-AI Collaboration System for the Detection of Invasive Ductal Carcinoma in Histopathology Images
Selective Contrastive Learning for Weakly Supervised Affordance Grounding
Autonomous Navigation of Cloud-Controlled Quadcopters in Confined Spaces Using Multi-Modal Perception and LLM-Driven High Semantic Reasoning
Not Yet AlphaFold for the Mind: Evaluating Centaur as a Synthetic Participant
NeeCo: Image Synthesis of Novel Instrument States Based on Dynamic and Deformable 3D Gaussian Reconstruction
Diffusing the Blind Spot: Uterine MRI Synthesis with Diffusion Models
SCDF: A Speaker Characteristics DeepFake Speech Dataset for Bias Analysis
Exploring the Challenges and Opportunities of AI-assisted Codebase Generation
WeChat-YATT: A Simple, Scalable and Balanced RLHF Trainer
Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
DIVER: A Multi-Stage Approach for Reasoning-intensive Information Retrieval
Learning to Select MCP Algorithms: From Traditional ML to Dual-Channel GAT-MLP
Advancing Knowledge Tracing by Exploring Follow-up Performance Trends
Bridging ASR and LLMs for Dysarthric Speech Recognition: Benchmarking Self-Supervised and Generative Approaches
Exploring Strategies for Personalized Radiation Therapy: Part III Identifying genetic determinants for Radiation Response with Meta Learning
BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models
Multi-modal Adaptive Mixture of Experts for Cold-start Recommendation
Rethinking Self-Replication: Detecting Distributed Selfhood in the Outlier Cellular Automaton
On Understanding of the Dynamics of Model Capacity in Continual Learning
Investigating the Design Space of Visual Grounding in Multimodal Large Language Model
C-MAG: Cascade Multimodal Attributed Graphs for Supply Chain Link Prediction
HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches
Growing Reservoirs with Developmental Graph Cellular Automata
Dual Information Speech Language Models for Emotional Conversations
Grid2Guide: A* Enabled Small Language Model for Indoor Navigation
ChatGPT on the Road: Leveraging Large Language Model-Powered In-vehicle Conversational Agents for Safer and More Enjoyable Driving Experience
Hyperspectral Imaging
GRASPTrack: Geometry-Reasoned Association via Segmentation and Projection for Multi-Object Tracking
Vision-Based Localization and LLM-based Navigation for Indoor Environments
MemoryKT: An Integrative Memory-and-Forgetting Method for Knowledge Tracing
Optimal Transport Regularization for Speech Text Alignment in Spoken Language Models
MuaLLM: A Multimodal Large Language Model Agent for Circuit Design Assistance with Hybrid Contextual Retrieval-Augmented Generation
Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models
COMponent-Aware Pruning for Accelerated Control Tasks in Latent Space Models
Can AI Explanations Make You Change Your Mind?
LPI-RIT at LeWiDi-2025: Improving Distributional Predictions via Metadata and Loss Reweighting with DisCo
PyVeritas: On Verifying Python via LLM-Based Transpilation and Bounded Model Checking for C
Neural Logic Networks for Interpretable Classification
MedReasoner: Reinforcement Learning Drives Reasoning Grounding from Clinical Thought to Pixel-Level Precision
RedDino: A foundation model for red blood cell analysis
Street-Level AI: Are Large Language Models Ready for Real-World Judgments?
Human-Alignment and Calibration of Inference-Time Uncertainty in Large Language Models
SAEMark: Multi-bit LLM Watermarking with Inference-Time Scaling
Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent
Capabilities of GPT-5 on Multimodal Medical Reasoning
OMGSR: You Only Need One Mid-timestep Guidance for Real-World Image Super-Resolution
LL3M: Large Language 3D Modelers
VGGSounder: Audio-Visual Evaluations for Foundation Models
Cut2Next: Generating Next Shot via In-Context Tuning
Sortability of Time Series Data
Learning How to Vote with Principles: Axiomatic Insights Into the Collective Decisions of Neural Networks
Graph-Powered Defense: Controller Area Network Intrusion Detection for Unmanned Aerial Vehicles
A Research Agenda for Usability and Generalisation in Reinforcement Learning
Observation Interference in Partially Observable Assistance Games
Aligning Instruction Tuning with Pre-training
Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models
Reviewing Clinical Knowledge in Medical Large Language Models: Training and Beyond
Instructor-Worker Large Language Model System for Policy Recommendation: a Case Study on Air Quality Analysis of the January 2025 Los Angeles Wildfires
A Planning Compilation to Reason about Goal Achievement at Planning Time
Extracting Overlapping Microservices from Monolithic Code via Deep Semantic Embeddings and Graph Neural Network-Based Soft Clustering
From Product Hilbert Spaces to the Generalized Koopman Operator and the Nonlinear Fundamental Lemma
VA-Blueprint: Uncovering Building Blocks for Visual Analytics System Design
Intersectoral Knowledge in AI and Urban Studies: A Framework for Transdisciplinary Research
From Field to Drone: Domain Drift Tolerant Automated Multi-Species and Damage Plant Semantic Segmentation for Herbicide Trials
Word Clouds as Common Voices: LLM-Assisted Visualization of Participant-Weighted Themes in Qualitative Interviews
Conversational DNA: A New Visual Language for Understanding Dialogue Structure in Human and AI
A DICOM Image De-identification Algorithm in the MIDI-B Challenge
Uncertainty-Driven Reliability: Selective Prediction and Trustworthy Deployment in Modern Machine Learning
A Small-footprint Acoustic Echo Cancellation Solution for Mobile Full-Duplex Speech Interactions
Retrieval-Augmented Multi-Agent System for Rapid Statement of Work Generation
Offline-to-Online Reinforcement Learning with Classifier-Free Diffusion Generation
Anatomy of a Machine Learning Ecosystem: 2 Million Models on Hugging Face
Who's the Evil Twin? Differential Auditing for Undesired Behavior
Highlight All the Phrases: Enhancing LLM Transparency through Visual Factuality Indicators
Towards Experience-Centered AI: A Framework for Integrating Lived Experience in Design and Development
AGIC: Attention-Guided Image Captioning to Improve Caption Relevance
VSI: Visual Subtitle Integration for Keyframe Selection to enhance Long Video Understanding
Sparsity-Driven Plasticity in Multi-Task Reinforcement Learning
ESNERA: Empirical and semantic named entity alignment for named entity dataset merging
NS-FPN: Improving Infrared Small Target Detection and Segmentation from Noise Suppression Perspective
Maestro-EVC: Controllable Emotional Voice Conversion Guided by References and Explicit Prosody
BASIC: Boosting Visual Alignment with Intrinsic Refined Embeddings in Multimodal Large Language Models
Advancements in Chinese font generation since deep learning era: A survey
MMReID-Bench: Unleashing the Power of MLLMs for Effective and Versatile Person Re-identification
CROP: Integrating Topological and Spatial Structures via Cross-View Prefixes for Molecular LLMs
CannyEdit: Selective Canny Control and Dual-Prompt Guidance for Training-Free Image Editing
CLAP: Coreference-Linked Augmentation for Passage Retrieval
When Prompt Engineering Meets Software Engineering: CNL-P as Natural and Robust "APIs'' for Human-AI Interaction
Class Unbiasing for Generalization in Medical Diagnosis
AMFT: Aligning LLM Reasoners by Meta-Learning the Optimal Imitation-Exploration Balance
Neural Beam Field for Spatial Beam RSRP Prediction
Beyond Frequency: Seeing Subtle Cues Through the Lens of Spatial Decomposition for Fine-Grained Visual Classification
Can Multitask Learning Enhance Model Explainability?
WeatherDiffusion: Weather-Guided Diffusion Model for Forward and Inverse Rendering
Conformal Set-based Human-AI Complementarity with Multiple Experts
Consensus-based Decentralized Multi-agent Reinforcement Learning for Random Access Network Optimization
Neural Channel Knowledge Map Assisted Scheduling Optimization of Active IRSs in Multi-User Systems
TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
Making Effective Decisions: Machine Learning and the Ecogame in 1970
From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
Trustworthy Medical Imaging with Large Language Models: A Study of Hallucinations Across Modalities
Balancing Privacy and Efficiency: Music Information Retrieval via Additive Homomorphic Encryption
Whisfusion: Parallel ASR Decoding via a Diffusion Transformer
ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability
Membership and Memorization in LLM Knowledge Distillation
SEADialogues: A Multilingual Culturally Grounded Multi-turn Dialogue Dataset on Southeast Asian Languages
Surgical Knowledge Rewrite in Compact LLMs: An 'Unlearn-then-Learn' Strategy with ($IA^3$) for Localized Factual Modulation and Catastrophic Forgetting Mitigation
Model Predictive Control for Crowd Navigation via Learning-Based Trajectory Prediction
An Evolutionary Game-Theoretic Merging Decision-Making Considering Social Acceptance for Autonomous Driving
SQL-Exchange: Transforming SQL Queries Across Domains
Hide or Highlight: Understanding the Impact of Factuality Expression on User Trust
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Towards High-Order Mean Flow Generative Models: Feasibility, Expressivity, and Provably Efficient Criteria
Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution
Pref-GUIDE: Continual Policy Learning from Real-Time Human Feedback via Preference-Based Learning
Perceptual Evaluation of GANs and Diffusion Models for Generating X-rays
Toward AI Matching Policies in Homeless Services: A Qualitative Study with Policymakers
"Draw me a curator" Examining the visual stereotyping of a cultural services profession by generative AI
A Stable and Principled Loss Function for Direct Language Model Alignment
A Real-Time, Self-Tuning Moderator Framework for Adversarial Prompt Detection
SGD Convergence under Stepsize Shrinkage in Low-Precision Training
Fairness of Automatic Speech Recognition: Looking Through a Philosophical Lens
Intention-Aware Diffusion Model for Pedestrian Trajectory Prediction
Integrating Neurosymbolic AI in Advanced Air Mobility: A Comprehensive Survey
Large-scale Multi-sequence Pretraining for Generalizable MRI Analysis in Versatile Clinical Applications
Lightweight Multi-Scale Feature Extraction with Fully Connected LMF Layer for Salient Object Detection
Improved Personalized Headline Generation via Denoising Fake Interests from Implicit Feedback
Schema Lineage Extraction at Scale: Multilingual Pipelines, Composite Evaluation, and Language-Model Benchmarks
Dynamic Benchmark Construction for Evaluating Large Language Models on Real-World Codes
Explainability-in-Action: Enabling Expressive Manipulation and Tacit Understanding by Bending Diffusion Models in ComfyUI
DySK-Attn: A Framework for Efficient, Real-Time Knowledge Updating in Large Language Models via Dynamic Sparse Knowledge Attention
Adapting LLMs to Time Series Forecasting via Temporal Heterogeneity Modeling and Semantic Alignment
Can Smaller Large Language Models Evaluate Research Quality?
Propagation Tree Is Not Deep: Adaptive Graph Contrastive Learning Approach for Rumor Detection
Presburger Functional Synthesis: Complexity and Tractable Normal Forms
What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains
Neural Bridge Processes
LLM-based Agents for Automated Confounder Discovery and Subgroup Analysis in Causal Inference
Selection and Exploitation of High-Quality Knowledge from Large Language Models for Recommendation
EDGE: A Theoretical Framework for Misconception-Aware Adaptive Learning
SocRipple: A Two-Stage Framework for Cold-Start Video Recommendations
Causal Negative Sampling via Diffusion Model for Out-of-Distribution Recommendation
OpenHAIV: A Framework Towards Practical Open-World Learning
Incorporating Contextual Paralinguistic Understanding in Large Speech-Language Models
MAQuA: Adaptive Question-Asking for Multidimensional Mental Health Screening using Item Response Theory
Representation Understanding via Activation Maximization
Fine-Tuning Large Language Models Using EEG Microstate Features for Mental Workload Assessment
"Pull or Not to Pull?'': Investigating Moral Biases in Leading Large Language Models Across Ethical Dilemmas
Revisiting Data Attribution for Influence Functions
When Is Prior Knowledge Helpful? Exploring the Evaluation and Selection of Unsupervised Pretext Tasks from a Neuro-Symbolic Perspective
From Knowledge to Conjectures: A Modal Framework for Reasoning about Hypotheses
DragonFruitQualityNet: A Lightweight Convolutional Neural Network for Real-Time Dragon Fruit Quality Inspection on Mobile Devices
MCITlib: Multimodal Continual Instruction Tuning Library and Benchmark
HealthBranches: Synthesizing Clinically-Grounded Question Answering Datasets via Decision Pathways
FlexCTC: GPU-powered CTC Beam Decoding with advanced Contextual Abilities
ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering
Strategies of Code-switching in Human-Machine Dialogs
Efficient Edge LLMs Deployment via HessianAware Quantization and CPU GPU Collaborative
ProteoKnight: Convolution-based phage virion protein classification and uncertainty analysis
AutoAssert 1: A LoRA Fine-Tuned LLM Model for Efficient Automated Assertion Generation
Urbanite: A Dataflow-Based Framework for Human-AI Interactive Alignment in Urban Visual Analytics
A Spin Glass Characterization of Neural Networks
AgriVLN: Vision-and-Language Navigation for Agricultural Robots
Leveraging GNN to Enhance MEF Method in Predicting ENSO
Real-Time Analysis of Unstructured Data with Machine Learning on Heterogeneous Architectures
Lightning Prediction under Uncertainty: DeepLight with Hazy Loss
Freeze and Reveal: Exposing Modality Bias in Vision-Language Models
Optimizing Districting Plans to Maximize Majority-Minority Districts via IPs and Local Search
Stackelberg Coupling of Online Representation Learning and Reinforcement Learning
Noise-Aware Generative Microscopic Traffic Simulation
ALOPE: Adaptive Layer Optimization for Translation Quality Estimation using Large Language Models
Large Language Models Do Not Simulate Human Psychology
DatasetResearch: Benchmarking Agent Systems for Demand-Driven Dataset Discovery
MASteer: Multi-Agent Adaptive Steer Strategy for End-to-End LLM Trustworthiness Repair
DSperse: A Framework for Targeted Verification in Zero-Knowledge Machine Learning
Simulating Biological Intelligence: Active Inference with Experiment-Informed Generative Model
Efficient and Reliable Hitting-Set Computations for the Implicit Hitting Set Approach
MultiMedEdit: A Scenario-Aware Benchmark for Evaluating Knowledge Editing in Medical VQA
K-Dense Analyst: Towards Fully Automated Scientific Analysis
Towards Safer AI Moderation: Evaluating LLM Moderators Through a Unified Benchmark Dataset and Advocating a Human-First Approach
Designing a Feedback-Driven Decision Support System for Dynamic Student Intervention
Multi-Dimensional Summarization Agents with Context-Aware Reasoning over Enterprise Tables
EndoAgent: A Memory-Guided Reflective Agent for Intelligent Endoscopic Vision-to-Decision Reasoning
Hallucination as a Computational Boundary: A Hierarchy of Inevitability and the Oracle Escape
Rethinking Domain-Specific LLM Benchmark Construction: A Comprehensiveness-Compactness Approach
Pentest-R1: Towards Autonomous Penetration Testing Reasoning Optimized via Two-Stage Reinforcement Learning
Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks for Enhanced Action Understanding
Generative AI for Strategic Plan Development
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Grounding Natural Language for Multi-agent Decision-Making with Multi-agentic LLMs
CP-Agent: Agentic Constraint Programming
Democratizing Diplomacy: A Harness for Evaluating Any Large Language Model on Full-Press Diplomacy
MCPToolBench++: A Large Scale AI Agent Model Context Protocol MCP Tool Use Benchmark
Optimization of Private Semantic Communication Performance: An Uncooperative Covert Communication Method
HGMF: A Hierarchical Gaussian Mixture Framework for Scalable Tool Invocation within the Model Context Protocol
ThinkTuning: Instilling Cognitive Reflections without Distillation
Multimodal AI Systems for Enhanced Laying Hen Welfare Assessment and Productivity Optimization
Breaking Down and Building Up: Mixture of Skill-Based Vision-and-Language Navigation Agents
Disentangling Multiplex Spatial-Temporal Transition Graph Representation Learning for Socially Enhanced POI Recommendation
1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning
EMPATHIA: Multi-Faceted Human-AI Collaboration for Refugee Integration
Ethics2vec: aligning automatic agents and human preferences
Symmetry-Aware Transformer Training for Automated Planning
Best-Effort Policies for Robust Markov Decision Processes
KIRETT: Knowledge-Graph-Based Smart Treatment Assistant for Intelligent Rescue Operations
$X$-evolve: Solution space evolution powered by large language models
Deep Reinforcement Learning with anticipatory reward in LSTM for Collision Avoidance of Mobile Robots
FEAT: A Multi-Agent Forensic AI System with Domain-Adapted Large Language Model for Automated Cause-of-Death Analysis
Interpreting Fedspeak with Confidence: A LLM-Based Uncertainty-Aware Framework Guided by Monetary Policy Transmission Paths
Fitting Description Logic Ontologies to ABox and Query Examples
AdaptFlow: Adaptive Workflow Optimization via Meta-Learning
FNBT: Full Negation Belief Transformation for Open-World Information Fusion Based on Dempster-Shafer Theory of Evidence
TeamMedAgents: Enhancing Medical Decision-Making of LLMs Through Structured Teamwork
BlindGuard: Safeguarding LLM-based Multi-Agent Systems under Unknown Attacks
From Natural Language to Solver-Ready Power System Optimization: An LLM-Assisted, Validation-in-the-Loop Framework
UPP: Unified Path Planner with Adaptive Safety and Optimality
AuthPrint: Fingerprinting Generative Models Against Malicious Model Providers
Semi-automated Fact-checking in Portuguese: Corpora Enrichment using Retrieval with Claim extraction
Forecasting Commodity Price Shocks Using Temporal and Semantic Fusion of Prices Signals and Agentic Generative AI Extracted Economic News
Network-Specific Models for Multimodal Brain Response Prediction
Computing with Canonical Microcircuits
Understanding Human Limits in Pattern Recognition: A Computational Model of Sequential Reasoning in Rock, Paper, Scissors
Retrieval augmented generation based dynamic prompting for few-shot biomedical named entity recognition using large language models
CarbonScaling: Extending Neural Scaling Laws for Carbon Footprint in Large Language Models
PiKV: KV Cache Management System for Mixture of Experts
A Framework Combining 3D CNN and Transformer for Video-Based Behavior Recognition
The Art of Breaking Words: Rethinking Multilingual Tokenizer Design
MetAdv: A Unified and Interactive Adversarial Testing Platform for Autonomous Driving
Symbolic Learning of Interpretable Reduced-Order Models for Jumping Quadruped Robots
Factor Augmented Supervised Learning with Text Embeddings
Surformer v1: Transformer-Based Surface Classification Using Tactile and Vision Features
Teaching Introduction to Programming in the times of AI: A case study of a course re-design
Efficient Safety Testing of Autonomous Vehicles via Adaptive Search over Crash-Derived Scenarios
Leveraging LLMs for Privacy-Aware Predictions in Participatory Budgeting
Discerning minds or generic tutors? Evaluating instructional guidance capabilities in Socratic LLMs
Omni Geometry Representation Learning vs Large Language Models for Geospatial Entity Resolution
Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning
A Federated Learning Framework for Handling Subtype Confounding and Heterogeneity in Large-Scale Neuroimaging Diagnosis
Generative Artificial Intelligence Extracts Structure-Function Relationships from Plants for New Materials
Towards Integrated Alignment
LLM Unlearning Without an Expert Curated Dataset
Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs
Generative AI for Intent-Driven Network Management in 6G: A Case Study on Hierarchical Learning Approach
Generalizing Scaling Laws for Dense and Sparse Large Language Models
Early Detection of Pancreatic Cancer Using Multimodal Learning on Electronic Health Record
CoDe-NeRF: Neural Rendering via Dynamic Coefficient Decomposition
Using Imperfect Synthetic Data in Downstream Inference Tasks
Segmented Confidence Sequences and Multi-Scale Adaptive Confidence Segments for Anomaly Detection in Nonstationary Time Series
Fractal Language Modelling by Universal Sequence Maps (USM)
In-Context Reinforcement Learning via Communicative World Models
Do Biased Models Have Biased Thoughts?
MMFformer: Multimodal Fusion Transformer Network for Depression Detection
Play Favorites: A Statistical Method to Measure Self-Bias in LLM-as-a-Judge
Large Language Models for Oral History Understanding with Text Classification and Sentiment Analysis
Learning Causal Structure Distributions for Robust Planning
Analysis of Schedule-Free Nonconvex Optimization
Many-Turn Jailbreaking
FoundBioNet: A Foundation-Based Model for IDH Genotyping of Glioma from Multi-Parametric MRI
SafePLUG: Empowering Multimodal LLMs with Pixel-Level Insight and Temporal Grounding for Traffic Accident Understanding
PANAMA: A Network-Aware MARL Framework for Multi-Agent Path Finding in Digital Twin Ecosystems
Zero-Direction Probing: A Linear-Algebraic Framework for Deep Analysis of Large-Language-Model Drift
BiXSE: Improving Dense Retrieval via Probabilistic Graded Relevance Distillation
PROPS: Progressively Private Self-alignment of Large Language Models
Mode-Aware Non-Linear Tucker Autoencoder for Tensor-based Unsupervised Learning
Geometry-Aware Spiking Graph Neural Network
LSDTs: LLM-Augmented Semantic Digital Twins for Adaptive Knowledge-Intensive Infrastructure Planning
Hardness-Aware Dynamic Curriculum Learning for Robust Multimodal Emotion Recognition with Missing Modalities
Solving Pasur Using GPU-Accelerated Counterfactual Regret Minimization
Operationalizing Serendipity: Multi-Agent AI Workflows for Enhanced Materials Characterization with Theory-in-the-Loop
IRL-VLA: Training an Vision-Language-Action Policy via Reward World Model
CountQA: How Well Do MLLMs Count in the Wild?
Formal Concept Analysis: a Structural Framework for Variability Extraction and Analysis
Zero-Shot Cellular Trajectory Map Matching
Probabilistic Circuits for Knowledge Graph Completion with Reduced Rule Sets
GLIDR: Graph-Like Inductive Logic Programming with Differentiable Reasoning
ParBalans: Parallel Multi-Armed Bandits-based Adaptive Large Neighborhood Search
Topology Generation of UAV Covert Communication Networks: A Graph Diffusion Approach with Incentive Mechanism
Pushing the Envelope of LLM Inference on AI-PC
A Fuzzy Logic Prompting Framework for Large Language Models in Adaptive and Uncertain Tasks
Natural Language-Driven Viewpoint Navigation for Volume Exploration via Semantic Block Representation
Remote Sensing Image Intelligent Interpretation with the Language-Centered Perspective: Principles, Methods and Challenges
Multi-level Advantage Credit Assignment for Cooperative Multi-Agent Reinforcement Learning
MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams
MeteorPred: A Meteorological Multimodal Large Model and Dataset for Severe Weather Event Prediction
Pushdown Reward Machines for Reinforcement Learning
GDBA Revisited: Unleashing the Power of Guided Local Search for Distributed Constraint Optimization
Automated Formalization via Conceptual Retrieval-Augmented LLMs
Intrinsic Explainability of Multimodal Learning for Crop Yield Prediction

Research Sources: 1029 | Generated: 8/25/2025