AI Research News Feeds for October 17th, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

ScholarBench: A Bilingual Benchmark for Abstraction, Comprehension, and Reasoning Evaluation in Academic Contexts
Multilinguality Does not Make Sense: Investigating Factors Behind Zero-Shot Transfer in Sense-Aware Tasks
Are LLMs Stable Formal Logic Translators in Logical Reasoning Across Linguistically Diversified Texts?
Thunder-DeID: Accurate and Efficient De-identification Framework for Korean Court Judgments
EasyNER: A Customizable Easy-to-Use Pipeline for Deep Learning- and Dictionary-based Named Entity Recognition from Medical and Life Science Text
CAP: Evaluation of Persuasive and Creative Image Generation
Seeing Through Green: Text-Based Classification and the Firm's Returns from Green Patents
MultiFoodhat: A potential new paradigm for intelligent food quality inspection
NAPPure: Adversarial Purification for Robust Image Classification under Non-Additive Perturbations
Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding
Synchronization of Multiple Videos
Capture, Canonicalize, Splat: Zero-Shot 3D Gaussian Avatars from Unstructured Phone Images
cubic: CUDA-accelerated 3D Bioimage Computing
LOTA: Bit-Planes Guided AI-Generated Image Detection
PIA: Deepfake Detection Using Phoneme-Temporal and Identity-Dynamic Analysis
Event Interval Modulation: A Novel Scheme for Event-based Optical Camera Communication
MACE: Mixture-of-Experts Accelerated Coordinate Encoding for Large-Scale Scene Localization and Rendering
Identity-Preserving Image-to-Video Generation via Reward-Guided Optimization
Identity-GRPO: Optimizing Multi-Human Identity-preserving Video Generation via Reinforcement Learning
MatchAttention: Matching the Relative Positions for High-Resolution Cross-View Matching
Experimental Demonstration of Event-based Optical Camera Communication in Long-Range Outdoor Environment
GauSSmart: Enhanced 3D Reconstruction through 2D Foundation Models and Geometric Filtering
CLEAR: Causal Learning Framework For Robust Histopathology Tumor Detection Under Out-Of-Distribution Shifts
A Multi-domain Image Translative Diffusion StyleGAN for Iris Presentation Attack Detection
Vision-Centric Activation and Coordination for Multimodal Large Language Models
Leveraging Cycle-Consistent Anchor Points for Self-Supervised RGB-D Registration
Spatial Preference Rewarding for MLLMs Spatial Understanding
DOS: Directional Object Separation in Text Embeddings for Multi-Object Image Generation
DRBD-Mamba for Robust and Efficient Brain Tumor Segmentation with Analytical Insights
DCMIL: A Progressive Representation Learning Model of Whole Slide Images for Cancer Prognosis Analysis
Real-Time Neural Video Compression with Unified Intra and Inter Coding
Structured Universal Adversarial Attacks on Object Detection for Video Sequences
Unsupervised Deep Generative Models for Anomaly Detection in Neuroimaging: A Systematic Scoping Review
Pruning Overparameterized Multi-Task Networks for Degraded Web Image Restoration
Grazing Detection using Deep Learning and Sentinel-2 Time Series Data
Vision Mamba for Permeability Prediction of Porous Media
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model
Towards Generalist Intelligence in Dentistry: Vision Foundation Models for Oral and Maxillofacial Radiology
Acquisition of interpretable domain information during brain MR image harmonization for content-based image retrieval
Exploring Image Representation with Decoupled Classical Visual Descriptors
Exploring Cross-Modal Flows for Few-Shot Learning
Consistent text-to-image generation via scene de-contextualization
Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video
BalanceGS: Algorithm-System Co-design for Efficient 3D Gaussian Splatting Training on GPU
CALM-Net: Curvature-Aware LiDAR Point Cloud-based Multi-Branch Neural Network for Vehicle Re-Identification
Hierarchical Re-Classification: Combining Animal Classification Models with Vision Transformers
Zero-Shot Wildlife Sorting Using Vision Transformers: Evaluating Clustering and Continuous Similarity Ordering
Shot2Tactic-Caption: Multi-Scale Captioning of Badminton Videos for Tactical Understanding
Efficient Video Sampling: Pruning Temporally Redundant Tokens for Faster VLM Inference
Adapting Self-Supervised Representations as a Latent Space for Efficient Generation
SteeringTTA: Guiding Diffusion Trajectories for Robust Test-Time-Adaptation
EuroMineNet: A Multitemporal Sentinel-2 Benchmark for Spatiotemporal Mining Footprint Analysis in the European Union (2015-2024)
WeCKD: Weakly-supervised Chained Distillation Network for Efficient Multimodal Medical Imaging
VTimeCoT: Thinking by Drawing for Video Temporal Grounding and Reasoning
Leveraging Learned Image Prior for 3D Gaussian Compression
Cross-Layer Feature Self-Attention Module for Multi-Scale Object Detection
Free-Grained Hierarchical Recognition
LightQANet: Quantized and Adaptive Feature Learning for Low-Light Image Enhancement
MoCom: Motion-based Inter-MAV Visual Communication Using Event Vision and Spiking Neural Networks
CoT-PL: Visual Chain-of-Thought Reasoning Meets Pseudo-Labeling for Open-Vocabulary Object Detection
FraQAT: Quantization Aware Training with Fractional bits
Scaling Tumor Segmentation: Best Lessons from Real and Synthetic Data
QDepth-VLA: Quantized Depth Prediction as Auxiliary Supervision for Vision-Language-Action Models
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
Multi-modal video data-pipelines for machine learning with minimal human supervision
TOUCH: Text-guided Controllable Generation of Free-Form Hand-Object Interactions
BADAS: Context Aware Collision Prediction Using Real-World Dashcam Data
ScaleWeaver: Weaving Efficient Controllable T2I Generation with Multi-Scale Reference Attention
Leveraging Multimodal LLM Descriptions of Activity for Explainable Semi-Supervised Video Anomaly Detection
3D Scene Prompting for Scene-Consistent Camera-Controllable Video Generation
OmniMotion: Multimodal Motion Generation with Continuous Masked Autoregression
RainDiff: End-to-end Precipitation Nowcasting Via Token-wise Attention Diffusion
ChangingGrounding: 3D Visual Grounding in Changing Scenes
Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation
Deep Compositional Phase Diffusion for Long Motion Sequence Generation
GOPLA: Generalizable Object Placement Learning via Synthetic Augmentation of Human Arrangement
From Language to Locomotion: Retargeting-free Humanoid Control via Motion Latent Guidance
Neural Plasticity-Inspired Multimodal Foundation Model for Earth Observation
Incomplete Multimodal Industrial Anomaly Detection via Cross-Modal Distillation
AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization
Multi-level Reliable Guidance for Unpaired Multi-view Clustering
Shape of Motion: 4D Reconstruction from a Single Video
GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models
Impact of Regularization on Calibration and Robustness: from the Representation Space Perspective
Perspective-Aware Teaching: Adapting Knowledge for Heterogeneous Distillation
HuGDiffusion: Generalizable Single-Image Human Rendering via 3D Gaussian Diffusion
LinPrim: Linear Primitives for Differentiable Volumetric Rendering
TRACE: Your Diffusion Model is Secretly an Instance Edge Detector
Falcon: A Remote Sensing Vision-Language Foundation Model (Technical Report)
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
OmnimatteZero: Fast Training-free Omnimatte with Pre-trained Video Diffusion Models
Training-Free Personalization via Retrieval and Reasoning on Fingerprints
3DOT: Texture Transfer for 3DGS Objects from a Single Reference Image
On Large Multimodal Models as Open-World Image Classifiers
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization
TinyRS-R1: Compact Multimodal Language Model for Remote Sensing
SphereDrag: Spherical Geometry-Aware Panoramic Image Editing
MetaQAP - A Meta-Learning Approach for Quality-Aware Pretraining in Image Quality Assessment
CL-Splats: Continual Learning of Gaussian Splatting with Local Optimization
ComposeMe: Attribute-Specific Image Prompts for Controllable Human Image Generation
SVAG-Bench: A Large-Scale Benchmark for Multi-Instance Spatio-temporal Video Action Grounding
CymbaDiff: Structured Spatial Diffusion for Sketch-based 3D Semantic Urban Scene Generation
Group-Wise Optimization for Self-Extensible Codebooks in Vector Quantized Models
High Semantic Features for the Continual Learning of Complex Emotions: a Lightweight Solution
OmniGaze: Reward-inspired Generalizable Gaze Estimation In The Wild
Reasoning in Space via Grounding in the World
TABSurfer: a Hybrid Deep Learning Architecture for Subcortical Segmentation
Pruning Sparse Tensor Neural Networks Enables Deep Learning for 3D Ultrasound Localization Microscopy
Deep Few-view High-resolution Photon-counting CT at Halved Dose for Extremity Imaging
From Explainability to Action: A Generative Operational Framework for Integrating XAI in Clinical Mental Health Screening
Quechua Speech Datasets in Common Voice: The Case of Puno Quechua
TextBandit: Evaluating Probabilistic Reasoning in LLMs Through Language-Only Decision Tasks
Too Open for Opinion? Embracing Open-Endedness in Large Language Models for Social Simulation
The Harder The Better: Maintaining Supervised Fine-tuning Generalization with Less but Harder Data
Attribution Quality in AI-Generated Content:Benchmarking Style Embeddings and LLM Judges
RAID: Refusal-Aware and Integrated Decoding for Jailbreaking LLMs
Investigating Political and Demographic Associations in Large Language Models Through Moral Foundations Theory
LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization
Interpreting the Latent Structure of Operator Precedence in Language Models
RAGCap-Bench: Benchmarking Capabilities of LLMs in Agentic Retrieval Augmented Generation Systems
Element2Vec: Build Chemical Element Representation from Text for Property Prediction
Optimal Aggregation of LLM and PRM Signals for Efficient Test-Time Scaling
FACTS: Table Summarization via Offline Template Generation with Agentic Workflows
An LLM-Powered AI Agent Framework for Holistic IoT Traffic Interpretation
BioMedSearch: A Multi-Source Biomedical Retrieval Framework Based on LLMs
Robust or Suggestible? Exploring Non-Clinical Induction in LLM Drug-Safety Decisions
FinDeepResearch: Evaluating Deep Research Agents in Rigorous Financial Analysis
Readers Prefer Outputs of AI Trained on Copyrighted Books over Expert Human Writers
The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models
CRaFT: An Explanation-Based Framework for Evaluating Cultural Reasoning in Multilingual Language Models
Quantifying Phonosemantic Iconicity Distributionally in 6 Languages
ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models
DROID: Dual Representation for Out-of-Scope Intent Detection
Building a Macedonian Recipe Dataset: Collection, Parsing, and Comparative Analysis
RLSR: Reinforcement Learning with Supervised Reward Outperforms SFT in Instruction Following
MoM: Mixtures of Scenario-Aware Document Memories for Retrieval-Augmented Generation Systems
Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior
Retrofitting Small Multilingual Models for Retrieval: Matching 7B Performance with 300M Parameters
Qwen3Guard Technical Report
Rethinking Schema Linking: A Context-Aware Bidirectional Retrieval Approach for Text-to-SQL
MathMist: A Parallel Multilingual Benchmark Dataset for Mathematical Problem Solving and Reasoning
On the Ability of LLMs to Handle Character-Level Perturbations: How Well and How?
Suicidal Comment Tree Dataset: Enhancing Risk Assessment and Prediction Through Contextual Analysis
Your Next Token Prediction: A Multilingual Benchmark for Personalized Response Generation
Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents
Natural Language Tools: A Natural Language Approach to Tool Calling In Large Language Agents
Efficient Seq2seq Coreference Resolution Using Entity Representations
Assessing Socio-Cultural Alignment and Technical Safety of Sovereign LLMs
Intent Clustering with Shared Pseudo-Labels
Semantic Prosody in Machine Translation: the English-Chinese Case of Passive Structures
Speculative Model Risk in Healthcare AI: Using Storytelling to Surface Unintended Harms
AutoRubric-R1V: Rubric-Based Generative Rewards for Faithful Multimodal Reasoning
Pluto: A Benchmark for Evaluating Efficiency of LLM-generated Hardware Code
Supervised Fine-Tuning or Contrastive Learning? Towards Better Multimodal LLM Reranking
Rewiring Experts on the Fly:Continuous Rerouting for Better Online Adaptation in Mixture-of-Expert models
Midtraining Bridges Pretraining and Posttraining Distributions
Harmonizing Diverse Models: A Layer-wise Merging Strategy for Consistent Generation
AI-Powered Early Diagnosis of Mental Health Disorders from Real-World Clinical Conversations
Joint Modeling of Big Five and HEXACO for Multimodal Apparent Personality-trait Recognition
Talking Points: Describing and Localizing Pixels
You May Speak Freely: Improving the Fine-Grained Visual Recognition Capabilities of Multimodal Large Language Models with Answer Extraction
MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning
AAVENUE: Detecting LLM Biases on NLU Tasks in AAVE via a Novel Benchmark
Multi-Perspective Stance Detection
Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs
Women, Infamous, and Exotic Beings: A Comparative Study of Honorific Usages in Wikipedia and LLMs for Bengali and Hindi
Following the Autoregressive Nature of LLM Embeddings via Compression and Alignment
Sentence Smith: Controllable Edits for Evaluating Text Embeddings
Revisiting Prompt Optimization with Large Reasoning Models-A Case Study on Event Extraction
Can Pre-training Indicators Reliably Predict Fine-tuning Outcomes of LLMs?
Offline Reinforcement Learning via Inverse Optimization
Strategyproof Reinforcement Learning from Human Feedback
Just One Layer Norm Guarantees Stable Extrapolation
Electrostatics from Laplacian Eigenbasis for Neural Network Interatomic Potentials
Uni-LoRA: One Vector is All You Need
TimeRecipe: A Time-Series Forecasting Recipe via Benchmarking Module Level Effectiveness
Calibrated Predictive Lower Bounds on Time-to-Unsafe-Sampling in LLMs
Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning
BoltzNCE: Learning Likelihoods for Boltzmann Generation with Stochastic Interpolants and Noise Contrastive Estimation
Flows and Diffusions on the Neural Manifold
The Principle of Uncertain Maximum Entropy
SoK: Adversarial Evasion Attacks Practicality in NIDS Domain and the Impact of Dynamic Learning
An analysis of the derivative-free loss method for solving PDEs
Integrating feature selection and regression methods with technical indicators for predicting Apple Inc. stock prices
Improving Intrusion Detection with Domain-Invariant Representation Learning in Latent Space
SOHES: Self-supervised Open-world Hierarchical Entity Segmentation
Approximation Rates for Shallow ReLU$^k$ Neural Networks on Sobolev Spaces via the Radon Transform
Generating High Dimensional User-Specific Wireless Channels using Diffusion Models
Uncertainty Quantification with the Empirical Neural Tangent Kernel
Minimax Optimal Kernel Two-Sample Tests with Random Features
Technical and legal aspects of federated learning in bioinformatics: applications, challenges and opportunities
ELASTIC: Efficient Once For All Iterative Search for Object Detection on Microcontrollers
SPIRIT: Patching Speech Language Models against Jailbreak Attacks
Adaptive Set-Mass Calibration with Conformal Prediction
Mitigating Hallucinations in Vision-Language Models through Image-Guided Head Suppression
Adversarial Disentanglement by Backpropagation with Physics-Informed Variational Autoencoder
Mapping Farmed Landscapes from Remote Sensing
Online Continual Learning via Spiking Neural Networks with Sleep Enhanced Latent Replay
Real-Time Adaptive Motion Planning via Point Cloud-Guided, Energy-Based Diffusion and Potential Fields
Protenix-Mini+: efficient structure prediction model with scalable pairformer
SimULi: Real-Time LiDAR and Camera Simulation with Unscented Transforms
MimicKit: A Reinforcement Learning Framework for Motion Imitation and Control
Attention-Aided MMSE for OFDM Channel Estimation: Learning Linear Filters with Attention
iQUEST: An Iterative Question-Guided Framework for Knowledge Base Question Answering
When Style Breaks Safety: Defending LLMs Against Superficial Style Alignment
KScope: A Framework for Characterizing the Knowledge Status of Language Models
TAI3: Testing Agent Integrity in Interpreting User Intent
SoK: Evaluating Jailbreak Guardrails for Large Language Models
Subspace-Boosted Model Merging
LLM-guided Chemical Process Optimization with a Multi-Agent Approach
VALID-Mol: a Systematic Framework for Validated LLM-Assisted Molecular Design
A Clinically-Grounded Two-Stage Framework for Renal CT Report Generation
TopoStreamer: Temporal Lane Segment Topology Reasoning in Autonomous Driving
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Prompt Perturbations Reveal Human-Like Biases in Large Language Model Survey Responses
Why is Your Language Model a Poor Implicit Reward Model?
HANS-Net: Hyperbolic Convolution and Adaptive Temporal Attention for Accurate and Generalizable Liver and Tumor Segmentation in CT Imaging
Causal Language Control in Multilingual Transformers via Sparse Feature Steering
Rethinking Purity and Diversity in Multi-Behavior Sequential Recommendation from the Frequency Perspective
Merge-of-Thought Distillation
Chiplet-Based RISC-V SoC with Modular AI Acceleration
Large Language Models for Real-World IoT Device Identification
Multi-View Semi-Supervised Label Distribution Learning with Local Structure Complementarity
Weight Weaving: Parameter Pooling for Data-Free Model Merging
LTR-ICD: A Learning-to-Rank Approach for Automatic ICD Coding
Distributional Consistency Loss: Beyond Pointwise Data Terms in Inverse Problems
BitNet Distillation
Noise-Adaptive Layerwise Learning Rates: Accelerating Geometry-Aware Optimization for Deep Neural Network Training
CausalVerse: Benchmarking Causal Representation Learning with Configurable High-Fidelity Simulations
FedHFT: Efficient Federated Finetuning with Heterogeneous Edge Clients
Neural Network approximation power on homogeneous and heterogeneous reaction-diffusion equations
TENDE: Transfer Entropy Neural Diffusion Estimation
Near-Optimal Regret-Queue Length Tradeoff in Online Learning for Two-Sided Markets
Briding Diffusion Posterior Sampling and Monte Carlo methods: a survey
Neural Network-enabled Domain-consistent Robust Optimisation for Global CO$_2$ Reduction Potential of Gas Power Plants
Demystifying the Mechanisms Behind Emergent Exploration in Goal-conditioned RL
Learning Wireless Interference Patterns: Decoupled GNN for Throughput Prediction in Heterogeneous Multi-Hop p-CSMA Networks
On Evaluating Loss Functions for Stock Ranking: An Empirical Analysis With Transformer Model
Data Understanding Survey: Pursuing Improved Dataset Characterization Via Tensor-based Methods
Optimal Control Theoretic Neural Optimizer: From Backpropagation to Dynamic Programming
Contrastive Diffusion Alignment: Learning Structured Latents for Controllable Generation
Incentive-Based Federated Learning
Spectral Analysis of Molecular Kernels: When Richer Features Do Not Guarantee Better Generalization
When Flatness Does (Not) Guarantee Adversarial Robustness
A Physics Prior-Guided Dual-Stream Attention Network for Motion Prediction of Elastic Bragg Breakwaters
Generalist vs Specialist Time Series Foundation Models: Investigating Potential Emergent Behaviors in Assessing Human Health Using PPG Signals
Nonparametric Data Attribution for Diffusion Models
Stable Prediction of Adverse Events in Medical Time-Series Data
Enhancing Time-Series Anomaly Detection by Integrating Spectral-Residual Bottom-Up Attention with Reservoir Computing
Active Measuring in Reinforcement Learning With Delayed Negative Effects
LLM-ERM: Sample-Efficient Program Learning via LLM-Guided Search
DARTS-GT: Differentiable Architecture Search for Graph Transformers with Quantifiable Instance-Specific Interpretability Analysis
Jet Functors and Weil Algebras in Automatic Differentiation: A Geometric Analysis
SHaRe-SSM: An Oscillatory Spiking Neural Network for Target Variable Modeling in Long Sequences
Revisit Modality Imbalance at the Decision Layer
Interaction Concordance Index: Performance Evaluation for Interaction Prediction Methods
MergeMoE: Efficient Compression of MoE Models via Expert Output Merging
Towards geological inference with process-based and deep generative modeling, part 1: training on fluvial deposits
Coder as Editor: Code-driven Interpretable Molecular Optimization
Learning to Undo: Rollback-Augmented Reinforcement Learning with Reversibility Signals
Enhancing Time Series Forecasting through Selective Representation Spaces: A Patch Perspective
On the Identifiability of Tensor Ranks via Prior Predictive Matching
MX+: Pushing the Limits of Microscaling Formats for Efficient Large Language Model Serving
Redundancy-Aware Test-Time Graph Out-of-Distribution Detection
State-Space Models for Tabular Prior-Data Fitted Networks
Matcha: Multi-Stage Riemannian Flow Matching for Accurate and Physically Valid Molecular Docking
Multimodal RAG for Unstructured Data:Leveraging Modality-Aware Knowledge Graphs with Hybrid Retrieval
First Attentions Last: Better Exploiting First Attentions for Efficient Transformer Training
Geometric Moment Alignment for Domain Adaptation via Siegel Embeddings
Online Reliable Anomaly Detection via Neuromorphic Sensing and Communications
Tawa: Automatic Warp Specialization for Modern GPUs with Asynchronous References
The Pursuit of Diversity: Multi-Objective Testing of Deep Reinforcement Learning Agents
Causal Discovery for Linear DAGs with Dependent Latent Variables via Higher-order Cumulants
Active Jammer Localization via Acquisition-Aware Path Planning
Rethinking Hebbian Principle: Low-Dimensional Structural Projection for Unsupervised Learning
Efficient Dynamic Structured Sparse Training with Learned Shuffles
Tackling Time-Series Forecasting Generalization via Mitigating Concept Drift
Programmatic Representation Learning with Language Models
To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models
Intelligent Dynamic Handover via AI-assisted Signal Quality Prediction in 6G Multi-RAT Networks
Reinforcement Learning with Stochastic Reward Machines
Provable Unlearning with Gradient Ascent on Two-Layer ReLU Neural Networks
Backdoor Unlearning by Linear Task Decomposition
Efficient Parallel Samplers for Recurrent-Depth Models and Their Connection to Diffusion Language Models
Identity-Link IRT for Label-Free LLM Evaluation: Preserving Additivity in TVD-MI Scores
Biology-informed neural networks learn nonlinear representations from omics data to improve genomic prediction and interpretability
Joint Active RIS Configuration and User Power Control for Localization: A Neuroevolution-Based Approach
Hybrid Deep Learning Approaches for Classifying Autism from Brain MRI
Language steering in latent space to mitigate unintended code-switching
EvoEdit: Evolving Null-space Alignment for Robust and Efficient Knowledge Editing
R2T: Rule-Encoded Loss Functions for Low-Resource Sequence Tagging
An Overview of the JPEG AI Learning-Based Image Coding Standard
DeepMartingale: Duality of the Optimal Stopping Problem with Expressivity
Post-surgical Endometriosis Segmentation in Laparoscopic Videos
Switchboard-Affect: Emotion Perception Labels from Conversational Speech
Long-Term Spatio-Temporal Forecasting of Monthly Rainfall in West Bengal Using Ensemble Learning Approaches
Classifying and Addressing the Diversity of Errors in Retrieval-Augmented Generation Systems
Signature in Code Backdoor Detection, how far are we?
Dynamic SBI: Round-free Sequential Simulation-Based Inference with Adaptive Datasets
PIShield: Detecting Prompt Injection Attacks via Intrinsic LLM Features
Exact Dynamics of Multi-class Stochastic Gradient Descent
deFOREST: Fusing Optical and Radar satellite data for Enhanced Sensing of Tree-loss
David vs. Goliath: A comparative study of different-sized LLMs for code generation in the domain of automotive scenario generation
High-Dimensional BWDM: A Robust Nonparametric Clustering Validation Index for Large-Scale Data
PoissonNet: A Local-Global Approach for Learning on Surfaces
A novel Information-Driven Strategy for Optimal Regression Assessment
Flip-Flop Consistency: Unsupervised Training for Robustness to Prompt Perturbations in LLMs
Constraint-Driven Small Language Models Based on Agent and OpenAlex Knowledge Graph: Mining Conceptual Pathways and Discovering Innovation Points in Academic Papers
PluriHop: Exhaustive, Recall-Sensitive QA over Distractor-Rich Corpora
BoardVision: Deployment-ready and Robust Motherboard Defect Detection with YOLO+Faster-RCNN Ensemble
Low Power Vision Transformer Accelerator with Hardware-Aware Pruning and Optimized Dataflow
Personalized federated learning, Row-wise fusion regularization, Multivariate modeling, Sparse estimation
Noise Projection: Closing the Prompt-Agnostic Gap Behind Text-to-Image Misalignment in Diffusion Models
A Deep State-Space Model Compression Method using Upper Bound on Output Error
Parameter Identification for Partial Differential Equation with Jump Discontinuities in Coefficients by Markov Switching Model and Physics-Informed Machine Learning
Decorrelation Speeds Up Vision Transformers
Response to Discussions of "Causal and Counterfactual Views of Missing Data Models"
MCbiF: Measuring Topological Autocorrelation in Multiscale Clusterings via 2-Parameter Persistent Homology
Fast and Scalable Score-Based Kernel Calibration Tests
Leveraging Code Cohesion Analysis to Identify Source Code Supply Chain Attacks
Unifying Environment Perception and Route Choice Modeling for Trajectory Representation Learning
A Geometric Approach to Optimal Experimental Design
A Multi-Task Deep Learning Framework for Skin Lesion Classification, ABCDE Feature Quantification, and Evolution Simulation
From Loop Nests to Silicon: Mapping AI Workloads onto AMD NPUs with MLIR-AIR
Prediction-Specific Design of Learning-Augmented Algorithms
Secure Sparse Matrix Multiplications and their Applications to Privacy-Preserving Machine Learning
Learnable Mixed Nash Equilibria are Collectively Rational
Instruction Set Migration at Warehouse Scale
VT-Refine: Learning Bimanual Assembly with Visuo-Tactile Feedback via Simulation Fine-Tunin
DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation
Learning an Image Editing Model without Image Editing Pairs
Lost in the Averages: A New Specific Setup to Evaluate Membership Inference Attacks Against Machine Learning Models
Kernel Neural Operators (KNOs) for Scalable, Memory-efficient, Geometrically-flexible Operator Learning
Boosting Graph Foundation Model from Structural Perspective
Hopfield-Fenchel-Young Networks: A Unified Framework for Associative Memory Retrieval
REX: Causal discovery based on machine learning and explainability techniques
Exploring the Noise Robustness of Online Conformal Prediction
Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Bootstrapping
FairBatching: Fairness-Aware Batch Formation for LLM Inference
MedTrust-RAG: Evidence Verification and Trust Alignment for Biomedical Question Answering
The Role of Social Learning and Collective Norm Formation in Fostering Cooperation in LLM Multi-Agent Systems
Instructions are all you need: Self-supervised Reinforcement Learning for Instruction Following
Big Data Approaches to Bovine Bioacoustics: A FAIR-Compliant Dataset and Scalable ML Framework for Precision Livestock Welfare
A Free Lunch in LLM Compression: Revisiting Retraining after Pruning
Feature Selection and Regularization in Multi-Class Classification: An Empirical Study of One-vs-Rest Logistic Regression with Gradient Descent Optimization and L1 Sparsity Constraints
Towards Adaptable Humanoid Control via Adaptive Motion Tracking
Holdout-Loss-Based Data Selection for LLM Finetuning via In-Context Learning
LiRA: Linguistic Robust Anchoring for Cross-lingual Large Language Models
Stealthy Dual-Trigger Backdoors: Attacking Prompt Tuning in LM-Empowered Graph Foundation Models
Semantic representations emerge in biologically inspired ensembles of cross-supervising neural networks
From Guess2Graph: When and How Can Unreliable Experts Safely Boost Causal Discovery in Finite Samples?
E2Edev: Benchmarking Large Language Models in End-to-End Software Development Task
State Your Intention to Steer Your Attention: An AI Assistant for Intentional Digital Living
Real-Time Surgical Instrument Defect Detection via Non-Destructive Testing
Agentic Entropy-Balanced Policy Optimization
Selective Labeling with False Discovery Rate Control
Local Causal Discovery for Statistically Efficient Causal Inference
STANCE: Motion Coherent Video Generation Via Sparse-to-Dense Anchored Encoding
Just-In-Time Objectives: A General Approach for Specialized AI Interactions
Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering
An Active Inference Model of Mouse Point-and-Click Behaviour
Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures
Code-driven Number Sequence Calculation: Enhancing the inductive Reasoning Abilities of Large Language Models
LeapFactual: Reliable Visual Counterfactual Explanation Using Conditional Flow Matching
GemiRec: Interest Quantization and Generation for Multi-Interest Recommendation
RLAIF-SPA: Optimizing LLM-based Emotional Speech Synthesis via RLAIF
Causality Enhancement for Cross-Domain Recommendation
The Bidding Games: Reinforcement Learning for MEV Extraction on Polygon Blockchain
In-Context Learning with Unpaired Clips for Instruction-based Video Editing
Galaxy Morphology Classification with Counterfactual Explanation
An Efficient Rubric-based Generative Verifier for Search-Augmented LLMs
When Planners Meet Reality: How Learned, Reactive Traffic Agents Shift nuPlan Benchmarks
xLLM Technical Report
FedPPA: Progressive Parameter Alignment for Personalized Federated Learning
Where are the Whales: A Human-in-the-loop Detection Method for Identifying Whales in High-resolution Satellite Imagery
Camera Movement Classification in Historical Footage: A Comparative Study of Deep Video Models
Seesaw: Accelerating Training by Balancing Learning Rate and Batch Size Scheduling
DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models
Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries
COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes
Inpainting the Red Planet: Diffusion Models for the Reconstruction of Martian Environments in Virtual Reality
Finding Answers in Thought Matters: Revisiting Evaluation on Large Language Models with Reasoning
Cross-Scenario Unified Modeling of User Interests at Billion Scale
Morphology-Aware Prognostic model for Five-Year Survival Prediction in Colorectal Cancer from H&E Whole Slide Images
Scaling Artificial Intelligence for Multi-Tumor Early Detection with More Reports, Fewer Masks
RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning
Benchmarking Multimodal Large Language Models for Face Recognition
Predicting kernel regression learning curves from only raw data statistics
Learning When Not to Learn: Risk-Sensitive Abstention in Bandits with Unbounded Rewards
Detecting Early and Implicit Suicidal Ideation via Longitudinal and Information Environment Signals on Social Media
Reasoning with Sampling: Your Base Model is Smarter Than You Think
MaskCaptioner : Learning to Jointly Segment and Caption Object Trajectories in Videos
Predicting Task Performance with Context-aware Scaling Laws
Circuit Insights: Towards Interpretability Beyond Activations
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding
MetaBench: A Multi-task Benchmark for Assessing LLMs in Metabolomics
Architecture Is All You Need: Diversity-Enabled Sweet Spots for Robust Humanoid Locomotion
RealDPO: Real or Not Real, that is the Preference
CBF-RL: Safety Filtering Reinforcement Learning in Training with Control Barrier Functions
C4D: 4D Made from 3D through Dual Correspondences
Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents
RDD: Retrieval-Based Demonstration Decomposer for Planner Alignment in Long-Horizon Tasks
LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar
Attention Is All You Need for KV Cache in Diffusion LLMs
pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation
WithAnyone: Towards Controllable and ID Consistent Image Generation
Terra: Explorable Native 3D World Model with Point Latents
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale
Coupled Diffusion Sampling for Training-Free Multi-View Image Editing
Domain-Independent Dynamic Programming
TriQXNet: Forecasting Dst Index from Solar Wind Data Using an Interpretable Parallel Classical-Quantum Framework with Uncertainty Quantification
DELE: Deductive $\mathcal{EL}^{++}$ Embeddings for Knowledge Base Completion
Robust Counterfactual Inference in Markov Decision Processes
PoE-World: Compositional World Modeling with Products of Programmatic Experts
MAFA: A multi-agent framework for annotation
EMAC+: Embodied Multimodal Agent for Collaborative Planning with VLM+LLM
Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance
Interpretability and Transparency-Driven Detection and Transformation of Textual Adversarial Examples (IT-DT)
Quantum Polar Metric Learning: Efficient Classically Learned Quantum Embeddings
Natural Language Processing RELIES on Linguistics
Why do explanations fail? A typology and discussion on failures in XAI
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference
Visual Stereotypes of Autism Spectrum in Janus-Pro-7B, DALL-E, Stable Diffusion, SDXL, FLUX, and Midjourney
Say My Name: a Model's Bias Discovery Framework
The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition
MIO: A Foundation Model on Multimodal Tokens
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe
Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision
VoxelPrompt: A Vision Agent for End-to-End Medical Image Analysis
CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment
AI-generated Essays: Characteristics and Implications on Automated Scoring and Academic Integrity
Disentangled and Self-Explainable Node Representation Learning
An AI-Driven Multimodal Smart Home Platform for Continuous Monitoring and Assistance in Post-Stroke Motor Impairment
Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
VERITAS: Verifying the Performance of AI-native Transceiver Actions in Base-Stations
The Last Dependency Crusade: Solving Python Dependency Conflicts with LLMs
FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling
The simulation of judgment in LLMs
Evaluating Sakana's AI Scientist: Bold Claims, Mixed Results, and a Promising Future?
Never too Prim to Swim: An LLM-Enhanced RL-based Adaptive S-Surface Controller for AUVs under Extreme Sea Conditions
A Neural Symbolic Model for Space Physics
Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning
EDIT: Enhancing Vision Transformers by Mitigating Attention Sink through an Encoder-Decoder Architecture
Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge
On Equivariance and Fast Sampling in Video Diffusion Models Trained with Warped Noise
Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
LLMs' Suitability for Network Security: A Case Study of STRIDE Threat Modeling
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Internet of Agents: Fundamentals, Applications, and Challenges
ConDiSim: Conditional Diffusion Models for Simulation Based Inference
APEX: Empowering LLMs with Physics-Based Task Planning for Real-time Insight
Checkpoint-GCG: Auditing and Attacking Fine-Tuning-Based Prompt Injection Defenses
KL-regularization Itself is Differentially Private in Bandits and RLHF
Thinker: Learning to Think Fast and Slow
Adaptive Budget Allocation for Orthogonal-Subspace Adapter Tuning in LLMs Continual Learning
ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling
SimKO: Simple Pass@K Policy Optimization
Agentic NL2SQL to Reduce Computational Costs
RoboGPT-R1: Enhancing Robot Planning with Reinforcement Learning
Boosting Instruction Following at Scale
Where to Search: Measure the Prior-Structured Search Space of LLM Agents
LabOS: The AI-XR Co-Scientist That Sees and Works With Humans
The Gatekeeper Knows Enough
Mapping Smarter, Not Harder: A Test-Time Reinforcement Learning Agent That Improves Without Labels or Model Updates
Budget-aware Test-time Scaling via Discriminative Verification
TRI-DEP: A Trimodal Comparative Study for Depression Detection Using Speech, Text, and EEG
Stable but Miscalibrated: A Kantian View on Overconfidence from Filters to Large Language Models
GroundedPRM: Tree-Guided and Fidelity-Aware Process Reward Modeling for Step-Level Reasoning
Agentic Design of Compositional Machines
Generative AI in Heritage Practice: Improving the Accessibility of Heritage Guidance
Reversing the Lens: Using Explainable AI to Understand Human Expertise
GQVis: A Dataset of Genomics Data Questions and Visualizations for Generative AI
Leveraging Wireless Sensor Networks for Real-Time Monitoring and Control of Industrial Environments
A2AS: Agentic AI Runtime Security and Self-Defense
Towards Neurocognitive-Inspired Intelligence: From AI's Structural Mimicry to Human-Like Functional Cognition
Bridging the Semantic Gap: Contrastive Rewards for Multilingual Text-to-SQL
A Linguistics-Aware LLM Watermarking via Syntactic Predictability
Users as Annotators: LLM Preference Learning from Comparison Mode
Informed Routing in LLMs: Smarter Token-Level Computation for Faster Inference
Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning
ConDABench: Interactive Evaluation of Language Models for Data Analysis
SIMBA UQ: Similarity-Based Aggregation for Uncertainty Quantification in Large Language Models
Seeing Hate Differently: Hate Subspace Modeling for Culture-Aware Hate Speech Detection
Meronymic Ontology Extraction via Large Language Models
ADMIT: Few-shot Knowledge Poisoning Attacks on RAG-based Fact Checking
Serialized EHR make for good text representations
Information flow in multilayer perceptrons: an in-depth analysis
DynaSpec: Context-aware Dynamic Speculative Sampling for Large-Vocabulary Language Models
On-device System of Compositional Multi-tasking in Large Language Models
Revisiting the UID Hypothesis in LLM Reasoning Traces
ConsistencyAI: A Benchmark to Assess LLMs' Factual Consistency When Responding to Different Demographic Groups
BenchPress: A Human-in-the-Loop Annotation System for Rapid Text-to-SQL Benchmark Curation
Harnessing Consistency for Robust Test-Time LLM Ensemble
Multimodal Retrieval-Augmented Generation with Large Language Models for Medical VQA
From Craft to Constitution: A Governance-First Paradigm for Principled Agent Engineering
Benchmarking Correctness and Security in Multi-Turn Code Generation
ShishuLM: Lightweight Language Model with Hybrid Decoder-MLP Architecture and Paired Weight Sharing
Ensembling Large Language Models to Characterize Affective Dynamics in Student-AI Tutor Dialogues
Self-Training with Dynamic Weighting for Robust Gradual Domain Adaptation
Deep Edge Filter: Return of the Human-Crafted Layer in Deep Learning
FFT-Accelerated Auxiliary Variable MCMC for Fermionic Lattice Models: A Determinant-Free Approach with $O(N\log N)$ Complexity
CoLoR-GAN: Continual Few-Shot Learning with Low-Rank Adaptation in Generative Adversarial Networks
Unlocking the Potential of Diffusion Language Models through Template Infilling
Joint Discriminative-Generative Modeling via Dual Adversarial Training
FRACCO: A gold-standard annotated corpus of oncological entities with ICD-O-3.1 normalisation
What Layers When: Learning to Skip Compute in LLMs with Residual Gates
Catch Your Breath: Adaptive Computation for Self-Paced Sequence Production
PAGE: Prompt Augmentation for text Generation Enhancement
Order from Chaos: Comparative Study of Ten Leading LLMs on Unstructured Data Categorization
Physics-Informed autoencoder for DSC-MRI Perfusion post-processing: application to glioma grading
Incomplete Multi-view Clustering via Hierarchical Semantic Alignment and Cooperative Completion
Reliable Fine-Grained Evaluation of Natural Language Math Proofs
A Survey on Collaborating Small and Large Language Models for Performance, Cost-effectiveness, Cloud-edge Privacy, and Trustworthiness
K-frames: Scene-Driven Any-k Keyframe Selection for long video understanding
Guarding the Guardrails: A Taxonomy-Driven Approach to Jailbreak Detection
Bayes or Heisenberg: Who(se) Rules?
GenCellAgent: Generalizable, Training-Free Cellular Image Segmentation via Large Language Model Agents
Dual-attention ResNet outperforms transformers in HER2 prediction on DCE-MRI
Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences
Benefits and Limitations of Communication in Multi-Agent Reasoning
Schema for In-Context Learning
Knowledge Reasoning Language Model: Unifying Knowledge and Language for Inductive Knowledge Graph Reasoning
AI Debaters are More Persuasive when Arguing in Alignment with Their Own Beliefs
Synthesizing Agentic Data for Web Agents with Progressive Difficulty Enhancement Mechanisms
Readability $\ne$ Learnability: Rethinking the Role of Simplicity in Training Small Language Models
LLMs Can Get "Brain Rot"!
Big Reasoning with Small Models: Instruction Retrieval at Inference Time
Less is More: Improving LLM Reasoning with Minimal Test-Time Intervention
Static Sandboxes Are Inadequate: Modeling Societal Complexity Requires Open-Ended Co-Evolution in LLM-Based Multi-Agent Simulations
Efficient Few-Shot Learning in Remote Sensing: Fusing Vision and Vision-Language Models
Finding Holes: Pathologist Level Performance Using AI for Cribriform Morphology Detection in Prostate Cancer
REAP the Experts: Why Pruning Prevails for One-Shot MoE compression
Conditional Clifford-Steerable CNNs with Complete Kernel Basis for PDE Modeling
Context-Selective State Space Models: Feedback is All You Need
Think Globally, Group Locally: Evaluating LLMs Using Multi-Lingual Word Grouping Games
One Bug, Hundreds Behind: LLMs for Large-Scale Bug Discovery
Cyber-Resilient System Identification for Power Grid through Bayesian Integration
Optical Computation-in-Communication enables low-latency, high-fidelity perception in telesurgery
On the expressivity of sparse maxout networks
Exploratory Causal Inference in SAEnce
DiffOPF: Diffusion Solver for Optimal Power Flow
Every Language Model Has a Forgery-Resistant Signature
Unlocking Out-of-Distribution Generalization in Transformers via Recursive Latent Space Reasoning
Extracting latent representations from X-ray spectra. Classification, regression, and accretion signatures of Chandra sources
Toward Cybersecurity-Expert Small Language Models
Inferred global dense residue transition graphs from primary structure sequences enable protein interaction prediction via directed graph convolutional neural networks
FinAI Data Assistant: LLM-based Financial Database Query Processing with the OpenAI Function Calling API
Towards Reversible Model Merging For Low-rank Weights
Virtually Being: Customizing Camera-Controllable Video Diffusion Models with Multi-View Performance Captures
MAFA: A Multi-Agent Framework for Enterprise-Scale Annotation with Configurable Task Adaptation
DPRF: A Generalizable Dynamic Persona Refinement Framework for Optimizing Behavior Alignment Between Personalized LLM Role-Playing Agents and Humans
LiteStage: Latency-aware Layer Skipping for Multi-stage Reasoning
Large Scale Retrieval for the LinkedIn Feed using Causal Language Models
Scaling Test-Time Compute to Achieve IOI Gold Medal with Open-Weight Models
Spatial Computing Communications for Multi-User Virtual Reality in Distributed Mobile Edge Computing Network
Reinforcement Learning for Unsupervised Domain Adaptation in Spatio-Temporal Echocardiography Segmentation
Policy Regularized Distributionally Robust Markov Decision Processes with Linear Function Approximation
Do Joint Language-Audio Embeddings Encode Perceptual Timbre Semantics?
CAST: Compositional Analysis via Spectral Tracking for Understanding Transformer Layer Functions
Less is More: Denoising Knowledge Graphs For Retrieval Augmented Generation
PRISM: Agentic Retrieval with LLMs for Multi-Hop Question Answering
Beyond a Single Perspective: Towards a Realistic Evaluation of Website Fingerprinting Attacks
Learning Human-Humanoid Coordination for Collaborative Object Carrying
TED++: Submanifold-Aware Backdoor Detection via Layerwise Tubular-Neighbourhood Screening
Expertise need not monopolize: Action-Specialized Mixture of Experts for Vision-Language-Action Learning
Watermarking for Factuality: Guiding Vision-Language Models Toward Truth via Tri-layer Contrastive Decoding
MERLIN: A Testbed for Multilingual Multimodal Entity Recognition and Linking
Column Generation Using Domain-Independent Dynamic Programming
Evaluating & Reducing Deceptive Dialogue From Language Models with Multi-turn RL
A Robust Classification Method using Hybrid Word Embedding for Early Diagnosis of Alzheimer's Disease
Stop-RAG: Value-Based Retrieval Control for Iterative RAG
A Density-Informed Multimodal Artificial Intelligence Framework for Improving Breast Cancer Detection Across All Breast Densities
BinCtx: Multi-Modal Representation Learning for Robust Android App Behavior Detection
Beyond One World: Benchmarking Super Heros in Role-Playing Across Multiversal Contexts
CURE: Confidence-driven Unified Reasoning Ensemble Framework for Medical Question Answering
SUM-AgriVLN: Spatial Understanding Memory for Agricultural Vision-and-Language Navigation
From Binary to Bilingual: How the National Weather Service is Using Artificial Intelligence to Develop a Comprehensive Translation Program
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers
Beat Detection as Object Detection
Decision Oriented Technique (DOTechnique): Finding Model Validity Through Decision-Maker Context
Do Slides Help? Multi-modal Context for Automatic Transcription of Conference Talks
Do Large Language Models Show Biases in Causal Learning? Insights from Contingency Judgment
GammaZero: Learning To Guide POMDP Belief Space Search With Graph Representations
Position: Require Frontier AI Labs To Release Small "Analog" Models
Generating Fair Consensus Statements with Social Choice on Token-Level MDPs
STEMS: Spatial-Temporal Enhanced Safe Multi-Agent Coordination for Building Energy Management
Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems
A Multimodal Approach to Heritage Preservation in the Context of Climate Change
CodeEvolve: An open source evolutionary coding agent for algorithm discovery and optimization
Combining Reinforcement Learning and Behavior Trees for NPCs in Video Games with AMD Schola
JEDA: Query-Free Clinical Order Search from Ambient Dialogues
ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning
Implementation of AI in Precision Medicine
Echoes of Human Malice in Agents: Benchmarking LLMs for Multi-Turn Online Harassment Attacks
LiveResearchBench: A Live Benchmark for User-Centric Deep Research in the Wild
Towards Agentic Self-Learning LLMs in Search Environment
MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
A Guardrail for Safety Preservation: When Safety-Sensitive Subspace Meets Harmful-Resistant Null-Space
Terrarium: Revisiting the Blackboard for Multi-Agent Safety, Privacy, and Security Studies
Metacognitive Self-Correction for Multi-Agent System via Prototype-Guided Next-Execution Reconstruction
AI for Service: Proactive Assistance with AI Glasses
Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch?
Hi-Agent: Hierarchical Vision-Language Agents for Mobile Device Control
IMAGINE: Integrating Multi-Agent System into One Model for Complex Reasoning and Planning
Eliminating Negative Occurrences of Derived Predicates from PDDL Axioms
Helmsman: Autonomous Synthesis of Federated Learning Systems via Multi-Agent Collaboration
JSPLIT: A Taxonomy-based Solution for Prompt Bloating in Model Context Protocol
Symbol Grounding in Neuro-Symbolic AI: A Gentle Introduction to Reasoning Shortcuts
LLM Agents Beyond Utility: An Open-Ended Perspective
ColorBench: Benchmarking Mobile Agents with Graph-Structured Framework for Complex Long-Horizon Tasks
Beyond Hallucinations: The Illusion of Understanding in Large Language Models
Machine Learning and Public Health: Identifying and Mitigating Algorithmic Bias through a Systematic Review
TITAN: Graph-Executable Reasoning for Cyber Threat Intelligence
NAEL: Non-Anthropocentric Ethical Logic
Practical, Utilitarian Algorithm Configuration
Purifying Task Vectors in Knowledge-Aware Subspace for Model Merging
Cognitive-Aligned Spatio-Temporal Large Language Models For Next Point-of-Interest Prediction

Research Sources: 620 | Generated: 10/18/2025