AI Research News Feeds for October 16th, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

EduDial: Constructing a Large-scale Multi-turn Teacher-Student Dialogue Corpus
Who's Asking? Evaluating LLM Robustness to Inquiry Personas in Factual Question Answering
The Curious Case of Curiosity across Human Cultures and LLMs
3-Model Speculative Decoding
A Multilingual, Large-Scale Study of the Interplay between LLM Safeguards, Personalisation, and Disinformation
OPLoRA: Orthogonal Projection LoRA Prevents Catastrophic Forgetting during Parameter-Efficient Fine-Tuning
On the Role of Preference Variance in Preference Optimization
GatePro: Parameter-Free Expert Selection Optimization for Mixture-of-Experts Models
I Am Aligned, But With Whom? MENA Values Benchmark for Evaluating Cultural Alignment and Multilingual Bias in LLMs
Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference
A Matter of Representation: Towards Graph-Based Abstract Code Generation
CoT-Evo: Evolutionary Distillation of Chain-of-Thought for Scientific Reasoning
Putting on the Thinking Hats: A Survey on Chain of Thought Fine-tuning from the Perspective of Human Reasoning Mechanism
DSCD: Large Language Model Detoxification with Self-Constrained Decoding
SHIELD: Classifier-Guided Prompting for Robust and Safer LVLMs
Grounding Long-Context Reasoning with Contextual Normalization for Retrieval-Augmented Generation
Text Anomaly Detection with Simplified Isolation Kernel
A fully automated and scalable Parallel Data Augmentation for Low Resource Languages using Image and Text Analytics
Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain
Do You Get the Hint? Benchmarking LLMs on the Board Game Concept
Beyond Correctness: Rewarding Faithful Reasoning in Retrieval-Augmented Generation
In-Distribution Steering: Balancing Control and Coherence in Language Model Generation
Mismatch Aware Guidance for Robust Emotion Control in Auto-Regressive TTS Models
ChatR1: Reinforcement Learning for Conversational Reasoning and Retrieval Augmented Question Answering
Embedding-Based Context-Aware Reranker
Taming the Fragility of KV Cache Eviction in LLM Inference
Are Proverbs the New Pythian Oracles? Exploring Sentiment in Greek Sayings
D-SMART: Enhancing LLM Dialogue Consistency via Dynamic Structured Memory And Reasoning Tree
Make an Offer They Can't Refuse: Grounding Bayesian Persuasion in Real-World Dialogues without Pre-Commitment
Doing Things with Words: Rethinking Theory of Mind Simulation in Large Language Models
Investigating Lexical Change through Cross-Linguistic Colexification Patterns
Evaluating Arabic Large Language Models: A Survey of Benchmarks, Methods, and Gaps
Beyond Single-Reward: Multi-Pair, Multi-Perspective Preference Optimization for Machine Translation
Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization
Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models
FreshTab: Sourcing Fresh Data for Table-to-Text Generation Evaluation
MemoTime: Memory-Augmented Temporal Knowledge Graph Enhanced Large Language Model Reasoning
How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study
GAPS: A Clinically Grounded, Automated Benchmark for Evaluating AI Clinicians
Assessing Web Search Credibility and Response Groundedness in Chat Assistants
Confidence-Based Response Abstinence: Improving LLM Trustworthiness via Activation-Based Uncertainty Estimation
The Mechanistic Emergence of Symbol Grounding in Language Models
Breadcrumbs Reasoning: Memory-Efficient Reasoning with Compression Beacons
BRIEF-Pro: Universal Context Compression with Short-to-Long Synthesis for Fast and Accurate Multi-Hop Reasoning
Toward LLM-Supported Automated Assessment of Critical Thinking Subskills
Unifying Vision-Language Latents for Zero-label Image Caption Enhancement
UNCAP: Uncertainty-Guided Planning Using Natural Language Communication for Cooperative Autonomous Vehicles
Addressing the alignment problem in transportation policy making: an LLM approach
MMLongCite: A Benchmark for Evaluating Fidelity of Long-Context Vision-Language Models
Two Heads Are Better Than One: Audio-Visual Speech Error Correction with Dual Hypotheses
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE
LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models
Towards Region-aware Bias Evaluation Metrics
ICA-RAG: Information Completeness Guided Adaptive Retrieval-Augmented Generation for Disease Diagnosis
Teaching Models to Understand (but not Generate) High-risk Data
What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs
RPM: Reasoning-Level Personalization for Black-Box Large Language Models
RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments
A Linguistically Motivated Analysis of Intonational Phrasing in Text-to-Speech Systems: Revealing Gaps in Syntactic Sensitivity
The Landscape of Arabic Large Language Models (ALLMs): A New Era for Arabic Language Technology
MMD-Flagger: Leveraging Maximum Mean Discrepancy to Detect Hallucinations
SemVink: Advancing VLMs' Semantic Understanding of Optical Illusions via Visual Global Thinking
KG2QA: Knowledge Graph-enhanced Retrieval-augmented Generation for Communication Standards Question Answering
Aligning Large Language Models to Low-Resource Languages through LLM-Based Selective Translation: A Systematic Study
Assessing the Latent Automated Program Repair Capabilities of Large Language Models using Round-Trip Translation
Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning
Benchmarking LLMs' Swarm intelligence
MERIT: Multilingual Semantic Retrieval with Interleaved Multi-Condition Query
Cross-modal Associations in Vision and Language Models: Revisiting the Bouba-Kiki Effect
Learning to Explore in Diverse Reward Settings via Temporal-Difference-Error Maximization
HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language Models
Riemannian generative decoder
Pad\'e Approximant Neural Networks for Enhanced Electric Motor Fault Diagnosis Using Vibration and Acoustic Data
Context-Action Embedding Learning for Off-Policy Evaluation in Contextual Bandits
PPA-Game: Characterizing and Learning Competitive Dynamics Among Online Content Creators
Learning to sample fibers for goodness-of-fit testing
Beyond the noise: intrinsic dimension estimation with optimal neighbourhood identification
Quantum Circuit Synthesis and Compilation Optimization: Overview and Prospects
Persistent Homology via Ellipsoids
Computing Systemic Risk Measures with Graph Neural Networks
LLMBridge: Reducing Costs in a Prompt-Centric Internet
Enhancing the reliability of machine learning for gravitational wave parameter estimation with attention-based models
A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics
Gen-C: Populating Virtual Worlds with Generative Crowds
BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals
A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random
Physics-Based Machine Learning Closures and Wall Models for Hypersonic Transition-Continuum Boundary Layer Predictions
Efficient Restarts in Non-Stationary Model-Free Reinforcement Learning
On efficiently computable functions, deep networks and sparse compositionality
QLENS: Towards A Quantum Perspective of Language Transformers
Learning by Steering the Neural Dynamics: A Statistical Mechanics Perspective
Nonlinear discretizations and Newton's method: characterizing stationary points of regression objectives
Mamba Can Learn Low-Dimensional Targets In-Context via Test-Time Feature Learning
Influence Dynamics and Stagewise Data Attribution
GraphShaper: Geometry-aware Alignment for Improving Transfer Learning in Text-Attributed Graphs
H4G: Unlocking Faithful Inference for Zero-Shot Graph Learning in Hyperbolic Space
Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning
nuGPR: GPU-Accelerated Gaussian Process Regression with Iterative Algorithms and Low-Rank Approximations
Graph Few-Shot Learning via Adaptive Spectrum Experts and Cross-Set Distribution Calibration
Fairness-Constrained Optimization Attack in Federated Learning
Self-Verifying Reflection Helps Transformers with CoT Reasoning
Hierarchical Koopman Diffusion: Fast Generation with Interpretable Diffusion Trajectory
Unveiling the Vulnerability of Graph-LLMs: An Interpretable Multi-Dimensional Adversarial Attack on TAGs
Optimal Regularization for Performative Learning
FedMMKT:Co-Enhancing a Server Text-to-Image Model and Client Task Models in Multi-Modal Federated Learning
Multi-Action Self-Improvement for Neural Combinatorial Optimization
General Fourier Feature Physics-Informed Extreme Learning Machine (GFF-PIELM) for High-Frequency PDEs
Leveraging Teleconnections with Physics-Informed Graph Attention Networks for Long-Range Extreme Rainfall Forecasting in Thailand
Traveling Salesman-Based Token Ordering Improves Stability in Homomorphically Encrypted Language Models
Towards Cross-Modal Error Detection with Tables and Images
Enhanced Pre-training of Graph Neural Networks for Million-Scale Heterogeneous Graphs
Cautious Weight Decay
Continuous Uniqueness and Novelty Metrics for Generative Modeling of Inorganic Crystals
Bayesian Optimization for Dynamic Pricing and Learning
Time-Correlated Video Bridge Matching
CrossAD: Time Series Anomaly Detection with Cross-scale Associations and Cross-window Modeling
Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance
Multi-Armed Bandits with Minimum Aggregated Revenue Constraints
Research in Collaborative Learning Does Not Serve Cross-Silo Federated Learning in Practice
Towards Fast Coarse-graining and Equation Discovery with Foundation Inference Models
Expert or not? assessing data quality in offline reinforcement learning
On Foundation Models for Temporal Point Processes to Accelerate Scientific Discovery
Towards Foundation Inference Models that Learn ODEs In-Context
Structured Sparsity and Weight-adaptive Pruning for Memory and Compute efficient Whisper models
Structure-Aware Spectral Sparsification via Uniform Edge Sampling
Keep Calm and Avoid Harmful Content: Concept Alignment and Latent Manipulation Towards Safer Answers
CoRA: Covariate-Aware Adaptation of Time Series Foundation Models
Few Shot Semi-Supervised Learning for Abnormal Stop Detection from Sparse GPS Trajectories
Multitask finetuning and acceleration of chemical pretrained models for small molecule drug property prediction
CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression
Improving Decision Trees through the Lens of Parameterized Local Search
Doctor Rashomon and the UNIVERSE of Madness: Variable Importance with Unobserved Confounding and the Rashomon Effect
KoALA: KL-L0 Adversarial Detector via Label Agreement
Sample-Efficient Omniprediction for Proper Losses
scPPDM: A Diffusion Model for Single-Cell Drug-Response Prediction
Multi-objective Bayesian Optimization with Human-in-the-Loop for Flexible Neuromorphic Electronics Fabrication
Quantum Kernel Methods: Convergence Theory, Separation Bounds and Applications to Marketing Analytics
PRISM: Enhancing Protein Inverse Folding through Fine-Grained Retrieval on Structure-Sequence Multimodal Representations
Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models
On Thompson Sampling and Bilateral Uncertainty in Additive Bayesian Optimization
Active Subspaces in Infinite Dimension
High-Probability Bounds For Heterogeneous Local Differential Privacy
Simplifying Optimal Transport through Schatten-$p$ Regularization
Enhancing Diffusion-Based Sampling with Molecular Collective Variables
Statistical Guarantees for High-Dimensional Stochastic Gradient Descent
Embedding the Teacher: Distilling vLLM Preferences for Scalable Image Retrieval
MIARec: Mutual-influence-aware Heterogeneous Network Embedding for Scientific Paper Recommendation
Compressibility Measures Complexity: Minimum Description Length Meets Singular Learning Theory
FedLoDrop: Federated LoRA with Dropout for Generalized LLM Fine-tuning
Locket: Robust Feature-Locking Technique for Language Models
Probabilistic Super-Resolution for Urban Micrometeorology via a Schr\"odinger Bridge
Follow-the-Perturbed-Leader for Decoupled Bandits: Best-of-Both-Worlds and Practicality
Learning Mean-Field Games through Mean-Field Actor-Critic Flow
Controllable Collision Scenario Generation via Collision Pattern Prediction
A Gradient Guided Diffusion Framework for Chance Constrained Programming
The Living Forecast: Evolving Day-Ahead Predictions into Intraday Reality
Heterogeneous RBCs via deep multi-agent reinforcement learning
DeepTrust: Multi-Step Classification through Dissimilar Adversarial Representations for Robust Android Malware Detection
Learning Latent Energy-Based Models via Interacting Particle Langevin Dynamics
Pretraining in Actor-Critic Reinforcement Learning for Robot Motion Control
Constrained Sensing and Reliable State Estimation with Shallow Recurrent Decoders on a TRIGA Mark II Reactor
Improved Central Limit Theorem and Bootstrap Approximations for Linear Stochastic Approximation
Improving Generative Behavior Cloning via Self-Guidance and Adaptive Chunking
Robot Learning: A Tutorial
Geopolitics, Geoeconomics and Risk:A Machine Learning Approach
Neural Guided Sampling for Quantum Circuit Optimization
Formal Models and Convergence Analysis for Context-Aware Security Verification
Diff-XYZ: A Benchmark for Evaluating Diff Understanding
Why the noise model matters: A performance gap in learned regularization
Universal Adaptive Environment Discovery
Same model, better performance: the impact of shuffling on DNA Language Models benchmarking
Adapting Noise to Data: Generative Flows from 1D Processes
Contraction and entropy production in continuous-time Sinkhorn dynamics
Data-Model Co-Evolution: Growing Test Sets to Refine LLM Behavior
Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps
Wavefront Coding for Accommodation-Invariant Near-Eye Displays
Posterior Sampling for Continuing Environments
WW-FL: Secure and Private Large-Scale Federated Learning
IBCL: Zero-shot Model Generation under Stability-Plasticity Trade-offs
Competitive Advantage Attacks to Decentralized Federated Learning
Optimistic Multi-Agent Policy Gradient
Toward Fair Graph Neural Networks Via Dual-Teacher Knowledge Distillation
AlphaZero Neural Scaling and Zipf's Law: a Tale of Board Games and Power Laws
Dimension Reduction with Locally Adjusted Graphs
Resource-Constrained Federated Continual Learning: What Does Matter?
Evaluating multiple models using labeled and unlabeled data
RIGNO: A Graph-based framework for robust and accurate operator learning for PDEs on arbitrary domains
Mirror Descent Actor Critic via Bounded Advantage Learning
Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation
Wasserstein-based Kernel Principal Component Analysis for Clustering Applications
MooseAgent: A LLM Based Multi-agent Framework for Automating Moose Simulation
Generative Deep Learning Framework for Inverse Design of Fuels
Newton-Puiseux Analysis for Interpretability and Calibration of Complex-Valued Neural Networks
Variational Rank Reduction Autoencoders
Panda: A pretrained forecast model for chaotic dynamics
Narrow Operator Models of Stellarator Equilibria in Fourier Zernike Basis
K-Merge: Online Continual Merging of Adapters for On-device Large Language Models
In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers
Modeling Cultural Bias in Facial Expression Recognition with Adaptive Agents
OpenDerisk: An Industrial Framework for AI-Driven SRE, with Design, Implementation, and Case Studies
Deflanderization for Game Dialogue: Balancing Character Authenticity with Task Execution in LLM-based NPCs
Subject Roles in the EU AI Act: Mapping and Regulatory Implications
NOSA: Native and Offloadable Sparse Attention
Message Passing on the Edge: Towards Scalable and Expressive GNNs
The Role of Computing Resources in Publishing Foundation Model Research
Unlocking Public Catalogues: Instruction-Tuning LLMs for ICD Coding of German Tumor Diagnoses
Closing the Gap Between Text and Speech Understanding in LLMs
Time Series Foundation Models: Benchmarking Challenges and Requirements
Axial Neural Networks for Dimension-Free Foundation Models
CanvasMAR: Improving Masked Autoregressive Video Generation With Canvas
MVCustom: Multi-View Customized Diffusion via Geometric Latent Rendering and Completion
Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents
Dedelayed: Deleting remote inference delay via on-device correction
NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching
FIRST: Federated Inference Resource Scheduling Toolkit for Scientific AI Model Access
Multi-Scale High-Resolution Logarithmic Grapher Module for Efficient Vision GNNs
RECODE: Reasoning Through Code Generation for Visual Question Answering
Scaling Vision Transformers for Functional MRI with Flat Maps
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
The Art of Scaling Reinforcement Learning Compute for LLMs
Provably Invincible Adversarial Attacks on Reinforcement Learning Systems: A Rate-Distortion Information-Theoretic Approach
Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs
Generative Universal Verifier as Multimodal Meta-Reasoner
Quantile Markov Decision Process
Translating Regulatory Clauses into Executable Codes for Building Design Checking via Large Language Model Driven Function Matching and Composing
Improving Planning with Large Language Models: A Modular Agentic Architecture
Sentiment and Emotion-aware Multi-criteria Fuzzy Group Decision Making System
Reinforcing Competitive Multi-Agents for Playing 'So Long Sucker'
AI Realtor: Towards Grounded Persuasive Language Generation for Automated Copywriting
LLM-Enabled In-Context Learning for Data Collection Scheduling in UAV-assisted Sensor Networks
Deep Generative Prior for First Order Inverse Optimization
GUARDIAN: Safeguarding LLM Multi-Agent Collaborations with Temporal Graph Modeling
MSEarth: A Multimodal Scientific Dataset and Benchmark for Phenomena Uncovering in Earth Science
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning
Nash Equilibria, Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization
MULTI: Multimodal Understanding Leaderboard with Text and Images
Do LLM Agents Have Regret? A Case Study in Online Learning and Games
A Comprehensive Survey on Data Augmentation
Extreme Compression of Adaptive Neural Images
Beyond Visual Appearances: Privacy-sensitive Objects Identification via Hybrid Graph Reasoning
Hi-Drive: Hierarchical POMDP Planning for Safe Autonomous Driving in Diverse Urban Environments
Temporal-Difference Variational Continual Learning
Optimal Quantization for Matrix Multiplication
ProReason: Multi-Modal Proactive Reasoning with Decoupled Eyesight and Wisdom
On the Limits of Language Generation: Trade-Offs Between Hallucination and Mode Collapse
Semantically Guided Action Anticipation
SoundnessBench: A Soundness Benchmark for Neural Network Verifiers
BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery
Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process
FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model
PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection
Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
A Personalized Data-Driven Generative Model of Human Repetitive Motion
On the Consistency of Multilingual Context Utilization in Retrieval-Augmented Generation
Statistical post-processing yields accurate probabilistic forecasts from Artificial Intelligence weather models
MIRROR: Multimodal Cognitive Reframing Therapy for Rolling with Resistance
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
Flattening Hierarchies with Policy Bootstrapping
Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning
R$^2$ec: Towards Large Recommender Models with Reasoning
ReasoningShield: Safety Detection over Reasoning Traces of Large Reasoning Models
Multi-Scale Probabilistic Generation Theory: A Unified Information-Theoretic Framework for Hierarchical Structure in Large Language Models
Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding
The quest for the GRAph Level autoEncoder (GRALE)
FLEX: A Largescale Multimodal, Multiview Dataset for Learning Structured Representations for Fitness Action Quality Assessment
Decentralizing Multi-Agent Reinforcement Learning with Temporal Causal Information
Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning
PAL: Probing Audio Encoders via LLMs - Audio Information Transfer into LLMs
Investigating the interaction of linguistic and mathematical reasoning in language models using multilingual number puzzles
A Brain-to-Population Graph Learning Framework for Diagnosing Brain Disorders
LLM Probability Concentration: How Alignment Shrinks the Generative Horizon
Orthogonal Finetuning Made Scalable
Early Signs of Steganographic Capabilities in Frontier LLMs
Adversarial Distilled Retrieval-Augmented Guarding Model for Online Malicious Intent Detection
Think as a Doctor: An Interpretable AI Approach for ICU Mortality Prediction
Schr\"odinger bridge for generative AI: Soft-constrained formulation and convergence analysis
Z0-Inf: Zeroth Order Approximation for Data Influence
WaveletDiff: Multilevel Wavelet Diffusion For Time Series Generation
Evaluating Open-Source Vision-Language Models for Multimodal Sarcasm Detection
Actor-Enriched Time Series Forecasting of Process Performance
Improving Knowledge Graph Embeddings through Contrastive Learning with Negative Statements
Robust Adversarial Reinforcement Learning in Stochastic Games via Sequence Modeling
ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty
Variational Mixture of Graph Neural Experts for Alzheimer's Disease Biomarker Recognition in EEG Brain Networks
From Literal to Liberal: A Meta-Prompting Framework for Eliciting Human-Aligned Exception Handling in Large Language Models
DeepPlanner: Scaling Planning Capability for Deep Research Agents via Advantage Shaping
SENTINEL: A Multi-Level Formal Framework for Safety Evaluation of LLM-based Embodied Agents
From Narratives to Probabilistic Reasoning: Predicting and Interpreting Drivers' Hazardous Actions in Crashes Using Large Language Model
Toward Reasoning-Centric Time-Series Analysis
Repairing Reward Functions with Human Feedback to Mitigate Reward Hacking
Emotional Cognitive Modeling Framework with Desire-Driven Objective Optimization for LLM-empowered Agent in Social Simulation
Adaptive Reasoning Executor: A Collaborative Agent System for Efficient Reasoning
Personalized Learning Path Planning with Goal-Driven Learner State Modeling
EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic Systems
An Analytical Framework to Enhance Autonomous Vehicle Perception for Smart Cities
SAJA: A State-Action Joint Attack Framework on Multi-Agent Deep Reinforcement Learning
Learnable Game-theoretic Policy Optimization for Data-centric Self-explanation Rationalization
Assessing LLM Reasoning Through Implicit Causal Chain Discovery in Climate Discourse
Mobile Coverage Analysis using Crowdsourced Data
Confidence as a Reward: Transforming LLMs into Reward Models
A Methodology for Assessing the Risk of Metric Failure in LLMs Within the Financial Domain
Tandem Training for Language Models
A Modal Logic for Temporal and Jurisdictional Classifier Models
Training LLM Agents to Empower Humans
From Refusal to Recovery: A Control-Theoretic Approach to Generative AI Guardrails
Hard2Verify: A Step-Level Verification Benchmark for Open-Ended Frontier Math
AutoCode: LLMs as Problem Setters for Competitive Programming
Benchmarking Open-Source Large Language Models for Persian in Zero-Shot and Few-Shot Learning
Cancer Diagnosis Categorization in Electronic Health Records Using Large Language Models and BioBERT: Model Performance Evaluation Study
From Noise to Signal to Selbstzweck: Reframing Human Label Variation in the Era of Post-training in NLP
MEDEQUALQA: Evaluating Biases in LLMs with Counterfactual Reasoning
Beyond Discrete Categories: Multi-Task Valence-Arousal Modeling for Pet Vocalization Analysis
Evidence Without Injustice: A New Counterfactual Test for Fair Algorithms
Classifier-Augmented Generation for Structured Workflow Prediction
Scheming Ability in LLM-to-LLM Strategic Interactions
Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
Mathematics with large language models as provers and verifiers
Gobernanza y trazabilidad "a prueba de AI Act" para casos de uso legales: un marco t\'ecnico-jur\'idico, m\'etricas forenses y evidencias auditables
MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training
Coherent Load Profile Synthesis with Conditional Diffusion for LV Distribution Network Scenario Generation
Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction
Repurposing Annotation Guidelines to Instruct LLM Annotators: A Case Study
Semantic knowledge guides innovation and drives cultural evolution
A\textsuperscript{2}FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning
FaStFACT: Faster, Stronger Long-Form Factuality Evaluations in LLMs
VLURes: Benchmarking VLM Visual and Linguistic Understanding in Low-Resource Languages
Ethic-BERT: An Enhanced Deep Learning Model for Ethical and Non-Ethical Content Classification
Efficient Adaptive Transformer: An Empirical Study and Reproducible Framework
Adaptive Generation of Bias-Eliciting Questions for LLMs
A Critical Review of the Need for Knowledge-Centric Evaluation of Quranic Recitation
Three Lenses on the AI Revolution: Risk, Transformation, Continuity
KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems
InferA: A Smart Assistant for Cosmological Ensemble Data
HyWA: Hypernetwork Weight Adapting Personalized Voice Activity Detection
SpareCodeSearch: Searching for Code Context When You Have No Spare GPU
Epistemic-aware Vision-Language Foundation Model for Fetal Ultrasound Interpretation
A Multimodal XAI Framework for Trustworthy CNNs and Bias Detection in Deep Representation Learning
Max It or Miss It: Benchmarking LLM On Solving Extremal Problems
CurLL: A Developmental Framework to Evaluate Continual Learning in Language Models
Developing and Validating the Arabic Version of the Attitudes Toward Large Language Models Scale
Deliberate Lab: A Platform for Real-Time Human-AI Social Experiments
Randomness and Interpolation Improve Gradient Descent
SeqBench: Benchmarking Sequential Narrative Generation in Text-to-Video Models
SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion
Time-Varying Optimization for Streaming Data Via Temporal Weighting
VLA-0: Building State-of-the-Art VLAs with Zero Modification
Towards Human-Centric Intelligent Treatment Planning for Radiation Therapy
True Self-Supervised Novel View Synthesis is Transferable
NeuroRVQ: Multi-Scale EEG Tokenization for Generative Large Brainwave Models
Transformer-based Scalable Beamforming Optimization via Deep Residual Learning
Agentic Discovery: Closing the Loop with Cooperative Agents
A Multi-dimensional Semantic Surprise Framework Based on Low-Entropy Semantic Manifolds for Fine-Grained Out-of-Distribution Detection
ESI: Epistemic Uncertainty Quantification via Semantic-preserving Intervention for Large Language Models
TRUSTVIS: A Multi-Dimensional Trustworthiness Evaluation Framework for Large Language Models
DriveCritic: Towards Context-Aware, Human-Aligned Evaluation for Autonomous Driving with Vision-Language Models
Multi-Label Clinical Text Eligibility Classification and Summarization System
On the Reasoning Abilities of Masked Diffusion Language Models
Stable LLM Ensemble: Interaction between Example Representativeness and Diversity
Program of Thoughts for Financial Reasoning: Leveraging Dynamic In-Context Examples and Generative Retrieval
Behavioral Embeddings of Programs: A Quasi-Dynamic Approach for Optimization Prediction
StressTransfer: Stress-Aware Speech-to-Speech Translation with Emphasis Preservation
Paper Copilot: Tracking the Evolution of Peer Review in AI Conferences
LLM-Guided Synthetic Augmentation (LGSA) for Mitigating Bias in AI Systems
CleverCatch: A Knowledge-Guided Weak Supervision Model for Fraud Detection
MimicParts: Part-aware Style Injection for Speech-Driven 3D Motion Generation
What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging
MotionBeat: Motion-Aligned Music Representation via Embodied Contrastive Learning and Bar-Equivariant Contact-Aware Encoding
Real-Time Crowd Counting for Embedded Systems with Lightweight Architecture
A Ratio-Based Shapley Value for Collaborative Machine Learning - Extended Version
To Steer or Not to Steer? Mechanistic Error Reduction with Abstention for Language Models
Higher Satisfaction, Lower Cost: A Technical Report on How LLMs Revolutionize Meituan's Intelligent Interaction Systems
LLM one-shot style transfer for Authorship Attribution and Verification
Self-Augmented Visual Contrastive Decoding
Injection, Attack and Erasure: Revocable Backdoor Attacks via Machine Unlearning
Thompson Sampling via Fine-Tuning of LLMs
AOAD-MAT: Transformer-based multi-agent deep reinforcement learning model considering agents' order of action decisions
Protect: Towards Robust Guardrailing Stack for Trustworthy Enterprise LLM Systems
Personal Attribute Leakage in Federated Speech Models
Adversarial Fine-tuning in Offline-to-Online Reinforcement Learning for Robust Robot Control
Generalist++: A Meta-learning Framework for Mitigating Trade-off in Adversarial Training
Language as a Label: Zero-Shot Multimodal Classification of Everyday Postures under Data Scarcity
Document Intelligence in the Era of Large Language Models: A Survey
A New Perspective on Transformers in Online Reinforcement Learning for Continuous Control
MADREC: A Multi-Aspect Driven LLM Agent for Explainable and Adaptive Recommendation
From Minimal Existence to Human Definition: The CES-IMU-HSG Theoretical Framework
Semantic Communication Enabled Holographic Video Processing and Transmission
Rectify and Align GPS Points to Parking Spots via Rank-1 Constraint
Neural Sum-of-Squares: Certifying the Nonnegativity of Polynomials with Transformers
LiteraryQA: Towards Effective Evaluation of Long-document Narrative QA
DistilCLIP-EEG: Enhancing Epileptic Seizure Detection Through Multi-modal Learning and Knowledge Distillation
ConsintBench: Evaluating Language Models on Real-World Consumer Intent Understanding
MedREK: Retrieval-Based Editing for Medical LLMs with Key-Aware Prompts
Offline and Online KL-Regularized RLHF under Differential Privacy
UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning

Research Sources: 396 | Generated: 10/16/2025