AI Research News Feeds for October 13th, 2025

AI RESEARCH PAPERS & ACADEMIC SOURCES

SQ-GAN: Semantic Image Communications Using Masked Vector Quantization
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition
Differentially Private 2D Human Pose Estimation
Deep Learning for Sports Video Event Detection: Tasks, Datasets, Methods, and Challenges
The Role of Video Generation in Enhancing Data-Limited Action Understanding
HoliTom: Holistic Token Merging for Fast Video Large Language Models
SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images
Solving Inverse Problems with FLAIR
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
Contour Errors: An Ego-Centric Metric for Reliable 3D Multi-Object Tracking
Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model
DiffMark: Diffusion-based Robust Watermark Against Deepfakes
Event-RGB Fusion for Spacecraft Pose Estimation Under Harsh Lighting
SMF: Template-free and Rig-free Animation Transfer using Kinetic Codes
Mask Tokens as Prophet: Fine-Grained Cache Eviction for Efficient dLLM Inference
LLP: LLM-based Product Pricing in E-commerce
ReTraceQA: Evaluating Reasoning Traces of Small Language Models in Commonsense Question Answering
Logit Arithmetic Elicits Long Reasoning Capabilities Without Training
NL2GenSym: Natural Language to Generative Symbolic Rules for SOAR Cognitive Architecture via Large Language Models
Understanding the Effects of Domain Finetuning on LLMs
Token-Level Policy Optimization: Linking Group-Level Rewards to Token-Level Aggregation via Markov Likelihood
KORMo: Korean Open Reasoning Model for Everyone
Domain-Adapted Pre-trained Language Models for Implicit Information Extraction in Crash Narratives
Getting Your Indices in a Row: Full-Text Search for LLM Training Data for Real World
StatEval: A Comprehensive Benchmark for Large Language Models in Statistics
Can We Reliably Rank Model Performance across Domains without Labeled Data?
Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking
Evaluating Robustness of Large Language Models Against Multilingual Typographical Errors
Beyond Surface Reasoning: Unveiling the True Long Chain-of-Thought Capacity of Diffusion Large Language Models
Hierarchical Indexing with Knowledge Enrichment for Multilingual Video Corpus Retrieval
A Comprehensive Evaluation of Multilingual Chain-of-Thought Reasoning: Performance, Consistency, and Faithfulness Across Languages
WUGNECTIVES: Novel Entity Inferences of Language Models from Discourse Connectives
AutoPR: Let's Automate Your Academic Promotion!
Mind-Paced Speaking: A Dual-Brain Approach to Real-Time Reasoning in Spoken Language Models
Unleashing Perception-Time Scaling to Multimodal Reasoning Models
Exploiting Web Search Tools of AI Agents for Data Exfiltration
Unsupervised lexicon learning from speech is limited by representations rather than clustering
Target speaker anonymization in multi-speaker recordings
Privacy-Preserving Parameter-Efficient Fine-Tuning for Large Language Model Services
On the Reliability of Large Language Models for Causal Discovery
Augmenting Compliance-Guaranteed Customer Service Chatbots: Context-Aware Knowledge Expansion with Large Language Models
Improbable Bigrams Expose Vulnerabilities of Incomplete Tokens in Byte-Level Tokenizers
RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation
AnyEdit: Edit Any Knowledge Encoded in Language Models
LightMamba: Efficient Mamba Acceleration on FPGA with Quantization and Hardware Co-design
What Are They Filtering Out? An Experimental Benchmark of Filtering Strategies for Harm Reduction in Pretraining Datasets
RAISE: Reinforced Adaptive Instruction Selection For Large Language Models
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Learning to Reason Across Parallel Samples for LLM Reasoning
RedDebate: Safer Responses through Multi-Agent Red Teaming Debates
Phonikud: Hebrew Grapheme-to-Phoneme Conversion for Real-Time Text-to-Speech
Learning to Disentangle Latent Reasoning Rules with Language VAEs: A Systematic Study
Chain-of-Retrieval Augmented Generation
Adjusting Initial Noise to Mitigate Memorization in Text-to-Image Diffusion Models
The Digital Mirror: Gender Bias and Occupational Stereotypes in AI-Generated Images
Dynamic Mixture-of-Experts for Visual Autoregressive Model
Detection of high-frequency oscillations using time-frequency analysis
PhyDAE: Physics-Guided Degradation-Adaptive Experts for All-in-One Remote Sensing Image Restoration
Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
BEAR: Benchmarking and Enhancing Multimodal Language Models for Atomic Embodied Capabilities
LinearSR: Unlocking Linear Attention for Stable and Efficient Image Super-Resolution
Q-Router: Agentic Video Quality Assessment with Expert Model Routing and Artifact Localization
Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering
FOLK: Fast Open-Vocabulary 3D Instance Segmentation via Label-guided Knowledge Distillation
Modeling Time-Lapse Trajectories to Characterize Cranberry Growth
SegTrans: Transferable Adversarial Examples for Segmentation Models
Defense against Unauthorized Distillation in Image Restoration via Feature Space Perturbation
mmJoints: Expanding Joint Representations Beyond (x,y,z) in mmWave-Based 3D Pose Estimation
Hierarchical Scheduling for Multi-Vector Image Retrieval
HandEval: Taking the First Step Towards Hand Quality Evaluation in Generated Images
Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation
Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy
Lesion-Aware Post-Training of Latent Diffusion Models for Synthesizing Diffusion MRI from CT Perfusion
Visual Anomaly Detection for Reliable Robotic Implantation of Flexible Microelectrode Array
MambaH-Fit: Rethinking Hyper-surface Fitting-based Point Cloud Normal Estimation via State Space Modelling
GL-DT: Multi-UAV Detection and Tracking with Global-Local Integration
Dense2MoE: Restructuring Diffusion Transformer to MoE for Efficient Text-to-Image Generation
Polar Separable Transform for Efficient Orthogonal Rotation-Invariant Image Representation
Online Topological Localization for Navigation Assistance in Bronchoscopy
Instance-Level Generation for Representation Learning
TARO: Toward Semantically Rich Open-World Object Detection
Online Video Depth Anything: Temporally-Consistent Depth Prediction with Low Memory Consumption
Cattle-CLIP: A Multimodal Framework for Cattle Behaviour Recognition
3D Reconstruction from Transient Measurements with Time-Resolved Transformer
Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
Tag-Enriched Multi-Attention with Large Language Models for Cross-Domain Sequential Recommendation
Hallucination Filtering in Radiology Vision-Language Models Using Discrete Semantic Entropy
MomentSeg: Moment-Centric Sampling for Enhanced Video Pixel Understanding
Spotlight on Token Perception for Multimodal Reinforcement Learning
Foraging with the Eyes: Dynamics in Human Visual Gaze and Deep Predictive Modeling
RadioFlow: Efficient Radio Map Construction Framework with Flow Matching
Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance for Self-supervised Monocular Depth Estimation
Instance-Aware Robust Consistency Regularization for Semi-Supervised Nuclei Instance Segmentation
Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark
Boosting Multi-modal Keyphrase Prediction with Dynamic Chain-of-Thought in Vision-Language Models
BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception
Visibility-Aware Densification for 3D Gaussian Splatting in Dynamic Urban Scenes
Minkowski-MambaNet: A Point Cloud Framework with Selective State Space Models for Forest Biomass Quantification
Utilizing dynamic sparsity on pretrained DETR
Mono4DEditor: Text-Driven 4D Scene Editing from Monocular Video via Point-Level Localization of Language-Embedded Gaussians
Dynamic Weight-based Temporal Aggregation for Low-light Video Enhancement
PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs
Diagonal Artifacts in Samsung Images: PRNU Challenges and Solutions
PRNet: Original Information Is All You Have
FLOWING: Implicit Neural Flows for Structure-Preserving Morphing
TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control
FSP-DETR: Few-Shot Prototypical Parasitic Ova Detection
Vision Language Models: A Survey of 26K Papers
SpaceVista: All-Scale Visual Spatial Reasoning from mm to km
VITA-VLA: Efficiently Teaching Vision-Language Models to Act via Action Expert Distillation
Look before Transcription: End-to-End SlideASR with Visually-Anchored Policy Optimization
Interlaced dynamic XCT reconstruction with spatio-temporal implicit neural representations
Generating Sizing Fields for Mesh Generation via GCN-based Simplification of Adaptive Background Grids
Progressive Uncertainty-Guided Evidential U-KAN for Trustworthy Medical Image Segmentation
FS-RWKV: Leveraging Frequency Spatial-Aware RWKV for 3T-to-7T MRI Translation
SAM2-3dMed: Empowering SAM2 for 3D Medical Image Segmentation
Rewiring Development in Brain Segmentation: Leveraging Adult Brain Priors for Enhancing Infant MRI Segmentation
FFT-based Selection and Optimization of Statistics for Robust Recognition of Severely Corrupted Images
Diffusion-based RGB-D Semantic Segmentation with Deformable Attention Transformer
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors
SkipClick: Combining Quick Responses and Low-Level Features for Interactive Segmentation in Winter Sports Contexts
Enhancing Biomedical Named Entity Recognition using GLiNER-BioMed with Targeted Dictionary-Based Post-processing for BioASQ 2025 task 6
Systematic Diagnosis of Brittle Reasoning in Large Language Models
Confidence, Not Perplexity: A Better Metric for the Creative Era of LLMs
YpathRAG:A Retrieval-Augmented Generation Framework and Benchmark for Pathology
GraphGhost: Tracing Structures Behind Large Language Models
Gender Bias in Large Language Models for Healthcare: Assignment Consistency and Clinical Implications
Iterative LLM-Based Generation and Refinement of Distracting Conditions in Math Word Problems
LLMs Show Surface-Form Brittleness Under Paraphrase Stress Tests
JAI-1: A Thai-Centric Large Language Model
From Simulation to Strategy: Automating Personalized Interaction Planning for Conversational Agents
Text2Stories: Evaluating the Alignment Between Stakeholder Interviews and Generated User Stories
Do LLMs Know They Are Being Tested? Evaluation Awareness and Incentive-Sensitive Failures in GPT-OSS-20B
ExPO-HM: Learning to Explain-then-Detect for Hateful Meme Detection
Scaling Laws for Code: A More Data-Hungry Regime
Thinking Longer, Not Always Smarter: Evaluating LLM Capabilities in Hierarchical Legal Reasoning
How Many Code and Test Cases Are Enough? Evaluating Test Cases Generation from a Binary-Matrix Perspective
Learning What to Remember: Adaptive Probabilistic Memory Retention for Memory-Efficient Language Models
MOSAIC: Multi-agent Orchestration for Task-Intelligent Scientific Coding
The Model's Language Matters: A Comparative Privacy Analysis of LLMs
Search-on-Graph: Iterative Informed Navigation for Large Language Model Reasoning on Knowledge Graphs
Quality Estimation Reranking for Document-Level Translation
FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark for Evaluating LLMs
Autoencoding-Free Context Compression for LLMs via Contextual Semantic Anchors
Artificial Impressions: Evaluating Large Language Model Behavior Through the Lens of Trait Impressions
SOP-Maze: Evaluating Large Language Models on Complicated Business Standard Operating Procedures
Creation of the Chinese Adaptive Policy Communication Corpus
MASA: LLM-Driven Multi-Agent Systems for Autoformalization
DARO: Difficulty-Aware Reweighting Policy Optimization
Decoupling Safety into Orthogonal Subspace: Cost-Efficient and Performance-Preserving Alignment for Large Language Models
LitE-SQL: A Lightweight and Efficient Text-to-SQL Framework with Vector-based Schema Linking and Execution-Guided Self-Correction
Automated Refinement of Essay Scoring Rubrics for Language Models via Reflect-and-Revise
Exploring Cross-Lingual Knowledge Transfer via Transliteration-Based MLM Fine-Tuning for Critically Low-resource Chakma Language
Large Language Models Do NOT Really Know What They Don't Know
ReFIne: A Framework for Trustworthy Large Reasoning Models with Reliability, Faithfulness, and Interpretability
FrameEOL: Semantic Frame Induction using Causal Language Models
When Retrieval Succeeds and Fails: Rethinking Retrieval-Augmented Generation for LLMs
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation
Augmenting Dialog with Think-Aloud Utterances for Modeling Individual Personality Traits by LLM
Stronger Re-identification Attacks through Reasoning and Aggregation
LLaMAX2: Your Translation-Enhanced Model also Performs Well in Reasoning
DSPO: Stable and Efficient Policy Optimization for Agentic Search and Reasoning
CFVBench: A Comprehensive Video Benchmark for Fine-grained Multimodal Retrieval-Augmented Generation
One Sentence, Two Embeddings: Contrastive Learning of Explicit and Implicit Semantic Representations
MaP: A Unified Framework for Reliable Evaluation of Pre-training Dynamics
ShiZhi: A Chinese Lightweight Large Language Model for Court View Generation
A Biophysically-Conditioned Generative Framework for 3D Brain Tumor MRI Synthesis
A Multimodal Approach to SME Credit Scoring Integrating Transaction and Ownership Networks
Estimating Brain Activity with High Spatial and Temporal Resolution using a Naturalistic MEG-fMRI Encoding Model
Active Model Selection for Large Language Models
Hybrid Models for Natural Language Reasoning: The Case of Syllogistic Logic
D-TPT: Dimensional Entropy Maximization for Calibrating Test-Time Prompt Tuning in Vision-Language Models
Few-shot multi-token DreamBooth with LoRa for style-consistent character generation
Efficient Autoregressive Inference for Transformer Probabilistic Models
Unsupervised full-field Bayesian inference of orthotropic hyperelasticity from a single biaxial test: a myocardial case study
Interpretable Generative and Discriminative Learning for Multimodal and Incomplete Clinical Data
Conditional Flow Matching for Bayesian Posterior Inference
Three Birds with One Stone: Improving Performance, Convergence, and System Throughput with Nest
From Contextual Data to Newsvendor Decisions: On the Actual Performance of Data-Driven Algorithms
Fair Graph Machine Learning under Adversarial Missingness Processes
FREE: The Foundational Semantic Recognition for Modeling Environmental Ecosystems
Towards Natural Machine Unlearning
An Imitative Reinforcement Learning Framework for Pursuit-Lock-Launch Missions
Direct Quantized Training of Language Models with Stochastic Rounding
A Digital Twin for Diesel Engines: Operator-infused Physics-Informed Neural Networks with Transfer Learning for Engine Health Monitoring
Covariances for Free: Exploiting Mean Distributions for Training-free Federated Learning
Network Dynamics-Based Framework for Understanding Deep Neural Networks
Filtering out mislabeled training instances using black-box optimization and quantum annealing
Orthogonal Representation Learning for Estimating Causal Quantities
Causal Additive Models with Unobserved Causal Paths and Backdoor Paths
Exploring Neural Granger Causality with xLSTMs: Unveiling Temporal Dependencies in Complex Data
Detecting and Filtering Unsafe Training Data via Data Attribution with Denoised Representation
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
On the Interpolation Effect of Score Smoothing in Diffusion Models
Aggregation on Learnable Manifolds for Asynchronous Federated Optimization
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
What's Inside Your Diffusion Model? A Score-Based Riemannian Metric to Explore the Data Manifold
CausalDynamics: A large-scale benchmark for structural discovery of dynamical causal models
Automated Capability Evaluation of Foundation Models
Partition Generative Modeling: Masked Modeling Without Masks
PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects
Epistemic Errors of Imperfect Multitask Learners When Distributions Shift
Training-free AI for Earth Observation Change Detection using Physics Aware Neuromorphic Networks
NIMO: a Nonlinear Interpretable MOdel
Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
Discrete Compositional Generation via General Soft Operators and Robust Reinforcement Learning
Towards the Training of Deeper Predictive Coding Neural Networks
Large Language Model Agent for Modular Task Execution in Drug Discovery
Networked Information Aggregation via Machine Learning
ROC-n-reroll: How verifier imperfection affects test-time scaling
DQS: A Low-Budget Query Strategy for Enhancing Unsupervised Data-driven Anomaly Detection Approaches
K-ASTRO: Structure-Aware Adaptation of LLMs for Code Vulnerability Detection
Multiparameter regularization and aggregation in the context of polynomial functional regression
SolNet: Open-source deep learning models for photovoltaic power forecasting across the globe
Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance
Frequency-Guided Posterior Sampling for Diffusion-Based Image Restoration
NLP-ADBench: NLP Anomaly Detection Benchmark
Making Bias Amplification in Balanced Datasets Directional and Interpretable
Convergence of two-timescale gradient descent ascent dynamics: finite-dimensional and mean-field perspectives
SWE-Arena: An Interactive Platform for Evaluating Foundation Models in Software Engineering
How a Bilingual LM Becomes Bilingual: Tracing Internal Representations with Sparse Autoencoders
Hierarchical autoregressive neural networks in three-dimensional statistical system
Understanding and Improving Information Preservation in Prompt Compression for LLMs
Gradient-based Sample Selection for Faster Bayesian Optimization
What Makes LLMs Effective Sequential Recommenders? A Study on Preference Intensity and Temporal Context
Generalized Probabilistic Approximate Optimization Algorithm
Lizard: An Efficient Linearization Framework for Large Language Models
Velocity and Density-Aware RRI Analysis and Optimization for AoI Minimization in IoV SPS
Simple and Robust Forecasting of Spatiotemporally Correlated Small Earth Data with A Tabular Foundation Model
AB-PINNs: Adaptive-Basis Physics-Informed Neural Networks for Residual-Driven Domain Decomposition
MATT-CTR: Unleashing a Model-Agnostic Test-Time Paradigm for CTR Prediction with Confidence-Guided Inference Paths
Bi-level Meta-Policy Control for Dynamic Uncertainty Calibration in Evidential Deep Learning
Variability Aware Recursive Neural Network (VARNN): A Residual-Memory Model for Capturing Temporal Deviation in Sequence Regression Modeling
When LLM Agents Meet Graph Optimization: An Automated Data Quality Improvement Approach
HiBBO: HiPPO-based Space Consistency for High-dimensional Bayesian Optimisation
Diagnosing and Mitigating System Bias in Self-Rewarding RL
FedL2T: Personalized Federated Learning with Two-Teacher Distillation for Seizure Prediction
Constraints-of-Thought: A Framework for Constrained Reasoning in Language-Model-Guided Search
LLM Unlearning on Noisy Forget Sets: A Study of Incomplete, Rewritten, and Watermarked Data
Slim Scheduler: A Runtime-Aware RL and Scheduler System for Efficient CNN Inference
MagicDock: Toward Docking-oriented De Novo Ligand Design via Gradient Inversion
The Environmental Impacts of Machine Learning Training Keep Rising Evidencing Rebound Effect
The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections
Convergence of optimizers implies eigenvalues filtering at equilibrium
Spatio-Temporal Graph Convolutional Networks for EV Charging Demand Forecasting Using Real-World Multi-Modal Data Integration
Improving Anomaly Detection in Industrial Time Series: The Role of Segmentation and Heterogeneous Ensemble
FLToP CTC: Frame-Level Token Pruning via Relative Threshold for Efficient and Memory-Saving Decoding on Diverse Platforms
Neural Codecs as Biosignal Tokenizers
AdaPM: a Partial Momentum Algorithm for LLM Training
Regret Bounds for Adversarial Contextual Bandits with General Function Approximation and Delayed Feedback
Score-Based Density Estimation from Pairwise Comparisons
Logits Replay + MoClip: Stabilized, Low-Cost Post-Training with Minimal Forgetting
Agentic-KGR: Co-evolutionary Knowledge Graph Construction through Multi-Agent Reinforcement Learning
Efficient Resource-Constrained Training of Vision Transformers via Subspace Optimization
Robustness and Regularization in Hierarchical Re-Basin
Beyond Pairwise Connections: Extracting High-Order Functional Brain Network Structures under Global Constraints
RepDL: Bit-level Reproducible Deep Learning Training and Inference
FM-IRL: Flow-Matching for Reward Modeling and Policy Regularization in Reinforcement Learning
Prime Implicant Explanations for Reaction Feasibility Prediction
Incentivizing Time-Aware Fairness in Data Sharing
A PCA-based Data Prediction Method
Mitigating Model Drift in Developing Economies Using Synthetic Data and Outliers
Large Language Model Prompt Datasets: An In-depth Analysis and Insights
Residual-Informed Learning of Solutions to Algebraic Loops
Safety Game: Balancing Safe and Informative Conversations with Blackbox Agentic AI using LP Solvers
Efficient Bayesian Inference from Noisy Pairwise Comparisons
Deep Learning to Identify the Spatio-Temporal Cascading Effects of Train Delays in a High-Density Network
CHUCKLE -- When Humans Teach AI To Learn Emotions The Easy Way
HINT: Helping Ineffective Rollouts Navigate Towards Effectiveness
Cross-Receiver Generalization for RF Fingerprint Identification via Feature Disentanglement and Adversarial Training
What Do Temporal Graph Learning Models Learn?
Weight Initialization and Variance Dynamics in Deep Neural Networks and Large Language Models
Cross-attention Secretly Performs Orthogonal Alignment in Recommendation Models
On Uniformly Scaling Flows: A Density-Aligned Approach to Deep One-Class Classification
Interpretable Machine Learning for Predicting Startup Funding, Patenting, and Exits
Geodesic Calculus on Latent Spaces
CRPS-LAM: Regional ensemble weather forecasting from matching marginals
Locally Optimal Private Sampling: Beyond the Global Minimax
Near-Optimal Second-Order Guarantees for Model-Based Adversarial Imitation Learning
Geo-Aware Models for Stream Temperature Prediction across Different Spatial Regions and Scales
Automated Evolutionary Optimization for Resource-Efficient Neural Network Training
STaTS: Structure-Aware Temporal Sequence Summarization via Statistical Window Merging
MODE: Learning compositional representations of complex systems with Mixtures Of Dynamical Experts
Evolutionary Computation as Natural Generative AI
Human Texts Are Outliers: Detecting LLM-generated Texts via Out-of-distribution Detection
Which Is Better For Reducing Outdated and Vulnerable Dependencies: Pinning or Floating?
Reproducible Evaluation of Data Augmentation and Loss Functions for Brain Tumor Segmentation
PARSE: LLM Driven Schema Optimization for Reliable Entity Extraction
Out-of-Distribution Detection in LiDAR Semantic Segmentation Using Epistemic Uncertainty from Hierarchical GMMs
Next Semantic Scale Prediction via Hierarchical Diffusion Language Models
QuIRK: Quantum-Inspired Re-uploading KAN
Decoding Positive Selection in Mycobacterium tuberculosis with Phylogeny-Guided Graph Attention Models
Neptune: Advanced ML Operator Fusion for Locality and Parallelism on GPUs
Structured Output Regularization: a framework for few-shot transfer learning
How Reliable is Language Model Micro-Benchmarking?
A Design-based Solution for Causal Inference with Text: Can a Language Model Be Too Large?
Understanding Exoplanet Habitability: A Bayesian ML Framework for Predicting Atmospheric Absorption Spectra
Prioritizing Latency with Profit: A DRL-Based Admission Control for 5G Network Slices
Detecting spills using thermal imaging, pretrained deep learning models, and a robotic platform
Man-Made Heuristics Are Dead. Long Live Code Generators!
Humanoid Everyday: A Comprehensive Robotic Dataset for Open-World Humanoid Manipulation
Gradient-Guided Furthest Point Sampling for Robust Training Set Selection
A Representer Theorem for Hawkes Processes via Penalized Least Squares Minimization
PHyCLIP: $\ell_1$-Product of Hyperbolic Factors Unifies Hierarchy and Compositionality in Vision-Language Representation Learning
Mirror Flow Matching with Heavy-Tailed Priors for Generative Modeling on Convex Domains
Physically Valid Biomolecular Interaction Modeling with Gauss-Seidel Projection
Denoised Diffusion for Object-Focused Image Augmentation
Uncolorable Examples: Preventing Unauthorized AI Colorization via Perception-Aware Chroma-Restrictive Perturbation
Exploring Single Domain Generalization of LiDAR-based Semantic Segmentation under Imperfect Labels
MAKO: Meta-Adaptive Koopman Operators for Learning-based Model Predictive Control of Parametrically Uncertain Nonlinear Systems
MMAudioSep: Taming Video-to-Audio Generative Model Towards Video/Text-Queried Sound Separation
MCMC: Bridging Rendering, Optimization and Generative AI
A Novel Multi-branch ConvNeXt Architecture for Identifying Subtle Pathological Features in CT Scans
Training Feature Attribution for Vision Models
Distributionally robust approximation property of neural networks
Augmented data and neural networks for robust epidemic forecasting: application to COVID-19 in Italy
Flow-Opt: Scalable Centralized Multi-Robot Trajectory Optimization with Flow Matching and Differentiable Optimization
Provable Watermarking for Data Poisoning Attacks
IRIS: An Iterative and Integrated Framework for Verifiable Causal Discovery in the Absence of Tabular Data
Characterizing 5G User Throughput via Uncertainty Modeling and Crowdsourced Measurements
Investigating the Impact of Rational Dilated Wavelet Transform on Motor Imagery EEG Decoding with Deep Learning Models
Application of Deep Reinforcement Learning to At-the-Money S&P 500 Options Hedging
Smart navigation of a gravity-driven glider with adjustable centre-of-mass
Zero-shot image privacy classification with Vision-Language Models
GREAT: Generalizable Backdoor Attacks in RLHF via Emotion-Aware Trigger Synthesis
Placeit! A Framework for Learning Robot Object Placement Skills
Goal-oriented Backdoor Attack against Vision-Language-Action Models via Physical Objects
A unified Bayesian framework for adversarial robustness
Reliability Sensitivity with Response Gradient
A methodology for clinically driven interactive segmentation evaluation
Mitigating Overthinking through Reasoning Shaping
SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models
Dyna-Mind: Learning to Simulate from Experience for Better AI Agents
BaNEL: Exploration Posteriors for Generative Modeling Using Only Negative Rewards
Prompting Test-Time Scaling Is A Strong LLM Reasoning Data Augmentation
StreamingVLM: Real-Time Understanding for Infinite Video Streams
A Method to Improve the Performance of Reinforcement Learning Based on the Y Operator for a Class of Stochastic Differential Equation-Based Child-Mother Systems
Training-Free Safe Denoisers for Safe Use of Diffusion Models
HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions
A Knowledge-Informed Deep Learning Paradigm for Generalizable and Stability-Optimized Car-Following Models
GUI-Shift: Enhancing VLM-Based GUI Agents through Self-supervised Reinforcement Learning
AdaReasoner: Adaptive Reasoning Enables More Flexible Thinking in Large Language Models
ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection
Continual Adapter Tuning with Semantic Shift Compensation for Class-Incremental Learning
E-ICL: Enhancing Fine-Grained Emotion Recognition through the Lens of Prototype Theory
Confidence-weighted integration of human and machine judgments for superior decision-making
UltraSeP: Sequence-aware Pre-training for Echocardiography Probe Movement Guidance
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
Medchain: Bridging the Gap Between LLM Agents and Clinical Practice with Interactive Sequence
Generate Any Scene: Scene Graph Driven Data Synthesis for Visual Generation Training
SwarmGPT: Combining Large Language Models with Safe Motion Planning for Drone Swarm Choreography
Preference Discerning with LLM-Enhanced Generative Retrieval
AD-LLM: Benchmarking Large Language Models for Anomaly Detection
Enabling Population-Level Parallelism in Tree-Based Genetic Programming for GPU Acceleration
OrcaLoca: An LLM Agent Framework for Software Issue Localization
IG-MCTS: Human-in-the-Loop Cooperative Navigation under Incomplete Information
RadVLM: A Multitask Conversational Vision-Language Model for Radiology
Solving Linear-Gaussian Bayesian Inverse Problems with Decoupled Diffusion Sequential Monte Carlo
WyckoffDiff -- A Generative Diffusion Model for Crystal Symmetry
Contrastive Learning Augmented Social Recommendations
Measuring directional bias amplification in image captions using predictability
CCDP: Composition of Conditional Diffusion Policies with Guided Sampling
Issue Localization via LLM-Driven Iterative Code Graph Searching
Brain2Text Decoding Model Reveals the Neural Mechanisms of Visual Semantic Processing
DeepOHeat-v1: Efficient Operator Learning for Fast and Trustworthy Thermal Simulation and Optimization in 3D-IC Design
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
Diffusion Generative Recommendation with Continuous Tokens
Exploring human-SAV interaction using LLMs: The impact of psychological factors on user experience
Multimodal Language Models See Better When They Look Shallower
Cognitio Emergens: Agency, Dimensions, and Dynamics in Human-AI Knowledge Co-Creation
System Prompt Optimization with Meta-Learning
Collaborative Unlabeled Data Optimization
Game of Trust: How Trustworthy Does Your Blockchain Think You Are?
Sequential Monte Carlo for Policy Optimization in Continuous POMDPs
DDO: Dual-Decision Optimization for LLM-Based Medical Consultation via Multi-Agent Collaboration
Beyond Demonstrations: Dynamic Vector Construction from Latent Representations
FinTagging: Benchmarking LLMs for Extracting and Structuring Financial Information
An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks
Elicit and Enhance: Advancing Multimodal Reasoning in Medical Scenarios
Robustness in Both Domains: CLIP Needs a Robust Text Encoder
AD-EE: Early Exiting for Fast and Reliable Vision-Language Models in Autonomous Driving
Flattery in Motion: Benchmarking and Analyzing Sycophancy in Video-LLMs
CausalVLBench: Benchmarking Visual Causal Reasoning in Large Vision-Language Models
Symmetry in Neural Network Parameter Spaces
Bures-Wasserstein Flow Matching for Graph Generation
Interpretable and Granular Video-Based Quantification of Motor Characteristics from the Finger Tapping Test in Parkinson Disease
Mem4Nav: Boosting Vision-and-Language Navigation in Urban Environments with a Hierarchical Spatial-Cognition Long-Short Memory System
Tuning without Peeking: Provable Privacy and Generalization Bounds for LLM Post-Training
AirScape: An Aerial Generative World Model with Motion Controllability
Site-Level Fine-Tuning with Progressive Layer Freezing: Towards Robust Prediction of Bronchopulmonary Dysplasia from Day-1 Chest Radiographs in Extremely Preterm Infants
Preprint: Poster: Did I Just Browse A Website Written by LLMs?
A Mega-Study of Digital Twins Reveals Strengths, Weaknesses and Opportunities for Further Improvement
How Scale Breaks "Normalized Stress" and KL Divergence: Rethinking Quality Metrics
Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting
Counterfactually Fair Conformal Prediction
Transmuting prompts into weights
SHAP-Based Supervised Clustering for Sample Classification and the Generalized Waterfall Plot
Faithful and Interpretable Explanations for Complex Ensemble Time Series Forecasts using Surrogate Models and Forecastability Analysis
RFOD: Random Forest-based Outlier Detection for Tabular Data
Conformal Risk Training: End-to-End Optimization of Conformal Risk Control
Exploring Cross-Client Memorization of Training Data in Large Language Models for Federated Learning
LOTION: Smoothing the Optimization Landscape for Quantized Training
Spatial Deconfounder: Interference-Aware Deconfounding for Spatial Causal Inference
Reinforcement Learning-Based Optimization of CT Acquisition and Reconstruction Parameters Through Virtual Imaging Trials
Zero-Shot Policy Transfer in Reinforcement Learning using Buckingham's Pi Theorem
Weights initialization of neural networks for function approximation
PO-CKAN:Physics Informed Deep Operator Kolmogorov Arnold Networks with Chunk Rational Structure
TAPAS: Datasets for Learning the Learning with Errors Problem
Edu-EmotionNet: Cross-Modality Attention Alignment with Temporal Feedback Loops
TinyGraphEstimator: Adapting Lightweight Language Models for Graph Structure Inference
Long-Tailed Recognition via Information-Preservable Two-Stage Learning
The Boundaries of Fair AI in Medical Image Prognosis: A Causal Perspective
On the Alignment Between Supervised and Self-Supervised Contrastive Learning
Sparse components distinguish visual pathways & their alignment to neural networks
Multi-fidelity Batch Active Learning for Gaussian Process Classifiers
An Improved Model-Free Decision-Estimation Coefficient with Applications in Adversarial MDPs
MLLM as a UI Judge: Benchmarking Multimodal LLMs for Predicting Human Perception of User Interfaces
Deceptive Exploration in Multi-armed Bandits
SkipSR: Faster Super Resolution with Token Skipping
Benchmarking Chinese Commonsense Reasoning with a Multi-hop Reasoning Perspective
Adaptive Science Operations in Deep Space Missions Using Offline Belief State Planning
$\mathsf{P} \neq \mathsf{NP}$: A Non-Relativizing Proof via Quantale Weakness and Geometric Complexity
D-CoDe: Scaling Image-Pretrained VLMs to Video via Dynamic Compression and Question Decomposition
McMining: Automated Discovery of Misconceptions in Student Code
CommandSans: Securing AI Agents with Surgical Precision Prompt Sanitization
Reinforcement Learning-Driven Edge Management for Reliable Multi-view 3D Reconstruction
Repository-Aware File Path Retrieval via Fine-Tuned LLMs
Time-Aware Feature Selection: Adaptive Temporal Masking for Stable Sparse Autoencoder Training
Pattern Enhanced Multi-Turn Jailbreaking: Exploiting Structural Vulnerabilities in Large Language Models
Slicing Is All You Need: Towards A Universal One-Sided Algorithm for Distributed Matrix Multiplication
Vector Graph-Based Repository Understanding for Issue-Driven File Retrieval
ControlAudio: Tackling Text-Guided, Timing-Indicated and Intelligible Audio Generation via Progressive Diffusion Modeling
Designing and Evaluating an AI-driven Immersive Multidisciplinary Simulation (AIMS) for Interprofessional Education
Exploring Multi-Temperature Strategies for Token- and Rollout-Level Control in RLVR
HES-SQL: Hybrid Reasoning for Efficient Text-to-SQL with Structural Skeleton Guidance
Pinpointing crucial steps: Attribution-based Credit Assignment for Verifiable Reinforcement Learning
A Unified Biomedical Named Entity Recognition Framework with Large Language Models
A Frequency-Domain Analysis of the Multi-Armed Bandit Problem: A New Perspective on the Exploration-Exploitation Trade-off
Co-Authoring the Self: A Human-AI Interface for Interest Reflection in Recommenders
RO-Bench: Large-scale robustness evaluation of MLLMs with text-driven counterfactual videos
SHERLOCK: Towards Dynamic Knowledge Adaptation in LLM-enhanced E-commerce Risk Management
A Human Behavioral Baseline for Collective Governance in Software Projects
Analytical Survey of Learning with Low-Resource Data: From Analysis to Investigation
Learning Regularizers: Learning Optimizers that can Regularize
SEER: Sustainability Enhanced Engineering of Software Requirements
PlatformX: An End-to-End Transferable Platform for Energy-Efficient Neural Architecture Search
Saving SWE-Bench: A Benchmark Mutation Approach for Realistic Agent Evaluation
SQS: Bayesian DNN Compression through Sparse Quantized Sub-distributions
On Epistemic Uncertainty of Visual Tokens for Object Hallucinations in Large Vision-Language Models
DiTSinger: Scaling Singing Voice Synthesis with Diffusion Transformer and Implicit Alignment
Value-State Gated Attention for Mitigating Extreme-Token Phenomena in Transformers
D\'er\'everb\'eration non-supervis\'ee de la parole par mod\`ele hybride
Robust Driving Control for Autonomous Vehicles: An Intelligent General-sum Constrained Adversarial Reinforcement Learning Approach
Cost-Efficient Long Code Translation using LLMs while Leveraging Identifier Replacements
Alif: Advancing Urdu Large Language Models via Multilingual Synthetic Data Distillation
Emotion-Disentangled Embedding Alignment for Noise-Robust and Cross-Corpus Speech Emotion Recognition
Training Models to Detect Successive Robot Errors from Human Reactions
AI and Human Oversight: A Risk-Based Framework for Alignment
When a Robot is More Capable than a Human: Learning from Constrained Demonstrators
MemLoss: Enhancing Adversarial Training with Recycling Adversarial Examples
SOS: Synthetic Object Segments Improve Detection, Segmentation, and Grounding
On the Fairness of Privacy Protection: Measuring and Mitigating the Disparity of Group Privacy Risks for Differentially Private Machine Learning
MSDM: Generating Task-Specific Pathology Images with a Multimodal Conditioned Diffusion Model for Cell and Nuclei Segmentation
Controlled Personalization in Legacy Media Online Services: A Case Study in News Recommendation
Federated Data Analytics for Cancer Immunotherapy: A Privacy-Preserving Collaborative Platform for Patient Management
Cross-Representation Benchmarking in Time-Series Electronic Health Records for Clinical Outcome Prediction
On the Implicit Adversariality of Catastrophic Forgetting in Deep Continual Learning
Modern Deep Learning Approaches for Cricket Shot Classification: A Comprehensive Baseline Study
Towards Safer and Understandable Driver Intention Prediction
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs
DICE: Structured Reasoning in LLMs through SLM-Guided Chain-of-Thought Correction
Clear Roads, Clear Vision: Advancements in Multi-Weather Restoration for Smart Transportation
Diagnosing Shoulder Disorders Using Multimodal Large Language Models and Consumer-Grade Cameras
CrisiText: A dataset of warning messages for LLM training in emergency communication
Obstacle Avoidance using Dynamic Movement Primitives and Reinforcement Learning
Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models
SynthID-Image: Image watermarking at internet scale
Inflated Excellence or True Performance? Rethinking Medical Diagnostic Benchmarks with Dynamic Evaluation
CLARity: Reasoning Consistency Alone Can Teach Reinforced Experts
CapGeo: A Caption-Assisted Approach to Geometric Reasoning
A Model-Driven Engineering Approach to AI-Powered Healthcare Platforms
Verifying Chain-of-Thought Reasoning via Its Computational Graph
Rate optimal learning of equilibria from data
Randomized HyperSteiner: A Stochastic Delaunay Triangulation Heuristic for the Hyperbolic Steiner Minimal Tree
FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference
deep-REMAP: Probabilistic Parameterization of Stellar Spectra Using Regularized Multi-Task Learning
The Potential of Second-Order Optimization for LLMs: A Study with Full Gauss-Newton
Task-Level Insights from Eigenvalues across Sequence Models
Design Principles for Sequence Models via Coefficient Dynamics
Identifying & Interactively Refining Ambiguous User Goals for Data Visualization Code Generation
ChoirRec: Semantic User Grouping via LLMs for Conversion Rate Prediction of Low-Activity Users
Beyond Single-Granularity Prompts: A Multi-Scale Chain-of-Thought Prompt Learning for Graph
On the Representations of Entities in Auto-regressive Large Language Models
The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach
Bandits with Single-Peaked Preferences and Limited Resources
SilvaScenes: Tree Segmentation and Species Classification from Under-Canopy Images in Natural Forests
Failure Prediction at Runtime for Generative Robot Policies
Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols
Scalable Multi-Agent Path Finding using Collision-Aware Dynamic Alert Mask and a Hybrid Execution Strategy
Multimodal Policy Internalization for Conversational Agents
Performance Analysis of Machine Learning Algorithms in Chronic Kidney Disease Prediction
Precoder Design in Multi-User FDD Systems with VQ-VAE and GNN
Autonomous Soft Robotic Guidewire Navigation via Imitation Learning
FATHOMS-RAG: A Framework for the Assessment of Thinking and Observation in Multimodal Systems that use Retrieval Augmented Generation
EcphoryRAG: Re-Imagining Knowledge-Graph RAG via Human Associative Memory
DualResearch: Entropy-Gated Dual-Graph Retrieval for Answer Reconstruction
Semantic-Condition Tuning: Fusing Graph Context with Large Language Models for Knowledge Graph Completion
Tiny-R1V: Lightweight Multimodal Unified Reasoning Model via Model Merging
TripScore: Benchmarking and rewarding real-world travel planning with fine-grained evaluation
RefGrader: Automated Grading of Mathematical Competition Proofs using Agentic Workflows
Repairing Regex Vulnerabilities via Localization-Guided Instructions
Auto-scaling Continuous Memory for GUI Agent
Humanoid Artificial Consciousness Designed with Large Language Model Based on Psychoanalysis and Personality Theory
MEC$^3$O: Multi-Expert Consensus for Code Time Complexity Prediction
OSCAR: Orthogonal Stochastic Control for Alignment-Respecting Diversity in Flow Matching
Physics-Informed High-order Graph Dynamics Identification Learning for Predicting Complex Networks Long-term Dynamics
Leading the Follower: Learning Persuasive Agents in Social Deduction Games
PAC Reasoning: Controlling the Performance Loss for Efficient Reasoning
Dr. Bias: Social Disparities in AI-Powered Medical Guidance
Comparing Knowledge Source Integration Methods for Optimizing Healthcare Knowledge Fusion in Rescue Operation
RegexPSPACE: A Benchmark for Evaluating LLM Reasoning on PSPACE-complete Regex Problems
Fundamentals of Building Autonomous LLM Agents
Localist LLMs -- A Mathematical Framework for Dynamic Locality Control
Toward Mechanistic Explanation of Deductive Reasoning in Language Models
Sequence Variables: A Constraint Programming Computational Domain for Routing and Sequencing
Agentic Systems in Radiology: Design, Applications, Evaluation, and Challenges
Titans Revisited: A Lightweight Reimplementation and Critical Analysis of a Test-Time Memory Model
Safe, Untrusted, "Proof-Carrying" AI Agents: toward the agentic lakehouse
GraphMERT: Efficient and Scalable Distillation of Reliable Knowledge Graphs from Unstructured Data
LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads?
Deep Multimodal Subspace Clustering Networks
Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition with Multimodal Training
Deep Sparse Representation-based Classification
PyNoetic: A modular python framework for no-code development of EEG brain-computer interfaces
Comparative Analysis of Large Language Models for the Machine-Assisted Resolution of User Intentions
AgenticAD: A Specialized Multiagent System Framework for Holistic Alzheimer Disease Management
LadderSym: A Multimodal Interleaved Transformer for Music Practice Error Detection
Evaluating Hallucinations in Multimodal LLMs with Spoken Queries under Diverse Acoustic Conditions
Articulation-Informed ASR: Integrating Articulatory Features into ASR via Auxiliary Speech Inversion and Cross-Attention Fusion
Dynamic Stress Detection: A Study of Temporal Progression Modelling of Stress in Speech
EGSTalker: Real-Time Audio-Driven Talking Head Generation with Efficient Gaussian Deformation
Beyond CNNs: Efficient Fine-Tuning of Multi-Modal LLMs for Object Detection on Low-Data Regimes
The Enduring Dominance of Deep Neural Networks: A Critical Analysis of the Fundamental Limitations of Quantum Machine Learning and Spiking Neural Networks
Less Diverse, Less Safe: The Indirect But Pervasive Risk of Test-Time Scaling in Large Language Models
Hierarchical Self-Supervised Representation Learning for Depression Detection from Speech
BaldWhisper: Faster Whisper with Head Shearing and Layer Merging
Recover-LoRA: Data-Free Accuracy Recovery of Degraded Language Models via Low-Rank Adaptation
Mnemosyne: An Unsupervised, Human-Inspired Long-Term Memory Architecture for Edge-Based LLMs
LatentBreak: Jailbreaking Large Language Models through Latent Space Feedback
Toward a Safer Web: Multilingual Multi-Agent LLMs for Mitigating Adversarial Misinformation Attacks
Centering Emotion Hotspots: Multimodal Local-Global Fusion and Cross-Modal Alignment for Emotion Recognition in Conversations
MMA-ASIA: A Multilingual and Multimodal Alignment Framework for Culturally-Grounded Evaluation
Relative Positioning Based Code Chunking Method For Rich Context Retrieval In Repository Level Code Completion Task With Code Language Model
Impact of LLMs on Team Collaboration in Software Development
From What to Why: Thought-Space Recommendation with Small Language Models
Hi-OSCAR: Hierarchical Open-set Classifier for Human Activity Recognition
Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry
Automating Android Build Repair: Bridging the Reasoning-Execution Gap in LLM Agents with Domain-Specific Tools
Energy-Driven Steering: Reducing False Refusals in Large Language Models
Upfront Chain-of-Thought: A Cooperative Framework for Chain-of-Thought Compression
Inverse-Free Wilson Loops for Transformers: A Practical Diagnostic for Invariance and Order Sensitivity
Formalizing Style in Personal Narratives
Knowledge Graph Sparsification for GNN-based Rare Disease Diagnosis
A 3D Generation Framework from Cross Modality to Parameterized Primitive
Inner-Instance Normalization for Time Series Forecasting
Provably Robust Adaptation for Language-Empowered Foundation Models
CATS-Linear: Classification Auxiliary Linear Model for Time Series Forecasting
DPCformer: An Interpretable Deep Learning Model for Genomic Prediction in Crops
A Novel Framework for Augmenting Rating Scale Tests with LLM-Scored Text Data
Faver: Boosting LLM-based RTL Generation with Function Abstracted Verifiable Middleware
RA-Gen: A Controllable Code Generation Framework Using ReAct for Multi-Agent Task Execution
dInfer: An Efficient Inference Framework for Diffusion Language Models
RAG4Tickets: AI-Powered Ticket Resolution via Retrieval-Augmented Generation on JIRA and GitHub Data
FreqCa: Accelerating Diffusion Models via Frequency-Aware Caching
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
ConPoSe: LLM-Guided Contact Point Selection for Scalable Cooperative Object Pushing
In-Context Learning for Non-Stationary MIMO Equalization
Enhancing Self-Supervised Learning with Semantic Pairs A New Dataset and Empirical Study
When to Reason: Semantic Router for vLLM
Coordinates from Context: Using LLMs to Ground Complex Location References
Graph Diffusion Transformers are In-Context Molecular Designers
SAFER-AiD: Saccade-Assisted Foveal-peripheral vision Enhanced Reconstruction for Adversarial Defense
Struc-EMB: The Potential of Structure-Aware Encoding in Language Embeddings
Re-Identifying K\={a}k\={a} with AI-Automated Video Key Frame Extraction
Measuring Moral LLM Responses in Multilingual Capacities
Guiding Exploration in Reinforcement Learning Through LLM-Augmented Observations
Hypothesis Hunting with Evolving Networks of Autonomous Scientific Agents
Optimizing delivery for quick commerce factoring qualitative assessment of generated routes
Unified World Models: Memory-Augmented Planning and Foresight for Visual Navigation
Robust Heuristic Algorithm Design with LLMs
COMPASS: Enhancing Agent Long-Horizon Reasoning with Evolving Context
Everyone prefers human writers, including AI
What Is Your Agent's GPA? A Framework for Evaluating Agent Goal-Plan-Action Alignment
ReviewerToo: Should AI Join The Program Committee? A Look At The Future of Peer Review
GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare
LM Fight Arena: Benchmarking Large Multimodal Models via Game Competition
RADAR: Mechanistic Pathways for Detecting Data Contamination in LLM Evaluation

Research Sources: 599 | Generated: 10/13/2025