AI RESEARCH PAPERS & ACADEMIC SOURCES
- EduDial: Constructing a Large-scale Multi-turn Teacher-Student Dialogue Corpus
- Who's Asking? Evaluating LLM Robustness to Inquiry Personas in Factual Question Answering
- The Curious Case of Curiosity across Human Cultures and LLMs
- 3-Model Speculative Decoding
- A Multilingual, Large-Scale Study of the Interplay between LLM Safeguards, Personalisation, and Disinformation
- OPLoRA: Orthogonal Projection LoRA Prevents Catastrophic Forgetting during Parameter-Efficient Fine-Tuning
- On the Role of Preference Variance in Preference Optimization
- GatePro: Parameter-Free Expert Selection Optimization for Mixture-of-Experts Models
- I Am Aligned, But With Whom? MENA Values Benchmark for Evaluating Cultural Alignment and Multilingual Bias in LLMs
- Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference
- A Matter of Representation: Towards Graph-Based Abstract Code Generation
- CoT-Evo: Evolutionary Distillation of Chain-of-Thought for Scientific Reasoning
- Putting on the Thinking Hats: A Survey on Chain of Thought Fine-tuning from the Perspective of Human Reasoning Mechanism
- DSCD: Large Language Model Detoxification with Self-Constrained Decoding
- SHIELD: Classifier-Guided Prompting for Robust and Safer LVLMs
- Grounding Long-Context Reasoning with Contextual Normalization for Retrieval-Augmented Generation
- Text Anomaly Detection with Simplified Isolation Kernel
- A fully automated and scalable Parallel Data Augmentation for Low Resource Languages using Image and Text Analytics
- Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain
- Do You Get the Hint? Benchmarking LLMs on the Board Game Concept
- Beyond Correctness: Rewarding Faithful Reasoning in Retrieval-Augmented Generation
- In-Distribution Steering: Balancing Control and Coherence in Language Model Generation
- Mismatch Aware Guidance for Robust Emotion Control in Auto-Regressive TTS Models
- ChatR1: Reinforcement Learning for Conversational Reasoning and Retrieval Augmented Question Answering
- Embedding-Based Context-Aware Reranker
- Taming the Fragility of KV Cache Eviction in LLM Inference
- Are Proverbs the New Pythian Oracles? Exploring Sentiment in Greek Sayings
- D-SMART: Enhancing LLM Dialogue Consistency via Dynamic Structured Memory And Reasoning Tree
- Make an Offer They Can't Refuse: Grounding Bayesian Persuasion in Real-World Dialogues without Pre-Commitment
- Doing Things with Words: Rethinking Theory of Mind Simulation in Large Language Models
- Investigating Lexical Change through Cross-Linguistic Colexification Patterns
- Evaluating Arabic Large Language Models: A Survey of Benchmarks, Methods, and Gaps
- Beyond Single-Reward: Multi-Pair, Multi-Perspective Preference Optimization for Machine Translation
- Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization
- Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models
- FreshTab: Sourcing Fresh Data for Table-to-Text Generation Evaluation
- MemoTime: Memory-Augmented Temporal Knowledge Graph Enhanced Large Language Model Reasoning
- How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study
- GAPS: A Clinically Grounded, Automated Benchmark for Evaluating AI Clinicians
- Assessing Web Search Credibility and Response Groundedness in Chat Assistants
- Confidence-Based Response Abstinence: Improving LLM Trustworthiness via Activation-Based Uncertainty Estimation
- The Mechanistic Emergence of Symbol Grounding in Language Models
- Breadcrumbs Reasoning: Memory-Efficient Reasoning with Compression Beacons
- BRIEF-Pro: Universal Context Compression with Short-to-Long Synthesis for Fast and Accurate Multi-Hop Reasoning
- Toward LLM-Supported Automated Assessment of Critical Thinking Subskills
- Unifying Vision-Language Latents for Zero-label Image Caption Enhancement
- UNCAP: Uncertainty-Guided Planning Using Natural Language Communication for Cooperative Autonomous Vehicles
- Addressing the alignment problem in transportation policy making: an LLM approach
- MMLongCite: A Benchmark for Evaluating Fidelity of Long-Context Vision-Language Models
- Two Heads Are Better Than One: Audio-Visual Speech Error Correction with Dual Hypotheses
- UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE
- LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models
- Towards Region-aware Bias Evaluation Metrics
- ICA-RAG: Information Completeness Guided Adaptive Retrieval-Augmented Generation for Disease Diagnosis
- Teaching Models to Understand (but not Generate) High-risk Data
- What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs
- RPM: Reasoning-Level Personalization for Black-Box Large Language Models
- RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments
- A Linguistically Motivated Analysis of Intonational Phrasing in Text-to-Speech Systems: Revealing Gaps in Syntactic Sensitivity
- The Landscape of Arabic Large Language Models (ALLMs): A New Era for Arabic Language Technology
- MMD-Flagger: Leveraging Maximum Mean Discrepancy to Detect Hallucinations
- SemVink: Advancing VLMs' Semantic Understanding of Optical Illusions via Visual Global Thinking
- KG2QA: Knowledge Graph-enhanced Retrieval-augmented Generation for Communication Standards Question Answering
- Aligning Large Language Models to Low-Resource Languages through LLM-Based Selective Translation: A Systematic Study
- Assessing the Latent Automated Program Repair Capabilities of Large Language Models using Round-Trip Translation
- Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning
- Benchmarking LLMs' Swarm intelligence
- MERIT: Multilingual Semantic Retrieval with Interleaved Multi-Condition Query
- Cross-modal Associations in Vision and Language Models: Revisiting the Bouba-Kiki Effect
- Learning to Explore in Diverse Reward Settings via Temporal-Difference-Error Maximization
- HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language Models
- Riemannian generative decoder
- Pad\'e Approximant Neural Networks for Enhanced Electric Motor Fault Diagnosis Using Vibration and Acoustic Data
- Context-Action Embedding Learning for Off-Policy Evaluation in Contextual Bandits
- PPA-Game: Characterizing and Learning Competitive Dynamics Among Online Content Creators
- Learning to sample fibers for goodness-of-fit testing
- Beyond the noise: intrinsic dimension estimation with optimal neighbourhood identification
- Quantum Circuit Synthesis and Compilation Optimization: Overview and Prospects
- Persistent Homology via Ellipsoids
- Computing Systemic Risk Measures with Graph Neural Networks
- LLMBridge: Reducing Costs in a Prompt-Centric Internet
- Enhancing the reliability of machine learning for gravitational wave parameter estimation with attention-based models
- A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics
- Gen-C: Populating Virtual Worlds with Generative Crowds
- BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals
- A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random
- Physics-Based Machine Learning Closures and Wall Models for Hypersonic Transition-Continuum Boundary Layer Predictions
- Efficient Restarts in Non-Stationary Model-Free Reinforcement Learning
- On efficiently computable functions, deep networks and sparse compositionality
- QLENS: Towards A Quantum Perspective of Language Transformers
- Learning by Steering the Neural Dynamics: A Statistical Mechanics Perspective
- Nonlinear discretizations and Newton's method: characterizing stationary points of regression objectives
- Mamba Can Learn Low-Dimensional Targets In-Context via Test-Time Feature Learning
- Influence Dynamics and Stagewise Data Attribution
- GraphShaper: Geometry-aware Alignment for Improving Transfer Learning in Text-Attributed Graphs
- H4G: Unlocking Faithful Inference for Zero-Shot Graph Learning in Hyperbolic Space
- Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning
- nuGPR: GPU-Accelerated Gaussian Process Regression with Iterative Algorithms and Low-Rank Approximations
- Graph Few-Shot Learning via Adaptive Spectrum Experts and Cross-Set Distribution Calibration
- Fairness-Constrained Optimization Attack in Federated Learning
- Self-Verifying Reflection Helps Transformers with CoT Reasoning
- Hierarchical Koopman Diffusion: Fast Generation with Interpretable Diffusion Trajectory
- Unveiling the Vulnerability of Graph-LLMs: An Interpretable Multi-Dimensional Adversarial Attack on TAGs
- Optimal Regularization for Performative Learning
- FedMMKT:Co-Enhancing a Server Text-to-Image Model and Client Task Models in Multi-Modal Federated Learning
- Multi-Action Self-Improvement for Neural Combinatorial Optimization
- General Fourier Feature Physics-Informed Extreme Learning Machine (GFF-PIELM) for High-Frequency PDEs
- Leveraging Teleconnections with Physics-Informed Graph Attention Networks for Long-Range Extreme Rainfall Forecasting in Thailand
- Traveling Salesman-Based Token Ordering Improves Stability in Homomorphically Encrypted Language Models
- Towards Cross-Modal Error Detection with Tables and Images
- Enhanced Pre-training of Graph Neural Networks for Million-Scale Heterogeneous Graphs
- Cautious Weight Decay
- Continuous Uniqueness and Novelty Metrics for Generative Modeling of Inorganic Crystals
- Bayesian Optimization for Dynamic Pricing and Learning
- Time-Correlated Video Bridge Matching
- CrossAD: Time Series Anomaly Detection with Cross-scale Associations and Cross-window Modeling
- Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance
- Multi-Armed Bandits with Minimum Aggregated Revenue Constraints
- Research in Collaborative Learning Does Not Serve Cross-Silo Federated Learning in Practice
- Towards Fast Coarse-graining and Equation Discovery with Foundation Inference Models
- Expert or not? assessing data quality in offline reinforcement learning
- On Foundation Models for Temporal Point Processes to Accelerate Scientific Discovery
- Towards Foundation Inference Models that Learn ODEs In-Context
- Structured Sparsity and Weight-adaptive Pruning for Memory and Compute efficient Whisper models
- Structure-Aware Spectral Sparsification via Uniform Edge Sampling
- Keep Calm and Avoid Harmful Content: Concept Alignment and Latent Manipulation Towards Safer Answers
- CoRA: Covariate-Aware Adaptation of Time Series Foundation Models
- Few Shot Semi-Supervised Learning for Abnormal Stop Detection from Sparse GPS Trajectories
- Multitask finetuning and acceleration of chemical pretrained models for small molecule drug property prediction
- CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression
- Improving Decision Trees through the Lens of Parameterized Local Search
- Doctor Rashomon and the UNIVERSE of Madness: Variable Importance with Unobserved Confounding and the Rashomon Effect
- KoALA: KL-L0 Adversarial Detector via Label Agreement
- Sample-Efficient Omniprediction for Proper Losses
- scPPDM: A Diffusion Model for Single-Cell Drug-Response Prediction
- Multi-objective Bayesian Optimization with Human-in-the-Loop for Flexible Neuromorphic Electronics Fabrication
- Quantum Kernel Methods: Convergence Theory, Separation Bounds and Applications to Marketing Analytics
- PRISM: Enhancing Protein Inverse Folding through Fine-Grained Retrieval on Structure-Sequence Multimodal Representations
- Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models
- On Thompson Sampling and Bilateral Uncertainty in Additive Bayesian Optimization
- Active Subspaces in Infinite Dimension
- High-Probability Bounds For Heterogeneous Local Differential Privacy
- Simplifying Optimal Transport through Schatten-$p$ Regularization
- Enhancing Diffusion-Based Sampling with Molecular Collective Variables
- Statistical Guarantees for High-Dimensional Stochastic Gradient Descent
- Embedding the Teacher: Distilling vLLM Preferences for Scalable Image Retrieval
- MIARec: Mutual-influence-aware Heterogeneous Network Embedding for Scientific Paper Recommendation
- Compressibility Measures Complexity: Minimum Description Length Meets Singular Learning Theory
- FedLoDrop: Federated LoRA with Dropout for Generalized LLM Fine-tuning
- Locket: Robust Feature-Locking Technique for Language Models
- Probabilistic Super-Resolution for Urban Micrometeorology via a Schr\"odinger Bridge
- Follow-the-Perturbed-Leader for Decoupled Bandits: Best-of-Both-Worlds and Practicality
- Learning Mean-Field Games through Mean-Field Actor-Critic Flow
- Controllable Collision Scenario Generation via Collision Pattern Prediction
- A Gradient Guided Diffusion Framework for Chance Constrained Programming
- The Living Forecast: Evolving Day-Ahead Predictions into Intraday Reality
- Heterogeneous RBCs via deep multi-agent reinforcement learning
- DeepTrust: Multi-Step Classification through Dissimilar Adversarial Representations for Robust Android Malware Detection
- Learning Latent Energy-Based Models via Interacting Particle Langevin Dynamics
- Pretraining in Actor-Critic Reinforcement Learning for Robot Motion Control
- Constrained Sensing and Reliable State Estimation with Shallow Recurrent Decoders on a TRIGA Mark II Reactor
- Improved Central Limit Theorem and Bootstrap Approximations for Linear Stochastic Approximation
- Improving Generative Behavior Cloning via Self-Guidance and Adaptive Chunking
- Robot Learning: A Tutorial
- Geopolitics, Geoeconomics and Risk:A Machine Learning Approach
- Neural Guided Sampling for Quantum Circuit Optimization
- Formal Models and Convergence Analysis for Context-Aware Security Verification
- Diff-XYZ: A Benchmark for Evaluating Diff Understanding
- Why the noise model matters: A performance gap in learned regularization
- Universal Adaptive Environment Discovery
- Same model, better performance: the impact of shuffling on DNA Language Models benchmarking
- Adapting Noise to Data: Generative Flows from 1D Processes
- Contraction and entropy production in continuous-time Sinkhorn dynamics
- Data-Model Co-Evolution: Growing Test Sets to Refine LLM Behavior
- Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps
- Wavefront Coding for Accommodation-Invariant Near-Eye Displays
- Posterior Sampling for Continuing Environments
- WW-FL: Secure and Private Large-Scale Federated Learning
- IBCL: Zero-shot Model Generation under Stability-Plasticity Trade-offs
- Competitive Advantage Attacks to Decentralized Federated Learning
- Optimistic Multi-Agent Policy Gradient
- Toward Fair Graph Neural Networks Via Dual-Teacher Knowledge Distillation
- AlphaZero Neural Scaling and Zipf's Law: a Tale of Board Games and Power Laws
- Dimension Reduction with Locally Adjusted Graphs
- Resource-Constrained Federated Continual Learning: What Does Matter?
- Evaluating multiple models using labeled and unlabeled data
- RIGNO: A Graph-based framework for robust and accurate operator learning for PDEs on arbitrary domains
- Mirror Descent Actor Critic via Bounded Advantage Learning
- Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation
- Wasserstein-based Kernel Principal Component Analysis for Clustering Applications
- MooseAgent: A LLM Based Multi-agent Framework for Automating Moose Simulation
- Generative Deep Learning Framework for Inverse Design of Fuels
- Newton-Puiseux Analysis for Interpretability and Calibration of Complex-Valued Neural Networks
- Variational Rank Reduction Autoencoders
- Panda: A pretrained forecast model for chaotic dynamics
- Narrow Operator Models of Stellarator Equilibria in Fourier Zernike Basis
- K-Merge: Online Continual Merging of Adapters for On-device Large Language Models
- In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers
- Modeling Cultural Bias in Facial Expression Recognition with Adaptive Agents
- OpenDerisk: An Industrial Framework for AI-Driven SRE, with Design, Implementation, and Case Studies
- Deflanderization for Game Dialogue: Balancing Character Authenticity with Task Execution in LLM-based NPCs
- Subject Roles in the EU AI Act: Mapping and Regulatory Implications
- NOSA: Native and Offloadable Sparse Attention
- Message Passing on the Edge: Towards Scalable and Expressive GNNs
- The Role of Computing Resources in Publishing Foundation Model Research
- Unlocking Public Catalogues: Instruction-Tuning LLMs for ICD Coding of German Tumor Diagnoses
- Closing the Gap Between Text and Speech Understanding in LLMs
- Time Series Foundation Models: Benchmarking Challenges and Requirements
- Axial Neural Networks for Dimension-Free Foundation Models
- CanvasMAR: Improving Masked Autoregressive Video Generation With Canvas
- MVCustom: Multi-View Customized Diffusion via Geometric Latent Rendering and Completion
- Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents
- Dedelayed: Deleting remote inference delay via on-device correction
- NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching
- FIRST: Federated Inference Resource Scheduling Toolkit for Scientific AI Model Access
- Multi-Scale High-Resolution Logarithmic Grapher Module for Efficient Vision GNNs
- RECODE: Reasoning Through Code Generation for Visual Question Answering
- Scaling Vision Transformers for Functional MRI with Flat Maps
- InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
- The Art of Scaling Reinforcement Learning Compute for LLMs
- Provably Invincible Adversarial Attacks on Reinforcement Learning Systems: A Rate-Distortion Information-Theoretic Approach
- Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs
- Generative Universal Verifier as Multimodal Meta-Reasoner
- Quantile Markov Decision Process
- Translating Regulatory Clauses into Executable Codes for Building Design Checking via Large Language Model Driven Function Matching and Composing
- Improving Planning with Large Language Models: A Modular Agentic Architecture
- Sentiment and Emotion-aware Multi-criteria Fuzzy Group Decision Making System
- Reinforcing Competitive Multi-Agents for Playing 'So Long Sucker'
- AI Realtor: Towards Grounded Persuasive Language Generation for Automated Copywriting
- LLM-Enabled In-Context Learning for Data Collection Scheduling in UAV-assisted Sensor Networks
- Deep Generative Prior for First Order Inverse Optimization
- GUARDIAN: Safeguarding LLM Multi-Agent Collaborations with Temporal Graph Modeling
- MSEarth: A Multimodal Scientific Dataset and Benchmark for Phenomena Uncovering in Earth Science
- Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
- When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning
- Nash Equilibria, Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization
- MULTI: Multimodal Understanding Leaderboard with Text and Images
- Do LLM Agents Have Regret? A Case Study in Online Learning and Games
- A Comprehensive Survey on Data Augmentation
- Extreme Compression of Adaptive Neural Images
- Beyond Visual Appearances: Privacy-sensitive Objects Identification via Hybrid Graph Reasoning
- Hi-Drive: Hierarchical POMDP Planning for Safe Autonomous Driving in Diverse Urban Environments
- Temporal-Difference Variational Continual Learning
- Optimal Quantization for Matrix Multiplication
- ProReason: Multi-Modal Proactive Reasoning with Decoupled Eyesight and Wisdom
- On the Limits of Language Generation: Trade-Offs Between Hallucination and Mode Collapse
- Semantically Guided Action Anticipation
- SoundnessBench: A Soundness Benchmark for Neural Network Verifiers
- BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery
- Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process
- FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model
- PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection
- Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
- Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
- A Personalized Data-Driven Generative Model of Human Repetitive Motion
- On the Consistency of Multilingual Context Utilization in Retrieval-Augmented Generation
- Statistical post-processing yields accurate probabilistic forecasts from Artificial Intelligence weather models
- MIRROR: Multimodal Cognitive Reframing Therapy for Rolling with Resistance
- FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
- Flattening Hierarchies with Policy Bootstrapping
- Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning
- R$^2$ec: Towards Large Recommender Models with Reasoning
- ReasoningShield: Safety Detection over Reasoning Traces of Large Reasoning Models
- Multi-Scale Probabilistic Generation Theory: A Unified Information-Theoretic Framework for Hierarchical Structure in Large Language Models
- Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding
- The quest for the GRAph Level autoEncoder (GRALE)
- FLEX: A Largescale Multimodal, Multiview Dataset for Learning Structured Representations for Fitness Action Quality Assessment
- Decentralizing Multi-Agent Reinforcement Learning with Temporal Causal Information
- Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning
- PAL: Probing Audio Encoders via LLMs - Audio Information Transfer into LLMs
- Investigating the interaction of linguistic and mathematical reasoning in language models using multilingual number puzzles
- A Brain-to-Population Graph Learning Framework for Diagnosing Brain Disorders
- LLM Probability Concentration: How Alignment Shrinks the Generative Horizon
- Orthogonal Finetuning Made Scalable
- Early Signs of Steganographic Capabilities in Frontier LLMs
- Adversarial Distilled Retrieval-Augmented Guarding Model for Online Malicious Intent Detection
- Think as a Doctor: An Interpretable AI Approach for ICU Mortality Prediction
- Schr\"odinger bridge for generative AI: Soft-constrained formulation and convergence analysis
- Z0-Inf: Zeroth Order Approximation for Data Influence
- WaveletDiff: Multilevel Wavelet Diffusion For Time Series Generation
- Evaluating Open-Source Vision-Language Models for Multimodal Sarcasm Detection
- Actor-Enriched Time Series Forecasting of Process Performance
- Improving Knowledge Graph Embeddings through Contrastive Learning with Negative Statements
- Robust Adversarial Reinforcement Learning in Stochastic Games via Sequence Modeling
- ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty
- Variational Mixture of Graph Neural Experts for Alzheimer's Disease Biomarker Recognition in EEG Brain Networks
- From Literal to Liberal: A Meta-Prompting Framework for Eliciting Human-Aligned Exception Handling in Large Language Models
- DeepPlanner: Scaling Planning Capability for Deep Research Agents via Advantage Shaping
- SENTINEL: A Multi-Level Formal Framework for Safety Evaluation of LLM-based Embodied Agents
- From Narratives to Probabilistic Reasoning: Predicting and Interpreting Drivers' Hazardous Actions in Crashes Using Large Language Model
- Toward Reasoning-Centric Time-Series Analysis
- Repairing Reward Functions with Human Feedback to Mitigate Reward Hacking
- Emotional Cognitive Modeling Framework with Desire-Driven Objective Optimization for LLM-empowered Agent in Social Simulation
- Adaptive Reasoning Executor: A Collaborative Agent System for Efficient Reasoning
- Personalized Learning Path Planning with Goal-Driven Learner State Modeling
- EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic Systems
- An Analytical Framework to Enhance Autonomous Vehicle Perception for Smart Cities
- SAJA: A State-Action Joint Attack Framework on Multi-Agent Deep Reinforcement Learning
- Learnable Game-theoretic Policy Optimization for Data-centric Self-explanation Rationalization
- Assessing LLM Reasoning Through Implicit Causal Chain Discovery in Climate Discourse
- Mobile Coverage Analysis using Crowdsourced Data
- Confidence as a Reward: Transforming LLMs into Reward Models
- A Methodology for Assessing the Risk of Metric Failure in LLMs Within the Financial Domain
- Tandem Training for Language Models
- A Modal Logic for Temporal and Jurisdictional Classifier Models
- Training LLM Agents to Empower Humans
- From Refusal to Recovery: A Control-Theoretic Approach to Generative AI Guardrails
- Hard2Verify: A Step-Level Verification Benchmark for Open-Ended Frontier Math
- AutoCode: LLMs as Problem Setters for Competitive Programming
- Benchmarking Open-Source Large Language Models for Persian in Zero-Shot and Few-Shot Learning
- Cancer Diagnosis Categorization in Electronic Health Records Using Large Language Models and BioBERT: Model Performance Evaluation Study
- From Noise to Signal to Selbstzweck: Reframing Human Label Variation in the Era of Post-training in NLP
- MEDEQUALQA: Evaluating Biases in LLMs with Counterfactual Reasoning
- Beyond Discrete Categories: Multi-Task Valence-Arousal Modeling for Pet Vocalization Analysis
- Evidence Without Injustice: A New Counterfactual Test for Fair Algorithms
- Classifier-Augmented Generation for Structured Workflow Prediction
- Scheming Ability in LLM-to-LLM Strategic Interactions
- Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
- Mathematics with large language models as provers and verifiers
- Gobernanza y trazabilidad "a prueba de AI Act" para casos de uso legales: un marco t\'ecnico-jur\'idico, m\'etricas forenses y evidencias auditables
- MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training
- Coherent Load Profile Synthesis with Conditional Diffusion for LV Distribution Network Scenario Generation
- Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction
- Repurposing Annotation Guidelines to Instruct LLM Annotators: A Case Study
- Semantic knowledge guides innovation and drives cultural evolution
- A\textsuperscript{2}FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning
- FaStFACT: Faster, Stronger Long-Form Factuality Evaluations in LLMs
- VLURes: Benchmarking VLM Visual and Linguistic Understanding in Low-Resource Languages
- Ethic-BERT: An Enhanced Deep Learning Model for Ethical and Non-Ethical Content Classification
- Efficient Adaptive Transformer: An Empirical Study and Reproducible Framework
- Adaptive Generation of Bias-Eliciting Questions for LLMs
- A Critical Review of the Need for Knowledge-Centric Evaluation of Quranic Recitation
- Three Lenses on the AI Revolution: Risk, Transformation, Continuity
- KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems
- InferA: A Smart Assistant for Cosmological Ensemble Data
- HyWA: Hypernetwork Weight Adapting Personalized Voice Activity Detection
- SpareCodeSearch: Searching for Code Context When You Have No Spare GPU
- Epistemic-aware Vision-Language Foundation Model for Fetal Ultrasound Interpretation
- A Multimodal XAI Framework for Trustworthy CNNs and Bias Detection in Deep Representation Learning
- Max It or Miss It: Benchmarking LLM On Solving Extremal Problems
- CurLL: A Developmental Framework to Evaluate Continual Learning in Language Models
- Developing and Validating the Arabic Version of the Attitudes Toward Large Language Models Scale
- Deliberate Lab: A Platform for Real-Time Human-AI Social Experiments
- Randomness and Interpolation Improve Gradient Descent
- SeqBench: Benchmarking Sequential Narrative Generation in Text-to-Video Models
- SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion
- Time-Varying Optimization for Streaming Data Via Temporal Weighting
- VLA-0: Building State-of-the-Art VLAs with Zero Modification
- Towards Human-Centric Intelligent Treatment Planning for Radiation Therapy
- True Self-Supervised Novel View Synthesis is Transferable
- NeuroRVQ: Multi-Scale EEG Tokenization for Generative Large Brainwave Models
- Transformer-based Scalable Beamforming Optimization via Deep Residual Learning
- Agentic Discovery: Closing the Loop with Cooperative Agents
- A Multi-dimensional Semantic Surprise Framework Based on Low-Entropy Semantic Manifolds for Fine-Grained Out-of-Distribution Detection
- ESI: Epistemic Uncertainty Quantification via Semantic-preserving Intervention for Large Language Models
- TRUSTVIS: A Multi-Dimensional Trustworthiness Evaluation Framework for Large Language Models
- DriveCritic: Towards Context-Aware, Human-Aligned Evaluation for Autonomous Driving with Vision-Language Models
- Multi-Label Clinical Text Eligibility Classification and Summarization System
- On the Reasoning Abilities of Masked Diffusion Language Models
- Stable LLM Ensemble: Interaction between Example Representativeness and Diversity
- Program of Thoughts for Financial Reasoning: Leveraging Dynamic In-Context Examples and Generative Retrieval
- Behavioral Embeddings of Programs: A Quasi-Dynamic Approach for Optimization Prediction
- StressTransfer: Stress-Aware Speech-to-Speech Translation with Emphasis Preservation
- Paper Copilot: Tracking the Evolution of Peer Review in AI Conferences
- LLM-Guided Synthetic Augmentation (LGSA) for Mitigating Bias in AI Systems
- CleverCatch: A Knowledge-Guided Weak Supervision Model for Fraud Detection
- MimicParts: Part-aware Style Injection for Speech-Driven 3D Motion Generation
- What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging
- MotionBeat: Motion-Aligned Music Representation via Embodied Contrastive Learning and Bar-Equivariant Contact-Aware Encoding
- Real-Time Crowd Counting for Embedded Systems with Lightweight Architecture
- A Ratio-Based Shapley Value for Collaborative Machine Learning - Extended Version
- To Steer or Not to Steer? Mechanistic Error Reduction with Abstention for Language Models
- Higher Satisfaction, Lower Cost: A Technical Report on How LLMs Revolutionize Meituan's Intelligent Interaction Systems
- LLM one-shot style transfer for Authorship Attribution and Verification
- Self-Augmented Visual Contrastive Decoding
- Injection, Attack and Erasure: Revocable Backdoor Attacks via Machine Unlearning
- Thompson Sampling via Fine-Tuning of LLMs
- AOAD-MAT: Transformer-based multi-agent deep reinforcement learning model considering agents' order of action decisions
- Protect: Towards Robust Guardrailing Stack for Trustworthy Enterprise LLM Systems
- Personal Attribute Leakage in Federated Speech Models
- Adversarial Fine-tuning in Offline-to-Online Reinforcement Learning for Robust Robot Control
- Generalist++: A Meta-learning Framework for Mitigating Trade-off in Adversarial Training
- Language as a Label: Zero-Shot Multimodal Classification of Everyday Postures under Data Scarcity
- Document Intelligence in the Era of Large Language Models: A Survey
- A New Perspective on Transformers in Online Reinforcement Learning for Continuous Control
- MADREC: A Multi-Aspect Driven LLM Agent for Explainable and Adaptive Recommendation
- From Minimal Existence to Human Definition: The CES-IMU-HSG Theoretical Framework
- Semantic Communication Enabled Holographic Video Processing and Transmission
- Rectify and Align GPS Points to Parking Spots via Rank-1 Constraint
- Neural Sum-of-Squares: Certifying the Nonnegativity of Polynomials with Transformers
- LiteraryQA: Towards Effective Evaluation of Long-document Narrative QA
- DistilCLIP-EEG: Enhancing Epileptic Seizure Detection Through Multi-modal Learning and Knowledge Distillation
- ConsintBench: Evaluating Language Models on Real-World Consumer Intent Understanding
- MedREK: Retrieval-Based Editing for Medical LLMs with Key-Aware Prompts
- Offline and Online KL-Regularized RLHF under Differential Privacy
- UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning
Research Sources: 396 | Generated: 10/16/2025
