AI RESEARCH PAPERS & ACADEMIC SOURCES
- RESCUE: Crowd Evacuation Simulation via Controlling SDM-United Characters
- Segmenting Bi-Atrial Structures Using ResNext Based Framework
- Poisson multi-Bernoulli mixture filter for trajectory measurements
- Filling of incomplete sinograms from sparse PET detector configurations using a residual U-Net
- Depth-Sequence Transformer (DST) for Segment-Specific ICA Calcification Mapping on Non-Contrast CT
- A Neurosymbolic Agent System for Compositional Visual Reasoning
- RoboSwap: A GAN-driven Video Diffusion Framework For Unsupervised Robot Arm Swapping
- Attention, Please! Revisiting Attentive Probing Through the Lens of Efficiency
- SIRI-Bench: Challenging VLMs' Spatial Intelligence through Complex Reasoning Tasks
- VisualChef: Generating Visual Aids in Cooking via Mask Inpainting
- ImplicitQA: Going beyond frames towards Implicit Video Reasoning
- VSRM: A Robust Mamba-Based Framework for Video Super-Resolution
- Comprehensive Evaluation of Large Multimodal Models for Nutrition Analysis: A New Benchmark Enriched with Contextual Metadata
- SurfDist: Interpretable Three-Dimensional Instance Segmentation Using Curved Surface Patches
- UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks
- DiViD: Disentangled Video Diffusion for Static-Dynamic Factorization
- FVQ: A Large-Scale Dataset and an LMM-based Method for Face Video Quality Assessment
- From Gaze to Insight: Bridging Human Visual Attention and Vision Language Model Explanation for Weakly-Supervised Medical Image Segmentation
- MambaMoE: Mixture-of-Spectral-Spatial-Experts State Space Model for Hyperspectral Image Classification
- FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors
- Enhancing Transformers Through Conditioned Embedded Tokens
- RGB-to-Polarization Estimation: A New Task and Benchmark Study
- GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization
- DS-VTON: An Enhanced Dual-Scale Coarse-to-Fine Framework for Virtual Try-On
- Resolving Task Objective Conflicts in Unified Model via Task-Aware Mixture-of-Experts
- TimeFormer: Capturing Temporal Relationships of Deformable 3D Gaussians for Robust Reconstruction
- Grounding-IQA: Grounding Multimodal Language Model for Image Quality Assessment
- Autonomous Imagination: Closed-Loop Decomposition of Visual-to-Textual Conversion in Visual Reasoning for Multimodal Large Language Models
- DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes
- Leveraging Confident Image Regions for Source-Free Domain-Adaptive Object Detection
- Fast-RF-Shimming: Accelerate RF Shimming in 7T MRI using Deep Learning
- UniUIR: Considering Underwater Image Restoration as An All-in-One Learner
- Generative Human Geometry Distribution
- Explaining Human Preferences via Metrics for Structured 3D Reconstruction
- DecompDreamer: A Composition-Aware Curriculum for Structured 3D Asset Generation
- Sliding Window Attention for Learned Video Compression
- Super-resolution image projection over an extended depth of field using a diffractive decoder
- Use of Quadcopter Wakes to Supplement Strawberry Pollination
- MoME: Mixture of Matryoshka Experts for Audio-Visual Speech Recognition
- The method of the approximate inverse for limited-angle CT
- Adaptive double-phase Rudin--Osher--Fatemi denoising model
- QGFace: Quality-Guided Joint Training For Mixed-Quality Face Recognition
- Foveated Retinotopy Improves Classification and Localization in CNNs
- Interactive Test-Time Adaptation with Reliable Spatial-Temporal Voxels for Multi-Modal Segmentation
- Evaluating Perceptual Distance Models by Fitting Binomial Distributions to Two-Alternative Forced Choice Data
- Reconstructing Topology-Consistent Face Mesh by Volume Rendering from Multi-View Images
- Capsule Network Projectors are Equivariant and Invariant Learners
- PanDORA: Casual HDR Radiance Acquisition for Indoor Scenes
- Law of Vision Representation in MLLMs
- CoralSCOP-LAT: Labeling and Analyzing Tool for Coral Reef Images with Dense Mask
- SSDD: Single-Step Diffusion Decoder for Efficient Image Tokenization
- Exploring the Efficacy of Modified Transfer Learning in Identifying Parkinson's Disease Through Drawn Image Patterns
- Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
- SegMASt3R: Geometry Grounded Segment Matching
- No-reference Quality Assessment of Contrast-distorted Images using Contrast-enhanced Pseudo Reference
- Neuroplastic Modular Framework: Cross-Domain Image Classification of Garbage and Industrial Surfaces
- Factuality Matters: When Image Generation and Editing Meet Structured Visuals
- Character Mixing for Video Generation
- VChain: Chain-of-Visual-Thought for Reasoning in Video Generation
- How We Won BraTS-SSA 2025: Brain Tumor Segmentation in the Sub-Saharan African Population Using Segmentation-Aware Data Augmentation and Model Ensembling
- Model-Guided Microstimulation Steers Primate Visual Behavior
- Towards Robust and Generalizable Continuous Space-Time Video Super-Resolution with Events
- ExposureEngine: Oriented Logo Detection and Sponsor Visibility Analytics in Sports Broadcasts
- Anomaly-Aware YOLO: A Frugal yet Robust Approach to Infrared Small Target Detection
- Beyond Appearance: Transformer-based Person Identification from Conversational Dynamics
- Hands-Free Heritage: Automated 3D Scanning for Cultural Heritage Digitization
- A Comparative Study of Vision Transformers and CNNs for Few-Shot Rigid Transformation and Fundamental Matrix Estimation
- AvatarVTON: 4D Virtual Try-On for Animatable Avatars
- Flow Matching for Conditional MRI-CT and CBCT-CT Image Synthesis
- Detailed Aerial Mapping of Photovoltaic Power Plants Through Semantically Significant Keypoints
- From Actions to Kinesics: Extracting Human Psychological States through Bodily Movements
- Read the Room: Inferring Social Context Through Dyadic Interaction Recognition in Cyber-physical-social Infrastructure Systems
- \mu DeepIQA: deep learning-based fast and robust image quality assessment with local predictions for optical microscopy
- In-Field Mapping of Grape Yield and Quality with Illumination-Invariant Deep Learning
- A Semantics-Aware Hierarchical Self-Supervised Approach to Classification of Remote Sensing Images
- TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling
- Conditional Representation Learning for Customized Tasks
- Pathology-CoT: Learning Visual Chain-of-Thought Agent from Expert Whole Slide Image Diagnosis Behavior
- A Spatial-Spectral-Frequency Interactive Network for Multimodal Remote Sensing Classification
- Do Superpixel Segmentation Methods Influence Deforestation Image Classification?
- EduPersona: Benchmarking Subjective Ability Boundaries of Virtual Student Agents
- MoME: Estimating Psychological Traits from Gait with Multi-Stage Mixture of Movement Experts
- ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement
- Label-Efficient Cross-Modality Generalization for Liver Segmentation in Multi-Phase MRI
- ID-Consistent, Precise Expression Generation with Blendshape-Guided Diffusion
- Object-Centric Representation Learning for Enhanced 3D Scene Graph Prediction
- Benchmark on Monocular Metric Depth Estimation in Wildlife Setting
- Let Features Decide Their Own Solvers: Hybrid Feature Caching for Diffusion Transformers
- A Recursive Pyramidal Algorithm for Solving the Image Registration Problem
- Scaling Sequence-to-Sequence Generative Neural Rendering
- The best performance in the CARE 2025 -- Liver Task (LiSeg-Contrast): Contrast-Aware Semi-Supervised Segmentation with Domain Generalization and Test-Time Adaptation
- Flexible and Efficient Spatio-Temporal Transformer for Sequential Visual Place Recognition
- ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
- CARE-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson's Disease Gait Assessment
- GenAR: Next-Scale Autoregressive Generation for Spatial Gene Expression Prediction
- Diffusion^2: Dual Diffusion Model with Uncertainty-Aware Adaptive Noise for Momentary Trajectory Prediction
- CodeFormer++: Blind Face Restoration Using Deformable Registration and Deep Metric Learning
- A.I.R.: Enabling Adaptive, Iterative, and Reasoning-based Frame Selection For Video Question Answering
- REAR: Rethinking Visual Autoregressive Models via Generator-Tokenizer Consistency Regularization
- VaseVQA-3D: Benchmarking 3D VLMs on Ancient Greek Pottery
- TBStar-Edit: From Image Editing Pattern Shifting to Consistency Enhancement
- Asynchronous Denoising Diffusion Models for Aligning Text-to-Image Generation
- Zero-Shot Fine-Grained Image Classification Using Large Vision-Language Models
- From Filters to VLMs: Benchmarking Defogging Methods through Object Detection and Segmentation Performance
- Generating Human Motion Videos using a Cascaded Text-to-Video Framework
- Harnessing Synthetic Preference Data for Enhancing Temporal Understanding of Video-LLMs
- Fit Pixels, Get Labels: Meta-learned Implicit Networks for Image Segmentation
- Video-in-the-Loop: Span-Grounded Long Video QA with Interleaved Reasoning
- Enhancing Fake News Video Detection via LLM-Driven Creative Process Simulation
- Ordinal Encoding as a Regularizer in Binary Loss for Solar Flare Prediction
- QuantDemoire: Quantization with Outlier Aware for Image Demoir\'eing
- Diffusion Low Rank Hybrid Reconstruction for Sparse View Medical Imaging
- Learning Efficient Meshflow and Optical Flow from Event Cameras
- Joint Learning of Pose Regression and Denoising Diffusion with Score Scaling Sampling for Category-level 6D Pose Estimation
- BLADE: Bias-Linked Adaptive DEbiasing
- Contrastive-SDE: Guiding Stochastic Differential Equations with Contrastive Learning for Unpaired Image-to-Image Translation
- Mirage: Unveiling Hidden Artifacts in Synthetic Images with Large Vision-Language Models
- UGround: Towards Unified Visual Grounding with Unrolled Transformers
- Optimized Minimal 4D Gaussian Splatting
- Cross-View Open-Vocabulary Object Detection in Aerial Imagery
- Exploring the Challenge and Value of Deep Learning in Automated Skin Disease Diagnosis
- SDAKD: Student Discriminator Assisted Knowledge Distillation for Super-Resolution Generative Adversarial Networks
- DHQA-4D: Perceptual Quality Assessment of Dynamic 4D Digital Human
- Skin Lesion Classification Based on ResNet-50 Enhanced With Adaptive Spatial Feature Fusion
- Exploring Instruction Data Quality for Explainable Image Quality Assessment
- Real-Time Assessment of Bystander Situation Awareness in Drone-Assisted First Aid
- FrameOracle: Learning What to See and How Much to See in Videos
- Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL
- A Novel Cloud-Based Diffusion-Guided Hybrid Model for High-Accuracy Accident Detection in Intelligent Transportation Systems
- SAMSOD: Rethinking SAM Optimization for RGB-T Salient Object Detection
- LoRA Patching: Exposing the Fragility of Proactive Defenses against Deepfakes
- The Overlooked Value of Test-time Reference Sets in Visual Place Recognition
- CoPA: Hierarchical Concept Prompting and Aggregating Network for Explainable Diagnosis
- Efficiency vs. Efficacy: Assessing the Compression Ratio-Dice Score Relationship through a Simple Benchmarking Framework for Cerebrovascular 3D Segmentation
- MambaCAFU: Hybrid Multi-Scale and Multi-Attention Model with Mamba-Based Fusion for Medical Image Segmentation
- Domain-Robust Marine Plastic Detection Using Vision Models
- Advances in Medical Image Segmentation: A Comprehensive Survey with a Focus on Lumbar Spine Applications
- Error correction in multiclass image classification of facial emotion on unbalanced samples
- OpusAnimation: Code-Based Dynamic Chart Generation
- Visual Odometry with Transformers
- Sonar Image Datasets: A Comprehensive Survey of Resources, Challenges, and Applications
- Learned Display Radiance Fields with Lensless Cameras
- Visual Language Model as a Judge for Object Detection in Industrial Diagrams
- Denoising of Two-Phase Optically Sectioned Structured Illumination Reconstructions Using Encoder-Decoder Networks
- PEaRL: Pathway-Enhanced Representation Learning for Gene and Pathway Expression Prediction from Histology
- Domain Generalization for Semantic Segmentation: A Survey
- From Scope to Script: An Automated Report Generation Model for Gastrointestinal Endoscopy
- Streaming Drag-Oriented Interactive Video Manipulation: Drag Anything, Anytime!
- Improve MLLM Benchmark Efficiency through Interview
- SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning
- Towards Enforcing Company Policy Adherence in Agentic Workflows
- MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly
- Identity resolution of software metadata using Large Language Models
- SoC-DT: Standard-of-Care Aligned Digital Twins for Patient-Specific Tumor Dynamics
- Visualizing Celebrity Dynamics in Video Content: A Proposed Approach Using Face Recognition Timestamp Data
- Geometry of orofacial neuromuscular signals: speech articulation decoding using surface electromyography
- HP-BERT: A framework for longitudinal study of Hinduphobia on social media via language models
- Can Large Language Models Outperform Non-Experts in Poetry Evaluation? A Comparative Study Using the Consensual Assessment Technique
- Praxis-VLM: Vision-Grounded Decision Making via Text-Driven Reinforcement Learning
- Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish
- AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery
- Forecasting Conversation Derailments Through Generation
- SCAN: Structured Capability Assessment and Navigation for LLMs
- DACL-RAG: Data Augmentation Strategy with Curriculum Learning for Retrieval-Augmented Generation
- TACO: Enhancing Multimodal In-context Learning via Task Mapping-Guided Sequence Configuration
- Social Good or Scientific Curiosity? Uncovering the Research Framing Behind NLP Artefacts
- StressTest: Can YOUR Speech LM Handle the Stress?
- Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
- No Tokens Wasted: Leveraging Long Context in Biomedical Vision-Language Models
- Enhancing OCR for Sino-Vietnamese Language Processing via Fine-tuned PaddleOCRv5
- Visual Representations inside the Language Model
- Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches
- Semantic Journeys: Quantifying Change in Emoji Meaning from 2012-2018
- Understanding Retrieval Augmentation for Long-Form Question Answering
- Rowen: Adaptive Retrieval-Augmented Generation for Hallucination Mitigation in LLMs
- Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse Reinforcement Learning
- Can LLMs Detect Ambiguous Plural Reference? An Analysis of Split-Antecedent and Mereological Reference
- Robustness assessment of large audio language models in multiple-choice evaluation
- FedSRD: Sparsify-Reconstruct-Decompose for Communication-Efficient Federated Large Language Models Fine-Tuning
- FT-MDT: Extracting Decision Trees from Medical Texts via a Novel Low-rank Adaptation Method
- Multi-Agent Tool-Integrated Policy Optimization
- JSON Whisperer: Efficient JSON Editing with LLMs
- A Low-Resource Speech-Driven NLP Pipeline for Sinhala Dyslexia Assistance
- ModernBERT + ColBERT: Enhancing biomedical RAG through an advanced re-ranking retriever
- Are BabyLMs Deaf to Gricean Maxims? A Pragmatic Evaluation of Sample-efficient Language Models
- Hybrid Architectures for Language Models: Systematic Analysis and Design Insights
- How I Built ASR for Endangered Languages with a Spoken Dictionary
- Instability in Downstream Task Performance During LLM Pretraining
- When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA
- A Set of Quebec-French Corpus of Regional Expressions and Terms
- Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization
- COLE: a Comprehensive Benchmark for French Language Understanding Evaluation
- Simulating and Understanding Deceptive Behaviors in Long-Horizon Interactions
- AgriGPT-VL: Agricultural Vision-Language Understanding Suite
- LLM Microscope: What Model Internals Reveal About Answer Correctness and Context Utilization
- What Makes Diffusion Language Models Super Data Learners?
- PoLi-RL: A Point-to-List Reinforcement Learning Framework for Conditional Semantic Textual Similarity
- Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning
- Sri Lanka Document Datasets: A Large-Scale, Multilingual Resource for Law, News, and Policy (v20251005)
- Self Speculative Decoding for Diffusion Large Language Models
- Teaching LLM to be Persuasive: Reward-Enhanced Policy Optimization for Alignment frm Heterogeneous Rewards
- Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought
- Equipping Retrieval-Augmented Large Language Models with Document Structure Awareness
- Measuring Language Model Hallucinations Through Distributional Correctness
- Evaluation of Clinical Trials Reporting Quality using Large Language Models
- Good Intentions Beyond ACL: Who Does NLP for Social Good, and Where?
- On the Role of Unobserved Sequences on Sample-based Uncertainty Quantification for LLMs
- Mitigating Forgetting Between Supervised and Reinforcement Learning Yields Stronger Reasoners
- Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance
- What is a protest anyway? Codebook conceptualization is still a first-order concern in LLM-era classification
- CCD-Bench: Probing Cultural Conflict in Large Language Model Decision-Making
- Decoupling Task-Solving and Output Formatting in LLM Generation
- UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG
- Fine-Tuning Large Language Models with QLoRA for Offensive Language Detection in Roman Urdu-English Code-Mixed Text
- Cross-Lingual Multi-Granularity Framework for Interpretable Parkinson's Disease Diagnosis from Speech
- Prompt Balance Matters: Understanding How Imbalanced Few-Shot Learning Affects Multilingual Sense Disambiguation in LLMs
- Annotate Rhetorical Relations with INCEpTION: A Comparison with Automatic Approaches
- Read Between the Lines: A Benchmark for Uncovering Political Bias in Bangla News Articles
- PsycholexTherapy: Simulating Reasoning in Psychotherapy with Small Language Models in Persian
- Mapping Patient-Perceived Physician Traits from Nationwide Online Reviews with LLMs
- Graph-S3: Enhancing Agentic textual Graph Retrieval with Synthetic Stepwise Supervision
- Morpheme Induction for Emergent Language
- Omni-Embed-Nemotron: A Unified Multimodal Retrieval Model for Text, Image, Audio, and Video
- Searching for the Most Human-like Emergent Language
- Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs
- Divergence Minimization Preference Optimization for Diffusion Model Alignment
- Conformal Fields from Neural Networks
- Fast constrained sampling in pre-trained diffusion models
- Probabilistic Language-Image Pre-Training
- Don't Pay Attention, PLANT It: Pretraining Attention via Learning-to-Rank
- SEE-DPO: Self Entropy Enhanced Direct Preference Optimization
- Evolutionary Optimization of Physics-Informed Neural Networks: Evo-PINN Frontiers and Opportunities
- Sample Complexity of Linear Quadratic Regulator Without Initial Stability
- On Pruning State-Space LLMs
- Seeded Poisson Factorization: leveraging domain knowledge to fit topic models
- PRO-VPT: Distribution-Adaptive Visual Prompt Tuning via Prompt Relocation
- Benchmark Dataset for Pore-Scale CO2-Water Interaction
- MAD: A Magnitude And Direction Policy Parametrization for Stability Constrained Reinforcement Learning
- Conformalized Generative Bayesian Imaging: An Uncertainty Quantification Framework for Computational Imaging
- A Hybrid Strategy for Probabilistic Forecasting and Trading of Aggregated Wind-Solar Power: Design and Analysis in HEFTCom2024
- Do We Need All the Synthetic Data? Targeted Synthetic Image Augmentation via Diffusion Models
- UniSim: A Unified Simulator for Time-Coarsened Dynamics of Biomolecules
- RhoDARTS: Differentiable Quantum Architecture Search with Density Matrix Simulations
- Making Logic a First-Class Citizen in Network Data Generation with ML
- Sampling-aware Adversarial Attacks Against Large Language Models
- Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling
- GUIDE: Towards Scalable Advising for Research Ideas
- Owen Sampling Accelerates Contribution Estimation in Federated Learning
- RealKIE: Five Novel Datasets for Enterprise Key Information Extraction
- Data-Driven Performance Guarantees for Classical and Learned Optimizers
- On-Demand Growth of Semiconductor Heterostructures Guided by Physics-Informed Machine Learning
- How to build a consistency model: Learning flow maps via self-distillation
- A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning
- Rethinking Probabilistic Circuit Parameter Learning
- Spurious Privacy Leakage in Neural Networks
- LUMION: Fast Fault Recovery for ML Jobs Using Programmable Optical Fabrics
- Cascading Adversarial Bias from Injection to Distillation in Language Models
- Learning Semantics, Not Addresses: Runtime Neural Prefetching for Far Memory
- Reliably Detecting Model Failures in Deployment Without Labels
- Measurement-Aligned Sampling for Inverse Problem
- Curating art exhibitions using machine learning
- Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test
- Comparative Performance of Collaborative Bandit Algorithms: Effect of Sparsity and Exploration Intensity
- The Persistence of Neural Collapse Despite Low-Rank Bias
- Joint Diffusion models in Continual Learning
- Detecting LLM Hallucination Through Layer-wise Information Deficiency: Analysis of Ambiguous Prompts and Unanswerable Questions
- SyMerge: From Non-Interference to Synergistic Merging via Single-Layer Adaptation
- Chameleon2++: An Efficient and Scalable Variant Of Chameleon Clustering
- From Restless to Contextual: A Thresholding Bandit Reformulation For Finite-horizon Performance
- Mamba base PKD for efficient knowledge compression
- Exact and Linear Convergence for Federated Learning under Arbitrary Client Participation is Attainable
- TRA: Better Length Generalisation with Threshold Relative Attention
- Learning and Transferring Physical Models through Derivatives
- Taming OOD Actions for Offline Reinforcement Learning: An Advantage-Based Approach
- Learning Penalty for Optimal Partitioning via Automatic Feature Extraction
- A Case for Library-Level k-Means Binning in Histogram Gradient-Boosted Trees
- Towards Coordinate- and Dimension-Agnostic Machine Learning for Partial Differential Equations
- Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers
- Data-Driven Adaptive PID Control Based on Physics-Informed Neural Networks
- Evaluating LLMs for Demographic-Targeted Social Bias Detection: A Comprehensive Benchmark Study
- A Study on the Data Distribution Gap in Music Emotion Recognition
- Predictive economics: Rethinking economic methodology with machine learning
- Beyond the Seen: Bounded Distribution Estimation for Open-Vocabulary Learning
- Federated Learning for Surgical Vision in Appendicitis Classification: Results of the FedSurg EndoVis 2024 Challenge
- Beyond Random: Automatic Inner-loop Optimization in Dataset Distillation
- ERDE: Entropy-Regularized Distillation for Early-exit
- BenthiCat: An opti-acoustic dataset for advancing benthic classification and habitat mapping
- Comparative Analysis of YOLOv5, Faster R-CNN, SSD, and RetinaNet for Motorbike Detection in Kigali Autonomous Driving Context
- Pivotal CLTs for Pseudolikelihood via Conditional Centering in Dependent Random Fields
- Latent Uncertainty Representations for Video-based Driver Action and Intention Recognition
- A Unified Optimization Framework for Multiclass Classification with Structured Hyperplane Arrangements
- Diffusion Approximations for Thompson Sampling in the Small Gap Regime
- Generalization of LiNGAM that allows confounding
- Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays
- Federated Continual Learning Goes Online: Uncertainty-Aware Memory Management for Vision Tasks and Beyond
- Optimal Bound for PCA with Outliers using Higher-Degree Voronoi Diagrams
- Temporal Source Recovery for Time-Series Source-Free Unsupervised Domain Adaptation
- Learning-Augmented Robust Algorithmic Recourse
- Low-Dimension-to-High-Dimension Generalization And Its Implications for Length Generalization
- Towards Fast Option Pricing PDE Solvers Powered by PIELM
- Environment-Aware Indoor LoRaWAN Path Loss: Parametric Regression Comparisons, Shadow Fading, and Calibrated Fade Margins
- Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models
- Quantizer Design for Finite Model Approximations, Model Learning, and Quantized Q-Learning for MDPs with Unbounded Spaces
- TCR-EML: Explainable Model Layers for TCR-pMHC Prediction
- Time Is Effort: Estimating Human Post-Editing Time for Grammar Error Correction Tool Evaluation
- Scale-Invariant Regret Matching and Online Learning with Optimal Convergence: Bridging Theory and Practice in Zero-Sum Games
- Zeroth-Order Methods for Stochastic Nonconvex Nonsmooth Composite Optimization
- Perspectives on Stochastic Localization
- Benchmarking atmospheric circulation variability in an AI emulator, ACE2, and a hybrid model, NeuralGCM
- Deep vs. Shallow: Benchmarking Physics-Informed Neural Architectures on the Biharmonic Equation
- Quantum generative model on bicycle-sharing system and an application
- Fast Witness Persistence for MRI Volumes via Hybrid Landmarking
- Keep It on a Leash: Controllable Pseudo-label Generation Towards Realistic Long-Tailed Semi-Supervised Learning
- Multi-Modal Multi-Task Semantic Communication: A Distributed Information Bottleneck Perspective
- Exploring Chain-of-Thought Reasoning for Steerable Pluralistic Alignment
- Sharp Lower Bounds for Linearized ReLU^k Approximation on the Sphere
- Fine Tuning Methods for Low-resource Languages
- Drax: Speech Recognition with Discrete Flow Matching
- From Segments to Concepts: Interpretable Image Classification via Concept-Guided Segmentation
- A Universal Deep Learning Force Field for Molecular Dynamic Simulation and Vibrational Spectra Prediction
- Detection of retinal diseases using an accelerated reused convolutional network
- PABSA: Hybrid Framework for Persian Aspect-Based Sentiment Analysis
- Read the Scene, Not the Script: Outcome-Aware Safety for LLMs
- MICROTRIPS: MICRO-geography TRavel Intelligence and Pattern Synthesis
- Improving S&P 500 Volatility Forecasting through Regime-Switching Methods
- Multimodal Arabic Captioning with Interpretable Visual Concept Integration
- Machine Learning and Control: Foundations, Advances, and Perspectives
- DECOR: Deep Embedding Clustering with Orientation Robustness
- Assessing the impact of contact time on leachate chemistry from recycled concrete aggregates
- Is it Bigger than a Breadbox: Efficient Cardinality Estimation for Real World Workloads
- Quantum feature-map learning with reduced resource overhead
- Exploring the Hierarchical Reasoning Model for Small Natural-Image Classification Without Augmentation
- Unsupervised Transformer Pre-Training for Images: Self-Distillation, Mean Teachers, and Random Crops
- Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models
- Mapping Rio de Janeiro's favelas: general-purpose vs. satellite-specific neural networks
- Lightweight and Generalizable Acoustic Scene Representations via Contrastive Fine-Tuning and Distillation
- A Benchmark Study of Deep Learning Methods for Multi-Label Pediatric Electrocardiogram-Based Cardiovascular Disease Classification
- Road Damage and Manhole Detection using Deep Learning for Smart Cities: A Polygonal Annotation Approach
- Cellular Learning: Scattered Data Regression in High Dimensions via Voronoi Cells
- A Trustworthy Industrial Fault Diagnosis Architecture Integrating Probabilistic Models and Large Language Models
- Fair Minimum Labeling: Efficient Temporal Network Activations for Reachability and Equity
- Optimal Computation from Fluctuation Responses
- Compressed Concatenation of Small Embedding Models
- IMLP: An Energy-Efficient Continual Learning Method for Tabular Data Streams
- Counterfactual Credit Guided Bayesian Optimization
- Parameter-free Algorithms for the Stochastically Extended Adversarial Model
- ViTs: Teaching Machines to See Time Series Anomalies Like Human Experts
- Directional Sheaf Hypergraph Networks: Unifying Learning on Directed and Undirected Hypergraphs
- EVaR-Optimal Arm Identification in Bandits
- Provable Affine Identifiability of Nonlinear CCA under Latent Distributional Priors
- ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs
- MetaMP: Seamless Metadata Enrichment and AI Application Framework for Enhanced Membrane Protein Visualization and Analysis
- On the Hardness of Learning Regular Expressions
- Synthesising Counterfactual Explanations via Label-Conditional Gaussian Mixture Variational Autoencoders
- A Clinical-grade Universal Foundation Model for Intraoperative Pathology
- Flow-Matching Based Refiner for Molecular Conformer Generation
- Benchmarking M-LTSF: Frequency and Noise-Based Evaluation of Multivariate Long Time Series Forecasting Models
- DP-HYPE: Distributed Differentially Private Hyperparameter Search
- How Different from the Past? Spatio-Temporal Time Series Forecasting with Self-Supervised Deviation Learning
- Egalitarian Gradient Descent: A Simple Approach to Accelerated Grokking
- StructuralDecompose: A Modular Framework for Robust Time Series Decomposition in R
- Adaptive Memory Momentum via a Model-Based Framework for Deep Learning Optimization
- Power Transform Revisited: Numerically Stable, and Federated
- Inoculation Prompting: Instructing LLMs to misbehave at train-time improves test-time alignment
- KEEP: Integrating Medical Ontologies with Clinical Data for Robust Code Embeddings
- Modeling Student Learning with 3.8 Million Program Traces
- Boomerang Distillation Enables Zero-Shot Model Size Interpolation
- Fractional Heat Kernel for Semi-Supervised Graph Learning with Small Training Sample Size
- Forking-Sequences
- Expand Neurons, Not Parameters
- Wavelet Predictive Representations for Non-Stationary Reinforcement Learning
- Real-time Prediction of Urban Sound Propagation with Conditioned Normalizing Flows
- Post-training quantization of vision encoders needs prefixing registers
- Tail-Safe Hedging: Explainable Risk-Sensitive Reinforcement Learning with a White-Box CBF--QP Safety Layer in Arbitrage-Free Markets
- Challenger-Based Combinatorial Bandits for Subcarrier Selection in OFDM Systems
- Stochastic Approximation Methods for Distortion Risk Measure Optimization
- Improved probabilistic regression using diffusion models
- Forecasting-Based Biomedical Time-series Data Synthesis for Open Data and Robust AI
- HoRA: Cross-Head Low-Rank Adaptation with Joint Hypernetworks
- Wave-PDE Nets: Trainable Wave-Equation Layers as an Alternative to Attention
- Activation Steering with a Feedback Controller
- Crash Severity Prediction Using Deep Learning Approaches: A Hybrid CNN-RNN Framework
- FoilDiff: A Hybrid Transformer Backbone for Diffusion-based Modelling of 2D Airfoil Flow Fields
- DoRAN: Stabilizing Weight-Decomposed Low-Rank Adaptation via Noise Injection and Auxiliary Networks
- Learning to Predict Chaos: Curriculum-Driven Training for Robust Forecasting of Chaotic Dynamics
- From News to Returns: A Granger-Causal Hypergraph Transformer on the Sphere
- Quantifying Ambiguity in Categorical Annotations: A Measure and Statistical Inference Framework
- Categorical Invariants of Learning Dynamics
- Score-based Greedy Search for Structure Identification of Partially Observed Linear Causal Models
- SSM-CGM: Interpretable State-Space Forecasting Model of Continuous Glucose Monitoring for Personalized Diabetes Management
- Achieve Performatively Optimal Policy for Performative Reinforcement Learning
- Trade-off in Estimating the Number of Byzantine Clients in Federated Learning
- On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection
- What Is The Performance Ceiling of My Classifier? Utilizing Category-Wise Influence Functions for Pareto Frontier Analysis
- Optimizing Resources for On-the-Fly Label Estimation with Multiple Unknown Medical Experts
- Early-Warning of Thunderstorm-Driven Power Outages with a Two-Stage Machine Learning Model
- Beyond Softmax: A New Perspective on Gradient Bandits
- ICEPool: Enhancing Graph Pooling Networks with Inter-cluster Connectivity
- Incorporating Multivariate Consistency in ML-Based Weather Forecasting with Latent-space Constraints
- Adaptive kernel-density approach for imbalanced binary classification
- Variational Diffusion Unlearning: A Variational Inference Framework for Unlearning in Diffusion Models under Data Constraints
- Rethinking Consistent Multi-Label Classification under Inexact Supervision
- Why Cannot Neural Networks Master Extrapolation? Insights from Physical Laws
- Can Linear Probes Measure LLM Uncertainty?
- Wasserstein projection distance for fairness testing of regression models
- On the Statistical Query Complexity of Learning Semiautomata: a Random Walk Approach
- Modeling Time Series Dynamics with Fourier Ordinary Differential Equations
- Efficient Manifold-Constrained Neural ODE for High-Dimensional Datasets
- Spectral Alignment as Predictor of Loss Explosion in Neural Network Training
- Adaptive Federated Learning via Dynamical System Model
- Truncated Kernel Stochastic Gradient Descent with General Losses and Spherical Radial Basis Functions
- Influence branching for learning to solve mixed-integer programs online
- Efficient Test-Time Scaling for Small Vision-Language Models
- FieldFormer: Physics-Informed Transformers for Spatio-Temporal Field Reconstruction from Sparse Sensors
- MECKD: Deep Learning-Based Fall Detection in Multilayer Mobile Edge Computing With Knowledge Distillation
- In-Vivo Training for Deep Brain Stimulation
- SAFA-SNN: Sparsity-Aware On-Device Few-Shot Class-Incremental Learning with Fast-Adaptive Structure of Spiking Neural Network
- Optimising Battery Energy Storage System Trading via Energy Market Operator Price Forecast
- Token Hidden Reward: Steering Exploration-Exploitation in Group Relative Deep Reinforcement Learning
- Personalized federated prototype learning in mixed heterogeneous data scenarios
- Optimizing Fine-Tuning through Advanced Initialization Strategies for Low-Rank Adaptation
- Neural Low-Discrepancy Sequences
- Merge and Guide: Unifying Model Merging and Guided Decoding for Controllable Multi-Objective Generation
- Curriculum-Augmented GFlowNets For mRNA Sequence Generation
- On Using Large Language Models to Enhance Clinically-Driven Missing Data Recovery Algorithms in Electronic Health Records
- BONSAI: Structure-exploiting robust Bayesian optimization for networked black-box systems under uncertainty
- LLM as an Algorithmist: Enhancing Anomaly Detectors via Programmatic Synthesis
- THEMIS: Unlocking Pretrained Knowledge with Foundation Model Embeddings for Anomaly Detection in Time Series
- Generalized Fitted Q-Iteration with Clustered Data
- Transductive and Learning-Augmented Online Regression
- Conditional Pseudo-Supervised Contrast for Data-Free Knowledge Distillation
- Studying the Korean Word-Chain Game with RLVR:Mitigating Reward Conflicts via Curriculum Learning
- Training Variation of Physically-Informed Deep Learning Models
- Memory-Efficient Backpropagation for Fine-Tuning LLMs on Resource-Constrained Mobile Devices
- LHGEL: Large Heterogeneous Graph Ensemble Learning using Batch View Aggregation
- How to Set $\beta_1, \beta_2$ in Adam: An Online Learning Perspective
- D2 Actor Critic: Diffusion Actor Meets Distributional Critic
- Task-Level Contrastiveness for Cross-Domain Few-Shot Learning
- RAPID: An Efficient Reinforcement Learning Algorithm for Small Language Models
- CrossLag: Predicting Major Dengue Outbreaks with a Domain Knowledge Informed Transformer
- Light Differentiable Logic Gate Networks
- Data-Driven Temperature Modelling of Machine Tools by Neural Networks: A Benchmark
- Variational Autoencoders-based Detection of Extremes in Plant Productivity in an Earth System Model
- Discovering Transformer Circuits via a Hybrid Attribution and Pruning Framework
- Single-Core Superscalar Optimization of Clifford Neural Layers
- CAFL-L: Constraint-Aware Federated Learning with Lagrangian Dual Optimization for On-Device Language Models
- Revoking Amnesia: RL-based Trajectory Optimization to Resurrect Erased Concepts in Diffusion Models
- Thin Bridges for Drug Text Alignment: Lightweight Contrastive Learning for Target Specific Drug Retrieval
- Scaling Laws Revisited: Modeling the Role of Data Quality in Language Model Pretraining
- Fast frequency reconstruction using Deep Learning for event recognition in ring laser data
- Constant in an Ever-Changing World
- Semantic-Aware Scheduling for GPU Clusters with Large Language Models
- Matching the Optimal Denoiser in Point Cloud Diffusion with (Improved) Rotational Alignment
- High Cycle S-N curve prediction for Al 7075-T6 alloy using Recurrent Neural Networks (RNNs)
- Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators
- MapIQ: Evaluating Multimodal Large Language Models for Map Question Answering
- Do Vision-Language Models See Urban Scenes as People Do? An Urban Perception Benchmark
- Computing Exact Shapley Values in Polynomial Time for Product-Kernel Methods
- From Compression to Expression: A Layerwise Analysis of In-Context Learning
- ViP$^2$-CLIP: Visual-Perception Prompting with Unified Alignment for Zero-Shot Anomaly Detection
- COUNTDOWN: Contextually Sparse Activation Filtering Out Unnecessary Weights in Down Projection
- AutoMiSeg: Automatic Medical Image Segmentation via Test-Time Adaptation of Foundation Models
- How Many Parameters Does Your Task Really Need? Task Specific Pruning with LLM-Sieve
- From Word to World: Evaluate and Mitigate Culture Bias in LLMs via Word Association Test
- Robust Stability Analysis of Positive Lure System with Neural Network Feedback
- Behavior Injection: Preparing Language Models for Reinforcement Learning
- MoESD: Unveil Speculative Decoding's Potential for Accelerating Sparse MoE
- VibE-SVC: Vibrato Extraction with High-frequency F0 Contour for Singing Voice Conversion
- What Has Been Lost with Synthetic Evaluation?
- Local Stability and Region of Attraction Analysis for Neural Network Feedback Systems under Positivity Constraints
- RFCAudit: An LLM Agent for Functional Bug Detection in Network Protocols
- In-Context Learning for Pure Exploration
- MedAgentGym: A Scalable Agentic Training Environment for Code-Centric Reasoning in Biomedical Data Science
- SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?
- Micro-Act: Mitigating Knowledge Conflict in LLM-based RAG via Actionable Self-Reasoning
- PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation
- Refactoring Codebases through Library Design
- Using cognitive models to reveal value trade-offs in language models
- Self-Correction Bench: Uncovering and Addressing the Self-Correction Blind Spot in Large Language Models
- Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards
- Who's the Mole? Modeling and Detecting Intention-Hiding Malicious Agents in LLM-Based Multi-Agent Systems
- Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
- VIPO: Value Function Inconsistency Penalized Offline Reinforcement Learning
- Recursive Deep Inverse Reinforcement Learning
- New Recipe for Semi-supervised Community Detection: Clique Annealing under Crystallization Kinetics
- From Dialect Gaps to Identity Maps: Tackling Variability in Speaker Verification
- J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
- Safety Subspaces are Not Linearly Distinct: A Fine-Tuning Case Study
- Self-GIVE: Associative Thinking from Limited Structured Knowledge for Enhanced Large Language Model Reasoning
- Constructing a 3D Scene from a Single Image
- Circle-RoPE: Cone-like Decoupled Rotary Positional Embedding for Large Vision-Language Models
- ZSMerge: Zero-Shot KV Cache Compression for Memory-Efficient Long-Context LLMs
- Periodontal Bone Loss Analysis via Keypoint Detection With Heuristic Post-Processing
- Mapping the Trust Terrain: LLMs in Software Engineering -- Insights and Perspectives
- Causality-Based Scores Alignment in Explainable Data Management
- Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation
- MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse
- Scaling Laws of Synthetic Data for Language Models
- Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models
- Understanding R1-Zero-Like Training: A Critical Perspective
- Large EEG-U-Transformer for Time-Step Level Detection Without Pre-Training
- Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Task
- Speculative Automated Refactoring of Imperative Deep Learning Programs to Graph Execution
- On the Effectiveness and Generalization of Race Representations for Debiasing High-Stakes Decisions
- Unlocking In-Context Learning for Natural Datasets Beyond Language Modelling
- Longitudinal Abuse and Sentiment Analysis of Hollywood Movie Dialogues using Language Models
- Characterizing Mobile SoC for Accelerating Heterogeneous LLM Inference
- AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science
- Controllable Video Generation with Provable Disentanglement
- Taming Latency-Memory Trade-Off in MoE-Based LLM Serving via Fine-Grained Expert Offloading
- QuIC: Quantum-Inspired Compound Adapters for Parameter Efficient Fine-Tuning
- TANTE: Time-Adaptive Operator Learning via Neural Taylor Expansion
- Fed-SB: A Silver Bullet for Extreme Communication Efficiency and Performance in (Private) Federated LoRA Fine-Tuning
- DISC: Dynamic Decomposition Improves LLM Inference Scaling
- Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs
- League: Leaderboard Generation on Demand
- SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-Checking
- Evolutionary Guided Decoding: Iterative Value Refinement for LLMs
- TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models
- Can GPT models Follow Human Summarization Guidelines? A Study for Targeted Communication Goals
- Robust MRI Reconstruction by Smoothed Unrolling (SMUG)
- Tabular Data: Is Deep Learning all you need?
- Synthesizing Sentiment-Controlled Feedback For Multimodal Text and Image Data
- When "Competency" in Reasoning Opens the Door to Vulnerability: Jailbreaking LLMs via Novel Complex Ciphers
- PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization
- Unified ODE Analysis of Smooth Q-Learning Algorithms
- Characteristic Learning for Provable One Step Generation
- Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models
- Deep Learning without Weight Symmetry
- Demystifying Higher-Order Graph Neural Networks
- How Far are AI-generated Videos from Simulating the 3D Visual World: A Learned 3D Evaluation Approach
- ClinicRealm: Re-evaluating Large Language Models with Conventional Machine Learning for Non-Generative Clinical Prediction Tasks
- Elastic On-Device LLM Service
- Advanced Clustering Techniques for Speech Signal Enhancement: A Review and Metanalysis of Fuzzy C-Means, K-Means, and Kernel Fuzzy C-Means Methods
- The Hive Mind is a Single Reinforcement Learning Agent
- H3Fusion: Helpful, Harmless, Honest Fusion of Aligned LLMs
- MALT: Improving Reasoning with Multi-Agent LLM Training
- STIV: Scalable Text and Image Conditioned Video Generation
- Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions
- ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering
- Machine Learning as Iterated Belief Change a la Darwiche and Pearl
- FloorplanQA: A Benchmark for Spatial Reasoning in LLMs using Structured Representations
- Cross-Modal Distillation For Widely Differing Modalities
- Autonomous Data Agents: A New Opportunity for Smart Data
- Algorithmic pricing with independent learners and relative experience replay
- Tutorial on amortized optimization
- Sum-of-Parts: Self-Attributing Neural Networks with End-to-End Learning of Feature Groups
- CHARME: A chain-based reinforcement learning approach for the minor embedding problem
- Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment
- Optimizing Agricultural Order Fulfillment Systems: A Hybrid Tree Search Approach
- Efficiently Learning Probabilistic Logical Models by Cheaply Ranking Mined Rules
- MAD-Sherlock: Multi-Agent Debate for Visual Misinformation Detection
- Data clustering: a fundamental method in data science and management
- Neural Deconstruction Search for Vehicle Routing Problems
- Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption
- TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios
- RECAST: Expanding the Boundaries of LLMs' Complex Instruction Following with Multi-Constraint Data
- PatentMind: A Multi-Aspect Reasoning Graph for Patent Similarity Evaluation
- Contextual Integrity in LLMs via Reasoning and Reinforcement Learning
- Beyond RLHF and NLHF: Population-Proportional Alignment under an Axiomatic Framework
- Resource-Efficient Fine-Tuning of LLaMA-3.2-3B for Medical Chain-of-Thought Reasoning
- Large Language Models Achieve Gold Medal Performance at International Astronomy & Astrophysics Olympiad
- Graph-Aware Diffusion for Signal Generation
- Test-Time Scaling in Diffusion LLMs via Hidden Semi-Autoregressive Experts
- HybridFlow: Quantification of Aleatoric and Epistemic Uncertainty with a Single Hybrid Model
- SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs
- Slm-mux: Orchestrating small language models for reasoning
- TeachLM: Post-Training LLMs for Education Using Authentic Learning Data
- Finish First, Perfect Later: Test-Time Token-Level Cross-Validation for Diffusion Large Language Models
- Learning to Interpret Weight Differences in Language Models
- From Noisy Traces to Stable Gradients: Bias-Variance Optimized Preference Optimization for Aligning Large Reasoning Models
- Paper2Video: Automatic Video Generation from Scientific Papers
- AgentBench: Evaluating LLMs as Agents
- Graph Generation Powered with LLMs for Boosting Multivariate Time-Series Representation Learning
- Glocal Information Bottleneck for Time Series Imputation
- Do LLMs Align with My Task? Evaluating Text-to-SQL via Dataset Alignment
- REN: Anatomically-Informed Mixture-of-Experts for Interstitial Lung Disease Diagnosis
- Federated Self-Supervised Learning for Automatic Modulation Classification under Non-IID and Class-Imbalanced Data
- The Geometry of Truth: Layer-wise Semantic Dynamics for Hallucination Detection in Large Language Models
- AURA Score: A Metric For Holistic Audio Question Answering Evaluation
- ONNX-Net: Towards Universal Representations and Instant Performance Prediction for Neural Architectures
- Unsupervised Active Learning via Natural Feature Progressive Framework
- A First Context-Free Grammar Applied to Nawatl Corpora Augmentation
- Bidirectional Mammogram View Translation with Column-Aware and Implicit 3D Conditional Diffusion
- Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy (short paper)
- Feasibility-Aware Decision-Focused Learning for Predicting Parameters in the Constraints
- MuFFIN: Multifaceted Pronunciation Feedback Model with Interactive Hierarchical Neural Modeling
- ActiveMark: on watermarking of visual foundation models via massive activations
- AWARE, Beyond Sentence Boundaries: A Contextual Transformer Framework for Identifying Cultural Capital in STEM Narratives
- AutoEmpirical: LLM-Based Automated Research for Empirical Software Fault Analysis
- On Predicting Post-Click Conversion Rate via Counterfactual Inference
- Bond-Centered Molecular Fingerprint Derivatives: A BBBP Dataset Study
- Distributionally Robust Causal Abstractions
- Detecting Distillation Data from Reasoning Models
- FreshBrew: A Benchmark for Evaluating AI Agents on Java Code Migration
- Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails
- Model Predictive Control-Guided Reinforcement Learning for Implicit Balancing
- Less is More: Recursive Reasoning with Tiny Networks
- Revealing Interconnections between Diseases: from Statistical Methods to Large Language Models
- SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests
- Focused Skill Discovery: Learning to Control Specific State Variables while Minimizing Side Effects
- Multilingual Routing in Mixture-of-Experts
- The Bayesian Origin of the Probability Weighting Function in Human Representation of Probabilities
- AtomWorld: A Benchmark for Evaluating Spatial Reasoning in Large Language Models on Crystalline Materials
- Curved Boolean Logic: A Contextual Generalization of Propositional Logic with Algorithmic Consequences
- Speak, Edit, Repeat: High-Fidelity Voice Editing and Zero-Shot TTS with Cross-Attentive Mamba
- A New Digital Divide? Coder Worldviews, the Slop Economy, and Democracy in the Age of AI
- Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction
- Agile Software Effort Estimation using Regression Techniques
- Distribution Preference Optimization: A Fine-grained Perspective for LLM Unlearning
- Learning on the Job: Test-Time Curricula for Targeted Reinforcement Learning
- Trade in Minutes! Rationality-Driven Agentic System for Quantitative Financial Trading
- DiT-VTON: Diffusion Transformer Framework for Unified Multi-Category Virtual Try-On and Virtual Try-All with Integrated Image Editing
- Did you just see that? Arbitrary view synthesis for egocentric replay of operating room workflows from ambient sensors
- Toward a Unified Geometry Understanding: Riemannian Diffusion Framework for Graph Generation and Prediction
- GILT: An LLM-Free, Tuning-Free Graph Foundational Model for In-Context Learning
- LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning
- Deep learning framework for predicting stochastic take-off and die-out of early spreading
- A Case for Declarative LLM-friendly Interfaces for Improved Efficiency of Computer-Use Agents
- Accountability Capture: How Record-Keeping to Support AI Transparency and Accountability (Re)shapes Algorithmic Oversight
- Design Process of a Self Adaptive Smart Serious Games Ecosystem
- Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
- Fairness in Repeated Matching: A Maximin Perspective
- SFANet: Spatial-Frequency Attention Network for Deepfake Detection
- Predictive Feature Caching for Training-free Acceleration of Molecular Geometry Generation
- Noise or Signal? Deconstructing Contradictions and An Adaptive Remedy for Reversible Normalization in Time Series Forecasting
- FocusMed: A Large Language Model-based Framework for Enhancing Medical Question Summarization with Focus Identification
- Semantic Channel Equalization Strategies for Deep Joint Source-Channel Coding
- TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA
- How does the optimizer implicitly bias the model merging loss landscape?
- MacroBench: A Novel Testbed for Web Automation Scripts via Large Language Models
- NegotiationGym: Self-Optimizing Agents in a Multi-Agent Social Simulation Environment
- GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks
- Adaptive Weighted Loss for Sequential Recommendations on Sparse Domains
- MorphoSim: An Interactive, Controllable, and Editable Language-guided 4D World Simulator
- Improving Consistency in Retrieval-Augmented Systems with Group Similarity Rewards
- Large Language Models Preserve Semantic Isotopies in Story Continuations
- Your Vision-Language Model Can't Even Count to 20: Exposing the Failures of VLMs in Compositional Counting
- Partial Information Decomposition via Normalizing Flows in Latent Gaussian Distributions
- SPEGNet: Synergistic Perception-Guided Network for Camouflaged Object Detection
- Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space
- MedCLM: Learning to Localize and Reason via a CoT-Curriculum in Medical Vision-Language Models
- Psychological Steering in LLMs: An Evaluation of Effectiveness and Trustworthiness
- GenQuest: An LLM-based Text Adventure Game for Language Learners
- Physics-Inspired All-Pair Interaction Learning for 3D Dynamics Modeling
- Diffusion-Assisted Distillation for Self-Supervised Graph Representation Learning with MLPs
- Concept-Based Masking: A Patch-Agnostic Defense Against Adversarial Patch Attacks
- Efficient Latent Variable Causal Discovery: Combining Score Search and Targeted Testing
- LongTail-Swap: benchmarking language models' abilities on rare words
- SliceMoE: Routing Embedding Slices Instead of Tokens for Fine-Grained and Balanced Transformer Scaling
- Audit the Whisper: Detecting Steganographic Collusion in Multi-Agent LLMs
- FairAgent: Democratizing Fairness-Aware Machine Learning with LLM-Powered Agents
- Pitch-Conditioned Instrument Sound Synthesis From an Interactive Timbre Latent Space
- Inoculation Prompting: Eliciting traits from LLMs during training can suppress them at test-time
- Critical appraisal of artificial intelligence for rare-event recognition: principles and pharmacovigilance case studies
- Challenge on Optimization of Context Collection for Code Completion
- GA4GC: Greener Agent for Greener Code via Multi-Objective Configuration Optimization
- Learning from All: Concept Alignment for Autonomous Distillation from Multiple Drifting MLLMs
- Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models
- Multi Language Models for On-the-Fly Syntax Highlighting
- Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization
- A Complement to Neural Networks for Anisotropic Inelasticity at Finite Strains
- Finite Time Analysis of Constrained Natural Critic-Actor Algorithm with Improved Sample Complexity
- Cooperative Flexibility Exchange: Fair and Comfort-Aware Decentralized Resource Allocation
- World-To-Image: Grounding Text-to-Image Generation with Agent-Driven World Knowledge
- CALM Before the STORM: Unlocking Native Reasoning for Optimization Modeling
- PolyKAN: A Polyhedral Analysis Framework for Provable and Minimal KAN Compression
- Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention
- MLLMEraser: Achieving Test-Time Unlearning in Multimodal Large Language Models through Activation Steering
- MASC: Boosting Autoregressive Image Generation with a Manifold-Aligned Semantic Clustering
- Zoom-In to Sort AI-Generated Images Out
- \textsc{GUI-Spotlight}: Adaptive Iterative Focus Refinement for Enhanced GUI Visual Grounding
- Quantization Range Estimation for Convolutional Neural Networks
- MetaFind: Scene-Aware 3D Asset Retrieval for Coherent Metaverse Scene Generation
- What Scales in Cross-Entropy Scaling Law?
- A Contextual Quality Reward Model for Reliable and Efficient Best-of-N Sampling
- Using predefined vector systems as latent space configuration for neural network supervised training on data with arbitrarily large number of classes
- Efficient Training of Spiking Neural Networks by Spike-aware Data Pruning
- TOPO-Bench: An Open-Source Topological Mapping Evaluation Framework with Quantifiable Perceptual Aliasing
- Unveiling LLMs' Metaphorical Understanding: Exploring Conceptual Irrelevance, Context Leveraging and Syntactic Influence
- Attending on Multilevel Structure of Proteins enables Accurate Prediction of Cold-Start Drug-Target Interactions
- On the Limitations and Capabilities of Position Embeddings for Length Generalization
- PhaseFormer: From Patches to Phases for Efficient and Effective Time Series Forecasting
- Towards Carbon-Aware Container Orchestration: Predicting Workload Energy Consumption with Federated Learning
- What Can You Do When You Have Zero Rewards During RL?
- Distilling Reasoning into Student LLMs: Local Naturalness for Selecting Teacher Data
- A Mathematical Explanation of Transformers for Large Language Models and GPTs
- Named Entity Recognition in COVID-19 tweets with Entity Knowledge Augmentation
- Replacing Softmax Similarity with a Sharpened Angular Similarity: Theory and Practice of Scaling To Billion-Context Attention
- Thai Semantic End-of-Turn Detection for Real-Time Voice Agents
- Principled and Tractable RL for Reasoning with Diffusion Language Models
- Spatiotemporal Forecasting as Planning: A Model-Based Reinforcement Learning Approach with Generative World Models
- The Debate on RLVR Reasoning Capability Boundary: Shrinkage, Expansion, or Both? A Two-Stage Dynamic View
- Does Using Counterfactual Help LLMs Explain Textual Importance in Classification?
- Small Language Models for Emergency Departments Decision Support: A Benchmark Study
- Prompt-to-Prompt: Text-Based Image Editing Via Cross-Attention Mechanisms -- The Research of Hyperparameters and Novel Mechanisms to Enhance Existing Frameworks
- Unlocking Reasoning Capabilities in LLMs via Reinforcement Learning Exploration
- AI Adoption Across Mission-Driven Organizations
- PoseGaze-AHP: A Knowledge-Based 3D Dataset for AI-Driven Ocular and Postural Diagnosis
- Multi-Modal Oral Cancer Detection Using Weighted Ensemble Convolutional Neural Networks
- Adversarial Agent Collaboration for C to Rust Translation
- Refactoring with LLMs: Bridging Human Expertise and Machine Understanding
- On the Convergence and Size Transferability of Continuous-depth Graph Neural Networks
- LLM Chemistry Estimation for Multi-LLM Recommendation
- Strategy Logic, Imperfect Information, and Hyperproperties
- SPEAR: Soft Prompt Enhanced Anomaly Recognition for Time Series Data
- HydroFusion-LMF: Semi-Supervised Multi-Network Fusion with Large-Model Adaptation for Long-Term Daily Runoff Forecasting
- TreePrompt: Leveraging Hierarchical Few-Shot Example Selection for Improved English-Persian and English-German Translation
- Code4MeV2: a Research-oriented Code-completion Platform
- EvoEngineer: Mastering Automated CUDA Kernel Code Evolution with Large Language Models
- Adaptively Sampling-Reusing-Mixing Decomposed Gradients to Speed Up Sharpness Aware Minimization
- Rezwan: Leveraging Large Language Models for Comprehensive Hadith Text Processing: A 1.2M Corpus Development
- Lightweight and Data-Efficient MultivariateTime Series Forecasting using Residual-Stacked Gaussian (RS-GLinear) Architecture
- Mechanistic Interpretability of Socio-Political Frames in Language Models
- Beyond Token Length: Step Pruner for Efficient and Accurate Reasoning in Large Language Models
- 6G-Enabled Digital Twin Framework for Real-Time Cyber-Physical Systems: An Experimental Validation with Industrial Bearing Fault Detection
- ReTiDe: Real-Time Denoising for Energy-Efficient Motion Picture Processing with FPGAs
- Detecting Invariant Manifolds in ReLU-Based RNNs
- A4FN: an Agentic AI Architecture for Autonomous Flying Networks
- AI-Assisted Pleural Effusion Volume Estimation from Contrast-Enhanced CT Images
- Designing Empirical Studies on LLM-Based Code Generation: Towards a Reference Framework
- Can an LLM Induce a Graph? Investigating Memory Drift and Context Length
- Predicting Stock Price Movement with LLM-Enhanced Tweet Emotion Analysis
- Towards Unsupervised Speech Recognition at the Syllable-Level
- LLM-Guided Evolutionary Program Synthesis for Quasi-Monte Carlo Design
- Operationalizing Data Minimization for Privacy-Preserving LLM Prompting
- MonitorVLM:A Vision Language Framework for Safety Violation Detection in Mining Operations
- MedReflect: Teaching Medical LLMs to Self-Improve via Reflective Correction
- REG: A Regularization Optimizer for Robust Training Dynamics
- Dissecting Larval Zebrafish Hunting using Deep Reinforcement Learning Trained RNN Agents
- Referring Expression Comprehension for Small Objects
- Artery-Vein Segmentation from Fundus Images using Deep Learning
- TS-Reasoner: Aligning Time Series Foundation Models with LLM Reasoning
- Certifiable Safe RLHF: Fixed-Penalty Constraint Optimization for Safer Language Models
- Identifying Financial Risk Information Using RAG with a Contrastive Insight
- TriMediQ: A Triplet-Structured Approach for Interactive Medical Question Answering
- Unmasking Puppeteers: Leveraging Biometric Leakage to Disarm Impersonation in AI-based Videoconferencing
- GAS-MIL: Group-Aggregative Selection Multi-Instance Learning for Ensemble of Foundation Models in Digital Pathology Image Analysis
- Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models
- Evaluating OCR performance on food packaging labels in South Africa
- Generalization of Graph Neural Network Models for Distribution Grid Fault Detection
- Deep learning the sources of MJO predictability: a spectral view of learned features
- A Hybrid Co-Finetuning Approach for Visual Bug Detection in Video Games
- Deep Domain Adaptation for Turbofan Engine Remaining Useful Life Prediction: Methodologies, Evaluation and Future Trends
- Report of the 2025 Workshop on Next-Generation Ecosystems for Scientific Computing: Harnessing Community, Software, and AI for Cross-Disciplinary Team Science
- PLSEMANTICSBENCH: Large Language Models As Programming Language Interpreters
- Generalized Orders of Magnitude for Scalable, Parallel, High-Dynamic-Range Computation
- Application of a Virtual Imaging Framework for Investigating a Deep Learning-Based Reconstruction Method for 3D Quantitative Photoacoustic Computed Tomography
- Scalable Ground Station Selection for Large LEO Constellations
- Spatial-ViLT: Enhancing Visual Spatial Reasoning through Multi-Task Learning
- The Argument is the Explanation: Structured Argumentation for Trust in Agents
- ALMAS: an Autonomous LLM-based Multi-Agent Software Engineering Framework
- DuPLUS: Dual-Prompt Vision-Language Framework for Universal Medical Image Segmentation and Prognosis
- Reasoning-based Anomaly Detection Framework: A Real-time, Scalable, and Automated Approach to Anomaly Detection Across Domains
- SEER: The Span-based Emotion Evidence Retrieval Benchmark
- AgentHub: A Research Agenda for Agent Sharing Infrastructure
- ALHD: A Large-Scale and Multigenre Benchmark Dataset for Arabic LLM-Generated Text Detection
- Platonic Transformers: A Solid Choice For Equivariance
- Red Lines and Grey Zones in the Fog of War: Benchmarking Legal Risk, Moral Harm, and Regional Bias in Large Language Model Military Decision-Making
- Provenance Networks: End-to-End Exemplar-Based Explainability
- Unified Unsupervised Anomaly Detection via Matching Cost Filtering
- Diffusion-Based, Data-Assimilation-Enabled Super-Resolution of Hub-height Winds
- Disentangling Recall and Reasoning in Transformer Models through Layer-wise Attention and Activation Analysis
- An Adaptive Responsible AI Governance Framework for Decentralized Organizations
- TriQuest:An AI Copilot-Powered Platform for Interdisciplinary Curriculum Design
- InstructPLM-mu: 1-Hour Fine-Tuning of ESM2 Beats ESM3 in Protein Mutation Predictions
- Distributed Low-Communication Training with Decoupled Momentum Optimization
- Real-time nonlinear inversion of magnetic resonance elastography with operator learning
- Lightweight Prompt Engineering for Cognitive Alignment in Educational AI: A OneClickQuiz Case Study
- Can an AI-Powered Presentation Platform Based On The Game "Just a Minute" Be Used To Improve Students' Public Speaking Skills?
- A Robust Clustered Federated Learning Approach for Non-IID Data with Quantity Skew
- Cross-Modal Reconstruction Pretraining for Ramp Flow Prediction at Highway Interchanges
- Implicit Values Embedded in How Humans and LLMs Complete Subjective Everyday Tasks
- Linguistic and Audio Embedding-Based Machine Learning for Alzheimer's Dementia and Mild Cognitive Impairment Detection: Insights from the PROCESS Challenge
- Pool Me Wisely: On the Effect of Pooling in Transformer-Based Models
- Learning Pareto-Optimal Pandemic Intervention Policies with MORL
- Defining a Strategic Action Plan for AI in Higher Education
- Pilot selection in the era of Virtual reality: algorithms for accurate and interpretable machine learning models
- KVComm: Enabling Efficient LLM Communication through Selective KV Sharing
- AgentCaster: Reasoning-Guided Tornado Forecasting
- Interpretable Neuropsychiatric Diagnosis via Concept-Guided Graph Neural Networks
- Inference-Time Search using Side Information for Diffusion-based Image Reconstruction
- Understanding Transformers for Time Series: Rank Structure, Flow-of-ranks, and Compressibility
- Physics-informed Neural-operator Predictive Control for Drag Reduction in Turbulent Flows
- Why mask diffusion does not work
- UniPruning: Unifying Local Metric and Global Feedback for Scalable Sparse LLMs
- From Score Distributions to Balance: Plug-and-Play Mixture-of-Experts Routing
- Convolutional Neural Nets vs Vision Transformers: A SpaceNet Case Study with Balanced vs Imbalanced Regimes
- Dynamic Meta-Learning for Adaptive XGBoost-Neural Ensembles
- Atlas-free Brain Network Transformer
- Predicting Effects, Missing Distributions: Evaluating LLMs as Human Behavior Simulators in Operations Management
- A Comprehensive Review on Artificial Intelligence Empowered Solutions for Enhancing Pedestrian and Cyclist Safety
- Decomposing Attention To Find Context-Sensitive Neurons
- The View From Space: Navigating Instrumentation Differences with EOFMs
- Photorealistic Inpainting for Perturbation-based Explanations in Ecological Monitoring
- NS-Pep: De novo Peptide Design with Non-Standard Amino Acids
- Intelligent Healthcare Ecosystems: Optimizing the Iron Triangle of Healthcare (Access, Cost, Quality)
- Decision Potential Surface: A Theoretical and Practical Approximation of LLM's Decision Boundary
- PDE-Transformer: A Continuous Dynamical Systems Approach to Sequence Modeling
- Learning without Global Backpropagation via Synergistic Information Distillation
- Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
- SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size
- QuadEnhancer: Leveraging Quadratic Transformations to Enhance Deep Neural Networks
- Quantifying constraint hierarchies in Bayesian PINNs via per-constraint Hessian decomposition
- MemMamba: Rethinking Memory Patterns in State Space Model
- Training Optimal Large Diffusion Language Models
- MACE: A Hybrid LLM Serving System with Colocated SLO-aware Continuous Retraining Alignment
- Edge-FIT: Federated Instruction Tuning of Quantized LLMs for Privacy-Preserving Smart Home Environments
- A Biologically Interpretable Cognitive Architecture for Online Structuring of Episodic Memories into Cognitive Maps
- LogAction: Consistent Cross-system Anomaly Detection through Logs via Active Domain
- Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
- Semantic-Inductive Attribute Selection for Zero-Shot Learning
- Rethinking Inter-LoRA Orthogonality in Adapter Merging: Insights from Orthogonal Monte Carlo Dropout
- Memory Self-Regeneration: Uncovering Hidden Knowledge in Unlearned Models
- Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data
- MindCraft: How Concept Trees Take Shape In Deep Models
- PT$^2$-LLM: Post-Training Ternarization for Large Language Models
- Decrypt Modality Gap in Multimodal Contrastive Learning: From Convergent Representation to Pair Alignment
- General Exploratory Bonus for Optimistic Exploration in RLHF
- CoDA: Coding LM via Diffusion Adaptation
- ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization
- PARS: Low-Latency LLM Serving via Pairwise Learning-to-Rank
- VIFO: Visual Feature Empowered Multivariate Time Series Forecasting with Cross-Modal Fusion
- Frequency-Aware Model Parameter Explorer: A new attribution method for improving explainability
- StructPrune: Structured Global Pruning asymptotics with $\mathcal{O}(\sqrt{N})$ GPU Memory
- Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data
- Real-Time Brain Biomechanics Prediction with Neural Operators: Toward Clinically Deployable Traumatic Brain Injury Models
- Numerion: A Multi-Hypercomplex Model for Time Series Forecasting
- Universal Multi-Domain Translation via Diffusion Routers
- Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents
- SciTS: Scientific Time Series Understanding and Generation with LLMs
- Triple-BERT: Do We Really Need MARL for Order Dispatch on Ride-Sharing Platforms?
- POEM: Explore Unexplored Reliable Samples to Enhance Test-Time Adaptation
- MARS: Optimizing Dual-System Deep Research via Multi-Agent Reinforcement Learning
- Safe and Compliant Cross-Market Trade Execution via Constrained RL and Zero-Knowledge Audits
- Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI
- LLM-Hanabi: Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game
- Think Then Embed: Generative Context Improves Multimodal Embedding
- Look-ahead Reasoning with a Learned Model in Imperfect Information Games
- Staircase Streaming for Low-Latency Multi-Agent Inference
- A Modular Conditional Diffusion Framework for Image Reconstruction
- Perfect AI Mimicry and the Epistemology of Consciousness: A Solipsistic Dilemma
- Making Mathematical Reasoning Adaptive
- MedPAO: A Protocol-Driven Agent for Structuring Medical Reports
- QuantAgents: Towards Multi-agent Financial System via Simulated Trading
- Improving Multimodal Brain Encoding Model with Dynamic Subject-awareness Routing
- Watch and Learn: Learning to Use Computers from Online Videos
- Beyond Outcome Reward: Decoupling Search and Answering Improves LLM Agents
- BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
- LMM-Incentive: Large Multimodal Model-based Incentive Design for User-Generated Content in Web 3.0
- Hybrid-Balance GFlowNet for Solving Vehicle Routing Problems
- Natural Language Edge Labelling: Decoupling Intent from Execution in Structured LM Reasoning
- LEGOMem: Modular Procedural Memory for Multi-agent LLM Systems for Workflow Automation
- Video Game Level Design as a Multi-Agent Reinforcement Learning Problem
- Where Did It All Go Wrong? A Hierarchical Look into Multi-Agent Error Attribution
- Human Behavior Atlas: Benchmarking Unified Psychological and Social Behavior Understanding
- Utility-Learning Tension in Self-Modifying Agents
- DRPO: Efficient Reasoning via Decoupled Reward Policy Optimization
- On Continuous Optimization for Constraint Satisfaction Problems
- Multi-Agent Collaborative Intelligence: Dual-Dial Control for Reliable LLM Reasoning
- Impatient Users Confuse AI Agents: High-fidelity Simulations of Human Traits for Testing Agents
- ChartAgent: A Multimodal Agent for Visually Grounded Reasoning in Complex Chart Question Answering
- Aria: An Agent For Retrieval and Iterative Auto-Formalization via Dependency Graph
- Code World Models for General Game Playing
- TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use
- ContextNav: Towards Agentic Multimodal In-Context Learning
- COSMIR: Chain Orchestrated Structured Memory for Iterative Reasoning over Long Context
- Strongly Solving 2048 4x3
- Searching Meta Reasoning Skeleton to Guide LLM Reasoning
- Internal states before wait modulate reasoning patterns
- Selective Expert Guidance for Effective and Diverse Exploration in Reinforcement Learning of LLMs
- The Artificial Intelligence Cognitive Examination: A Survey on the Evolution of Multimodal Evaluation from Recognition to Reasoning
- Open Agent Specification (Agent Spec) Technical Report
- Constructing coherent spatial memory in LLM agents through graph rectification
- COSMO-RL: Towards Trustworthy LMRMs via Joint Safety and Stability
- AgentRL: Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework
- Closing the Loop: Coordinating Inventory and Recommendation via Deep Reinforcement Learning on Multiple Timescales
- GROK: From Quantitative Biomarkers to Qualitative Diagnosis via a Grounded MLLM with Knowledge-Guided Instruction
- Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning
- On the Importance of Task Complexity in Evaluating LLM-Based Multi-Agent Systems
- Speculative Actions: A Lossless Framework for Faster Agentic Systems
- Just-in-time Episodic Feedback Hinter: Leveraging Offline Knowledge to Improve LLM Agents Adaptation
- LLM Based Bayesian Optimization for Prompt Search
- Internal World Models as Imagination Networks in Cognitive Agents
- Algorithm Generation via Creative Ideation
- Adaptive and Explainable AI Agents for Anomaly Detection in Critical IoT Infrastructure using LLM-Enhanced Contextual Reasoning
- Rare Text Semantics Were Always There in Your Diffusion Transformer
- Kantian-Utilitarian XAI: Meta-Explained
- What Shapes a Creative Machine Mind? Comprehensively Benchmarking Creativity in Foundation Models
- Zephyrus: An Agentic Framework for Weather Science
- LLM-Based Data Science Agents: A Survey of Capabilities, Challenges, and Future Directions
- A global log for medical AI
- FaithCoT-Bench: Benchmarking Instance-Level Faithfulness of Chain-of-Thought Reasoning
- Increasing LLM response trustworthiness using voting ensembles
- Toward a unified framework for data-efficient evaluation of large language models
- Decoding Emotion in the Deep: A Systematic Study of How LLMs Represent, Retain, and Express Emotion
- Moral Anchor System: A Predictive Framework for AI Value Alignment and Drift Prevention
- SPOGW: a Score-based Preference Optimization method via Group-Wise comparison for workflows
- Harnessing LLM for Noise-Robust Cognitive Diagnosis in Web-Based Intelligent Education Systems
- WebRenderBench: Enhancing Web Interface Generation through Layout-Style Consistency and Reinforcement Learning
- Know Thyself? On the Incapability and Implications of AI Self-Recognition
- ContraGen: A Multi-Agent Generation Framework for Enterprise Contradictions Detection
- A Qualitative Comparative Evaluation of Cognitive and Generative Theories
- Bridging LLM Planning Agents and Formal Methods: A Case Study in Plan Verification
- Towards Policy-Compliant Agents: Learning Efficient Guardrails For Policy Violation Detection
- OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows
- MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information
- Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs
- Mind the Goal: Data-Efficient Goal-Oriented Evaluation of Conversational Agents and Chatbots using Teacher Models
- H-DDx: A Hierarchical Evaluation Framework for Differential Diagnosis
- Bridging the Gap Between Multimodal Foundation Models and World Models
- OptAgent: Optimizing Query Rewriting for E-commerce via Multi-Agent Simulation
- GuidedSampling: Steering LLMs Towards Diverse Candidate Solutions at Inference-Time
- Small Language Models for Agentic Systems: A Survey of Architectures, Capabilities, and Deployment Trade offs
- Refined Iterated Pareto Greedy for Energy-aware Hybrid Flowshop Scheduling with Blocking Constraints
- A General Quantum Duality for Representations of Groups with Applications to Quantum Money, Lightning, and Fire
- Cooperative Decentralized Backdoor Attacks on Vertical Federated Learning
- What Lurks Within? Concept Auditing for Shared Diffusion Models at Scale
- Rethinking Exact Unlearning under Exposure: Extracting Forgotten Data under Exact Unlearning in Large Language Model
- Can We Infer Confidential Properties of Training Data from LLMs?
- LexiMark: Robust Watermarking via Lexical Substitutions to Enhance Membership Verification of an LLM's Textual Training Data
- Thought Purity: A Defense Framework For Chain-of-Thought Attack
- VDDP: Verifiable Distributed Differential Privacy under the Client-Server-Verifier Setup
- System Prompt Poisoning: Persistent Attacks on Large Language Models Beyond User Injection
- The Security Threat of Compressed Projectors in Large Vision-Language Models
- Backdoors in Code Summarizers: How Bad Is It?
- LLMalMorph: On The Feasibility of Generating Variant Malware using Large-Language-Models
- Contrastive-KAN: A Semi-Supervised Intrusion Detection Framework for Cybersecurity with scarce Labeled Data
- TopicAttack: An Indirect Prompt Injection Attack via Topic Transition
- PP-STAT: An Efficient Privacy-Preserving Statistical Analysis Framework using Homomorphic Encryption
- From Protest to Power Plant: Interpreting the Role of Escalatory Hacktivism in Cyber Conflict
- HIPAAChecker: The Comprehensive Solution for HIPAA Compliance in Android mHealth Apps
- Position Paper: Assessing Robustness, Privacy, and Fairness in Federated Learning Integrated with Foundation Models
- SteerDiff: Steering towards Safe Text-to-Image Diffusion Models
- CountCrypt: Quantum Cryptography between QCMA and PP
- Quantum Cryptography and Hardness of Non-Collapsing Measurements
- Collusion-Resistant Quantum Secure Key Leasing Beyond Decryption
- Federated Computation of ROC and PR Curves
- What your brain activity says about you: A review of neuropsychiatric disorders identified in resting-state and sleep EEG data
- Less is More: On Copy Complexity in Quantum Cryptography
- Imperceptible Jailbreaking against Large Language Models
- On Cryptography and Distribution Verification, with Applications to Quantum Advantage
- Multi-Agent Distributed Optimization With Feasible Set Privacy
- SCART: Simulation of Cyber Attacks for Real-Time
- Impact of Dataset Properties on Membership Inference Vulnerability of Deep Transfer Learning
- Retrofitting XoM for Stripped Binaries without Embedded Data Relocation
- Proof-of-Data: A Consensus Protocol for Collaborative Intelligence
- Can Indirect Prompt Injection Attacks Be Detected and Removed?
- AttackSeqBench: Benchmarking Large Language Models in Analyzing Attack Sequences within Cyber Threat Intelligence
- From Cyber Security Incident Management to Cyber Security Crisis Management in the European Union
- DualBreach: Efficient Dual-Jailbreaking via Target-Driven Initialization and Multi-Target Optimization
- RL Is a Hammer and LLMs Are Nails: A Simple Reinforcement Learning Recipe for Strong Prompt Injection
- NatGVD: Natural Adversarial Example Attack towards Graph-based Vulnerability Detection
- Proactive defense against LLM Jailbreak
- Adversarial training with restricted data manipulation
- WAREX: Web Agent Reliability Evaluation on Existing Benchmarks
- LegalSim: Multi-Agent Simulation of Legal Systems for Discovering Procedural Exploits
- Repairing Leaks in Resource Wrappers
- A Quantum-Secure Voting Framework Using QKD, Dual-Key Symmetric Encryption, and Verifiable Receipts
- A Lightweight Federated Learning Approach for Privacy-Preserving Botnet Detection in IoT
- Machine Unlearning Meets Adversarial Robustness via Constrained Interventions on LLMs
- Cross-Modal Content Optimization for Steering Web Agent Preferences
- From Theory to Practice: Evaluating Data Poisoning Attacks and Defenses in In-Context Learning on Social Media Health Discourse
- Spatial CAPTCHA: Generatively Benchmarking Spatial Reasoning for Human-Machine Differentiation
- Quantifying Risks in Multi-turn Conversation with Large Language Models
- Strategic Communication Protocols for Interstellar Objects Using a Threat-Communication Viability Index and the Information-Communication Paradox
- Multi-Class Support Vector Machine with Differential Privacy
- Proofs of quantum memory
- SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations
- AgentTypo: Adaptive Typographic Prompt Injection Attacks against Black-box Multimodal Agents
- VortexPIA: Indirect Prompt Injection Attack against LLMs for Efficient Extraction of User Privacy
- MulVuln: Enhancing Pre-trained LMs with Shared and Language-Specific Knowledge for Multilingual Vulnerability Detection
- P2P: A Poison-to-Poison Remedy for Reliable Backdoor Defense in LLMs
- Unified Threat Detection and Mitigation Framework (UTDMF): Combating Prompt Injection, Deception, and Bias in Enterprise-Scale Transformers
- Computational Certified Deletion Property of Magic Square Game and its Application to Classical Secure Key Leasing
- PoS-CoPOR: Proof-of-Stake Consensus Protocol with Native Onion Routing Providing Scalability and DoS-Resistance
- Backing the Wrong Horse: How Bit-Level Netlist Augmentation can Counter Power Side Channel Attacks
- Modeling and Managing Temporal Obligations in GUCON Using SPARQL-star and RDF-star
- Enhancing TreePIR for a Single-Server Setting via Resampling
- Securing Operating Systems Through Fine-grained Kernel Access Limitation for IoT Systems
- Public-Key Encryption from the MinRank Problem
- You Have Been LaTeXpOsEd: A Systematic Analysis of Information Leakage in Preprint Archives Using Large Language Models
- Complex Domain Approach for Reversible Data Hiding and Homomorphic Encryption: General Framework and Application to Dispersed Data
- Security Analysis of Ponzi Schemes in Ethereum Smart Contracts
- Pilot Contamination Attacks Detection with Machine Learning for Multi-User Massive MIMO
- Quantifying Distributional Robustness of Agentic Tool-Selection
- PrivSpike: Employing Homomorphic Encryption for Private Inference of Deep Spiking Neural Networks
- FHEON: A Configurable Framework for Developing Privacy-Preserving Neural Networks Using Homomorphic Encryption
- Real-VulLLM: An LLM Based Assessment Framework in the Wild
- Gluing Random Unitaries with Inverses and Applications to Strong Pseudorandom Unitaries
- Cyber Warfare During Operation Sindoor: Malware Campaign Analysis and Detection Framework
- ObCLIP: Oblivious CLoud-Device Hybrid Image Generation with Privacy Preservation
- RAG over Tables: Hierarchical Memory Index, Multi-Stage Retrieval, and Benchmarking
- SVDefense: Effective Defense against Gradient Inversion Attacks via Singular Value Decomposition
- Attack logics, not outputs: Towards efficient robustification of deep neural networks by falsifying concept-based properties
- Security Analysis and Threat Modeling of Research Management Applications [Extended Version]
- NEXUS: Network Exploration for eXploiting Unsafe Sequences in Multi-Turn LLM Jailbreaks
- A Multi-Layer Electronic and Cyber Interference Model for AI-Driven Cruise Missiles: The Case of Khuzestan Province
- CryptOracle: A Modular Framework to Characterize Fully Homomorphic Encryption
- PentestMCP: A Toolkit for Agentic Penetration Testing
- Explainable but Vulnerable: Adversarial Attacks on XAI Explanation in Cybersecurity Applications
- On the Limits of Consensus under Dynamic Availability and Reconfiguration
- QPADL: Post-Quantum Private Spectrum Access with Verified Location and DoS Resilience
- A Time-Bound Signature Scheme for Blockchains
- Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods
- Shrinking the Kernel Attack Surface Through Static and Dynamic Syscall Limitation
- Topic-Specific Classifiers are Better Relevance Judges than Prompted LLMs
- LLM, Reporting In! Medical Information Extraction Across Prompting, Fine-tuning and Post-correction
- Automating construction safety inspections using a multi-modal vision-language RAG framework
- Epistemic Diversity and Knowledge Collapse in Large Language Models
- Evaluating Keyframe Layouts for Visual Known-Item Search in Homogeneous Collections
- GRACE: Generative Representation Learning via Contrastive Policy Optimization
- Fine-grained auxiliary learning for real-world product recommendation
- Contrastive Learning Using Graph Embeddings for Domain Adaptation of Language Models in the Process Industry
- Exploring Applications of State Space Models and Advanced Training Techniques in Sequential Recommendations: A Comparative Study on Efficiency and Performance
- Prompt Tuning as User Inherent Profile Inference Machine
- Learning Refined Document Representations for Dense Retrieval via Deliberate Thinking
- LLM-CoT Enhanced Graph Neural Recommendation with Harmonized Group Policy Optimization
- Scientific Paper Retrieval with LLM-Guided Semantic-Based Ranking
- Query Drift Compensation: Enabling Compatibility in Continual Learning of Retrieval Embedding Models
- Towards Understanding Bias in Synthetic Data for Evaluation
- Investigating LLM Variability in Personalized Conversational Information Retrieval
- Beyond Static Evaluation: Rethinking the Assessment of Personalized Agent Adaptability in Information Retrieval
- Visual Lifelog Retrieval through Captioning-Enhanced Interpretation
- The LCLStream Ecosystem for Multi-Institutional Dataset Exploration
- RLRF: Competitive Search Agent Design via Reinforcement Learning from Ranker Feedback
- Learning-Based Hashing for ANN Search: Foundations and Early Advances
- Empowering Denoising Sequential Recommendation with Large Language Model Embeddings
- Causality-aware Graph Aggregation Weight Estimator for Popularity Debiasing in Top-K Recommendation
- MARCO: A Cooperative Knowledge Transfer Framework for Personalized Cross-domain Recommendations
- Enhancing Foveated Rendering with Weighted Reservoir Sampling
- 3Dify: a Framework for Procedural 3D-CG Generation Assisted by LLMs Using MCP and RAG
- C3Editor: Achieving Controllable Consistency in 2D Model for 3D Editing
- Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents
- Bridging Text and Video Generation: A Survey
- SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder
- Pulp Motion: Framing-aware multimodal camera and human motion generation
- Textured Gaussians for Enhanced 3D Scene Appearance Modeling
- ProcTex: Consistent and Interactive Text-to-texture Synthesis for Part-based Procedural Models
- Mixture of Contexts for Long Video Generation
- WonderHuman: Hallucinating Unseen Parts in Dynamic 3D Human Reconstruction
- TCDiff++: An End-to-end Trajectory-Controllable Diffusion Model for Harmonious Music-Driven Group Choreography
- Evaluating High-Resolution Piano Sustain Pedal Depth Estimation with Musically Informed Metrics
- "It felt more real": Investigating the User Experience of the MiWaves Personalizing JITAI Pilot Study
- Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models
- Human Empathy as Encoder: AI-Assisted Depression Assessment in Special Education
- MURMR: A Multimodal Sensing Framework for Automated Group Behavior Analysis in Mixed Reality
- "Think about it like you're a firefighter": Understanding How Reddit Moderators Use the Modqueue
- Can Large Language Models generalize analogy solving like children can?
- Student-AI Interaction in an LLM-Empowered Learning Environment: A Cluster Analysis of Engagement Profiles
- The Ultimate Configuration Management Tool? Lessons from a Mixed Methods Study of Ansible's Challenges
- Advancing Brainwave Modeling with a Codebook-Based Foundation Model
- The Narcissus Hypothesis: Descending to the Rung of Illusion
- Creative synthesis of kinematic mechanisms
- Universal Beta Splatting
- Style Brush: Guided Style Transfer for 3D Objects
- Paris: A Decentralized Trained Open-Weight Diffusion Model
- Neon: Negative Extrapolation From Self-Training Improves Image Generation
- Diverse Text-to-Image Generation via Contrastive Noise Optimization
- Joint Neural SDF Reconstruction and Semantic Segmentation for CAD Models
- Investigating mixed traffic dynamics of pedestrians and non-motorized vehicles at urban intersections: Observation experiments and modelling
- A survey on the impact of emotions on the productivity among software developers
- ReactDiff: Fundamental Multiple Appropriate Facial Reaction Diffusion Model
- Social bias is prevalent in user reports of hate and abuse online
- Understanding User Perception and Intention to Use Smart Homes for Energy Efficiency: A Survey
- Ads that Talk Back: Implications and Perceptions of Injecting Personalized Advertising into LLM Chatbots
- Towards a Better Modqueue: Designing for Diversity Across Moderator Objectives and Workflows
- EEG-based AI-BCI Wheelchair Advancement: A Brain-Computer Interfacing Wheelchair System Using Deep Learning Approach
- Privacy Leakage Overshadowed by Views of AI: A Study on Human Oversight of Privacy in Language Model Agent
- Understanding User Mental Models in AI-Driven Code Completion Tools: Insights from an Elicitation Study
- "When I lost it, they dragged me out": How Care Encounters Empower Marginalized Young Adults' Aspiration and Mental Health Care-Seeking
- Beyond Training: How Workers Discover Value in Enterprise AI
- NaturalEdit: Code Modification through Direct Interaction with Adaptive Natural Language Representation
- What Do We Mean When We Talk About Data Storytelling?
- Trust in Transparency: How Explainable AI Shapes User Perceptions
- NERVIS: An Interactive System for Graph-Based Exploration and Editing of Named Entities
- Observing Without Doing: Pseudo-Apprenticeship Patterns in Student LLM Use
- CAG: Chunked Augmented Generation for Google Chrome's Built-in Gemini Nano
- PrivacyMotiv: Speculative Persona Journeys for Empathic and Motivating Privacy Reviews in UX Design
- A Survey of LLM-Based Applications in Programming Education: Balancing Automation and Human Oversight
- Smart Paste: Automatically Fixing Copy/Paste for Google Developers
- Talking Tennis: Language Feedback from 3D Biomechanical Action Recognition
- Reconsidering Requirements Engineering: Human-AI Collaboration in AI-Native Software Development
- Mixed Reality Guidance of a Surgical Scalpel Using Magic Leap: Evaluation on a 3D-Printed Liver Phantom
- Invisible Saboteurs: Sycophantic LLMs Mislead Novices in Problem-Solving Tasks
- Bridging the Gap: Enhancing Gaze-Performance Link in Children with ASD through Dual-Level Visual Guidance in MR-DMT
- Teaching with AI: A Systematic Review of Chatbots, Generative Tools, and Tutoring Systems in Programming Education
- AI-Driven Grading and Moderation for Collaborative Projects in Computer Science Education
- Wrist2Finger: Sensing Fingertip Force for Force-Aware Hand Interaction with a Ring-Watch Wearable
- Pedestrian collision avoidance in hemianopia during natural walking in immersive virtual reality
- When AI Gets Persuaded, Humans Follow: Inducing the Conformity Effect in Persuasive Dialogue
- Reflection Before Action: Designing a Framework for Quantifying Thought Patterns for Increased Self-awareness in Personal Decision Making
- Beyond the Benefits: A Systematic Review of the Harms and Consequences of Generative AI in Computing Education
- AgentBuilder: Exploring Scaffolds for Prototyping User Experiences of Interface Agents
- Autonomy Matters: A Study on Personalization-Privacy Dilemma in LLM Agents
- Multi-Hop Question Answering: When Can Humans Help, and Where do They Struggle?
- A Hierarchical Control Architecture for Space Robots in On-Orbit Servicing Operations
- In-Hand Manipulation of Articulated Tools with Dexterous Robot Hands with Sim-to-Real Transfer
- KiVi: Kinesthetic-Visuospatial Integration for Dynamic and Safe Egocentric Legged Locomotion
- HeLoM: Hierarchical Learning for Whole-Body Loco-Manipulation in Hexapod Robot
- ShapeICP: Iterative Category-level Object Pose and Shape Estimation from Depth
- CHOICE: Coordinated Human-Object Interaction in Cluttered Environments for Pick-and-Place Actions
- RowDetr: End-to-End Crop Row Detection Using Polynomials
- Universal Adversarial Perturbation Attacks On Modern Behavior Cloning Policies
- Rough Stochastic Pontryagin Maximum Principle and an Indirect Shooting Method
- Motion Blender Gaussian Splatting for Dynamic Scene Reconstruction
- Learning Closed-Loop Parametric Nash Equilibria of Multi-Agent Collaborative Field Coverage
- LIAM: Multimodal Transformer for Language Instructions, Images, Actions and Semantic Maps
- AutoDrive-QA: A Multiple-Choice Benchmark for Vision-Language Evaluation in Urban Autonomous Driving
- Physics-Based Motion Imitation with Adversarial Differential Discriminators
- Immersive Mixed Reality Simulator for CT Scan Preparation: Enhancing Patient Emotional and Physical Readiness
- ReLI: A Language-Agnostic Approach to Human-Robot Interaction
- Digital-physical testbed for ship autonomy studies in the Marine Cybernetics Laboratory basin
- Neural Brain: A Neuroscience-inspired Framework for Embodied Agents
- Learning a Unified Policy for Position and Force Control in Legged Loco-Manipulation
- FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
- Viewpoint-Agnostic Manipulation Policies with Strategic Vantage Selection
- ATK: Automatic Task-driven Keypoint Selection for Robust Policy Learning
- Time-Optimized Safe Navigation in Unstructured Environments through Learning Based Depth Completion
- LERa: Replanning with Visual Feedback in Instruction Following
- Learning Geometry-Aware Nonprehensile Pushing and Pulling with Dexterous Hands
- Cyber Racing Coach: A Haptic Shared Control Framework for Teaching Advanced Driving Skills
- RT-GuIDE: Real-Time Gaussian Splatting for Information-Driven Exploration
- Dynamic Neural Potential Field: Online Trajectory Optimization in the Presence of Moving Obstacles
- NDOB-Based Control of a UAV with Delta-Arm Considering Manipulator Dynamics
- Online Hybrid-Belief POMDP with Coupled Semantic-Geometric Models
- ToddlerBot: Open-Source ML-Compatible Humanoid Platform for Loco-Manipulation
- Training-free Task-oriented Grasp Generation
- Nonparametric adaptive payload tracking for an offshore crane
- Humanoid Policy ~ Human Policy
- Learning to Play Piano in the Real World
- A Corrector-aided Look-ahead Distance-based Guidance for Online Reference Path Following with an Efficient Mid-course Guidance Strategy
- BiDexHand: Design and Evaluation of an Open-Source 16-DoF Biomimetic Dexterous Hand
- Distributed Area Coverage with High Altitude Balloons Using Multi-Agent Reinforcement Learning
- LIBERO-PRO: Towards Robust and Fair Evaluation of Vision-Language-Action Models Beyond Memorization
- Bridge Thinking and Acting: Unleashing Physical Potential of VLM with Generalizable Action Expert
- OpenFLAME: Federated Visual Positioning System to Enable Large-Scale Augmented Reality Applications
- A KL-regularization framework for learning to plan with adaptive priors
- RAP: 3D Rasterization Augmented End-to-End Planning
- More Than Meets the Eye? Uncovering the Reasoning-Planning Disconnect in Training Vision-Language Driving Models
- Learning a Shape-adaptive Assist-as-needed Rehabilitation Policy from Therapist-informed Input
- Efficient Probabilistic Planning with Maximum-Coverage Distributionally Robust Backward Reachable Trees
- A Benchmarking Study of Vision-Based Robotic Grasping Algorithms: A Comparative Analysis
- Code Generation and Conic Constraints for Model-Predictive Control on Microcontrollers with Conic-TinyMPC
- Dual-arm Motion Generation for Repositioning Care based on Deep Predictive Learning with Somatosensory Attention Mechanism
- Pragmatic Embodied Spoken Instruction Following in Human-Robot Collaboration with Theory of Mind
- CLEAR-IR: Clarity-Enhanced Active Reconstruction of Infrared Imagery
- HyperVLA: Efficient Inference in Vision-Language-Action Models via Hypernetworks
- Efficient Navigation in Unknown Indoor Environments with Vision-Language Models
- Walking, Rolling, and Beyond: First-Principles and RL Locomotion on a TARS-Inspired Robot
- StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation
- Automaton Constrained Q-Learning
- ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning
- Adaptive Cruise Control in Autonomous Vehicles: Challenges, Gaps, Comprehensive Review, and, Future Directions
- Viability-Preserving Passive Torque Control
- Real-Time Threaded Houbara Detection and Segmentation for Wildlife Conservation using Mobile Platforms
- Agile Tradespace Exploration for Space Rendezvous Mission Design via Transformers
- SketchPlan: Diffusion Based Drone Planning From Human Sketches
- Deep Reinforcement Learning for Multi-Agent Coordination
- ContextVLA: Vision-Language-Action Model with Amortized Multi-Frame Context
- Integrated Planning and Control on Manifolds: Factor Graph Representation and Toolkit
- Stability-Aware Retargeting for Humanoid Multi-Contact Teleoperation
- Reliable and Scalable Robot Policy Evaluation with Imperfect Simulators
- PAD-TRO: Projection-Augmented Diffusion for Direct Trajectory Optimization
- Velocity-Form Data-Enabled Predictive Control of Soft Robots under Unknown External Payloads
- Everything-Grasping (EG) Gripper: A Universal Gripper with Synergistic Suction-Grasping Capabilities for Cross-Scale and Cross-State Manipulation
- MobRT: A Digital Twin-Based Framework for Scalable Learning in Mobile Manipulation
- OKVIS2-X: Open Keyframe-based Visual-Inertial SLAM Configurable with Dense Depth or LiDAR, and GNSS
- Bio-Inspired Robotic Houbara: From Development to Field Deployment for Behavioral Studies
- Building Gradient by Gradient: Decentralised Energy Functions for Bimanual Robot Assembly
- Performance-guided Task-specific Optimization for Multirotor Design
- Online automatic code generation for robot swarms: LLMs and self-organizing hierarchy
- TAG-K: Tail-Averaged Greedy Kaczmarz for Computationally Efficient and Performant Online Inertial Parameter Estimation
- Model-Based Adaptive Precision Control for Tabletop Planar Pushing Under Uncertain Dynamics
- Trajectory prediction for heterogeneous agents: A performance analysis on small and imbalanced datasets
- COVER:COverage-VErified Roadmaps for Fixed-time Motion Planning in Continuous Semi-Static Environments
- Seeing the Bigger Picture: 3D Latent Mapping for Mobile Manipulation Policy Learning
- NoTVLA: Narrowing of Dense Action Trajectories for Generalizable Robot Manipulation
- WAFFLE: A Wearable Approach to Bite Timing Estimation in Robot-Assisted Feeding
- TCB-VIO: Tightly-Coupled Focal-Plane Binary-Enhanced Visual Inertial Odometry
- A Real-Time Framework for Intermediate Map Construction and Kinematically Feasible Off-Road Planning Without OSM
- SITCOM: Scaling Inference-Time COMpute for VLAs
- Feedback Matters: Augmenting Autonomous Dissection with Visual and Topological Feedback
- From Shadow to Light: Toward Safe and Efficient Policy Learning Across MPC, DeePC, RL, and LLM Agents
- HEHA: Hierarchical Planning for Heterogeneous Multi-Robot Exploration of Unknown Environments
- Learning to Capture Rocks using an Excavator: A Reinforcement Learning Approach with Guiding Reward Formulation
- VBM-NET: Visual Base Pose Learning for Mobile Manipulation using Equivariant TransporterNet and GNNs
- Using Robotics to Improve Transcatheter Edge-to-Edge Repair of the Mitral Valve
- Zenbo Patrol: A Social Assistive Robot Based on Multimodal Deep Learning for Real-time Illegal Parking Recognition and Notification
- Flexible Locomotion Learning with Diffusion Model Predictive Control
- A Simulation Evaluation Suite for Robust Adaptive Quadcopter Control
- Destination-to-Chutes Task Mapping Optimization for Multi-Robot Coordination in Robotic Sorting Systems
- Robust Permissive Controller Synthesis for Interval MDPs
- Digital-Twin Evaluation for Proactive Human-Robot Collision Avoidance via Prediction-Guided A-RRT*
- Distributed Connectivity Maintenance and Recovery for Quadrotor Motion Planning
- LapSurgie: Humanoid Robots Performing Surgery via Teleoperated Handheld Laparoscopy
- Efficient Surgical Robotic Instrument Pose Reconstruction in Real World Conditions Using Unified Feature Detection
- Shape-Space Graphs: Fast and Collision-Free Path Planning for Soft Robots
- Learning to Act Through Contact: A Unified View of Multi-Task Robot Learning
- Safety-Oriented Dynamic Path Planning for Automated Vehicles
- Geometrically Exact Hard Magneto-Elastic Cosserat Shells: Static Formulation for Shape Morphing
- An Amphibious Untethered Inchworm Soft Robot for Fast Crawling Locomotion
- Robust Visual Embodiment: How Robots Discover Their Bodies in Real Environments
- EmbodiSwap for Zero-Shot Robot Imitation Learning
- Summaries as Centroids for Interpretable and Scalable Text Clustering
- Interpretable Visualizations of Data Spaces for Classification Problems
- Dependency-aware Maximum Likelihood Estimation for Active Learning
- Density Ratio-based Causal Discovery from Bivariate Continuous-Discrete Data
- Directional Convergence, Benign Overfitting of Gradient Descent in leaky ReLU two-layer Neural Networks
- Gradient-flow SDEs have unique transient population dynamics
- Inference-time Scaling of Diffusion Models through Classical Search
- Gemini Robotics 1.5: Pushing the Frontier of Generalist Robots with Advanced Embodied Reasoning, Thinking, and Motion Transfer
- Optimal swimming with body compliance in an overdamped medium
- Warm-Starting Optimization-Based Motion Planning for Robotic Manipulators via Point Cloud-Conditioned Flow Matching
- Vector Copula Variational Inference and Dependent Block Posterior Approximations
- Graph Alignment via Birkhoff Relaxation
- Critical Points of Random Neural Networks
- Uniform convergence of the smooth calibration error and its relationship with functional gradient
- Probably Approximately Correct Labels
- Conformal Prediction for Long-Tailed Classification
- On amortizing convex conjugates for optimal transport
- Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models
- What is Intelligence? A Cycle Closure Perspective
- A Deterministic Information Bottleneck Method for Clustering Mixed-Type Data
- Maximum mean discrepancies of Farey sequences
- Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis
- Learning to Bid in Non-Stationary Repeated First-Price Auctions
- Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training
- Rethinking Langevin Thompson Sampling from A Stochastic Approximation Perspective
- ResCP: Reservoir Conformal Prediction for Time Series Forecasting
- TopInG: Topologically Interpretable Graph Learning via Persistent Rationale Filtration
- Mixtures of Gaussian Process Experts with SMC$^2$
- Fitted value iteration methods for bicausal optimal transport
- Universality of Kernel Random Matrices and Kernel Regression in the Quadratic Regime
- Approximation Bounds for Recurrent Neural Networks with Application to Regression
- Machine Learning for Inverse Problems and Data Assimilation
- Counterfactual explainability and analysis of variance
- Another look at inference after prediction
- A Statistical Hypothesis Testing Framework for Data Misappropriation Detection in Large Language Models
- Efficient Sparsification of Simplicial Complexes via Local Densities of States
- Inverse Mixed-Integer Programming: Learning Constraints then Objective Functions
- Two new approaches to multiple canonical correlation analysis for repeated measures data
- Demystifying MaskGIT Sampler and Beyond: Adaptive Order Selection in Masked Diffusion
- Graph-based Tabular Deep Learning Should Learn Feature Interactions, Not Just Make Predictions
- Learning Linear Regression with Low-Rank Tasks in-Context
- SONA: Learning Conditional, Unconditional, and Mismatching-Aware Discriminator
- Busemann Functions in the Wasserstein Space: Existence, Closed-Forms, and Applications to Slicing
- Closed-Form Last Layer Optimization
- On decomposability and subdifferential of the tensor nuclear norm
- When Do Credal Sets Stabilize? Fixed-Point Theorems for Credal Set Updates
- Discrete scalar curvature as a weighted sum of Ollivier-Ricci curvatures
- On Structured State-Space Duality
- Technical note on Fisher Information for Robust Federated Cross-Validation
- Technical note on Sequential Test-Time Adaptation via Martingale-Driven Fisher Prompting
- The Hidden Game Problem
- On Provable Benefits of Muon in Federated Learning
- Optimal Scaling Needs Optimal Norm
- Analysis of kinetic Langevin Monte Carlo under the stochastic exponential Euler discretization from underdamped all the way to overdamped
- Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning
- Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees
- Don't Pass$\mathtt{@}k$: A Bayesian Framework for Large Language Model Evaluation
- Probing Geometry of Next Token Prediction Using Cumulant Expansion of the Softmax Entropy
- Arithmetic-Mean $\mu$P for Modern Architectures: A Unified Learning-Rate Scale for CNNs and ResNets
- Score-based generative emulation of impact-relevant Earth system model outputs
- spd-metrics-id: A Python Package for SPD-Aware Distance Metrics in Connectome Fingerprinting and Beyond
- Domain Generalization: A Tale of Two ERMs
- Towards Sampling Data Structures for Tensor Products in Turnstile Streams
- Group Policy Gradient
- From Moments to Models: Graphon Mixture-Aware Mixup and Contrastive Learning
- Balancing Interpretability and Performance in Reinforcement Learning: An Adaptive Spectral Based Linear Approach
- Cost Efficient Fairness Audit Under Partial Feedback
- Allocation of Parameters in Transformers
- Robust Batched Bandits
- TROLL: Trust Regions improve Reinforcement Learning for Large Language Models
- Proximal Diffusion Neural Sampler
- HOFLON: Hybrid Offline Learning and Online Optimization for Process Start-Up and Grade-Transition Control
- Longitudinal Flow Matching for Trajectory Modeling
- BEKAN: Boundary condition-guaranteed evolutionary Kolmogorov-Arnold networks with radial basis functions for solving PDE problems
- Latent Mixture of Symmetries for Sample-Efficient Dynamic Learning
- Exact and Approximate MCMC for Doubly-intractable Probabilistic Graphical Models Leveraging the Underlying Independence Model
- Understanding the Role of Training Data in Test-Time Scaling
- Explore the Loss space with Hill-ADAM
- Neural Bayesian Filtering
- Handling Missing Data in Probabilistic Regression Trees: Methods and Implementation in R
- Implicit Models: Expressive Power Scales with Test-Time Compute
- Does higher interpretability imply better utility? A Pairwise Analysis on Sparse Autoencoders
- Bias and Coverage Properties of the WENDy-IRLS Algorithm
- Multi-task neural diffusion processes for uncertainty-quantified wind power prediction
- Consistent Kernel Change-Point Detection under m-Dependence for Text Segmentation
- Optimal Regularization Under Uncertainty: Distributional Robustness and Convexity Constraints
- On residual network depth
- Trajectory Data Suffices for Statistically Efficient Policy Evaluation in Finite-Horizon Offline RL with Linear $q^\pi$-Realizability and Concentrability
- Composite Optimization with Error Feedback: the Dual Averaging Approach
- Long-Term Mapping of the Douro River Plume with Multi-Agent Reinforcement Learning
- Sequential decoder training for improved latent space dynamics identification
- Learning Survival Models with Right-Censored Reporting Delays
- Divergence Phase Index: A Riesz-Transform Framework for Multidimensional Phase Difference Analysis
- Gini-based Model Monitoring: A General Framework with an Application to Non-life Insurance Pricing
- Computing Wasserstein Barycenters through Gradient Flows
- Fisher-Bingham-like normalizing flows on the sphere
- Kernel ridge regression under power-law data: spectrum and generalization
- A Noise Resilient Approach for Robust Hurst Exponent Estimation
- Set to Be Fair: Demographic Parity Constraints for Set-Valued Classification
- Embracing Discrete Search: A Reasonable Approach to Causal Structure Learning
- Curiosity-Driven Co-Development of Action and Language in Robots Through Self-Exploration
- Causal Abstractions, Categorically Unified
- Machine Learning Workflows in Climate Modeling: Design Patterns and Insights from Case Studies
- Estimating link level traffic emissions: enhancing MOVES with open-source data
- Quantile-Scaled Bayesian Optimization Using Rank-Only Feedback
- Mathematically rigorous proofs for Shapley explanations
- Transformed $\ell_1$ Regularizations for Robust Principal Component Analysis: Toward a Fine-Grained Understanding
- The analogy theorem in Hoare logic
- Spectral Thresholds for Identifiability and Stability:Finite-Sample Phase Transitions in High-Dimensional Learning
- Self-Speculative Masked Diffusions
- Simulation-based inference via telescoping ratio estimation for trawl processes
- Scalable Causal Discovery from Recursive Nonlinear Data via Truncated Basis Function Scores and Tests
- Relative Information Gain and Gaussian Process Regression
- Adaptive Coverage Policies in Conformal Prediction
- Modular and Adaptive Conformal Prediction for Sequential Models via Residual Decomposition
Research Sources: 1320 | Generated: 10/7/2025