AI RESEARCH PAPERS & ACADEMIC SOURCES
- Optimized 3D Gaussian Splatting using Coarse-to-Fine Image Frequency Modulation
- Advancing Image Super-resolution Techniques in Remote Sensing: A Comprehensive Survey
- Lost in the Maze: Overcoming Context Limitations in Long-Horizon Agentic Search
- Dynamic Evaluation for Oversensitivity in LLMs
- Are they lovers or friends? Evaluating LLMs' Social Reasoning in English and Korean Dialogues
- Re:Member: Emotional Question Generation from Personal Memories
- When Can We Trust LLMs in Mental Health? Large-Scale Benchmarks for Reliable LLM Evaluation
- From Memorization to Generalization: Fine-Tuning Large Language Models for Biomedical Term-to-Identifier Normalization
- Tibetan Language and AI: A Comprehensive Survey of Resources, Methods and Challenges
- "You Are Rejected!": An Empirical Study of Large Language Models Taking Hiring Evaluations
- Think Straight, Stop Smart: Structured Reasoning for Efficient Multi-Hop RAG
- Multi-Faceted Evaluation of Tool-Augmented Dialogue Systems
- DiSRouter: Distributed Self-Routing for LLM Selections
- Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+
- SheetBrain: A Neuro-Symbolic Agent for Accurate Reasoning over Complex and Large Spreadsheets
- Difficulty-Controllable Multiple-Choice Question Generation Using Large Language Models and Direct Preference Optimization
- TheMCPCompany: Creating General-purpose Agents with Task-specific Tools
- JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation
- KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints
- HAD: HAllucination Detection Language Models Based on a Comprehensive Hallucination Taxonomy
- Slot Filling as a Reasoning Task for SpeechLLMs
- Algorithmic Fairness in NLP: Persona-Infused LLMs for Human-Centric Hate Speech Detection
- Local Obfuscation by GLINER for Impartial Context Aware Lineage: Development and evaluation of PII Removal system
- Modeling Turn-Taking with Semantically Informed Gestures
- LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
- Sign Language Translation with Sentence Embedding Supervision
- SONAR-SLT: Multilingual Sign Language Translation via Language-Agnostic Sentence Embedding Supervision
- Spatio-temporal Sign Language Representation and Translation
- BLiSS 1.0: Evaluating Bilingual Learner Competence in Second Language Small Language Models
- MINED: Probing and Updating with Multimodal Time-Sensitive Knowledge for Large Multimodal Models
- Machine Text Detectors are Membership Inference Attacks
- What is the Best Sequence Length for BABYLM?
- Lookahead Routing for Large Language Models
- Which Evaluation for Which Model? A Taxonomy for Speech Model Assessment
- Conditions for Catastrophic Forgetting in Multilingual Translation
- PBBQ: A Persian Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models
- CrossNews-UA: A Cross-lingual News Semantic Similarity Benchmark for Ukrainian, Polish, Russian, and English
- LLavaCode: Compressed Code Representations for Retrieval-Augmented Code Generation
- DiffAdapt: Difficulty-Adaptive Reasoning for Token-Efficient LLM Inference
- CoSense-LLM: Semantics at the Edge with Cost- and Uncertainty-Aware Cloud-Edge Cooperation
- From Answers to Guidance: A Proactive Dialogue System for Legal Documents
- Adapting Multilingual Models to Code-Mixed Tasks via Model Merging
- ToolDreamer: Instilling LLM Reasoning Into Tool Retrievers
- The Art of Asking: Multilingual Prompt Optimization for Synthetic Data
- Towards Better Health Conversations: The Benefits of Context-seeking
- OpenGuardrails: An Open-Source Context-Aware AI Guardrails Platform
- Aligning Multilingual News for Stock Return Prediction
- [De|Re]constructing VLMs' Reasoning in Counting
- olmOCR 2: Unit Test Rewards for Document OCR
- LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K
- CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation
- Can Large Language Models be Effective Online Opinion Miners?
- metaTextGrad: Automatically optimizing language model optimizers
- ETT: Expanding the Long Context Understanding Capability of LLMs at Test-Time
- PixelWorld: How Far Are We from Perceiving Everything as Pixels?
- ScholaWrite: A Dataset of End-to-End Scholarly Writing Process
- WikiVideo: Article Generation from Multiple Videos
- Efficient Interleaved Speech Modeling through Knowledge Distillation
- Dimensionality Reduction for Remote Sensing Data Analysis: A Systematic Review of Methods and Applications
- Ninja Codes: Neurally Generated Fiducial Markers for Stealthy 6-DoF Tracking
- MoAlign: Motion-Centric Representation Alignment for Video Diffusion Models
- UniHPR: Unified Human Pose Representation via Singular Value Contrastive Learning
- Advancing Brain Tumor Segmentation via Attention-based 3D U-Net Architecture and Digital Image Processing
- FootFormer: Estimating Stability from Visual Input
- Malaria Detection from Blood Cell Images Using XceptionNet
- Video Consistency Distance: Enhancing Temporal Consistency for Image-to-Video Generation via Reward-Based Fine-Tuning
- MoE-GS: Mixture of Experts for Dynamic Gaussian Splatting
- SFGFusion: Surface Fitting Guided 3D Object Detection with 4D Radar and Camera Fusion
- Space Object Detection using Multi-frame Temporal Trajectory Completion Method
- Background Fades, Foreground Leads: Curriculum-Guided Background Pruning for Efficient Foreground-Centric Collaborative Perception
- Advances in 4D Representation: Geometry, Motion, and Interaction
- SCEESR: Semantic-Control Edge Enhancement for Diffusion-Based Super-Resolution
- MobiAct: Efficient MAV Action Recognition Using MobileNetV4 with Contrastive Learning and Knowledge Distillation
- D2D: Detector-to-Differentiable Critic for Improved Numeracy in Text-to-Image Generation
- Vision-Based Mistake Analysis in Procedural Activities: A Review of Advances and Challenges
- Unified Reinforcement and Imitation Learning for Vision-Language Models
- Exploring Scale Shift in Crowd Localization under the Context of Domain Generalization
- BrainMCLIP: Brain Image Decoding with Multi-Layer feature Fusion of CLIP
- A Training-Free Framework for Open-Vocabulary Image Segmentation and Recognition with EfficientNet and CLIP
- DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents
- DARE: A Deformable Adaptive Regularization Estimator for Learning-Based Medical Image Registration
- AegisRF: Adversarial Perturbations Guided with Sensitivity for Protecting Intellectual Property of Neural Radiance Fields
- Seeing Across Views: Benchmarking Spatial Reasoning of Vision-Language Models in Robotic Scenes
- Multi-Camera Worker Tracking in Logistics Warehouse Considering Wide-Angle Distortion
- Reasoning Like Experts: Leveraging Multimodal Large Language Models for Drawing-based Psychoanalysis
- Predicting before Reconstruction: A generative prior framework for MRI acceleration
- PRGCN: A Graph Memory Network for Cross-Sequence Pattern Reuse in 3D Human Pose Estimation
- Mitigating representation bias caused by missing pixels in methane plume detection
- Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts
- PoseCrafter: Extreme Pose Estimation with Hybrid Video Synthesis
- The Intricate Dance of Prompt Complexity, Quality, Diversity, and Consistency in T2I Models
- HAD: Hierarchical Asymmetric Distillation to Bridge Spatio-Temporal Gaps in Event-Based Object Tracking
- Can You Trust What You See? Alpha Channel No-Box Attacks on Video Object Detection
- VGD: Visual Geometry Gaussian Splatting for Feed-Forward Surround-view Driving Reconstruction
- Addressing the Depth-of-Field Constraint: A New Paradigm for High Resolution Multi-Focus Image Fusion
- Digitizing Paper ECGs at Scale: An Open-Source Algorithm for Clinical Research
- Decomposed Attention Fusion in MLLMs for Training-Free Video Reasoning Segmentation
- CBDiff:Conditional Bernoulli Diffusion Models for Image Forgery Localization
- Beyond sparse denoising in frames: minimax estimation with a scattering transform
- Pragmatic Heterogeneous Collaborative Perception via Generative Communication Mechanism
- Augmenting Moment Retrieval: Zero-Dependency Two-Stage Learning
- MedReason-R1: Learning to Reason for CT Diagnosis with Reinforcement Learning and Local Zoom
- Re-Activating Frozen Primitives for 3D Gaussian Splatting
- Curvilinear Structure-preserving Unpaired Cross-domain Medical Image Translation
- Explainable Face Presentation Attack Detection via Ensemble-CAM
- LyTimeT: Towards Robust and Interpretable State-Variable Discovery
- Adaptive Distribution-aware Quantization for Mixed-Precision Neural Networks
- OmniMotion-X: Versatile Multimodal Whole-Body Motion Generation
- Class-Aware Prototype Learning with Negative Contrast for Test-Time Adaptation of Vision-Language Models
- How to Evaluate Monocular Depth Estimation?
- Is This Tracker On? A Benchmark Protocol for Dynamic Tracking
- GRASPLAT: Enabling dexterous grasping through novel view synthesis
- GigaBrain-0: A World Model-Powered Vision-Language-Action Model
- Automated Morphological Analysis of Neurons in Fluorescence Microscopy Using YOLOv8
- LBL: Logarithmic Barrier Loss Function for One-class Classification
- LookUp3D: Data-Driven 3D Scanning
- Brain3D: Generating 3D Objects from fMRI
- ComDrive: Comfort-Oriented End-to-End Autonomous Driving
- Adversarial Attacks on LiDAR-Based Tracking Across Road Users: Robustness Evaluation and Target-Aware Black-Box Method
- Learning Differential Pyramid Representation for Tone Mapping
- VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models
- ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding
- FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views
- Video-R1: Reinforcing Video Reasoning in MLLMs
- MMLA: Multi-Environment, Multi-Species, Low-Altitude Drone Dataset
- 3D Visual Illusion Depth Estimation
- Spiking Neural Networks Need High Frequency Information
- See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction
- Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition
- Doctor Approved: Generating Medically Accurate Skin Disease Images through AI-Expert Feedback
- Towards foundational LiDAR world models with efficient latent flow matching
- Latent Diffusion Models with Masked AutoEncoders
- Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation
- Breaking the Discretization Barrier of Continuous Physics Simulation Learning
- Discretized Gaussian Representation for Tomographic Reconstruction
- Vectorization of Persistence Diagrams for Topological Data Analysis in R and Python Using TDAvec Package
- When Do Transformers Learn Heuristics for Graph Connectivity?
- CONFEX: Uncertainty-Aware Counterfactual Explanations with Conformal Guarantees
- The Tail Tells All: Estimating Model-Level Membership Inference Vulnerability Without Reference Models
- GaLLoP: Gradient-based Sparse Learning on Low-Magnitude Parameters
- Environment Inference for Learning Generalizable Dynamical System
- Blackbox Model Provenance via Palimpsestic Membership Inference
- Transformers are almost optimal metalearners for linear classification
- The Feasibility of Training Sovereign Language Models in the Global South: A Study of Brazil and Mexico
- CityAQVis: Integrated ML-Visualization Sandbox Tool for Pollutant Estimation in Urban Regions Using Multi-Source Data (Software Article)
- When Models Can't Follow: Testing Instruction Adherence Across 256 LLMs
- Foundation Models for Discovery and Exploration in Chemical Space
- Evaluating LLM Story Generation through Large-scale Network Analysis of Social Structures
- Impartial Selection with Predictions
- Calibrated Principal Component Regression
- Learning noisy tissue dynamics across time scales
- Signature Kernel Scoring Rule as Spatio-Temporal Diagnostic for Probabilistic Forecasting
- A Graph Signal Processing Framework for Hallucination Detection in Large Language Models
- Training-Free Spectral Fingerprints of Voice Processing in Transformers
- HAMLOCK: HArdware-Model LOgically Combined attacK
- Extreme Event Aware ($\eta$-) Learning
- Transfer Learning Beyond the Standard Model
- RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs
- Synthesizability Prediction of Crystalline Structures with a Hierarchical Transformer and Uncertainty Quantification
- Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models
- Magnetic field estimation using Gaussian process regression for interactive wireless power system design
- Topology of Currencies: Persistent Homology for FX Co-movements: A Comparative Clustering Study
- Transformers are Inherently Succinct
- Nonmonotone subgradient methods based on a local descent lemma
- Autobidding Arena: unified evaluation of the classical and RL-based autobidding algorithms
- MoE-Prism: Disentangling Monolithic Experts for Elastic MoE Services via Model-System Co-Designs
- AMAuT: A Flexible and Efficient Multiview Audio Transformer Framework Trained from Scratch
- On the hardness of RL with Lookahead
- Using Temperature Sampling to Effectively Train Robot Learning Policies on Imbalanced Datasets
- Square root Cox's survival analysis by the fittest linear and neural networks model
- A Derandomization Framework for Structure Discovery: Applications in Neural Networks and Beyond
- From See to Shield: ML-Assisted Fine-Grained Access Control for Visual Data
- Exploring "Many in Few" and "Few in Many" Properties in Long-Tailed, Highly-Imbalanced IC Defect Classification
- PCP-GAN: Property-Constrained Pore-scale image reconstruction via conditional Generative Adversarial Networks
- Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition
- Online Two-Stage Submodular Maximization
- Learning Upper Lower Value Envelopes to Shape Online RL: A Principled Approach
- Uncertainty evaluation of segmentation models for Earth observation
- Remarks on a recent preprint of Chernikov and Towsner
- Bridging Earth and Space: A Survey on HAPS for Non-Terrestrial Networks
- Zhyper: Factorized Hypernetworks for Conditioned LLM Fine-Tuning
- Exploring the Effect of DNN Depth on Adversarial Attacks in Network Intrusion Detection Systems
- Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing
- Hubble: a Model Suite to Advance the Study of LLM Memorization
- DR-VIDAL -- Doubly Robust Variational Information-theoretic Deep Adversarial Learning for Counterfactual Prediction and Treatment Effect Estimation on Real World Data
- Source-Free Domain Adaptation for SSVEP-based Brain-Computer Interfaces
- AtomSurf : Surface Representation for Learning on Protein Structures
- Phase-driven Domain Generalizable Learning for Nonstationary Time Series
- Survey of Graph Neural Network for Internet of Things and NextG Networks
- Bootstrap Sampling Rate Greater than 1.0 May Improve Random Forest Performance
- Learning to Learn with Contrastive Meta-Objective
- AdaptGrad: Adaptive Sampling to Reduce Noise
- Deep Linear Probe Generators for Weight Space Learning
- LoRA vs Full Fine-tuning: An Illusion of Equivalence
- DNN Modularization via Activation-Driven Training
- by Using Less: Distributed Learning with Energy-Constrained Devices
- Joint Hierarchical Representation Learning of Samples and Features via Informed Tree-Wasserstein Distance
- Finite Sample Identification of Partially Observed Bilinear Dynamical Systems
- A recursive Bayesian neural network for constitutive modeling of sands under monotonic and cyclic loading
- Learning Reward Machines from Partially Observed Policies
- Training-Free Constrained Generation With Stable Diffusion Models
- Optimizing Asynchronous Federated Learning: A Delicate Trade-Off Between Model-Parameter Staleness and Update Frequency
- Learning to Add, Multiply, and Execute Algorithmic Instructions Exactly with Neural Networks
- Long-term Causal Inference via Modeling Sequential Latent Confounding
- Using (Not-so) Large Language Models to Generate Simulation Models in a Formal DSL: A Study on Reaction Networks
- Learning Spatially Adaptive $\ell_1$-Norms Weights for Convolutional Synthesis Regularization
- Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?
- Customizing Spider Silk: Generative Models with Mechanical Property Conditioning for Protein Engineering
- TunnElQNN: A Hybrid Quantum-classical Neural Network for Efficient Learning
- CATransformers: Carbon Aware Transformers Through Joint Model-Hardware Optimization
- InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models
- Discrete Neural Flow Samplers with Locally Equivariant Transformer
- Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking
- AmorLIP: Efficient Language-Image Pretraining via Amortization
- Diffusion-Based Hierarchical Graph Neural Networks for Simulating Nonlinear Solid Mechanics
- SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks
- Heavy-Ball Momentum Method in Continuous Time and Discretization Error Analysis
- Generating Directed Graphs with Dual Attention and Asymmetric Encoding
- Unfolding Generative Flows with Koopman Operators: Fast and Interpretable Sampling
- Online Conformal Prediction with Efficiency Guarantees
- Semi-off-Policy Reinforcement Learning for Vision-Language Slow-Thinking Reasoning
- What Expressivity Theory Misses: Message Passing Complexity for GNNs
- Rebalancing with Calibrated Sub-classes (RCS): A Statistical Fusion-based Framework for Robust Imbalanced Classification across Modalities
- LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits
- VisDiff: SDF-Guided Polygon Generation for Visibility Reconstruction and Recognition
- Stable Matching with Ties: Approximation Ratios and Learning
- The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution
- Advancing Carbon Capture using AI: Design of permeable membrane and estimation of parameters for Carbon Capture using linear regression and membrane-based equations
- MsEdF: A Multi-stream Encoder-decoder Framework for Remote Sensing Image Captioning
- Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits
- Fast, Modular, and Differentiable Framework for Machine Learning-Enhanced Molecular Simulations
- A Comprehensive Benchmark for RNA 3D Structure-Function Modeling
- Smoothed Distance Kernels for MMDs and Applications in Wasserstein Gradient Flows
- Sub-optimality of the Separation Principle for Quadratic Control from Bilinear Observations
- Rank-One Modified Value Iteration
- Non-Stationary Lipschitz Bandits
- Nearly-Linear Time Private Hypothesis Selection with the Optimal Approximation Factor
- QiMeng-MuPa: Mutual-Supervised Learning for Sequential-to-Parallel Code Translation
- QPPG: Quantum-Preconditioned Policy Gradient for Link Adaptation in Rayleigh Fading Channels
- Stable Minima of ReLU Neural Networks Suffer from the Curse of Dimensionality: The Neural Shattering Phenomenon
- Trajectory learning for ensemble forecasts via the continuous ranked probability score: a Lorenz '96 case study
- Are Modern Speech Enhancement Systems Vulnerable to Adversarial Attacks?
- Transformer-Based Low-Resource Language Translation: A Study on Standard Bengali to Sylheti
- Integrating Transparent Models, LLMs, and Practitioner-in-the-Loop: A Case of Nonprofit Program Evaluation
- Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning
- Semantic World Models
- Hummer: Towards Limited Competitive Preference Dataset
- Reasoning Models Better Express Their Confidence
- Follow the STARs: Dynamic $\omega$-Regular Shielding of Learned Policies
- ROTATE: Regret-driven Open-ended Training for Ad Hoc Teamwork
- Measuring Data Science Automation: A Survey of Evaluation Tools for AI Assistants and Agents
- The Endless Tuning. An Artificial Intelligence Design To Avoid Human Replacement and Trace Back Responsibilities
- IM-Chat: A Multi-agent LLM Framework Integrating Tool-Calling and Diffusion Modeling for Knowledge Transfer in Injection Molding Industry
- Data Efficient Adaptation in Large Language Models via Continuous Low-Rank Fine-Tuning
- FP-IRL: Fokker-Planck Inverse Reinforcement Learning -- A Physics-Constrained Approach to Markov Decision Processes
- Embedding in Recommender Systems: A Survey
- Towards Context-Aware Domain Generalization: Understanding the Benefits and Limits of Marginal Transfer Learning
- LICO: Large Language Models for In-Context Molecular Optimization
- Estimating Long-term Heterogeneous Dose-response Curve: Generalization Bound Leveraging Optimal Transport Weights
- PARCO: Parallel AutoRegressive Models for Multi-Agent Combinatorial Optimization
- Unveiling Transformer Perception by Exploring Input Manifolds
- Learning Linear Attention in Polynomial Time
- Model-based Large Language Model Customization as Service
- Benchmarking Large Language Models with Integer Sequence Generation Tasks
- Fast MRI for All: Bridging Access Gaps by Training without Raw Data
- Open-World Drone Active Tracking with Goal-Centered Rewards
- Explainable fault and severity classification for rolling element bearings using Kolmogorov-Arnold networks
- Graph Representation Learning with Diffusion Generative Models
- ModServe: Modality- and Stage-Aware Resource Disaggregation for Scalable Multimodal Model Serving
- An Efficient Local Search Approach for Polarized Community Discovery in Signed Networks
- Probing Perceptual Constancy in Large Vision-Language Models
- Towards Enhanced Image Generation Via Multi-modal Chain of Thought in Unified Generative Models
- FairGen: Controlling Sensitive Attributes for Fair Generations in Diffusion Models via Adaptive Latent Guidance
- Merge then Realign: Simple and Effective Modality-Incremental Continual Learning for Multimodal LLMs
- EgoBlind: Towards Egocentric Visual Assistance for the Blind
- PTFA: An LLM-based Agent that Facilitates Online Consensus Building through Parallel Thinking
- GeoBenchX: Benchmarking LLMs in Agent Solving Multistep Geospatial Tasks
- NAACL2025 Tutorial: Adaptation of Large Language Models
- Merging Embedded Topics with Optimal Transport for Online Topic Modeling on Data Streams
- Quantum Natural Language Processing: A Comprehensive Review of Models, Methods, and Applications
- LongCodeBench: Evaluating Coding LLMs at 1M Context Windows
- Memorization-Compression Cycles Improve Generalization
- A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning
- CtrlDiff: Boosting Large Diffusion Language Models with Dynamic Block Prediction and Controllable Generation
- Improving planning and MBRL with temporally-extended actions
- An Analysis of Concept Bottleneck Models: Measuring, Understanding, and Mitigating the Impact of Noisy Annotations
- Evaluating NLP Embedding Models for Handling Science-Specific Symbolic Expressions in Student Texts
- High-order Equivariant Flow Matching for Density Functional Theory Hamiltonian Prediction
- MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research
- Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors
- QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training
- Horizon Reduction Makes RL Scalable
- TimeWak: Temporal Chained-Hashing Watermark for Time Series Data
- GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models
- Flexible-length Text Infilling for Discrete Diffusion Models
- One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution
- Improved Exploration in GFlownets via Enhanced Epistemic Neural Networks
- With Limited Data for Multimodal Alignment, Let the STRUCTURE Guide You
- Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search
- Understanding Reasoning in Thinking Language Models via Steering Vectors
- Pay Attention to Small Weights
- PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs
- Where are we with calibration under dataset shift in image classification?
- Position: Many generalization measures for deep learning are fragile
- Towards Universal Solvers: Using PGD Attack in Active Learning to Increase Generalizability of Neural Operators as Knowledge Distillation from Numerical PDE Solvers
- An Encode-then-Decompose Approach to Unsupervised Time Series Anomaly Detection on Contaminated Training Data--Extended Version
- Category learning in deep neural networks: Information content and geometry of internal representations
- Empowering Decision Trees via Shape Function Branching
- POLAR: Policy-based Layerwise Reinforcement Learning Method for Stealthy Backdoor Attacks in Federated Learning
- Weight Decay may matter more than muP for Learning Rate Transfer in Practice
- MetaCluster: Enabling Deep Compression of Kolmogorov-Arnold Network
- Learning Peer Influence Probabilities with Linear Contextual Bandits
- Subliminal Corruption: Mechanisms, Thresholds, and Interpretability
- Feature Space Adaptation for Robust Model Fine-Tuning
- Instance-Dependent Regret Bounds for Nonstochastic Linear Partial Monitoring
- Preliminary Use of Vision Language Model Driven Extraction of Mouse Behavior Towards Understanding Fear Expression
- Natural Gradient VI: Guarantees for Non-Conjugate Models
- A Communication-Efficient Decentralized Actor-Critic Algorithm
- Enhancing Graph Neural Networks: A Mutual Learning Approach
- Controllable Machine Unlearning via Gradient Pivoting
- Brain-Inspired Perspective on Configurations: Unsupervised Similarity and Early Cognition
- Understanding the Implicit Biases of Design Choices for Time Series Foundation Models
- Interpret Policies in Deep Reinforcement Learning using SILVER with RL-Guided Labeling: A Model-level Approach to High-dimensional and Multi-action Environments
- Mixing Configurations for Downstream Prediction
- Data Efficient Any Transformer-to-Mamba Distillation via Attention Bridge
- Knowledge Distillation of Uncertainty using Deep Latent Factor Model
- QiMeng-SALV: Signal-Aware Learning for Verilog Code Generation
- Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall
- FrogDeepSDM: Improving Frog Counting and Occurrence Prediction Using Multimodal Data and Pseudo-Absence Imputation
- Calibration and Discrimination Optimization Using Clusters of Learned Representation
- A Markov Decision Process for Variable Selection in Branch & Bound
- Scalable LinUCB: Low-Rank Design Matrix Updates for Recommenders with Large Action Spaces
- ConvXformer: Differentially Private Hybrid ConvNeXt-Transformer for Inertial Navigation
- Optimization Benchmark for Diffusion Models on Dynamical Systems
- LMFD: Latent Monotonic Feature Discovery
- Learning Noise-Resilient and Transferable Graph-Text Alignment via Dynamic Quality Assessment
- CPSVD: Enhancing Large Language Model Compression via Column-Preserving Singular Value Decomposition
- ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
- Iterative Training of Physics-Informed Neural Networks with Fourier-enhanced Features
- LLM Unlearning with LLM Beliefs
- Revisiting the Relation Between Robustness and Universality
- g-DPO: Scalable Preference Optimization for Protein Language Models
- ELUTQ: Efficient LUT-Aware Quantization for Deploying Large Language Models on Edge Devices
- Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
- Teaming LLMs to Detect and Mitigate Hallucinations
- Bi-Level Decision-Focused Causal Learning for Large-Scale Marketing Optimization: Bridging Observational and Experimental Data
- The Confusing Instance Principle for Online Linear Quadratic Control
- A Climate-Aware Deep Learning Framework for Generalizable Epidemic Forecasting
- Learning and Simulating Building Evacuation Patterns for Enhanced Safety Design Using Generative Models
- Matrix-Free Least Squares Solvers: Values, Gradients, and What to Do With Them
- Latent Space Factorization in LoRA
- Overlap-weighted orthogonal meta-learner for treatment effect estimation over time
- Policy Learning with Abstention
- Fast Inference via Hierarchical Speculative Decoding
- SEMPO: Lightweight Foundation Models for Time Series Forecasting
- Statistical Inference for Linear Functionals of Online Least-squares SGD when $t \gtrsim d^{1+\delta}$
- BATIS: Bayesian Approaches for Targeted Improvement of Species Distribution Models
- A Graph Engine for Guitar Chord-Tone Soloing Education
- Explainable e-sports win prediction through Machine Learning classification in streaming
- RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models
- Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning
- Misalignment Bounty: Crowdsourcing AI Agent Misbehavior
- Beyond Reactivity: Measuring Proactive Problem Solving in LLM Agents
- Benchmarking World-Model Learning
- A Unified Formal Theory on the Logical Limits of Symbol Grounding
- What is Implementation Science; and Why It Matters for Bridging the Artificial Intelligence Innovation-to-Application Gap in Medical Imaging
- LLM Bazaar: A Service Design for Supporting Collaborative Learning with an LLM-Powered Multi-Party Collaboration Infrastructure
- Contextual Augmentation for Entity Linking using Large Language Models
- Small Language Models Offer Significant Potential for Science Community
- CodeCRDT: Observation-Driven Coordination for Multi-Agent LLM Code Generation
- CosmoCore Affective Dream-Replay Reinforcement Learning for Code Generation
- AI for Distributed Systems Design: Scalable Cloud Optimization Through Repeated LLMs Sampling And Simulators
- Evaluating LLMs for Career Guidance: Comparative Analysis of Computing Competency Recommendations Across Ten African Countries
- DuoLens: A Framework for Robust Detection of Machine-Generated Multilingual Text and Code
- 3D Optimization for AI Inference Scaling: Balancing Accuracy, Cost, and Latency
- Improving Topic Modeling of Social Media Short Texts with Rephrasing: A Case Study of COVID-19 Related Tweets
- Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection
- Large Connectome Model: An fMRI Foundation Model of Brain Connectomes Empowered by Brain-Environment Interaction in Multitask Learning Landscape
- Prospects for Using Artificial Intelligence to Understand Intrinsic Kinetics of Heterogeneous Catalytic Reactions
- ADPO: Anchored Direct Preference Optimization
- Context-aware Fairness Evaluation and Mitigation in LLMs
- MMAO-Bench: MultiModal All in One Benchmark Reveals Compositional Law between Uni-modal and Omni-modal in OmniModels
- Misinformation Detection using Large Language Models with Explainability
- Benchmarking On-Device Machine Learning on Apple Silicon with MLX
- Noise-corrected GRPO: From Noisy Rewards to Unbiased Gradients
- Application of Reduced-Order Models for Temporal Multiscale Representations in the Prediction of Dynamical Systems
- BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping
- A Justice Lens on Fairness and Ethics Courses in Computing Education: LLM-Assisted Multi-Perspective and Thematic Evaluation
- StutterZero and StutterFormer: End-to-End Speech Conversion for Stuttering Transcription and Correction
- NeuroAda: Activating Each Neuron's Potential for Parameter-Efficient Fine-Tuning
- ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge
- $\nabla$-SDF: Learning Euclidean Signed Distance Functions Online with Gradient-Augmented Octree Interpolation and Neural Residual
- Robust Driving QA through Metadata-Grounded Context and Task-Specific Prompts
- $\Delta$t-Mamba3D: A Time-Aware Spatio-Temporal State-Space Model for Breast Cancer Risk Prediction
- Plural Voices, Single Agent: Towards Inclusive AI in Multi-User Domestic Spaces
- Prior-informed optimization of treatment recommendation via bandit algorithms trained on large language model-processed historical records
- FlexiDataGen: An Adaptive LLM Framework for Dynamic Semantic Dataset Generation in Sensitive Domains
- CLiVR: Conversational Learning System in Virtual Reality with AI-Powered Patients
- "Over-the-Hood" AI Inclusivity Bugs and How 3 AI Product Teams Found and Fixed Them
- REPAIR Approach for Social-based City Reconstruction Planning in case of natural disasters
- PoSh: Using Scene Graphs To Guide LLMs-as-a-Judge For Detailed Image Descriptions
- Local Guidance for Configuration-Based Multi-Agent Pathfinding
- What Makes a Good Curriculum? Disentangling the Effects of Data Ordering on LLM Mathematical Reasoning
- That's Deprecated! Understanding, Detecting, and Steering Knowledge Conflicts in Language Models for Code Generation
- A Novel Approach to Breast Cancer Segmentation using U-Net Model with Attention Mechanisms and FedProx
- Steering Autoregressive Music Generation with Recursive Feature Machines
- A Cross-Environment and Cross-Embodiment Path Planning Framework via a Conditional Diffusion Model
- InvarGC: Invariant Granger Causality for Heterogeneous Interventional Time Series under Latent Confounding
- X-Ego: Acquiring Team-Level Tactical Situational Awareness via Cross-Egocentric Contrastive Video Representation Learning
- When Facts Change: Probing LLMs on Evolving Knowledge with evolveQA
- News-Aware Direct Reinforcement Trading for Financial Markets
- Imbalanced Gradients in RL Post-Training of Multi-Task LLMs
- Interpretable Question Answering with Knowledge Graphs
- PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning
- Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks
- An Active Diffusion Neural Network for Graphs
- No Intelligence Without Statistics: The Invisible Backbone of Artificial Intelligence
- SPOT: Scalable Policy Optimization with Trees for Markov Decision Processes
- See, Think, Act: Online Shopper Behavior Simulation with VLM Agents
- FnRGNN: Distribution-aware Fairness in Graph Neural Network
- LAPRAD: LLM-Assisted PRotocol Attack Discovery
- Social World Model-Augmented Mechanism Design Policy Learning
- Enhancing Early Alzheimer Disease Detection through Big Data and Ensemble Few-Shot Learning
- Knowledge and Common Knowledge of Strategies
- Collaborative penetration testing suite for emerging generative AI algorithms
- Online Handwritten Signature Verification Based on Temporal-Spatial Graph Attention Transformer
- Enabling Reconfiguration-Communication Overlap for Collective Communication in Optical Networks
- Balancing Rewards in Text Summarization: Multi-Objective Reinforcement Learning via HyperVolume Optimization
- SORA-ATMAS: Adaptive Trust Management and Multi-LLM Aligned Governance for Future Smart Cities
- Seabed-Net: A multi-task network for joint bathymetry estimation and seabed classification from remote sensing imagery in shallow waters
- Metadata Extraction Leveraging Large Language Models
- Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning
- To Use or to Refuse? Re-Centering Student Agency with Generative AI in Engineering Design Education
- Foundation Model Forecasts: Form and Function
- A New Type of Adversarial Examples
- Learning To Defer To A Population With Limited Demonstrations
- M3-SLU: Evaluating Speaker-Attributed Reasoning in Multimodal Large Language Models
- AgenticMath: Enhancing LLM Reasoning via Agentic-based Math Data Generation
- The Massive Legal Embedding Benchmark (MLEB)
- ColorAgent: Building A Robust, Personalized, and Interactive OS Agent
- ToMMeR -- Efficient Entity Mention Detection from Large Language Models
- EchoFake: A Replay-Aware Dataset for Practical Speech Deepfake Detection
- Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation
- FairNet: Dynamic Fairness Correction without Performance Loss via Contrastive Conditional LoRA
- Neural Variational Dropout Processes
- Universal Quantitative Abstraction: Categorical Duality and Logical Completeness for Probabilistic Systems
- HybridEP: Scaling Expert Parallelism to Cross-Datacenter Scenario via Hybrid Expert/Data Transmission
- A Concrete Roadmap towards Safety Cases based on Chain-of-Thought Monitoring
- Graph Unlearning Meets Influence-aware Negative Preference Optimization
- KnowMol: Advancing Molecular Large Language Models with Multi-Level Chemical Knowledge
- VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos
- Using Non-Expert Data to Robustify Imitation Learning via Offline Reinforcement Learning
- CARES: Context-Aware Resolution Selector for VLMs
- Modeling realistic human behavior using generative agents in a multimodal transport system: Software architecture and Application to Toulouse
- From Prototypes to Sparse ECG Explanations: SHAP-Driven Counterfactuals for Multivariate Time-Series Multi-class Classification
- Optimizing the Unknown: Black Box Bayesian Optimization with Energy-Based Model and Reinforcement Learning
- Insights into the Unknown: Federated Data Diversity Analysis on Molecular Data
- Demonstrating Real Advantage of Machine-Learning-Enhanced Monte Carlo for Combinatorial Optimization
- A Matter of Time: Revealing the Structure of Time in Vision-Language Models
- Multi-modal Co-learning for Earth Observation: Enhancing single-modality models via modality collaboration
- Detecting Latin in Historical Books with Large Language Models: A Multimodal Benchmark
- A Goal-Driven Survey on Root Cause Analysis
- XBench: A Comprehensive Benchmark for Visual-Language Explanations in Chest Radiography
- Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1
- Style Attack Disguise: When Fonts Become a Camouflage for Adversarial Intent
- From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction
- Unraveling Emotions with Pre-Trained Models
- Study of Training Dynamics for Memory-Constrained Fine-Tuning
- I Spy With My Model's Eye: Visual Search as a Behavioural Test for MLLMs
- Directive, Metacognitive or a Blend of Both? A Comparison of AI-Generated Feedback Types on Student Engagement, Confidence, and Outcomes
- Are Large Language Models Sensitive to the Motives Behind Communication?
- Serverless GPU Architecture for Enterprise HR Analytics: A Production-Scale BDaaS Implementation
- Toward Agentic Software Engineering Beyond Code: Framing Vision, Values, and Vocabulary
- Do Prompts Reshape Representations? An Empirical Study of Prompting Effects on Embeddings
- Enabling Granular Subgroup Level Model Evaluations by Generating Synthetic Medical Time Series
- Learning Affordances at Inference-Time for Vision-Language-Action Models
- A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation
- SmartSwitch: Advancing LLM Reasoning by Overcoming Underthinking via Promoting Deeper Thought Exploration
- AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders
- On Controlled Change: Generative AI's Impact on Professional Authority in Journalism
- Test-time Verification via Optimal Transport: Coverage, ROC, & Sub-optimality
- Timely Clinical Diagnosis through Active Test Selection
- Rectifying Shortcut Behaviors in Preference-based Reward Learning
- The MUSE Benchmark: Probing Music Perception and Auditory Relational Reasoning in Audio LLMS
- A Multi-faceted Analysis of Cognitive Abilities: Evaluating Prompt Methods with Large Language Models on the CONSORT Checklist
- The Zero-Step Thinking: An Empirical Study of Mode Selection as Harder Early Exit in Reasoning Models
- WebGraphEval: Multi-Turn Trajectory Evaluation for Web Agents using Graph Representation
- ChatGPT Unveils Its Limits: Principles of Law Deliver Checkmate
- An Argumentative Explanation Framework for Generalized Reason Model with Inconsistent Precedents
- Learning to Make Friends: Coaching LLM Agents toward Emergent Social Ties
- Continual Knowledge Adaptation for Reinforcement Learning
- MSC-Bench: A Rigorous Benchmark for Multi-Server Tool Orchestration
- NeSyPr: Neurosymbolic Proceduralization For Efficient Embodied Reasoning
- DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning
- HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application
- AgentSense: LLMs Empower Generalizable and Explainable Web-Based Participatory Urban Sensing
Research Sources: 496 | Generated: 10/23/2025
