AI RESEARCH PAPERS & ACADEMIC SOURCES
- Large VLM-based Vision-Language-Action Models for Robotic Manipulation: A Survey
- Real-Time Per-Garment Virtual Try-On with Temporal Consistency for Loose-Fitting Garments
- Grid-Reg: Detector-Free Gridized Feature Learning and Matching for Large-Scale SAR-Optical Image Registration
- GS-TG: 3D Gaussian Splatting Accelerator with Tile Grouping for Reducing Redundant Sorting while Preserving Rasterization Efficiency
- HodgeFormer: Transformers for Learnable Operators on Triangular Meshes through Data-Driven Hodge Matrices
- Calibration Prediction Interval for Non-parametric Regression and Neural Networks
- Inference on covariance structure in high-dimensional multi-view data
- A Composite-Loss Graph Neural Network for the Multivariate Post-Processing of Ensemble Weather Forecasts
- A proximal augmented Lagrangian method for nonconvex optimization with equality and inequality constraints
- Convergence for adaptive resampling of random Fourier features
- Feedback-Enhanced Online Multiple Testing with Applications to Conformal Selection
- Markov Missing Graph: A Graphical Approach for Missing Data Imputation
- The Broader Landscape of Robustness in Algorithmic Statistics
- The case for and against fixed step-size: Stochastic approximation algorithms in optimization and machine learning
- Principled model selection for stochastic dynamics
- How many simulations do we need for simulation-based inference in cosmology?
- AstroClearNet: Deep image prior for multi-frame astronomical image restoration
- Dexonomy: Synthesizing All Dexterous Grasp Types in a Grasp Taxonomy
- Point Cloud Recombination: Systematic Real Data Augmentation Using Robotic Targets for LiDAR Perception Validation
- Uncertainty-aware Test-Time Training (UT$^3$) for Efficient On-the-fly Domain Adaptive Dense Regression
- Prompt-Guided Patch UNet-VAE with Adversarial Supervision for Adrenal Gland Segmentation in Computed Tomography Medical Images
- Efficient Active Training for Deep LiDAR Odometry
- Generalist versus Specialist Vision Foundation Models for Ocular Disease and Oculomics
- EclipseTouch: Touch Segmentation on Ad Hoc Surfaces using Worn Infrared Shadow Casting
- SmartPoser: Arm Pose Estimation with a Smartphone and Smartwatch Using UWB and IMU Data
- Repurposing SAM for User-Defined Semantics Aware Segmentation
- Bridging the Domain Gap for Flight-Ready Spaceborne Vision
- Refinement of Monocular Depth Maps via Multi-View Differentiable Rendering
- Learning a Neural Association Network for Self-supervised Multi-Object Tracking
- Towards a Universal Synthetic Video Detector: From Face or Background Manipulations to Fully AI-Generated Content
- ViDDAR: Vision Language Model-Based Task-Detrimental Content Detection for Augmented Reality
- Comparing Next-Day Wildfire Predictability of MODIS and VIIRS Satellite Data
- Deeply Supervised Flow-Based Generative Models
- On the representation of stack operators by mathematical morphology
- Mitigating Hallucination in Large Vision-Language Models through Aligning Attention Distribution to Information Flow
- Sequential keypoint density estimator: an overlooked baseline of skeleton-based video anomaly detection
- Enhancing Diffusion Model Stability for Image Restoration via Gradient Management
- Faster and Better: Reinforced Collaborative Distillation and Self-Learning for Infrared-Visible Image Fusion
- Beyond Feature Mapping GAP: Integrating Real HDRTV Priors for Superior SDRTV-to-HDRTV Conversion
- Embedding Similarity Guided License Plate Super Resolution
- SynBT: High-quality Tumor Synthesis for Breast Tumor Segmentation by 3D Diffusion Model
- PointAD+: Learning Hierarchical Representations for Zero-shot 3D Anomaly Detection
- Empowering Lightweight MLLMs with Reasoning via Long CoT SFT
- InfraDiffusion: zero-shot depth map restoration with diffusion models and prompted segmentation from sparse infrastructure point clouds
- Transformer-Guided Content-Adaptive Graph Learning for Hyperspectral Unmixing
- Human Preference-Aligned Concept Customization Benchmark via Decomposed Evaluation
- Time-Scaling State-Space Models for Dense Video Captioning
- Decoding Visual Neural Representations by Multimodal with Dynamic Balancing
- Joint Training of Image Generator and Detector for Road Defect Detection
- Parameter-Efficient Adaptation of mPLUG-Owl2 via Pixel-Level Visual Prompts for NR-IQA
- OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation
- DeepSea MOT: A benchmark dataset for multi-object tracking on deep-sea video
- A comprehensive Persian offline handwritten database for investigating the effects of heritability and family relationships on handwriting
- Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?
- Application of Quantum Convolutional Neural Networks for MRI-Based Brain Tumor Detection and Classification
- Pan-Cancer mitotic figures detection and domain generalization: MIDOG 2025 Challenge
- Sequential Hard Mining: a data-centric approach for Mitosis Detection
- ConvNeXt with Histopathology-Specific Augmentations for Mitotic Figure Classification
- Solutions for Mitotic Figure Detection and Atypical Classification in MIDOG 2025
- RF-DETR for Robust Mitotic Figure Detection: A MIDOG 2025 Track 1 Approach
- Team Westwood Solution for MIDOG 2025 Challenge
- Foundation Model-Driven Classification of Atypical Mitotic Figures with Domain-Aware Training Strategies
- Challenges and Lessons from MIDOG 2025: A Two-Stage Approach to Domain-Robust Mitotic Figure Detection
- Ensemble YOLO Framework for Multi-Domain Mitotic Figure Detection in Histopathology Images
- DUViN: Diffusion-Based Underwater Visual Navigation via Knowledge-Transferred Depth Features
- 2nd Place Solution for CVPR2024 E2E Challenge: End-to-End Autonomous Driving Using Vision Language Model
- PixFoundation 2.0: Do Video Multi-Modal LLMs Use Motion in Visual Grounding?
- PRECISE-AS: Personalized Reinforcement Learning for Efficient Point-of-Care Echocardiography in Aortic Stenosis Diagnosis
- LiGuard: A Streamlined Open-Source Framework for Rapid & Interactive Lidar Research
- PercepTwin: Modeling High-Fidelity Digital Twins for Sim2Real LiDAR-based Perception for Intelligent Transportation Systems
- High-Fidelity Digital Twins for Bridging the Sim2Real Gap in LiDAR-Based ITS Perception
- STAR: A Fast and Robust Rigid Registration Framework for Serial Histopathological Images
- Resilient Multimodal Industrial Surface Defect Detection with Uncertain Sensors Availability
- EdgeAttNet: Towards Barb-Aware Filament Segmentation
- VQualA 2025 Challenge on Engagement Prediction for Short Videos: Methods and Results
- InstaDA: Augmenting Instance Segmentation Data with Dual-Agent System
- SPENet: Self-guided Prototype Enhancement Network for Few-shot Medical Image Segmentation
- SOPSeg: Prompt-based Small Object Instance Segmentation in Remote Sensing Imagery
- Enhancing Robustness in Post-Processing Watermarking: An Ensemble Attack Network Using CNNs and Transformers
- Background Matters Too: A Language-Enhanced Adversarial Framework for Person Re-Identification
- DCDB: Dynamic Conditional Dual Diffusion Bridge for Ill-posed Multi-Tasks
- Isolated Bangla Handwritten Character Classification using Transfer Learning
- High Cursive Complex Character Recognition using GAN External Classifier
- Backdoor Poisoning Attack Against Face Spoofing Attack Detection Methods
- Towards Realistic Hand-Object Interaction with Gravity-Field Based Diffusion Bridge
- Preserving instance continuity and length in segmentation through connectivity-aware loss computation
- PPORLD-EDNetLDCT: A Proximal Policy Optimization-Based Reinforcement Learning Framework for Adaptive Low-Dose CT Denoising
- AIVA: An AI-based Virtual Companion for Emotion-aware Interaction
- RTGMFF: Enhanced fMRI-based Brain Disorder Diagnosis via ROI-driven Text Generation and Multimodal Feature Fusion
- PI3DETR: Parametric Instance Detection of 3D Point Cloud Edges with a Geometry-Aware 3DETR
- LatPhon: Lightweight Multilingual G2P for Romance Languages and English
- AgenTracer: Who Is Inducing Failure in the LLM Agentic Systems?
- LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
- Learning Mechanism Underlying NLP Pre-Training and Fine-Tuning
- Curse of Knowledge: When Complex Evaluation Context Benefits yet Biases LLM Judges
- Design and Optimization of Reinforcement Learning-Based Agents in Text-Based Games
- Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models
- Identifiability and minimality bounds of quantum and post-quantum models of classical stochastic processes
- Mitigating Multimodal Hallucinations via Gradient-based Self-Reflection
- SESGO: Spanish Evaluation of Stereotypical Generative Outputs
- Attacking Misinformation Detection Using Adversarial Examples Generated by Language Models
- FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching
- Texture or Semantics? Vision-Language Models Get Lost in Font Recognition
- Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification
- Demystifying optimized prompts in language models
- QualBench: Benchmarking Chinese LLMs with Localized Professional Qualifications for Vertical Domain Evaluation
- On the class of coding optimality of human languages and the origins of Zipf's law
- RAT: Bridging RNN Efficiency and Attention Accuracy via Chunk-based Sequence Modeling
- That is Unacceptable: the Moral Foundations of Canceling
- You Sound a Little Tense: L2 Tailored Clear TTS Using Durational Vowel Properties
- Transformer-Based Power Optimization for Max-Min Fairness in Cell-Free Massive MIMO
- LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization
- GAEA: A Geolocation Aware Conversational Assistant
- Anchors no more: Using peculiar velocities to constrain $H_0$ and the primordial Universe without calibrators
- Model-based learning for joint channel estimationand hybrid MIMO precoding
- Insertion Language Models: Sequence Generation with Arbitrary-Position Insertions
- Statistical Test for Saliency Maps of Graph Neural Networks via Selective Inference
- DeepTopoNet: A Framework for Subglacial Topography Estimation on the Greenland Ice Sheets
- SSVD: Structured SVD for Parameter-Efficient Fine-Tuning and Benchmarking under Domain Shift in ASR
- IDEAlign: Comparing Large Language Models to Human Experts in Open-ended Interpretive Annotations
- Advancing Minority Stress Detection with Transformers: Insights from the Social Media Datasets
- English Pronunciation Evaluation without Complex Joint Training: LoRA Fine-tuned Speech Multimodal LLM
- Decoding the Rule Book: Extracting Hidden Moderation Criteria from Reddit Communities
- ProMQA-Assembly: Multimodal Procedural QA Dataset on Assembly
- DiaCBT: A Long-Periodic Dialogue Corpus Guided by Cognitive Conceptualization Diagram for CBT-based Psychological Counseling
- Training LLMs to be Better Text Embedders through Bidirectional Reconstruction
- Structure-Learnable Adapter Fine-Tuning for Parameter-Efficient Large Language Models
- A Long Short-Term Memory (LSTM) Model for Business Sentiment Analysis Based on Recurrent Neural Network
- Measuring Scalar Constructs in Social Science with LLMs
- An experimental and computational study of an Estonian single-person word naming
- Expanding the WMT24++ Benchmark with Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader
- SinhalaMMLU: A Comprehensive Benchmark for Evaluating Multitask Language Understanding in Sinhala
- Comparison of End-to-end Speech Assessment Models for the NOCASA 2025 Challenge
- Bayesian Active Learning for Multi-Criteria Comparative Judgement in Educational Assessment
- FlowKac: An Efficient Neural Fokker-Planck solver using Temporal Normalizing Flows and the Feynman-Kac Formula
- A State Alignment-Centric Approach to Federated System Identification: The FedAlign Framework
- MPCritic: A plug-and-play MPC architecture for reinforcement learning
- Improving Bayesian Optimization for Portfolio Management with an Adaptive Scheduling
- Explaining Anomalies with Tensor Networks
- Unsupervised Learning of Local Updates for Maximum Independent Set in Dynamic Graphs
- Learning and Interpreting Gravitational-Wave Features from CNNs with a Random Forest Approach
- CRISP-NAM: Competing Risks Interpretable Survival Prediction with Neural Additive Models
- RNE: plug-and-play diffusion inference-time control and energy-based training
- Non-Asymptotic Stability and Consistency Guarantees for Physics-Informed Neural Networks via Coercive Operator Analysis
- Neural Canonical Polyadic Factorization for Traffic Analysis
- Convergence of regularized agent-state-based Q-learning in POMDPs
- Hierarchical Multi-Interest Co-Network For Coarse-Grained Ranking
- An Exponentially Converging Particle Method for the Mixed Nash Equilibrium of Continuous Games
- Single-seed generation of Brownian paths and integrals for adaptive and high order SDE solvers
- Learn and Unlearn: Addressing Misinformation in Multilingual LLMs
- A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators
- A Lorentz-Equivariant Transformer for All of the LHC
- LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting
- Dial-In LLM: Human-Aligned LLM-in-the-loop Intent Clustering for Customer Service Dialogues
- Quantum Data Encoding and Variational Algorithms: A Framework for Hybrid Quantum Classical Machine Learning
- Learning sparse generalized linear models with binary outcomes via iterative hard thresholding
- Dynamical Decoupling of Generalization and Overfitting in Large Two-Layer Networks
- Beyond Words: Interjection Classification for Improved Human-Computer Interaction
- Enhancing Interpretability and Effectiveness in Recommendation with Numerical Features via Learning to Contrast the Counterfactual samples
- The Role of Embodiment in Intuitive Whole-Body Teleoperation for Mobile Manipulation
- NeurStore: Efficient In-database Deep Learning Model Management System
- Machine Learning-Driven Anomaly Detection for 5G O-RAN Performance Metrics
- Improving Perceptual Audio Aesthetic Assessment via Triplet Loss and Self-Supervised Embeddings
- Bayesian Additive Regression Trees for functional ANOVA model
- Temporal social network modeling of mobile connectivity data with graph neural networks
- Generative Auto-Bidding in Large-Scale Competitive Auctions via Diffusion Completer-Aligner
- An Effective Strategy for Modeling Score Ordinality and Non-uniform Intervals in Automated Speaking Assessment
- Understanding and Improving the Shampoo Optimizer via Kullback-Leibler Minimization
- CloudFormer: An Attention-based Performance Prediction for Public Clouds with Unknown Workload
- Scalable and Loosely-Coupled Multimodal Deep Learning for Breast Cancer Subtyping
- Non-Linear Counterfactual Aggregate Optimization
- Off-Policy Learning in Large Action Spaces: Optimization Matters More Than Estimation
- From Image Denoisers to Regularizing Imaging Inverse Problems: An Overview
- Learning AC Power Flow Solutions using a Data-Dependent Variational Quantum Circuit
- Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part I
- Correcting Auto-Differentiation in Neural-ODE Training
- Deep Variational Multivariate Information Bottleneck -- A Framework for Variational Losses
- INCPrompt: Task-Aware incremental Prompting for Rehearsal-Free Class-incremental Learning
- The Nah Bandit: Modeling User Non-compliance in Recommendation Systems
- FedGraph: A Research Library and Benchmark for Federated Graph Learning
- Recursive Gaussian Process State Space Model
- Pareto-frontier Entropy Search with Variational Lower Bound Maximization
- Structure-preserving contrastive learning for spatial time series
- Use ADAS Data to Predict Near-Miss Events: A Group-Based Zero-Inflated Poisson Approach
- Gaussian process surrogate with physical law-corrected prior for multi-coupled PDEs defined on irregular geometry
- Towards Performatively Stable Equilibria in Decision-Dependent Games for Arbitrary Data Distribution Maps
- Optimizing Prognostic Biomarker Discovery in Pancreatic Cancer Through Hybrid Ensemble Feature Selection and Multi-Omics Data
- Fast kernel methods: Sobolev, physics-informed, and additive models
- Quantifying Clinician Bias and its Effects on Schizophrenia Diagnosis in the Emergency Department of the Mount Sinai Health System
- Quantifying the Social Costs of Power Outages and Restoration Disparities Across Four U.S. Hurricanes
- Toward a robust lesion detection model in breast DCE-MRI: adapting foundation models to high-risk women
- Multi-Embodiment Locomotion at Scale with extreme Embodiment Randomization
- Fast and Accurate SVD-Type Updating in Streaming Data
- Multi-Scale Deep Learning for Colon Histopathology: A Hybrid Graph-Transformer Approach
- Managing Correlations in Data and Privacy Demand
- A Data-Driven RetinaNet Model for Small Object Detection in Aerial Images
- Faster Gradient Methods for Highly-smooth Stochastic Bilevel Optimization
- RankGraph: Unified Heterogeneous Graph Learning for Cross-Domain Recommendation
- Scale-Adaptive Generative Flows for Multiscale Scientific Data
- Mitigating Data Imbalance in Automated Speaking Assessment
- Mycroft: Tracing Dependencies in Collective Communication Towards Reliable LLM Training
- SurGBSA: Learning Representations From Molecular Dynamics Simulations
- TRELLIS-Enhanced Surface Features for Comprehensive Intracranial Aneurysm Analysis
- RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation
- Temporally-Aware Diffusion Model for Brain Progression Modelling with Bidirectional Temporal Regularisation
- Count2Density: Crowd Density Estimation without Location-level Annotations
- Deep Self-knowledge Distillation: A hierarchical supervised learning for coronary artery segmentation
- Multimodal learning of melt pool dynamics in laser powder bed fusion
- Population-aware Online Mirror Descent for Mean-Field Games with Common Noise by Deep Reinforcement Learning
- Discrete Functional Geometry of ReLU Networks via ReLU Transition Graphs
- LSAM: Asynchronous Distributed Training with Landscape-Smoothed Sharpness-Aware Minimization
- Systematic Evaluation of Attribution Methods: Eliminating Threshold Bias and Revealing Method-Dependent Performance Patterns
- Tabular foundation model for GEOAI benchmark problems BM/AirportSoilProperties/2/2025
- Exploring the Design Space of Fair Tree Learning Algorithms
- TeRA: Vector-based Random Tensor Network for High-Rank Adaptation of Large Language Models
- Unsupervised Learning based Element Resource Allocation for Reconfigurable Intelligent Surfaces in mmWave Network
- TopoMap: A Feature-based Semantic Discriminator of the Topographical Regions in the Test Input Space
- Meta-Imputation Balanced (MIB): An Ensemble Approach for Handling Missing Data in Biomedical Machine Learning
- EvolveSignal: A Large Language Model Powered Coding Agent for Discovering Traffic Signal Control Algorithms
- Some patterns of sleep quality and Daylight Saving Time across countries: a predictive and exploratory analysis
- The distribution of calibrated likelihood functions on the probability-likelihood Aitchison simplex
- Cluster and then Embed: A Modular Approach for Visualization
- Exploring a Graph-based Approach to Offline Reinforcement Learning for Sepsis Treatment
- Initialization Schemes for Kolmogorov-Arnold Networks: An Empirical Study
- LINKER: Learning Interactions Between Functional Groups and Residues With Chemical Knowledge-Enhanced Reasoning and Explainability
- Graph neural networks for learning liquid simulations in dynamic scenes containing kinematic objects
- Geometric Foundations of Tuning without Forgetting in Neural ODEs
- Invariant Features for Global Crop Type Classification
- Can LLMs Lie? Investigation beyond Hallucination
- EEG-MSAF: An Interpretable Microstate Framework uncovers Default-Mode Decoherence in Early Neurodegeneration
- Gaussian Process Regression of Steering Vectors With Physics-Aware Deep Composite Kernels for Augmented Listening
- Lessons Learned from Deploying Adaptive Machine Learning Agents with Limited Data for Real-time Cell Culture Process Monitoring
- Locus: Agentic Predicate Synthesis for Directed Fuzzing
- Preference Robustness for DPO with Applications to Public Health
- Imitate Optimal Policy: Prevail and Induce Action Collapse in Policy Gradient
- LExI: Layer-Adaptive Active Experts for Efficient MoE Model Inference
- Structured Basis Function Networks: Loss-Centric Multi-Hypothesis Ensembles with Controllable Diversity
- Learning Laplacian Eigenvectors: a Pre-training Method for Graph Neural Networks
- Challenges in Understanding Modality Conflict in Vision-Language Models
- Unlearning That Lasts: Utility-Preserving, Robust, and Almost Irreversible Forgetting in LLMs
- Towards Reasoning for PDE Foundation Models: A Reward-Model-Driven Inference-Time-Scaling Algorithm
- Power Grid Control with Graph-Based Distributed Reinforcement Learning
- Improving Generative Methods for Causal Evaluation via Simulation-Based Inference
- Event Detection and Classification for Long Range Sensing of Elephants Using Seismic Signal
- A Narrative Review of Clinical Decision Support Systems in Offloading Footwear for Diabetes-Related Foot Ulcers
- PDRL: Post-hoc Descriptor-based Residual Learning for Uncertainty-Aware Machine Learning Potentials
- Delayed Momentum Aggregation: Communication-efficient Byzantine-robust Federated Learning with Partial Participation
- AdaGrad Meets Muon: Adaptive Stepsizes for Orthogonal Updates
- On Developers' Self-Declaration of AI-Generated Code: An Analysis of Practices
- LawFlow: Collecting and Simulating Lawyers' Thought Processes on Business Formation Case Studies
- Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer
- Group-in-Group Policy Optimization for LLM Agent Training
- When a Reinforcement Learning Agent Encounters Unknown Unknowns
- NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
- FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation
- Securing AI Agents with Information-Flow Control
- A theoretical framework for self-supervised contrastive learning for continuous dependent data
- LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis
- Revisiting Clustering of Neural Bandits: Selective Reinitialization for Mitigating Loss of Plasticity
- Multimodal Medical Image Binding via Shared Text Embeddings
- HERCULES: Hierarchical Embedding-based Recursive Clustering Using LLMs for Efficient Summarization
- IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech
- GroundingDINO-US-SAM: Text-Prompted Multi-Organ Segmentation in Ultrasound with LoRA-Tuned Vision-Language Models
- LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling
- Rethinking Data Protection in the (Generative) Artificial Intelligence Era
- Covering a Few Submodular Constraints and Applications
- Efficiently Editing Mixture-of-Experts Models with Compressed Experts
- Impoola: The Power of Average Pooling for Image-Based Deep Reinforcement Learning
- TruthLens: Visual Grounding for Universal DeepFake Reasoning
- HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO
- Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond
- WildFireCan-MMD: A Multimodal Dataset for Classification of User-Generated Content During Wildfires in Canada
- Explainable Machine Learning-Based Security and Privacy Protection Framework for Internet of Medical Things Systems
- MF-OML: Online Mean-Field Reinforcement Learning with Occupation Measures for Large Population Games
- SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention
- Banishing LLM Hallucinations Requires Rethinking Generalization
- Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving
- Towards Agentic AI on Particle Accelerators
- Aligning Machine and Human Visual Representations across Abstraction Levels
- Domain Consistency Representation Learning for Lifelong Person Re-Identification
- TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
- Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
- Soft-TransFormers for Continual Learning
- GalaxAlign: Mimicking Citizen Scientists' Multimodal Guidance for Galaxy Morphology Analysis
- RouteNet-Gauss: Hardware-Enhanced Network Modeling with Machine Learning
- Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models
- Predict, Cluster, Refine: A Joint Embedding Predictive Self-Supervised Framework for Graph Representation Learning
- FedP$^2$EFT: Federated Learning to Personalize PEFT for Multilingual LLMs
- Rapid Word Learning Through Meta In-Context Learning
- Investigating a Model-Agnostic and Imputation-Free Approach for Irregularly-Sampled Multivariate Time-Series Modeling
- Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs
- LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence
- Can the Waymo Open Motion Dataset Support Realistic Behavioral Modeling? A Validation Study with Naturalistic Trajectories
- JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents
- A Survey on Human-AI Collaboration with Large Foundation Models
- On Generating Monolithic and Model Reconciling Explanations in Probabilistic Scenarios
- Frugal inference for control
- MorphAgent: Empowering Agents through Self-Evolving Profiles and Decentralized Collaboration
- PadChest-GR: A Bilingual Chest X-ray Dataset for Grounded Radiology Report Generation
- CyberBOT: Towards Reliable Cybersecurity Education via Ontology-Grounded Retrieval Augmented Generation
- Shutdownable Agents through POST-Agency
- ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research
- Gradients: When Markets Meet Fine-tuning -- A Distributed Approach to Model Optimisation
- Deep Research Agents: A Systematic Examination And Roadmap
- ChordPrompt: Orchestrating Cross-Modal Prompt Synergy for Multi-Domain Incremental Learning in CLIP
- AHELM: A Holistic Evaluation of Audio-Language Models
- L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search
- P2DT: Mitigating Forgetting in task-incremental Learning with progressive prompt Decision Transformer
- From Metrics to Meaning: Time to Rethink Evaluation in Human-AI Collaborative Design
- Autonomous Learning From Success and Failure: Goal-Conditioned Supervised Learning with Negative Feedback
- LGBP-OrgaNet: Learnable Gaussian Band Pass Fusion of CNN and Transformer Features for Robust Organoid Segmentation and Tracking
- Evaluation of Stress Detection as Time Series Events -- A Novel Window-Based F1-Metric
- FoMEMO: Towards Foundation Models for Expensive Multi-objective Optimization
- Structure Transfer: an Inference-Based Calculus for the Transformation of Representations
- HyPV-LEAD: Proactive Early-Warning of Cryptocurrency Anomalies through Data-Driven Structural-Temporal Modeling
- Estudio de la eficiencia en la escalabilidad de GPUs para el entrenamiento de Inteligencia Artificial
- A Comprehensive Guide to Differential Privacy: From Theory to User Expectations
- Automatic Differentiation of Agent-Based Models
- Heatmap Guided Query Transformers for Robust Astrocyte Detection across Immunostains and Resolutions
- Equivariant Flow Matching for Symmetry-Breaking Bifurcation Problems
- On the MIA Vulnerability Gap Between Private GANs and Diffusion Models
- epiGPTope: A machine learning-based epitope generator and classifier
- Fair Resource Allocation for Fleet Intelligence
- Neural Field Turing Machine: A Differentiable Spatial Computer
- TinyDrop: Tiny Model Guided Token Dropping for Vision Transformers
- Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
- Multi-level SSL Feature Gating for Audio Deepfake Detection
- Continuous Saudi Sign Language Recognition: A Vision Transformer Approach
- DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling
- Robult: Leveraging Redundancy and Modality Specific Features for Robust Multimodal Learning
- SafeProtein: Red-Teaming Framework and Benchmark for Protein Foundation Models
- On Entropy Control in LLM-RL Algorithms
- Real-Time Instrument Planning and Perception for Novel Measurements of Dynamic Phenomena
- Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data
- Warming Up for Zeroth-Order Federated Pre-Training with Low Resource Clients
- VendiRL: A Framework for Self-Supervised Reinforcement Learning of Diversely Diverse Skills
- Lattice Annotated Temporal (LAT) Logic for Non-Markovian Reasoning
- KEPT: Knowledge-Enhanced Prediction of Trajectories from Consecutive Driving Frames with Vision-Language Models
- AR-KAN: Autoregressive-Weight-Enhanced Kolmogorov-Arnold Network for Time Series Forecasting
- StableSleep: Source-Free Test-Time Adaptation for Sleep Staging with Lightweight Safety Rails
- Lesion-Aware Visual-Language Fusion for Automated Image Captioning of Ulcerative Colitis Endoscopic Examinations
- Efficient Privacy-Preserving Recommendation on Sparse Data using Fully Homomorphic Encryption
- Unveiling the Response of Large Vision-Language Models to Visually Absent Tokens
- Knowledge Integration for Physics-informed Symbolic Regression Using Pre-trained Large Language Models
- MedLiteNet: Lightweight Hybrid Medical Image Segmentation Model
- FlashRecovery: Fast and Low-Cost Recovery from Failures for Large-Scale Training of LLMs
- Binary Quantization For LLMs Through Dynamic Grouping
- Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers
- S2M2ECG: Spatio-temporal bi-directional State Space Model Enabled Multi-branch Mamba for ECG
- Are We SOLID Yet? An Empirical Study on Prompting LLMs to Detect Design Principle Violations
- Information transmission: Inferring change area from change moment in time series remote sensing images
- A Hierarchical Deep Reinforcement Learning Framework for Traffic Signal Control with Predictable Cycle Planning
- From Evaluation to Defense: Constructing Persistent Edit-Based Fingerprints for Large Language Models
- Adaptive KV-Cache Compression without Manually Setting Budget
- A Neural Network Approach to Multi-radionuclide TDCR Beta Spectroscopy
- Decentralised self-organisation of pivoting cube ensembles using geometric deep learning
- Domain Adaptation of LLMs for Process Data
- Rashomon in the Streets: Explanation Ambiguity in Scene Understanding
- AutoDetect: Designing an Autoencoder-based Detection Method for Poisoning Attacks on Object Detection Applications in the Military Domain
- Is Synthetic Image Augmentation Useful for Imbalanced Classification Problems? Case-Study on the MIDOG2025 Atypical Cell Detection Competition
- Radio Astronomy in the Era of Vision-Language Models: Prompt Sensitivity and Adaptation
- IS${}^3$ : Generic Impulsive--Stationary Sound Separation in Acoustic Scenes using Deep Filtering
- Who Owns The Robot?: Four Ethical and Socio-technical Questions about Wellbeing Robots in the Real World through Community Engagement
- A Two-Stage Strategy for Mitosis Detection Using Improved YOLO11x Proposals and ConvNeXt Classification
- A Single Detect Focused YOLO Framework for Robust Mitotic Figure Detection
- Enhanced Single-Cell RNA-seq Embedding through Gene Expression and Data-Driven Gene-Gene Interaction Integration
- Adaptive Learning Strategies for Mitotic Figure Classification in MIDOG2025 Challenge
- BioMD: All-atom Generative Model for Biomolecular Dynamics Simulation
- BioBlue: Notable runaway-optimiser-like LLM failure modes on biologically and economically aligned AI safety benchmarks for LLMs with simplified observation format
- Mentality: A Mamba-based Approach towards Foundation Models for EEG
- Optimizing Geometry Problem Sets for Skill Development
- The Transparent Earth: A Multimodal Foundation Model for the Earth's Subsurface
- DrDiff: Dynamic Routing Diffusion with Hierarchical Attention for Breaking the Efficiency-Quality Trade-off
- Improving the Resilience of Quadrotors in Underground Environments by Combining Learning-based and Safety Controllers
- Ensemble Learning for Healthcare: A Comparative Analysis of Hybrid Voting and Ensemble Stacking in Obesity Risk Prediction
- Clustering Discourses: Racial Biases in Short Stories about Women Generated by Large Language Models
- HF-RAG: Hierarchical Fusion-based RAG with Multiple Sources and Rankers
- Conformal Prediction for Time-series Forecasting with Change Points
- The Architecture of AI Transformation: Four Strategic Patterns and an Emerging Frontier
- Enhancing Machine Learning for Imbalanced Medical Data: A Quantum-Inspired Approach to Synthetic Oversampling (QI-SMOTE)
- A-SEA3L-QA: A Fully Automated Self-Evolving, Adversarial Workflow for Arabic Long-Context Question-Answer Generation
- Grocery to General Merchandise: A Cross-Pollination Recommender using LLMs and Real-Time Cart Context
- Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees
- The Basic B*** Effect: The Use of LLM-based Agents Reduces the Distinctiveness and Diversity of People's Choices
- Single Domain Generalization in Diabetic Retinopathy: A Neuro-Symbolic Learning Approach
- Simulacra Naturae: Generative Ecosystem driven by Agent-Based Simulations and Brain Organoid Collective Intelligence
- Deep Research is the New Analytics System: Towards Building the Runtime for AI-Driven Analytics
- Do LLM Modules Generalize? A Study on Motion Generation for Autonomous Driving
- Plan Verification for LLM-Based Embodied Task Completion Agents
- Key Principles in Cross-Domain Hyper-Heuristic Performance
- Learning General Policies From Examples
- Uncertainty-driven Adaptive Exploration
- Accountability Framework for Healthcare AI Systems: Towards Joint Accountability in Decision Making
- app.build: A Production Framework for Scaling Agentic Prompt-to-App Generation with Environment Scaffolding
- Language Models Do Not Follow Occam's Razor: A Benchmark for Inductive and Abductive Reasoning
- Situating AI Agents in their World: Aspective Agentic AI for Dynamic Partially Observable Information Systems
- ANNIE: Be Careful of Your Robots
- sam-llm: interpretable lane change trajectoryprediction via parametric finetuning
- The Lifecycle Principle: Stabilizing Dynamic Neural Networks with State Memory
- Latent Variable Modeling in Multi-Agent Reinforcement Learning via Expectation-Maximization for UAV-Based Wildlife Protection
- Charting the Future of Scholarly Knowledge with AI: A Community Perspective
- MitoDetect++: A Domain-Robust Pipeline for Mitosis Detection and Atypical Subtyping
- Normal and Atypical Mitosis Image Classifier using Efficient Vision Transformer
- Ensemble of Pathology Foundation Models for MIDOG 2025 Track 2: Atypical Mitosis Classification
- Beyond Synthetic Augmentation: Group-Aware Threshold Calibration for Robust Balanced Accuracy in Imbalanced Learning
- Robust Pan-Cancer Mitotic Figure Detection with YOLOv12
- OpenAIs HealthBench in Action: Evaluating an LLM-Based Medical Assistant on Realistic Clinical Queries
- MIDOG 2025: Mitotic Figure Detection with Attention-Guided False Positive Correction
- Synthetic Founders: AI-Generated Social Simulations for Startup Validation Research in Computational Social Science
- Towards Digital Twins for Optimal Radioembolization
- Contrastive clustering based on regular equivalence for influential node identification in complex networks
- Resilient Biosecurity in the Era of AI-Enabled Bioweapons
- Can Media Act as a Soft Regulator of Safe AI Development? A Game Theoretical Analysis
- The Future of Artificial Intelligence and the Mathematical and Physical Sciences (AI+MPS)
- Planning with Reasoning using Vision Language World Model
Research Sources: 415 | Generated: 9/4/2025