AI RESEARCH PAPERS & ACADEMIC SOURCES
- Steering Vision-Language Pre-trained Models for Incremental Face Presentation Attack Detection
- Decoding Predictive Inference in Visual Language Processing via Spatiotemporal Neural Coherence
- Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning
- Beyond Context: Large Language Models Failure to Grasp Users Intent
- ReaSeq: Unleashing World Knowledge via Reasoning for Sequential Modeling
- Improving Neural Question Generation using World Knowledge
- Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback
- VL4Gaze: Unleashing Vision-Language Models for Gaze Following
- OccuFly: A 3D Vision Benchmark for Semantic Scene Completion from the Aerial Perspective
- Learning to Sense for Driving: Joint Optics-Sensor-Model Co-Design for Semantic Segmentation
- Input-Adaptive Visual Preprocessing for Efficient Fast Vision-Language Model Inference
- ALIVE: An Avatar-Lecture Interactive Video Engine with Content-Aware Retrieval for Real-Time Interaction
- Lightweight framework for underground pipeline recognition and spatial localization based on multi-view 2D GPR images
- NeRV360: Neural Representation for 360-Degree Videos with a Viewport Decoder
- Beyond Weight Adaptation: Feature-Space Domain Injection for Cross-Modal Ship Re-Identification
- DGSAN: Dual-Graph Spatiotemporal Attention Network for Pulmonary Nodule Malignancy Prediction
- Benchmarking and Enhancing VLM for Compressed Image Understanding
- PanoGrounder: Bridging 2D and 3D with Panoramic Scene Representations for VLM-based 3D Visual Grounding
- Self-supervised Multiplex Consensus Mamba for General Image Fusion
- Quantile Rendering: Efficiently Embedding High-dimensional Feature on 3D Gaussian Splatting
- Reasoning-Driven Amodal Completion: Collaborative Agents and Perceptual Evaluation
- Beyond Artifacts: Real-Centric Envelope Modeling for Reliable AI-Generated Image Detection
- SPOT!: Map-Guided LLM Agent for Unsupervised Multi-CCTV Dynamic Object Tracking
- XGrid-Mapping: Explicit Implicit Hybrid Grid Submaps for Efficient Incremental Neural LiDAR Mapping
- X-ray Insights Unleashed: Pioneering the Enhancement of Multi-Label Long-Tail Data
- PUFM++: Point Cloud Upsampling via Enhanced Flow Matching
- MVInverse: Feed-forward Multi-view Inverse Rendering in Seconds
- Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations
- Granular-ball Guided Masking: Structure-aware Data Augmentation
- FluencyVE: Marrying Temporal-Aware Mamba with Bypass Attention for Video Editing
- Efficient and Robust Video Defense Framework against 3D-field Personalized Talking Face
- Multi-Attribute guided Thermal Face Image Translation based on Latent Diffusion Model
- Next-Scale Prediction: A Self-Supervised Approach for Real-World Image Denoising
- A Large-Depth-Range Layer-Based Hologram Dataset for Machine Learning-Based 3D Computer-Generated Holography
- Matrix Completion Via Reweighted Logarithmic Norm Minimization
- Optical Flow-Guided 6DoF Object Pose Tracking with an Event Camera
- Beyond Pixel Simulation: Pathology Image Generation via Diagnostic Semantic Tokens and Prototype Control
- Multimodal Skeleton-Based Action Representation Learning via Decomposition and Composition
- UniPR-3D: Towards Universal Visual Place Recognition with Visual Geometry Grounded Transformer
- T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation
- UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters
- FreeInpaint: Tuning-free Prompt Alignment and Visual Rationality Enhancement in Image Inpainting
- MarineEval: Assessing the Marine Intelligence of Vision-Language Models
- TGC-Net: A Structure-Aware and Semantically-Aligned Framework for Text-Guided Medical Image Segmentation
- ORCA: Object Recognition and Comprehension for Archiving Marine Species
- A Turn Toward Better Alignment: Few-Shot Generative Adaptation with Equivariant Feature Rotation
- Towards Arbitrary Motion Completing via Hierarchical Continuous Representation
- UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement
- VisRes Bench: On Evaluating the Visual Reasoning Capabilities of VLMs
- Human Motion Estimation with Everyday Wearables
- Latent Implicit Visual Reasoning
- Leveraging Lightweight Entity Extraction for Scalable Event-Based Image Retrieval
- SegMo: Segment-aligned Text to 3D Human Motion Generation
- DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation
- AnyAD: Unified Any-Modality Anomaly Detection in Incomplete Multi-Sequence MRI
- ACD: Direct Conditional Control for Video Diffusion Models via Attention Supervision
- GriDiT: Factorized Grid-Based Diffusion for Efficient Long Image Sequence Generation
- Surgical Scene Segmentation using a Spike-Driven Video Transformer with Real-Time Potential
- Post-Processing Mask-Based Table Segmentation for Structural Coordinate Extraction
- AndroidLens: Long-latency Evaluation with Nested Sub-targets for Android GUI Agents
- TICON: A Slide-Level Tile Contextualizer for Histopathology Representation Learning
- Fast SAM2 with Text-Driven Token Pruning
- Streaming Video Instruction Tuning
- Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models
- HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
- Flow Gym
- Language-Guided Grasp Detection with Coarse-to-Fine Learning for Robotic Manipulation
- TexAvatars : Hybrid Texel-3D Representations for Stable Rigging of Photorealistic Gaussian Head Avatars
- Equivariant Multiscale Learned Invertible Reconstruction for Cone Beam CT: From Simulated to Real Data
- Schr\"odinger's Navigator: Imagining an Ensemble of Futures for Zero-Shot Object Navigation
- RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic
- Automatic Replication of LLM Mistakes in Medical Conversations
- Learning from Neighbors with PHIBP: Predicting Infectious Disease Dynamics in Data-Sparse Environments
- Enhancing diffusion models with Gaussianization preprocessing
- Towards Better Search with Domain-Aware Text Embeddings for C2C Marketplaces
- Critical Points of Degenerate Metrics on Algebraic Varieties: A Tale of Overparametrization
- Agentic Multi-Persona Framework for Evidence-Aware Fake News Detection
- zkFL-Health: Blockchain-Enabled Zero-Knowledge Federated Learning for Medical AI Privacy
- DexAvatar: 3D Sign Language Reconstruction with Hand and Body Pose Priors
- Blurb-Refined Inference from Crowdsourced Book Reviews using Hierarchical Genre Mining with Dual-Path Graph Convolutions
- LLM Personas as a Substitute for Field Experiments in Method Benchmarking
- Dyna-Style Reinforcement Learning Modeling and Control of Non-linear Dynamics
- Hierarchical Modeling Approach to Fast and Accurate Table Recognition
- Semantic Refinement with LLMs for Graph Representations
- Semi-Supervised Learning for Large Language Models Safety and Content Moderation
- AutoBaxBuilder: Bootstrapping Code Security Benchmarking
- ElfCore: A 28nm Neural Processor Enabling Dynamic Structured Sparse Training and Online Self-Supervised Learning with Activity-Dependent Weight Update
- A Community-Enhanced Graph Representation Model for Link Prediction
- Causal-driven attribution (CDA): Estimating channel influence without user-level data
- Assessing the Software Security Comprehension of Large Language Models
- LookPlanGraph: Embodied Instruction Following Method with VLM Graph Augmentation
- Variationally correct operator learning: Reduced basis neural operator with a posteriori error estimation
- Parallel Token Prediction for Language Models
- Autonomous Uncertainty Quantification for Computational Point-of-care Sensors
- Optimizing Decoding Paths in Masked Diffusion Models by Quantifying Uncertainty
- Explicit Group Sparse Projection with Applications to Deep Learning and NMF
- DATTA: Domain Diversity Aware Test-Time Adaptation for Dynamic Domain Shift Data Streams
- Optimal Control with Natural Images: Efficient Reinforcement Learning using Overcomplete Sparse Codes
- Predicting Metabolic Dysfunction-Associated Steatotic Liver Disease using Machine Learning Methods
- Deep Kronecker Network
- Eliciting Risk Aversion with Inverse Reinforcement Learning via Interactive Questioning
- Agnostic Process Tomography
- MatchMiner-AI: An Open-Source Solution for Cancer Clinical Trial Matching
- SA-DiffuSeq: Addressing Computational and Scalability Challenges in Long-Document Generation with Sparse Attention
- Adversarial Training for Failure-Sensitive User Simulation in Mental Health Dialogue Optimization
- Large Language Models Approach Expert Pedagogical Quality in Math Tutoring but Differ in Instructional and Linguistic Profiles
- Investigating Model Editing for Unlearning in Large Language Models
- Measuring Mechanistic Independence: Can Bias Be Removed Without Erasing Demographics?
- Semantic Deception: When Reasoning Models Can't Compute an Addition
- EssayCBM: Rubric-Aligned Concept Bottleneck Models for Transparent Essay Grading
- MediEval: A Unified Medical Benchmark for Patient-Contextual and Knowledge-Grounded Reasoning in LLMs
- How important is Recall for Measuring Retrieval Quality?
- Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation
- Foundation Model-based Evaluation of Neuropsychiatric Disorders: A Lifespan-Inclusive, Multi-Modal, and Multi-Lingual Study
- Neural Probe-Based Hallucination Detection for Large Language Models
- Reflection Pretraining Enables Token-Level Self-Correction in Biological Sequence Models
- Distilling the Essence: Efficient Reasoning Distillation via Sequence Truncation
- Rethinking Supervised Fine-Tuning: Emphasizing Key Answer Tokens for Improved LLM Accuracy
- ClarifyMT-Bench: Benchmarking and Improving Multi-Turn Clarification for Conversational Large Language Models
- SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation
- SMART SLM: Structured Memory and Reasoning Transformer, A Small Language Model for Accurate Document Assistance
- Your Reasoning Benchmark May Not Test Reasoning: Revealing Perception Bottleneck in Abstract Reasoning Benchmarks
- C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling
- MegaRAG: Multimodal Knowledge Graph-Based Retrieval Augmented Generation
- Automated Red-Teaming Framework for Large Language Model Security Assessment: A Comprehensive Attack Generation and Detection System
- Zero-Training Temporal Drift Detection for Transformer Sentiment Models: A Comprehensive Analysis on Authentic Social Media Streams
- Enhancing Lung Cancer Treatment Outcome Prediction through Semantic Feature Engineering Using Large Language Models
- Real Time Detection and Quantitative Analysis of Spurious Forgetting in Continual Learning
- SHRP: Specialized Head Routing and Pruning for Efficient Encoder Compression
- Data-Free Pruning of Self-Attention Layers in LLMs
- Forecasting N-Body Dynamics: A Comparative Study of Neural Ordinary Differential Equations and Universal Differential Equations
- Q-RUN: Quantum-Inspired Data Re-uploading Networks
- MaskOpt: A Large-Scale Mask Optimization Dataset to Advance AI in Integrated Circuit Manufacturing
- Managing the Stochastic: Foundations of Learning in Neuro-Symbolic Systems for Software Engineering
- Dominating vs. Dominated: Generative Collapse in Diffusion Models
- Forward Only Learning for Orthogonal Neural Networks of any Depth
- Improving Cardiac Risk Prediction Using Data Generation Techniques
- Disentangling Fact from Sentiment: A Dynamic Conflict-Consensus Framework for Multimodal Fake News Detection
- HyDRA: Hierarchical and Dynamic Rank Adaptation for Mobile Vision Language Model
- Revisiting the Learning Objectives of Vision-Language Reward Models
- PHOTON: Hierarchical Autoregressive Modeling for Lightspeed and Memory-Efficient Language Generation
- FEM-Bench: A Structured Scientific Reasoning Benchmark for Evaluating Code-Generating LLMs
- Stabilizing Multimodal Autoencoders: A Theoretical and Empirical Analysis of Fusion Strategies
- Bridging Efficiency and Safety: Formal Verification of Neural Networks with Early Exits
- Generalization of RLVR Using Causal Reasoning as a Testbed
- TS-Arena Technical Report -- A Pre-registered Live Forecasting Platform
- Subgroup Discovery with the Cox Model
- Improving Matrix Exponential for Generative AI Flows: A Taylor-Based Approach Beyond Paterson--Stockmeyer
- Symbolic regression for defect interactions in 2D materials
- GraphFire-X: Physics-Informed Graph Attention Networks and Structural Gradient Boosting for Building-Scale Wildfire Preparedness at the Wildland-Urban Interface
- FedMPDD: Communication-Efficient Federated Learning with Privacy Preservation Attributes via Projected Directional Derivative
- Defending against adversarial attacks using mixture of experts
- Memory-Efficient Acceleration of Block Low-Rank Foundation Models on Resource Constrained GPUs
- Robustness Certificates for Neural Networks against Adversarial Attacks
- From GNNs to Symbolic Surrogates via Kolmogorov-Arnold Networks for Delay Prediction
- Time-Efficient Evaluation and Enhancement of Adversarial Robustness in Deep Neural Networks
- DiEC: Diffusion Embedded Clustering
- Towards a General Framework for Predicting and Explaining the Hardness of Graph-based Combinatorial Optimization Problems using Machine Learning and Association Rule Mining
- RevFFN: Memory-Efficient Full-Parameter Fine-Tuning of Mixture-of-Experts LLMs with Reversible Blocks
- Guardrailed Elasticity Pricing: A Churn-Aware Forecasting Playbook for Subscription Strategy
- A Multi-fidelity Double-Delta Wing Dataset and Empirical Scaling Laws for GNN-based Aerodynamic Field Surrogate
- Solving Functional PDEs with Gaussian Processes and Applications to Functional Renormalization Group Equations
- ReACT-Drug: Reaction-Template Guided Reinforcement Learning for de novo Drug Design
- Can Agentic AI Match the Performance of Human Data Scientists?
- Generalization of Diffusion Models Arises with a Balanced Representation Space
- Generalised Linear Models in Deep Bayesian RL with Learnable Basis Functions
- CoSeNet: A Novel Approach for Optimal Segmentation of Correlation Matrices
- LLM Swiss Round: Aggregating Multi-Benchmark Performance via Competitive Swiss-System Dynamics
- Understanding Scaling Laws in Deep Neural Networks via Feature Learning Dynamics
- Shared Representation Learning for High-Dimensional Multi-Task Forecasting under Resource Contention in Cloud-Native Backends
- A Mechanistic Analysis of Transformers for Dynamical Systems
- STLDM: Spatio-Temporal Latent Diffusion Model for Precipitation Nowcasting
- MODE: Multi-Objective Adaptive Coreset Selection
- BALLAST: Bandit-Assisted Learning for Latency-Aware Stable Timeouts in Raft
- A Unified Framework for EEG Seizure Detection Using Universum-Integrated Generalized Eigenvalues Proximal Support Vector Machine
- Analytic and Variational Stability of Deep Learning Systems
- MiST: Understanding the Role of Mid-Stage Scientific Training in Developing Chemical Reasoning Models
- Improving the Convergence Rate of Ray Search Optimization for Query-Efficient Hard-Label Attacks
- Model Merging via Multi-Teacher Knowledge Distillation
- Transcriptome-Conditioned Personalized De Novo Drug Generation for AML Using Metaheuristic Assembly and Target-Driven Filtering
- Learning to Solve PDEs on Neural Shape Representations
- Does the Data Processing Inequality Reflect Practice? On the Utility of Low-Level Tasks
- Measuring all the noises of LLM Evals
- Uncovering Patterns of Brain Activity from EEG Data Consistently Associated with Cybersickness Using Neural Network Interpretability Maps
- Uncovering Competency Gaps in Large Language Models and Their Benchmarks
- Graph Neural Networks for Source Detection: A Review and Benchmark Study
- Fast and Exact Least Absolute Deviations Line Fitting via Piecewise Affine Lower-Bounding
- Diffusion Models in Simulation-Based Inference: A Tutorial Review
- Mechanism-Based Intelligence (MBI): Differentiable Incentives for Rational Coordination and Guaranteed Alignment in Multi-Agent Systems
- Real-World Adversarial Attacks on RF-Based Drone Detectors
- AI-Driven Green Cognitive Radio Networks for Sustainable 6G Communication
- AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent
- TrashDet: Iterative Neural Architecture Search for Efficient Waste Detection
- A Physics Informed Neural Network For Deriving MHD State Vectors From Global Active Regions Observations
- TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior
- NULLBUS: Multimodal Mixed-Supervision for Breast Ultrasound Segmentation via Nullable Global-Local Prompts
- Weighted MCC: A Robust Measure of Multiclass Classifier Performance for Observations with Individual Weights
- Context-Sensitive Abstractions for Reinforcement Learning with Parameterized Actions
- CHAMMI-75: pre-training multi-channel models with heterogeneous microscopy images
- Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
- NVIDIA Nemotron 3: Efficient and Open Intelligence
- Better Call Graphs: A New Dataset of Function Call Graphs for Malware Classification
- Architectural Trade-offs in Small Language Models Under Compute Constraints
- Clever Hans in Chemistry: Chemist Style Signals Confound Activity Prediction on Public Benchmarks
- AirGS: Real-Time 4D Gaussian Streaming for Free-Viewpoint Video Experiences
- MultiMind at SemEval-2025 Task 7: Crosslingual Fact-Checked Claim Retrieval via Multi-Source Alignment
- Deadline-Aware Online Scheduling for LLM Fine-Tuning with Spot Market Predictions
- GenTSE: Enhancing Target Speaker Extraction via a Coarse-to-Fine Generative Language Model
- Parameter-Efficient Neural CDEs via Implicit Function Jacobians
- Learning Evolving Latent Strategies for Multi-Agent Language Systems without Model Fine-Tuning
Research Sources: 210 | Generated: 12/25/2025
