AI RESEARCH PAPERS & ACADEMIC SOURCES
- Data-driven approximation of transfer operators for mean-field stochastic differential equations
- A Smooth Computational Transition in Tensor PCA
- A Computable Measure of Suboptimality for Entropy-Regularised Variational Objectives
- Zero-inflation in the Multivariate Poisson Lognormal Family
- Can Generative Geospatial Diffusion Models Excel as Discriminative Geospatial Foundation Models?
- Talk2PC: Enhancing 3D Visual Grounding through LiDAR and Radar Point Clouds Fusion for Autonomous Driving
- Dynamic Motion Blending for Versatile Motion Editing
- GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill
- MedM-VL: What Makes a Good Medical LVLM?
- Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation
- Earth Observation Foundation Model PhilEO: Pretraining on the MajorTOM and FastTOM Datasets
- Geometry and Perception Guided Gaussians for Multiview-consistent 3D Generation from a Single Image
- Integrative Variational Autoencoders for Generative Modeling of an Image Outcome with Multiple Input Images
- Out-Of-Distribution Detection for Audio-visual Generalized Zero-Shot Learning: A General Framework
- Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation
- Orientation Scores should be a Piece of Cake
- Taccel: Scaling Up Vision-based Tactile Robotics via High-performance GPU Simulation
- Uncovering Neuroimaging Biomarkers of Brain Tumor Surgery with AI-Driven Methods
- GAMMA: Generalizable Alignment via Multi-task and Manipulation-Augmented Training for AI-Generated Image Detection
- Robustness and Diagnostic Performance of Super-Resolution Fetal Brain MRI
- Mask Consistency Regularization in Object Removal
- MagicMirror: A Large-Scale Dataset and Benchmark for Fine-Grained Artifacts Assessment in Text-to-Image Generation
- Detecting Text Manipulation in Images using Vision Language Models
- Adversarial robustness through Lipschitz-Guided Stochastic Depth in Neural Networks
- A Stochastic Birth-and-Death Approach for Street Furniture Geolocation in Urban Environments
- Compute Only 16 Tokens in One Timestep: Accelerating Diffusion Transformers with Cluster-Driven Feature Caching
- GARD: Gamma-based Anatomical Restoration and Denoising for Retinal OCT
- Immunizing Images from Text to Image Editing via Adversarial Cross-Attention
- Efficient Learned Image Compression Through Knowledge Distillation
- Ordinality of Visible-Thermal Image Intensities for Intrinsic Image Decomposition
- Compressed Video Quality Enhancement: Classifying and Benchmarking over Standards
- InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis
- Chord: Chain of Rendering Decomposition for PBR Material Estimation from Generated Texture Images
- HHI-Assist: A Dataset and Benchmark of Human-Human Interaction in Physical Assistance Scenario
- Polarization Denoising and Demosaicking: Dataset and Baseline Method
- GC-VLN: Instruction as Graph Constraints for Training-free Vision-and-Language Navigation
- The Weighting Game: Evaluating Quality of Explainability Methods
- GeoDE: a Geographically Diverse Evaluation Dataset for Object Recognition
- LoFi: Vision-Aided Label Generator for Wi-Fi Localization and Tracking
- Afford-X: Generalizable and Slim Affordance Reasoning for Task-oriented Manipulation
- Privacy-Preserving Automated Rosacea Detection Based on Medically Inspired Region of Interest Selection
- Investigating the Impact of Various Loss Functions and Learnable Wiener Filter for Laparoscopic Image Desmoking
- SCoDA: Self-supervised Continual Domain Adaptation
- Segment Anything for Cell Tracking
- Online 3D Multi-Camera Perception through Robust 2D Tracking and Depth-based Late Aggregation
- Augment to Segment: Tackling Pixel-Level Imbalance in Wheat Disease and Pest Segmentation
- An HMM-based framework for identity-aware long-term multi-object tracking from sparse and uncertain identification: use case on long-term tracking in livestock
- ISTASTrack: Bridging ANN and SNN via ISTA Adapter for RGB-Event Tracking
- FLARE-SSM: Deep State Space Models with Influence-Balanced Loss for 72-Hour Solar Flare Prediction
- TUNI: Real-time RGB-T Semantic Segmentation with Unified Multi-Modal Feature Extraction and Cross-Modal Feature Fusion
- Few-Part-Shot Font Generation
- Efficient and Accurate Downfacing Visual Inertial Odometry
- Hierarchical MLANet: Multi-level Attention for 3D Face Reconstruction From Single Images
- LaV-CoT: Language-Aware Visual CoT with Multi-Aspect Reward Optimization for Real-World Multilingual VQA
- Color Me Correctly: Bridging Perceptual Color Spaces and Text Embeddings for Improved Diffusion Generation
- BEVTraj: Map-Free End-to-End Trajectory Prediction in Bird's-Eye View with Deformable Attention and Sparse Goal Proposals
- Leveraging Multi-View Weak Supervision for Occlusion-Aware Multi-Human Parsing
- A Lightweight Ensemble-Based Face Image Quality Assessment Method with Correlation-Aware Loss
- Grad-CL: Source Free Domain Adaptation with Gradient Guided Feature Disalignment
- Scalable Training for Vector-Quantized Networks with 100% Codebook Utilization
- LayerLock: Non-collapsing Representation Learning with Progressive Freezing
- On the Geometric Accuracy of Implicit and Primitive-based Representations Derived from View Rendering Constraints
- Atomic Fact Decomposition Helps Attributed Question Answering
- Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance
- A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls
- FinMTEB: Finance Massive Text Embedding Benchmark
- Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification
- Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts
- NileChat: Towards Linguistically Diverse and Culturally Aware LLMs for Local Communities
- Faster and Better LLMs via Latency-Aware Test-Time Scaling
- Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes
- Australian Supermarket Object Set (ASOS): A Benchmark Dataset of Physical Objects and 3D Models for Robotics and Computer Vision
- Decomposing Visual Classification: Assessing Tree-Based Reasoning in VLMs
- Images in Motion?: A First Look into Video Leakage in Collaborative Deep Learning
- Purge-Gate: Backpropagation-Free Test-Time Adaptation for Point Clouds Classification via Token Purging
- Fine-Grained Cross-View Localization via Local Feature Matching and Monocular Depth Priors
- Patch-based Automatic Rosacea Detection Using the ResNet Deep Learning Framework
- Discrimination by LLMs: Cross-lingual Bias Assessment and Mitigation in Decision-Making and Summarisation
- Pragmatic Frames Evoked by Gestures: A FrameNet Brasil Approach to Multimodality in Turn Organization
- Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization
- CMHG: A Dataset and Benchmark for Headline Generation of Minority Languages in China
- Multi-Intent Recognition in Dialogue Understanding: A Comparison Between Smaller Open-Source LLMs
- Linguistic trajectories of bipolar disorder on social media
- !MSA at BAREC Shared Task 2025: Ensembling Arabic Transformers for Readability Assessment
- Querying Climate Knowledge: Semantic Retrieval for Scientific Discovery
- Arabic Large Language Models for Medical Text Generation
- Scaling Arabic Medical Chatbots Using Synthetic Data: Enhancing Generative AI with Synthetic Patient Records
- Prominence-aware automatic speech recognition for conversational speech
- Towards Reliable and Interpretable Document Question Answering via VLMs
- Incongruent Positivity: When Miscalibrated Positivity Undermines Online Supportive Conversations
- Beyond Token Limits: Assessing Language Model Performance on Long Text Classification
- Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs
- Long Context Automated Essay Scoring with Language Models
- RefactorCoderQA: Benchmarking LLMs for Multi-Domain Coding Question Solutions in Cloud and Edge Deployment
- DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL
- Whisper Has an Internal Word Aligner
- VARCO-VISION-2.0 Technical Report
- UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs
- Direct Judgement Preference Optimization
- Optimal Multi-Task Learning at Regularization Horizon for Speech Translation Task
- BIBERT-Pipe on Biomedical Nested Named Entity Linking at BioASQ 2025
- Natural Language Translation of Formal Proofs through Informalization of Proof Steps and Recursive Summarization along Proof Structure
- A Role-Aware Multi-Agent Framework for Financial Education Question Answering with LLMs
- Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning
- Task-Oriented Multimodal Token Transmission in Resource-Constrained Multiuser Networks
- Diffusion Buffer: Online Diffusion-based Speech Enhancement with Sub-Second Latency
- Attacking Attention of Foundation Models Disrupts Downstream Tasks
- Malware Classification Leveraging NLP & Machine Learning for Enhanced Accuracy
- Constructive Universal Approximation and Sure Convergence for Multi-Layer Neural Networks
- Constraint Guided Model Quantization of Neural Networks
- Bayesian Sheaf Neural Networks
- When and How Does CLIP Enable Domain and Compositional Generalization?
- Local-Cloud Inference Offloading for LLMs in Multi-Modal, Multi-Task, Multi-Dialogue Settings
- A Unified Framework for Diffusion Bridge Problems: Flow Matching and Schr\"{o}dinger Matching into One
- Space Group Informed Transformer for Crystalline Materials Generation
- Efficient transformer adaptation for analog in-memory computing via low-rank adapters
- MoPD: Mixture-of-Prompts Distillation for Vision-Language Models
- Soft Diamond Regularizers for Deep Learning
- Early Detection of Visual Impairments at Home Using a Smartphone Red-Eye Reflex Test
- An Information-Theoretic Framework for Credit Risk Modeling: Unifying Industry Practice with Statistical Theory for Fair and Interpretable Scorecards
- Off Policy Lyapunov Stability in Reinforcement Learning
- FetalSleepNet: A Transfer Learning Framework with Spectral Equalisation Domain Adaptation for Fetal Sleep Stage Classification
- Model-agnostic post-hoc explainability for recommender systems
- MCL-AD: Multimodal Collaboration Learning for Zero-Shot 3D Anomaly Detection
- Robot guide with multi-agent control and automatic scenario generation with LLM
- Why does your graph neural network fail on some graphs? Insights from exact generalisation error
- Characterizing the Efficiency of Distributed Training: A Power, Performance, and Thermal Perspective
- WhisTLE: Deeply Supervised, Text-Only Domain Adaptation for Pretrained Speech Recognition Transformers
- Is Adversarial Training with Compressed Datasets Effective?
- Unveiling Group-Specific Distributed Concept Drift: A Fairness Imperative in Federated Learning
- Uncertainty Modeling in Graph Neural Networks via Stochastic Differential Equations
- Federated Multi-Agent Reinforcement Learning for Privacy-Preserving and Energy-Aware Resource Management in 6G Edge Networks
- A Certifiable Machine Learning-Based Pipeline to Predict Fatigue Life of Aircraft Structures
- Prompt Injection Attacks on LLM Generated Reviews of Scientific Publications
- Property prediction for ionic liquids without prior structural knowledge using limited experimental data: A data-driven neural recommender system leveraging transfer learning
- Proof of AutoML: SDN based Secure Energy Trading with Blockchain in Disaster Case
- GraphCSVAE: Graph Categorical Structured Variational Autoencoder for Spatiotemporal Auditing of Physical Vulnerability Towards Sustainable Post-Disaster Risk Reduction
- ARMA Block: A CNN-Based Autoregressive and Moving Average Module for Long-Term Time Series Forecasting
- Physics-informed sensor coverage through structure preserving machine learning
- Flow Straight and Fast in Hilbert Space: Functional Rectified Flow
- Vendi Information Gain for Active Learning and its Application to Ecology
- Inpainting-Guided Policy Optimization for Diffusion Large Language Models
- Multipole Semantic Attention: A Fast Approximation of Softmax Attention for Pretraining
- Powering Job Search at Scale: LLM-Enhanced Query Understanding in Job Matching Systems
- DCHO: A Decomposition-Composition Framework for Predicting Higher-Order Brain Connectivity to Enhance Diverse Downstream Applications
- Improving MLLM Historical Record Extraction with Test-Time Image
- Hybrid Adaptive Conformal Offline Reinforcement Learning for Fair Population Health Management
- One Head, Many Models: Cross-Attention Routing for Cost-Aware LLM Selection
- Variational Neural Networks for Observable Thermodynamics (V-NOTS)
- LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios
- SciML Agents: Write the Solver, Not the Solution
- DyKen-Hyena: Dynamic Kernel Generation via Cross-Modal Attention for Multimodal Intent Recognition
- Data-Driven Energy Estimation for Virtual Servers Using Combined System Metrics and Machine Learning
- Neural Scaling Laws for Deep Regression
- Uncertainty-Aware Tabular Prediction: Evaluating VBLL-Enhanced TabPFN in Safety-Critical Medical Data
- AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion models
- Tokens, the oft-overlooked appetizer: Large language models, the distributional hypothesis, and meaning
- Can LLM Prompting Serve as a Proxy for Static Analysis in Vulnerability Detection
- A Novel Approach to Balance Convenience and Nutrition in Meals With Long-Term Group Recommendations and Reasoning on Multimodal Recipes and its Implementation in BEACON
- Auxiliary Discrminator Sequence Generative Adversarial Networks (ADSeqGAN) for Few Sample Molecule Generation
- Prompt Programming: A Platform for Dialogue-based Computational Problem Solving with Generative AI Models
- JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse
- A Framework for Testing and Adapting REST APIs as LLM Tools
- The Precautionary Principle and the Innovation Principle: Incompatible Guides for AI Innovation Governance?
- AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
- HiLight: A Hierarchical Reinforcement Learning Framework with Global Adversarial Guidance for Large-Scale Traffic Signal Control
- Atherosclerosis through Hierarchical Explainable Neural Network Analysis
- SignClip: Leveraging Mouthing Cues for Sign Language Translation by Multimodal Contrastive Fusion
- We Need a New Ethics for a World of AI Agents
- Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Data
- I-Segmenter: Integer-Only Vision Transformer for Efficient Semantic Segmentation
- GLAM: Geometry-Guided Local Alignment for Multi-View VLP in Mammography
- Data distribution impacts the performance and generalisability of contrastive learning-based foundation models of electrocardiograms
- Multimodal SAM-adapter for Semantic Segmentation
- Standards in the Preparation of Biomedical Research Metadata: A Bridge2AI Perspective
- Multi-Turn Human-LLM Interaction Through the Lens of a Two-Way Intelligibility Protocol
- QuantX: A Framework for Hardware-Aware Quantization of Generative AI Workloads
- Analyzing the Impact of Adversarial Examples on Explainable Machine Learning
- Slaves to the Law of Large Numbers: An Asymptotic Equipartition Property for Perplexity in Generative Language Models
- A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts
- A Survey on Group Fairness in Federated Learning: Challenges, Taxonomy of Solutions and Directions for Future Research
- Tackling One Health Risks: How Large Language Models are leveraged for Risk Negotiation and Consensus-building
- An Autoencoder and Vision Transformer-based Interpretability Analysis of the Differences in Automated Staging of Second and Third Molars
- WALL: A Web Application for Automated Quality Assurance using Large Language Models
- SmartCoder-R1: Towards Secure and Explainable Smart Contract Generation with Security-Aware Group Relative Policy Optimization
- Adaptive Token Merging for Efficient Transformer Semantic Communication at the Edge
- Securing LLM-Generated Embedded Firmware through AI Agent-Driven Validation and Patching
- Drone-Based Multispectral Imaging and Deep Learning for Timely Detection of Branched Broomrape in Tomato Farms
- Exploring Expert Specialization through Unsupervised Training in Sparse Mixture of Experts
- Reinforcement learning for spin torque oscillator tasks
- Multimodal Mathematical Reasoning Embedded in Aerial Vehicle Imagery: Benchmarking, Analysis, and Exploration
- Predictive Spike Timing Enables Distributed Shortest Path Computation in Spiking Neural Networks
- Established Psychometric vs. Ecologically Valid Questionnaires: Rethinking Psychological Assessments in Large Language Models
- Realism Control One-step Diffusion for Real-World Image Super-Resolution
- Population-Aligned Persona Generation for LLM-based Social Simulation
- BenchECG and xECG: a benchmark and baseline for ECG foundation models
- Benchmark of stylistic variation in LLM-generated texts
- SI-FACT: Mitigating Knowledge Conflict via Self-Improving Faithfulness-Aware Contrastive Tuning
- Openness in AI and downstream governance: A global value chain approach
- A Multimodal RAG Framework for Housing Damage Assessment: Collaborative Optimization of Image Encoding and Policy Vector Retrieval
- ALIGNS: Unlocking nomological networks in psychological measurement through a large language model
- DiTTO-LLM: Framework for Discovering Topic-based Technology Opportunities via Large Language Model
- MultimodalHugs: Enabling Sign Language Processing in Hugging Face
- MITS: A Large-Scale Multimodal Benchmark Dataset for Intelligent Traffic Surveillance
- Structure Matters: Brain Graph Augmentation via Learnable Edge Masking for Data-efficient Psychiatric Diagnosis
- D-CAT: Decoupled Cross-Attention Transfer between Sensor Modalities for Unimodal Inference
- A Co-Training Semi-Supervised Framework Using Faster R-CNN and YOLO Networks for Object Detection in Densely Packed Retail Images
- Meta-Learning Reinforcement Learning for Crypto-Return Prediction
- Revisiting Actor-Critic Methods in Discrete Action Off-Policy Reinforcement Learning
- SWE-Effi: Re-Evaluating Software AI Agent System Effectiveness Under Resource Constraints
- Vibe Check: Understanding the Effects of LLM-Based Conversational Agents' Personality and Alignment on User Perceptions in Goal-Oriented Tasks
- Emulating Public Opinion: A Proof-of-Concept of AI-Generated Synthetic Survey Responses for the Chilean Case
- From Hugging Face to GitHub: Tracing License Drift in the Open-Source AI Ecosystem
- Automated Tuning for Diffusion Inverse Problem Solvers without Generative Prior Retraining
- Mutual Information Tracks Policy Coherence in Reinforcement Learning
- Forecasting Clicks in Digital Advertising: Multimodal Inputs and Interpretable Outputs
- Text-to-SQL Oriented to the Process Mining Domain: A PT-EN Dataset for Query Translation
- TalkPlayData 2: An Agentic Synthetic Data Pipeline for Multimodal Conversational Music Recommendation
- GeoGPT.RAG Technical Report
- AI-Powered Assistant for Long-Term Access to RHIC Knowledge
- Structured Information Matters: Explainable ICD Coding with Patient-Level Knowledge Graphs
- Cross-Layer Attention Probing for Fine-Grained Hallucination Detection
- Creativity Benchmark: A benchmark for marketing creativity for LLM models
- CTCC: A Robust and Stealthy Fingerprinting Framework for Large Language Models via Cross-Turn Contextual Correlation Backdoor
- The Non-Determinism of Small LLMs: Evidence of Low Answer Consistency in Repetition Trials of Standard Multiple-Choice Benchmarks
- Differential Robustness in Transformer Language Models: Empirical Evaluation Under Adversarial Text Attacks
- LLM-Based Instance-Driven Heuristic Bias In the Context of a Biased Random Key Genetic Algorithm
- Beyond I'm Sorry, I Can't: Dissecting Large Language Model Refusal
- Assisting Research Proposal Writing with Large Language Models: Evaluation and Refinement
- Generating Individual Travel Diaries Using Large Language Models Informed by Census and Land-Use Data
- Psychiatry-Bench: A Multi-Task Benchmark for LLMs in Psychiatry
- The Thinking Therapist: Training Large Language Models to Deliver Acceptance and Commitment Therapy using Supervised Fine-Tuning and Odds Ratio Policy Optimization
- HANRAG: Heuristic Accurate Noise-resistant Retrieval-Augmented Generation for Multi-hop Question Answering
- Human-AI Collaboration Increases Efficiency in Regulatory Writing
- How well can LLMs provide planning feedback in grounded environments?
- A Modular and Multimodal Generative AI Framework for Urban Building Energy Data: Generating Synthetic Homes
- Towards an AI-based knowledge assistant for goat farmers based on Retrieval-Augmented Generation
- LLMs as Agentic Cooperative Players in Multiplayer UNO
- The (R)evolution of Scientific Workflows in the Agentic AI Era: Towards Autonomous Science
- Evaluation of Black-Box XAI Approaches for Predictors of Values of Boolean Formulae
- GAMA: A General Anonymizing Multi-Agent System for Privacy Preservation Enhanced by Domain Rules and Disproof Method
- XAgents: A Unified Framework for Multi-Agent Cooperation via IF-THEN Rules and Multipolar Task Processing Graph
- AI Harmonics: a human-centric and harms severity-adaptive AI risk assessment framework
- Online Robust Planning under Model Uncertainty: A Sample-Based Approach
- The Morality of Probability: How Implicit Moral Biases in LLMs May Shape the Future of Human-AI Symbiosis
- Abduct, Act, Predict: Scaffolding Causal Inference for Automated Failure Attribution in Multi-Agent Systems
Research Sources: 246 | Generated: 9/15/2025