AI RESEARCH PAPERS & ACADEMIC SOURCES
- Noise-Robustness Through Noise: Asymmetric LoRA Adaption with Poisoning Expert
- Perception-R1: Advancing Multimodal Reasoning Capabilities of MLLMs via Visual Perception Reward
- Algorithmic Fairness: Not a Purely Technical but Socio-Technical Property
- Towards Interactive and Learnable Cooperative Driving Automation: a Large Language Model-Driven Decision-Making Framework
- SeCodePLT: A Unified Platform for Evaluating the Security of Code GenAI
- Negotiative Alignment: Embracing Disagreement to Achieve Fairer Outcomes -- Insights from Urban Studies
- Who is Responsible When AI Fails? Mapping Causes, Entities, and Consequences of AI Privacy and Ethical Incidents
- Hybrid Temporal Differential Consistency Autoencoder for Efficient and Sustainable Anomaly Detection in Cyber-Physical Systems
- MigGPT: Harnessing Large Language Models for Automated Migration of Out-of-Tree Linux Kernel Patches Across Versions
- Space Group Equivariant Crystal Diffusion
- World Modelling Improves Language Model Agents
- Beyond the Average: Distributional Causal Inference under Imperfect Compliance
- Transfer learning under latent space model
- Training More Robust Classification Model via Discriminative Loss and Gaussian Noise Injection
- Gradient-Free Sequential Bayesian Experimental Design via Interacting Particle Systems
- Causal inference for the expected number of recurrent events in the presence of a terminal event
- Improved learning theory for kernel distribution regression with two-stage sampling
- Permutation recovery of spikes in noisy high-dimensional tensor estimation
- Double descent in quantum kernel methods
- Copyright and Competition: Estimating Supply and Demand with Unstructured Data
- Deep Reinforcement Learning with Gradient Eligibility Traces
- CLIPTTA: Robust Contrastive Vision-Language Test-Time Adaptation
- MapAnything: Universal Feed-Forward Metric 3D Reconstruction
- AToken: A Unified Tokenizer for Vision
- Sea-ing Through Scattered Rays: Revisiting the Image Formation Model for Realistic Underwater Image Generation
- Set Phasers to Stun: Beaming Power and Control to Mobile Robots with Laser Light
- A more efficient method for large-sample model-free feature screening via multi-armed bandits
- Subset Selection for Stratified Sampling in Online Controlled Experiments
- cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning
- OSPO: Object-centric Self-improving Preference Optimization for Text-to-Image Generation
- Spatial Understanding from Videos: Structured Prompts Meet Simulation Data
- OptiScene: LLM-driven Indoor Scene Layout Generation via Scaled Human-aligned Data Synthesis and Multi-Stage Preference Optimization
- DualEdit: Dual Editing for Knowledge Updating in Vision-Language Models
- Classification of Tents in Street Bazaars Using CNN
- RePIC: Reinforced Post-Training for Personalizing Multi-Modal Language Models
- Cross-Resolution SAR Target Detection Using Structural Hierarchy Adaptation and Reliable Adjacency Alignment
- MCGA: Mixture of Codebooks Hyperspectral Reconstruction via Grayscale-Aware Attention
- Deformable Dynamic Convolution for Accurate yet Efficient Spatio-Temporal Traffic Prediction
- VLA-Mark: A cross modal watermark for large vision-language alignment model
- Training A Neural Network For Partially Occluded Road Sign Identification In The Context Of Autonomous Vehicles
- scSplit: Bringing Severity Cognizance to Image Decomposition in Fluorescence Microscopy
- AttentionDrop: A Novel Regularization Method for Transformer Models
- DSDNet: Raw Domain Demoir\'eing via Dual Color-Space Synergy
- The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction
- Temperature-Driven Robust Disease Detection in Brain and Gastrointestinal Disorders via Context-Aware Adaptive Knowledge Distillation
- TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection
- Examining Deployment and Refinement of the VIOLA-AI Intracranial Hemorrhage Model Using an Interactive NeoMedSys Platform
- Semantic Change Detection of Roads and Bridges: A Fine-grained Dataset and Multimodal Frequency-driven Detector
- RETRO: REthinking Tactile Representation Learning with Material PriOrs
- GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains
- CADSpotting: Robust Panoptic Symbol Spotting on Large-Scale CAD Drawings
- NFL-BA: Near-Field Light Bundle Adjustment for SLAM in Dynamic Lighting
- iCBIR-Sli: Interpretable Content-Based Image Retrieval with 2D Slice Embeddings
- Experimenting with Affective Computing Models in Video Interviews with Spanish-speaking Older Adults
- Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models
- Screener: Self-supervised Pathology Segmentation in Medical CT Images
- Integrating Spatiotemporal Vision Transformer into Digital Twins for High-Resolution Heat Stress Forecasting in Campus Environments
- SCoT: Straight Consistent Trajectory for Pre-Trained Diffusion Model Distillations
- Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models
- ISP-AD: A Large-Scale Real-World Dataset for Advancing Industrial Anomaly Detection with Synthetic and Real Defects
- Pruning the Paradox: How CLIP's Most Informative Heads Enhance Performance While Amplifying Bias
- Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning
- Navigate Beyond Shortcuts: Debiased Learning through the Lens of Neural Collapse
- A re-calibration method for object detection with multi-modal alignment bias in autonomous driving
- Assessing invariance to affine transformations in image quality metrics
- Img2CAD: Reverse Engineering 3D CAD Models from Images through VLM-Assisted Conditional Factorization
- FOVAL: Calibration-Free and Subject-Invariant Fixation Depth Estimation Across Diverse Eye-Tracking Datasets
- Combo: Co-speech holistic 3D human motion generation and efficient customizable adaptation in harmony
- CrackSCF: Lightweight Cascaded Fusion Network for Robust and Efficient Structural Crack Segmentation
- Diffusion-Based Depth Inpainting for Transparent and Reflective Objects
- G2D2: Gradient-Guided Discrete Diffusion for Inverse Problem Solving
- AcT2I: Evaluating and Improving Action Depiction in Text-to-Image Models
- Pointing to a Llama and Call it a Camel: On the Sycophancy of Multimodal Large Language Models
- UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation
- Analysis Plug-and-Play Methods for Imaging Inverse Problems
- Prostate Capsule Segmentation from Micro-Ultrasound Images using Adaptive Focal Loss
- Uncertainty-Gated Deformable Network for Breast Tumor Segmentation in MR Images
- DPC-QA Net: A No-Reference Dual-Stream Perceptual and Cellular Quality Assessment Network for Histopathology Images
- QWD-GAN: Quality-aware Wavelet-driven GAN for Unsupervised Medical Microscopy Images Denoising
- CoReVLA: A Dual-Stage End-to-End Autonomous Driving Framework for Long-Tail Scenarios via Collect-and-Refine
- SLaM-DiMM: Shared Latent Modeling for Diffusion Based Missing Modality Synthesis in MRI
- FMD-TransUNet: Abdominal Multi-Organ Segmentation Based on Frequency Domain Multi-Axis Representation Learning and Dual Attention Mechanisms
- Beyond Pixels: Enhancing LIME with Hierarchical Features and Segmentation Foundation Models
- Towards Robust Visual Continual Learning with Multi-Prototype Supervision
- DistillMatch: Leveraging Knowledge Distillation from Vision Foundation Model for Multimodal Image Matching
- Generalized Deep Multi-view Clustering via Causal Learning with Partially Aligned Cross-view Correspondence
- GLip: A Global-Local Integrated Progressive Framework for Robust Visual Speech Recognition
- Graph-based Point Cloud Surface Reconstruction using B-Splines
- Language-Instructed Reasoning for Group Activity Detection via Multimodal Large Language Model
- Blind-Spot Guided Diffusion for Self-supervised Real-World Denoising
- AdaSports-Traj: Role- and Domain-Aware Adaptation for Multi-Agent Trajectory Modeling in Sports
- SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features
- RadarGaussianDet3D: An Efficient and Effective Gaussian-based 3D Detector with 4D Automotive Radars
- BaseReward: A Strong Baseline for Multimodal Reward Model
- Recovering Parametric Scenes from Very Few Time-of-Flight Pixels
- Boosting Active Learning with Knowledge Transfer
- LC-SLab -- An Object-based Deep Learning Framework for Large-scale Land Cover Classification from Satellite Imagery and Sparse In-situ Labels
- Zero-Shot Visual Grounding in 3D Gaussians via View Retrieval
- ENSAM: an efficient foundation model for interactive segmentation of 3D medical images
- RangeSAM: Leveraging Visual Foundation Models for Range-View repesented LiDAR segmentation
- Global Regulation and Excitation via Attention Tuning for Stereo Matching
- Deep Feedback Models
- Sparse Multiview Open-Vocabulary 3D Detection
- PAN: Pillars-Attention-Based Network for 3D Object Detection
- A multi-temporal multi-spectral attention-augmented deep convolution neural network with contrastive learning for crop yield prediction
- CoPAD : Multi-source Trajectory Fusion and Cooperative Trajectory Prediction with Anchor-oriented Decoder in V2X Scenarios
- DAFTED: Decoupled Asymmetric Fusion of Tabular and Echocardiographic Data for Cardiac Hypertension Diagnosis
- Toward Medical Deepfake Detection: A Comprehensive Dataset and Novel Method
- TrueMoE: Dual-Routing Mixture of Discriminative Experts for Synthetic Image Detection
- Hybrid Lie semi-group and cascade structures for the generalized Gaussian derivative model for visual receptive fields
- Simulated Cortical Magnification Supports Self-Supervised Object Learning
- MCOD: The First Challenging Benchmark for Multispectral Camouflaged Object Detection
- Overview of PlantCLEF 2024: multi-species plant identification in vegetation plot images
- Vision-Language Models as Differentiable Semantic and Spatial Rewards for Text-to-3D Generation
- Enriched Feature Representation and Motion Prediction Module for MOSEv2 Track of 7th LSVOS Challenge: 3rd Place Solution
- FoBa: A Foreground-Background co-Guided Method and New Benchmark for Remote Sensing Semantic Change Detection
- Minimal Semantic Sufficiency Meets Unsupervised Domain Generalization
- TASAM: Terrain-and-Aware Segment Anything Model for Temporal-Scale Remote Sensing Segmentation
- TennisTV: Do Multimodal Large Language Models Understand Tennis Rallies?
- Enhancing WSI-Based Survival Analysis with Report-Auxiliary Self-Distillation
- PCSR: Pseudo-label Consistency-Guided Sample Refinement for Noisy Correspondence Learning
- pFedSAM: Personalized Federated Learning of Segment Anything Model for Medical Image Segmentation
- UNIV: Unified Foundation Model for Infrared and Visible Modalities
- GS-Scale: Unlocking Large-Scale 3D Gaussian Splatting Training via Host Offloading
- FingerSplat: Contactless Fingerprint 3D Reconstruction and Generation based on 3D Gaussian Splatting
- A PCA Based Model for Surface Reconstruction from Incomplete Point Clouds
- Camera Splatting for Continuous View Optimization
- Layout Stroke Imitation: A Layout Guided Handwriting Stroke Generation for Style Imitation with Diffusion Model
- SCENEFORGE: Enhancing 3D-text alignment with Structured Scene Compositions
- Training-Free Pyramid Token Pruning for Efficient Large Vision-Language Models via Region, Token, and Instruction-Guided Importance
- NeuroRAD-FM: A Foundation Model for Neuro-Oncology with Distributionally Robust Training
- Efficient Multimodal Dataset Distillation via Generative Models
- OpenViGA: Video Generation for Automotive Driving Scenes by Streamlining and Fine-Tuning Open Source Models with Public Data
- Lynx: Towards High-Fidelity Personalized Video Generation
- Backdoor Mitigation via Invertible Pruning Masks
- MEC-Quant: Maximum Entropy Coding for Extremely Low Bit Quantization-Aware Training
- SAMPO:Scale-wise Autoregression with Motion PrOmpt for generative world models
- Enhancing Sa2VA for Referent Video Object Segmentation: 2nd Solution for 7th LSVOS RVOS Track
- MS-GS: Multi-Appearance Sparse-View 3D Gaussian Splatting in the Wild
- From Development to Deployment of AI-assisted Telehealth and Screening for Vision- and Hearing-threatening diseases in resource-constrained settings: Field Observations, Challenges and Way Forward
- DC-Mamba: Bi-temporal deformable alignment and scale-sparse enhancement for remote sensing change detection
- EyePCR: A Comprehensive Benchmark for Fine-Grained Perception, Knowledge Comprehension and Clinical Reasoning in Ophthalmic Surgery
- CoDoL: Conditional Domain Prompt Learning for Out-of-Distribution Generalization
- LowDiff: Efficient Diffusion Sampling with Low-Resolution Condition
- RaceGAN: A Framework for Preserving Individuality while Converting Racial Information for Image-to-Image Translation
- Causal Fingerprints of AI Generative Models
- Mind the Gap: Data Rewriting for Stable Off-Policy Supervised Fine-Tuning
- Exploring the Capabilities of LLM Encoders for Image-Text Retrieval in Chest X-rays
- ProFusion: 3D Reconstruction of Protein Complex Structures from Multi-view AFM Images
- Multi-Modal Interpretability for Enhanced Localization in Vision-Language Models
- Which Direction to Choose? An Analysis on the Representation Power of Self-Supervised ViTs in Downstream Tasks
- How Good are Foundation Models in Step-by-Step Embodied Reasoning?
- SPaRC: A Spatial Pathfinding Reasoning Challenge
- P2VA: Converting Persona Descriptions into Voice Attributes for Fair and Controllable Text-to-Speech
- Can Large Language Models Infer Causal Relationships from Real-World Text?
- Beyond Linear Steering: Unified Multi-Attribute Control for Language Models
- LLMs Can Compensate for Deficiencies in Visual Representations
- Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
- Tag&Tab: Pretraining Data Detection in Large Language Models Using Keyword-Based Membership Inference Attack
- AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
- AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents
- StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant
- LESS: Large Language Model Enhanced Semi-Supervised Learning for Speech Foundational Models Using in-the-wild Data
- Benchmarking Debiasing Methods for LLM-based Parameter Estimates
- From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring
- A Rigorous Evaluation of LLM Data Generation Strategies for Low-Resource Languages
- IGD: Token Decisiveness Modeling via Information Gain in LLMs for Personalized Recommendation
- The Impact of Automatic Speech Transcription on Speaker Attribution
- Translationese-index: Using Likelihood Ratios for Graded and Generalizable Measurement of Translationese
- Are Vision-Language Models Safe in the Wild? A Meme-Based Benchmark Study
- MuseScorer: Idea Originality Scoring At Scale
- CLEAR: A Clinically-Grounded Tabular Framework for Radiology Report Evaluation
- Measuring Lexical Diversity of Synthetic Data Generated through Fine-Grained Persona Prompting
- AmpleHate: Amplifying the Attention for Versatile Implicit Hate Detection
- SEMMA: A Semantic Aware Knowledge Graph Foundation Model
- Calibrating LLM Confidence by Probing Perturbed Representation Stability
- MEDAL: A Framework for Benchmarking LLMs as Multilingual Open-Domain Dialogue Evaluators
- Cross-Attention Speculative Decoding
- Emergent Abilities of Large Language Models under Continued Pretraining for Language Adaptation
- reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs
- MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling
- Personalized Language Models via Privacy-Preserving Evolutionary Model Merging
- A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models
- UXAgent: A System for Simulating Usability Testing of Web Design with LLM Agents
- Natural Fingerprints of Large Language Models
- Enhancing LLM Language Adaption through Cross-lingual In-Context Pre-training
- Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning
- The Effect of Language Diversity When Fine-Tuning Large Language Models for Translation
- Are LLMs Better Formalizers than Solvers on Complex Problems?
- MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language
- Creative Preference Optimization
- Efficient Real-time Refinement of Language Model Text Generation
- A Layered Multi-Expert Framework for Long-Context Mental Health Assessments
- Bias Beware: The Impact of Cognitive Biases on LLM-Driven Product Recommendations
- Adaptive Self-improvement LLM Agentic System for ML Library Development
- Where Fact Ends and Fairness Begins: Redefining AI Bias Evaluation through Cognitive Biases
- FSLI: An Interpretable Formal Semantic System for One-Dimensional Ordering Inference
- Sparsity May Be All You Need: Sparse Random Parameter Adaptation
- Harnessing Multiple Large Language Models: A Survey on LLM Ensemble
- KatFishNet: Detecting LLM-Generated Korean Text through Linguistic Feature Analysis
- DP-GTR: Differentially Private Prompt Protection via Group Text Rewriting
- Video2Roleplay: A Multimodal Dataset and Framework for Video-Guided Role-playing Agents
- ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding
- M-PACE: Mother Child Framework for Multimodal Compliance
- Beyond Words: Enhancing Desire, Emotion, and Sentiment Recognition with Non-Verbal Cues
- Direct Simultaneous Translation Activation for Large Audio-Language Models
- Automatic Lexical Simplification for Turkish
- BBScoreV2: Learning Time-Evolution and Latent Alignment from Stochastic Representation
- Database-Augmented Query Representation for Information Retrieval
- The Great AI Witch Hunt: Reviewers Perception and (Mis)Conception of Generative AI in Research Writing
- ConfReady: A RAG based Assistant and Dataset for Conference Checklist Responses
- DynamicNER: A Dynamic, Multilingual, and Fine-Grained Dataset for LLM-based Named Entity Recognition
- Disentangling Latent Shifts of In-Context Learning with Weak Supervision
- REFER: Mitigating Bias in Opinion Summarisation via Frequency Framed Prompting
- Can LLMs Judge Debates? Evaluating Non-Linear Reasoning via Argumentation Theory Semantics
- UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression
- RAVE: Retrieval and Scoring Aware Verifiable Claim Detection
- The Curious Case of Visual Grounding: Different Effects for Speech- and Text-based Language Encoders
- Multi-Physics: A Comprehensive Benchmark for Multimodal LLMs Reasoning on Chinese Multi-Subject Physics Problems
- The Psychology of Falsehood: A Human-Centric Survey of Misinformation Detection
- DiEP: Adaptive Mixture-of-Experts Compression through Differentiable Expert Pruning
- It Depends: Resolving Referential Ambiguity in Minimal Contexts with Commonsense Knowledge
- CodeRAG: Finding Relevant and Necessary Knowledge for Retrieval-Augmented Repository-Level Code Completion
- Learning Analytics from Spoken Discussion Dialogs in Flipped Classroom
- LLM Cache Bandit Revisited: Addressing Query Heterogeneity for Cost-Effective LLM Inference
- A method for improving multilingual quality and diversity of instruction fine-tuning datasets
- DNA-DetectLLM: Unveiling AI-Generated Text via a DNA-Inspired Mutation-Repair Paradigm
- How important is language for human-like intelligence?
- Chunk Based Speech Pre-training with High Resolution Finite Scalar Quantization
- SciEvent: Benchmarking Multi-domain Scientific Event Extraction
- Multilingual LLM Prompting Strategies for Medical English-Vietnamese Machine Translation
- Layer-wise Minimal Pair Probing Reveals Contextual Grammatical-Conceptual Hierarchy in Speech Representations
- VOX-KRIKRI: Unifying Speech and Language through Continuous Fusion
- Fine-Tuning Large Multimodal Models for Automatic Pronunciation Assessment
- Comparative Analysis of Tokenization Algorithms for Low-Resource Language Dzongkha
- Toxicity Red-Teaming: Benchmarking LLM Safety in Singapore's Low-Resource Languages
- PolBiX: Detecting LLMs' Political Bias in Fact-Checking through X-phemisms
- Quantifying Self-Awareness of Knowledge in Large Language Models
- Real, Fake, or Manipulated? Detecting Machine-Influenced Text
- Speech Language Models for Under-Represented Languages: Insights from Wolof
- Frustratingly Easy Data Augmentation for Low-Resource ASR
- BiRQ: Bi-Level Self-Labeling Random Quantization for Self-Supervised Speech Recognition
- Evaluating Multimodal Large Language Models on Spoken Sarcasm Understanding
- Red Teaming Multimodal Language Models: Evaluating Harm Across Prompt Modalities and Models
- No Black Box Anymore: Demystifying Clinical Predictive Modeling with Temporal-Feature Cross Attention Mechanism
- SPACE: SPike-Aware Consistency Enhancement for Test-Time Adaptation in Spiking Neural Networks
- Compound Fault Diagnosis for Train Transmission Systems Using Deep Learning with Fourier-enhanced Representation
- Revealing Human Internal Attention Patterns from Gameplay Analysis for Reinforcement Learning
- TSCAN: Context-Aware Uplift Modeling via Two-Stage Training for Online Merchant Business Diagnosis
- Gaussian process policy iteration with additive Schwarz acceleration for forward and inverse HJB and mean field game problems
- ConCISE: Confidence-guided Compression in Step-by-step Efficient Reasoning
- Schreier-Coset Graph Propagation
- Context parroting: A simple but tough-to-beat baseline for foundation models in scientific machine learning
- Automating Versatile Time-Series Analysis with Tiny Transformers on Embedded FPGAs
- A Survey of Large Language Models for Data Challenges in Graphs
- Deep Learning Foundation and Pattern Models: Challenges in Hydrological Time Series
- Bayesian Concept Bottleneck Models with LLM Priors
- A Data-Driven Review of Remote Sensing-Based Data Fusion in Precision Agriculture from Foundational to Transformer-Based Techniques
- Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks
- Entropy-Regularized Process Reward Model
- Domain-invariant feature learning in brain MR imaging for content-based image retrieval
- Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
- Analog In-memory Training on General Non-ideal Resistive Elements: The Impact of Response Functions
- Neural Networks for Learnable and Scalable Influence Estimation of Instruction Fine-Tuning Data
- Cache-of-Thought: Master-Apprentice Framework for Cost-Effective Vision Language Model Reasoning
- GIN-Graph: A Generative Interpretation Network for Model-Level Explanation of Graph Neural Networks
- StFT: Spatio-temporal Fourier Transformer for Long-term Dynamics Prediction
- Improving the forecast accuracy of wind power by leveraging multiple hierarchical structure
- Spatio-Temporal Anomaly Detection with Graph Networks for Data Quality Monitoring of the Hadron Calorimeter
- Negotiated Representations to Prevent Overfitting in Machine Learning Applications
- Estimating Model Performance Under Covariate Shift Without Labels
- A Unified Theory of Exact Inference and Learning in Exponential Family Latent Variable Models
- Towards Communication-efficient Federated Learning via Sparse and Aligned Adaptive Optimization
- Two Is Better Than One: Aligned Representation Pairs for Anomaly Detection
- Modeling Temporal Dependencies within the Target for Long-Term Time Series Forecasting
- FRIDA: Free-Rider Detection using Privacy Attacks
- A noise-corrected Langevin algorithm and sampling by half-denoising
- Localmax dynamics for attention in transformers and its asymptotic behavior
- VoXtream: Full-Stream Text-to-Speech with Extremely Low Latency
- Model-free algorithms for fast node clustering in SBM type graphs and application to social role inference in animals
- Quantum Enhanced Anomaly Detection for ADS-B Data using Hybrid Deep Learning
- Quantum Reinforcement Learning with Dynamic-Circuit Qubit Reuse and Grover-Based Trajectory Optimization
- What is a good matching of probability measures? A counterfactual lens on transport maps
- PRISM: Probabilistic and Robust Inverse Solver with Measurement-Conditioned Diffusion Prior for Blind Inverse Problems
- When Bugs Linger: A Study of Anomalous Resolution Time Outliers and Their Themes
- Query-Efficient Locally Private Hypothesis Selection via the Scheffe Graph
- Quantum Generative Adversarial Autoencoders: Learning latent representations for quantum data generation
- MatchFixAgent: Language-Agnostic Autonomous Repository-Level Code Translation Validation and Repair
- MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
- Interpretable Network-assisted Random Forest+
- Concept Unlearning in Large Language Models via Self-Constructed Knowledge Triplets
- Sparse-Autoencoder-Guided Internal Representation Unlearning for Large Language Models
- ORIC: Benchmarking Object Recognition in Incongruous Context for Large Vision-Language Models
- Triplet Loss Based Quantum Encoding for Class Separability
- Impact of Single Rotations and Entanglement Topologies in Quantum Neural Networks
- Training Variational Quantum Circuits Using Particle Swarm Optimization
- UPRPRC: Unified Pipeline for Reproducing Parallel Resources -- Corpus from the United Nations
- Phase Transition for Stochastic Block Model with more than $\sqrt{n}$ Communities
- Optimizing Product Deduplication in E-Commerce with Multimodal Embeddings
- A Flow-rate-conserving CNN-based Domain Decomposition Method for Blood Flow Simulations
- Beyond the Score: Uncertainty-Calibrated LLMs for Automated Essay Assessment
- The Missing Piece: A Case for Pre-Training in 3D Medical Object Detection
- Subject Matter Expertise vs Professional Management in Collective Sequential Decision Making
- Copycat vs. Original: Multi-modal Pretraining and Variable Importance in Box-office Prediction
- Training thermodynamic computers by gradient descent
- MaskAttn-SDXL: Controllable Region-Level Text-To-Image Generation
- Exploring Fine-Tuning of Large Audio Language Models for Spoken Language Understanding under Limited Speech data
- Quantifying Uncertainty in Natural Language Explanations of Large Language Models for Question Answering
- Neural Architecture Search Algorithms for Quantum Autoencoders
- Breathing and Semantic Pause Detection and Exertion-Level Classification in Post-Exercise Speech
- (SP)$^2$-Net: A Neural Spatial Spectrum Method for DOA Estimation
- Geometric Integration for Neural Control Variates
- Hybrid Deep Learning-Federated Learning Powered Intrusion Detection System for IoT/5G Advanced Edge Computing Network
- SETrLUSI: Stochastic Ensemble Multi-Source Transfer Learning Using Statistical Invariant
- Rethinking Molecule Synthesizability with Chain-of-Reaction
- Randomized Smoothing Meets Vision-Language Models
- Personalized Federated Learning with Heat-Kernel Enhanced Tensorized Multi-View Clustering
- Dynamic Classifier-Free Diffusion Guidance via Online Feedback
- Spatio-temporal, multi-field deep learning of shock propagation in meso-structured media
- Automated Cyber Defense with Generalizable Graph-based Reinforcement Learning Agents
- DIVEBATCH: Accelerating Model Training Through Gradient-Diversity Aware Batch Size Adaptation
- Latent learning: episodic memory complements parametric learning by enabling flexible reuse of experiences
- Inverting Trojans in LLMs
- Deep Gaussian Process-based Cost-Aware Batch Bayesian Optimization for Complex Materials Design Campaigns
- Kernel Model Validation: How To Do It, And Why You Should Care
- RespoDiff: Dual-Module Bottleneck Transformation for Responsible & Faithful T2I Generation
- SAGE: Semantic-Aware Shared Sampling for Efficient Diffusion
- Improving Monte Carlo Tree Search for Symbolic Regression
- Bayesian Physics Informed Neural Networks for Reliable Transformer Prognostics
- UniTac2Pose: A Unified Approach Learned in Simulation for Category-level Visuotactile In-hand Pose Estimation
- Targeted Fine-Tuning of DNN-Based Receivers via Influence Functions
- Adversarial Graph Fusion for Incomplete Multi-view Semi-supervised Learning with Tensorial Imputation
- Inverse Optimization Latent Variable Models for Learning Costs Applied to Route Problems
- Predicting the descent into extremism and terrorism
- Time-adaptive SympNets for separable Hamiltonian systems
- Automated Constitutive Model Discovery by Pairing Sparse Regression Algorithms with Model Selection Criteria
- SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection
- MTS-DMAE: Dual-Masked Autoencoder for Unsupervised Multivariate Time Series Representation Learning
- Incremental Multistep Forecasting of Battery Degradation Using Pseudo Targets
- Learning to Optimize Capacity Planning in Semiconductor Manufacturing
- Generalization and Optimization of SGD with Lookahead
- ThermalGuardian: Temperature-Aware Testing of Automotive Deep Learning Frameworks
- On the Convergence of Muon and Beyond
- SolarCrossFormer: Improving day-ahead Solar Irradiance Forecasting by Integrating Satellite Imagery and Ground Sensors
- HyP-ASO: A Hybrid Policy-based Adaptive Search Optimization Framework for Large-Scale Integer Linear Programs
- Tsururu: A Python-based Time Series Forecasting Strategies Library
- FedHK-MVFC: Federated Heat Kernel Multi-View Clustering
- Efficient Long-Tail Learning in Latent Space by sampling Synthetic Data
- ToFU: Transforming How Federated Learning Systems Forget User Data
- Universal Learning of Stochastic Dynamics for Exact Belief Propagation using Bernstein Normalizing Flows
- Nonconvex Decentralized Stochastic Bilevel Optimization under Heavy-Tailed Noises
- PolyJuice Makes It Real: Black-Box, Universal Red Teaming for Synthetic Image Detectors
- The Multi-Query Paradox in Zeroth-Order Optimization
- Small LLMs with Expert Blocks Are Good Enough for Hyperparamter Tuning
- How many classes do we need to see for novel class discovery?
- Personalized Prediction By Learning Halfspace Reference Classes Under Well-Behaved Distribution
- Efficient Extractive Text Summarization for Online News Articles Using Machine Learning
- Nonconvex Regularization for Feature Selection in Reinforcement Learning
- RMT-KD: Random Matrix Theoretic Causal Knowledge Distillation
- EigenTrack: Spectral Activation Feature Tracking for Hallucination and Out-of-Distribution Detection in LLMs and VLMs
- Aircraft Fuel Flow Modelling with Ageing Effects: From Parametric Corrections to Neural Networks
- GUI-ReWalk: Massive Data Generation for GUI Agent via Stochastic Exploration and Intent-Aware Reasoning
- Top-$k$ Feature Importance Ranking
- Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data
- Computing Linear Regions in Neural Networks with Skip Connections
- IMPQ: Interaction-Aware Layerwise Mixed Precision Quantization for LLMs
- Temporal Reasoning with Large Language Models Augmented by Evolving Knowledge Graphs
- Solar Forecasting with Causality: A Graph-Transformer Approach to Spatiotemporal Dependencies
- FRAUDGUESS: Spotting and Explaining New Types of Fraud in Million-Scale Financial Data
- Detail Across Scales: Multi-Scale Enhancement for Full Spectrum Neural Representations
- Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers
- Policy Gradient Optimzation for Bayesian-Risk MDPs with General Convex Losses
- KoopCast: Trajectory Forecasting via Koopman Operators
- Fully Decentralized Cooperative Multi-Agent Reinforcement Learning is A Context Modeling Problem
- A Weak Supervision Approach for Monitoring Recreational Drug Use Effects in Social Media
- Fleming-R1: Toward Expert-Level Medical Reasoning via Reinforcement Learning
- Hybrid unary-binary design for multiplier-less printed Machine Learning classifiers
- Kuramoto Orientation Diffusion Models
- Global Pre-fixing, Local Adjusting: A Simple yet Effective Contrastive Strategy for Continual Learning
- Probabilistic Conformal Coverage Guarantees in Small-Data Settings
- Predicting Language Models' Success at Zero-Shot Probabilistic Prediction
- Stochastic Sample Approximations of (Local) Moduli of Continuity
- Adversarial generalization of unfolding (model-based) networks
- Learning in Stackelberg Mean Field Games: A Non-Asymptotic Analysis
- VMDNet: Time Series Forecasting with Leakage-Free Samplewise Variational Mode Decomposition and Multibranch Decoding
- Adaptive Algorithms with Sharp Convergence Rates for Stochastic Hierarchical Optimization
- Enhancing Interpretability in Deep Reinforcement Learning through Semantic Clustering
- Dynamic Policy Fusion for User Alignment Without Re-Interaction
- FLARE: Faithful Logic-Aided Reasoning and Exploration
- Watson: A Cognitive Observability Framework for the Reasoning of LLM-Powered Agents
- Decomposing Interventional Causality into Synergistic, Redundant, and Unique Components
- Exploring the Impact of Personality Traits on LLM Bias and Toxicity
- Activation Space Interventions Can Be Transferred Between Large Language Models
- DebFlow: Automating Agent Creation via Agent Debate
- Towards deployment-centric multimodal AI beyond vision and language
- Robust Vision-Language Models via Tensor Decomposition: A Defense Against Adversarial Attacks
- Fast OTSU Thresholding Using Bisection Method
- Accelerating Atomic Fine Structure Determination with Graph Reinforcement Learning
- CultureScope: A Dimensional Lens for Probing Cultural Understanding in LLMs
- FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation
- RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation
- Action is the primary key: a categorical framework for episodic memories and logical reasoning
- Communications to Circulations: 3D Wind Field Retrieval and Real-Time Prediction Using 5G GNSS Signals and Deep Learning
- See&Trek: Training-Free Spatial Prompting for Multimodal Large Language Model
- Beyond Pointwise Scores: Decomposed Criteria-Based Evaluation of LLM Responses
- DiffusionNFT: Online Diffusion Reinforcement with Forward Process
- Network-Based Detection of Autism Spectrum Disorder Using Sustainable and Non-invasive Salivary Biomarkers
- MoE-CE: Enhancing Generalization for Deep Learning based Channel Estimation via a Mixture-of-Experts Framework
- RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation
- BEFT: Bias-Efficient Fine-Tuning of Language Models
- Shedding Light on Depth: Explainability Assessment in Monocular Depth Estimation
- Uncertainty-Based Smooth Policy Regularisation for Reinforcement Learning with Few Demonstrations
- EmoHeal: An End-to-End System for Personalized Therapeutic Music Retrieval from Fine-grained Emotions
- Towards Sharper Object Boundaries in Self-Supervised Depth Estimation
- Fed-PISA: Federated Voice Cloning via Personalized Identity-Style Adaptation
- AI Methods for Permutation Circuit Synthesis Across Generic Topologies
- Session-Level Spoken Language Assessment with a Multimodal Foundation Model via Multi-Target Learning
- Think, Verbalize, then Speak: Bridging Complex Thoughts and Comprehensible Speech
- Compose by Focus: Scene Graph-based Atomic Skills
- Distribution-Aligned Decoding for Efficient LLM Task Adaptation
- MoAngelo: Motion-Aware Neural Surface Reconstruction for Dynamic Scenes
- From Data to Diagnosis: A Large, Comprehensive Bone Marrow Dataset and AI Methods for Childhood Leukemia Prediction
- Re-FRAME the Meeting Summarization SCOPE: Fact-Based Summarization and Personalization via Questions
- An Equivariant Graph Network for Interpretable Nanoporous Materials Design
- Foundation Models as World Models: A Foundational Study in Text-Based GridWorlds
- Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search
- The Alignment Bottleneck
- A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning
- ArchesClimate: Probabilistic Decadal Ensemble Generation With Flow Matching
- Compose Yourself: Average-Velocity Flow Matching for One-Step Speech Enhancement
- Explainable AI for Maritime Autonomous Surface Ships (MASS): Adaptive Interfaces and Trustworthy Human-AI Collaboration
- CBPNet: A Continual Backpropagation Prompt Network for Alleviating Plasticity Loss on Edge Devices
- Monte Carlo Tree Diffusion with Multiple Experts for Protein Design
- Hierarchical Reinforcement Learning with Low-Level MPC for Multi-Agent Control
- ChronoForge-RL: Chronological Forging through Reinforcement Learning for Enhanced Video Understanding
- CIDER: A Causal Cure for Brand-Obsessed Text-to-Image Models
- Instance Generation for Meta-Black-Box Optimization through Latent Space Reverse Engineering
- Best-of-L: Cross-Lingual Reward Modeling for Mathematical Reasoning
- Diversity of Structured Domains via k-Kemeny Scores
- EvoBrain: Dynamic Multi-channel EEG Graph Modeling for Time-evolving Brain Network
- DeepMech: A Machine Learning Framework for Chemical Reaction Mechanism Prediction
- Self-Supervised Cross-Modal Learning for Image-to-Point Cloud Registration
- RACap: Relation-Aware Prompting for Lightweight Retrieval-Augmented Image Captioning
- Chunk Knowledge Generation Model for Enhanced Information Retrieval: A Multi-task Learning Approach
- SightSound-R1: Cross-Modal Reasoning Distillation from Vision to Audio Language Models
- TISDiSS: A Training-Time and Inference-Time Scalable Framework for Discriminative Source Separation
- Inference Offloading for Cost-Sensitive Binary Classification at the Edge
- KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning
- Saccadic Vision for Fine-Grained Visual Classification
- SGMAGNet: A Baseline Model for 3D Cloud Phase Structure Reconstruction on a New Passive Active Satellite Benchmark
- Once Upon a Time: Interactive Learning for Storytelling with Small Language Models
- GP3: A 3D Geometry-Aware Policy with Multi-View Images for Robotic Manipulation
- FloorSAM: SAM-Guided Floorplan Reconstruction with Semantic-Geometric Fusion
- On Optimal Steering to Achieve Exact Fairness
- Ideal Registration? Segmentation is All You Need
- BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent
- LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs
- Contrastive Learning with Spectrum Information Augmentation in Abnormal Sound Detection
- Towards Size-invariant Salient Object Detection: A Generic Evaluation and Optimization Approach
- Relevance to Utility: Process-Supervised Rewrite for RAG
- Multimodal Learning for Fake News Detection in Short Videos Using Linguistically Verified Data and Heterogeneous Modality Fusion
- Momentum-constrained Hybrid Heuristic Trajectory Optimization Framework with Residual-enhanced DRL for Visually Impaired Scenarios
- DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models
- CFDA & CLIP at TREC iKAT 2025: Enhancing Personalized Conversational Search via Query Reformulation and Rank Fusion
- Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification
- Information Geometry of Variational Bayes
- Toward Efficient Influence Function: Dropout as a Compression Tool
- Incorporating Visual Cortical Lateral Connection Properties into CNN: Recurrent Activation and Excitatory-Inhibitory Separation
- Self-supervised learning of imaging and clinical signatures using a multimodal joint-embedding predictive architecture
- Comparing Computational Pathology Foundation Models using Representational Similarity Analysis
- mucAI at BAREC Shared Task 2025: Towards Uncertainty Aware Arabic Readability Assessment
- SmolRGPT: Efficient Spatial Reasoning for Warehouse Environments with 600M Parameters
- The (Short-Term) Effects of Large Language Models on Unemployment and Earnings
- How do Language Models Generate Slang: A Systematic Comparison between Human and Machine-Generated Slang Usages
- GUI-ARP: Enhancing Grounding with Adaptive Region Perception for GUI Agents
- Diffusion-Based Cross-Modal Feature Extraction for Multi-Label Classification
- Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
- Reward Hacking Mitigation using Verifiable Composite Rewards
- Generating Part-Based Global Explanations Via Correspondence
- Exploring multimodal implicit behavior learning for vehicle navigation in simulated cities
- Deep learning and abstractive summarisation for radiological reports: an empirical study for adapting the PEGASUS models' family with scarce data
- ORCA: Agentic Reasoning For Hallucination and Adversarial Robustness in Vision-Language Models
- Region-Aware Deformable Convolutions
- Impact of Phonetics on Speaker Identity in Adversarial Voice Attack
- Dual-Mode Visual System for Brain-Computer Interfaces: Integrating SSVEP and P300 Responses
- Where Do I 'Add the Egg'?: Exploring Agency and Ownership in AI Creative Co-Writing Systems
- Implicit Kinodynamic Motion Retargeting for Human-to-humanoid Imitation Learning
- PILOT: Steering Synthetic Data Generation with Psychological & Linguistic Output Targeting
- Hierarchical Self-Attention: Generalizing Neural Attention Mechanics to Multi-Scale Problems
- CAGE: Continuity-Aware edGE Network Unlocks Robust Floorplan Reconstruction
- Generative AI Meets Wireless Sensing: Towards Wireless Foundation Model
- IEFS-GMB: Gradient Memory Bank-Guided Feature Selection Based on Information Entropy for EEG Classification of Neurological Disorders
- Autoguided Online Data Curation for Diffusion Model Training
- Modeling Transformers as complex networks to analyze learning dynamics
- PRISM: Phase-enhanced Radial-based Image Signature Mapping framework for fingerprinting AI-generated images
- Large Vision Models Can Solve Mental Rotation Problems
- Partial Column Generation with Graph Neural Networks for Team Formation and Routing
- Evaluating the Limitations of Local LLMs in Solving Complex Programming Challenges
- Collective Voice: Recovered-Peer Support Mediated by An LLM-Based Chatbot for Eating Disorder Recovery
- Emulating Human-like Adaptive Vision for Efficient and Flexible Machine Visual Perception
- Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert Routing
- Efficient and Versatile Model for Multilingual Information Retrieval of Islamic Text: Development and Deployment in Real-World Scenarios
- EHR-MCP: Real-world Evaluation of Clinical Information Retrieval by Large Language Models via Model Context Protocol
- Structured Information for Improving Spatial Relationships in Text-to-Image Generation
- Attention Schema-based Attention Control (ASAC): A Cognitive-Inspired Approach for Attention Management in Transformers
- Pre-Forgettable Models: Prompt Learning as a Native Mechanism for Unlearning
- ChannelFlow-Tools: A Standardized Dataset Creation Pipeline for 3D Obstructed Channel Flows
- Generating Plans for Belief-Desire-Intention (BDI) Agents Using Alternating-Time Temporal Logic (ATL)
- GenCAD-3D: CAD Program Generation using Multimodal Latent Space Alignment and Synthetic Dataset Balancing
- Synthetic bootstrapped pretraining
- Causal Reasoning Elicits Controllable 3D Scene Generation
- Walk and Read Less: Improving the Efficiency of Vision-and-Language Navigation via Tuning-Free Multimodal Token Pruning
- Emotion-Aware Speech Generation with Character-Specific Voices for Comics
- A Multi-Scale Graph Neural Process with Cross-Drug Co-Attention for Drug-Drug Interactions Prediction
- The Distribution Shift Problem in Transportation Networks using Reinforcement Learning and AI
- An Artificial Intelligence Driven Semantic Similarity-Based Pipeline for Rapid Literature
- Knowledge-Driven Hallucination in Large Language Models: An Empirical Study on Process Modeling
- Diagnostics of cognitive failures in multi-agent expert systems using dynamic evaluation protocols and subsequent mutation of the processing context
- FragmentRetro: A Quadratic Retrosynthetic Method Based on Fragmentation Algorithms
- Stress Testing Deliberative Alignment for Anti-Scheming Training
- MicroRCA-Agent: Microservice Root Cause Analysis Method Based on Large Language Model Agents
- CCrepairBench: A High-Fidelity Benchmark and Reinforcement Learning Framework for C++ Compilation Repair
- A Nascent Taxonomy of Machine Learning in Intelligent Robotic Process Automation
- Ontology Creation and Management Tools: the Case of Anatomical Connectivity
- Building Data-Driven Occupation Taxonomies: A Bottom-Up Multi-Stage Approach via Semantic Clustering and Multi-Agent Collaboration
- A Comparative Study of Rule-Based and Data-Driven Approaches in Industrial Monitoring
- MICA: Multi-Agent Industrial Coordination Assistant
- KNARsack: Teaching Neural Algorithmic Reasoners to Solve Pseudo-Polynomial Problems
Research Sources: 528 | Generated: 9/22/2025