AI RESEARCH PAPERS & ACADEMIC SOURCES
- A CrossMod-Transformer deep learning framework for multi-modal pain detection through EDA and ECG fusion
- Prompt-based bioinformatics: a new interface for multi-omics analysis
- AI-driven fusion of multimodal data for Alzheimer’s disease biomarker assessment
- AI-assisted cervical cytology precancerous screening for high-risk population in resource-limited regions using a compact microscope
- A practical framework for appropriate implementation and review of artificial intelligence (FAIR-AI) in healthcare
- Deep learning-based prediction of the selection factors for quantifying selection in immune receptor repertoires
- Decomposing Global AUC into Cluster-Level Contributions for Localized Model Diagnostics
- A Meta-Learning Method for Estimation of Causal Excursion Effects to Assess Time-Varying Moderation
- Stability and performance guarantees for misspecified multivariate score-driven filters
- Physics-Informed Generative Modeling of Wireless Channels
- Semantic Mapping in Indoor Embodied AI -- A Survey on Advances, Challenges, and Future Directions
- DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control
- OceanSim: A GPU-Accelerated Underwater Robot Perception Simulation Framework
- MeshPad: Interactive Sketch-Conditioned Artist-Reminiscent Mesh Generation and Editing
- Dual-domain Modulation Network for Lightweight Image Super-Resolution
- Evaluating structural uncertainty in accelerated MRI: are voxelwise measures useful surrogates?
- UniCalib: Targetless LiDAR-Camera Calibration via Probabilistic Flow on Unified Depth Representations
- Retuve: Automated Multi-Modality Analysis of Hip Dysplasia with Open Source AI
- SOPHY: Learning to Generate Simulation-Ready Objects with Physical Materials
- Is Single-View Mesh Reconstruction Ready for Robotics?
- Efficient RAW Image Deblurring with Adaptive Frequency Modulation
- Maximum Dispersion, Maximum Concentration: Enhancing the Quality of MOP Solutions
- Spatio-Temporal Representation Decoupling and Enhancement for Federated Instrument Segmentation in Surgical Videos
- A Steel Surface Defect Detection Method Based on Lightweight Convolution Optimization
- Learned Regularization for Microwave Tomography
- ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks
- On Representation Learning with Feedback
- EA-KD: Entropy-based Adaptive Knowledge Distillation
- Scene Summarization: Clustering Scene Videos into Spatially Diverse Frames
- BonnBeetClouds3D: A Dataset Towards Point Cloud-based Organ-level Phenotyping of Sugar Beet Plants under Field Conditions
- Towards Customized Knowledge Distillation for Chip-Level Dense Image Predictions
- Compact and De-biased Negative Instance Embedding for Multi-Instance Learning on Whole-Slide Image Classification
- Spotter+GPT: Turning Sign Spottings into Sentences with LLMs
- Goldilocks Test Sets for Face Verification
- InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows
- Learning Multi-view Anomaly Detection with Efficient Adaptive Selection
- FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
- Alignment-free Raw Video Demoireing
- Prompt-Softbox-Prompt: A Free-Text Embedding Control for Image Editing
- Ethical Challenges in Computer Vision: Ensuring Privacy and Mitigating Bias in Publicly Available Datasets
- PainDiffusion: Learning to Express Pain
- SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal Dissection with Mamba
- Multimodal Deception in Explainable AI: Concept-Level Backdoor Attacks on Concept Bottleneck Models
- SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity
- Flow Matching Posterior Sampling: A Training-free Conditional Generation for Flow Matching
- Solving Zero-Shot 3D Visual Grounding as Constraint Satisfaction Problems
- Quadratic Gaussian Splatting: High Quality Surface Reconstruction with Second-order Geometric Primitives
- VideoSAVi: Self-Aligned Video Language Models without Human Supervision
- DuoCast: Duo-Probabilistic Diffusion for Precipitation Nowcasting
- BadPatch: Diffusion-Based Generation of Physical Adversarial Patches
- Street Gaussians without 3D Object Tracker
- Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction
- TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
- Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models
- DAViD: Modeling Dynamic Affordance of 3D Objects Using Pre-trained Video Diffusion Models
- DWTNeRF: Boosting Few-shot Neural Radiance Fields via Discrete Wavelet Transform
- LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer
- Confidence-Based Annotation Of Brain Tumours In Ultrasound
- GaussianFlowOcc: Sparse and Weakly Supervised Occupancy Estimation using Gaussian Splatting and Temporal Flow
- X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation
- TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification
- MambaFlow: A Mamba-Centric Architecture for End-to-End Optical Flow Estimation
- Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
- From Limited Labels to Open Domains:An Efficient Learning Method for Drone-view Geo-Localization
- ROODI: Reconstructing Occluded Objects with Denoising Inpainters
- VFM-UDA++: Improving Network Architectures and Data Strategies for Unsupervised Domain Adaptive Semantic Segmentation
- AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction
- Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
- Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models
- GAITGen: Disentangled Motion-Pathology Impaired Gait Generative Model -- Bringing Motion Generation to the Clinical Domain
- Leveraging Sparse Annotations for Leukemia Diagnosis on the Large Leukemia Dataset
- Hierarchical Relation-augmented Representation Generalization for Few-shot Action Recognition
- Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
- Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detection
- DamageCAT: A Deep Learning Transformer Framework for Typology-Based Post-Disaster Building Damage Categorization
- Interpreting the linear structure of vision-language model embedding spaces
- Just Say the Word: Annotation-Free Fine-Grained Object Counting
- Decoupled Global-Local Alignment for Improving Compositional Understanding
- CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
- 3D Gaussian Splatting Data Compression with Mixture of Priors
- QuickSplat: Fast 3D Surface Reconstruction via Learned Gaussian Initialization
- Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language
- Unintended Bias in 2D+ Image Segmentation and Its Effect on Attention Asymmetry
- Toward Patient-specific Partial Point Cloud to Surface Completion for Pre- to Intra-operative Registration in Image-guided Liver Interventions
- NEXT: Multi-Grained Mixture of Experts via Text-Modulation for Multi-Modal Object Re-Identification
- EF-VI: Enhancing End-Frame Injection for Video Inbetweening
- Rhetorical Text-to-Image Generation via Two-layer Diffusion Policy Optimization
- HiGarment: Cross-modal Harmony Based Diffusion Model for Flat Sketch to Realistic Garment Image
- Video Signature: In-generation Watermarking for Latent Video Diffusion Models
- Zoom-Refine: Boosting High-Resolution Multimodal Understanding via Localized Zoom and Self-Refinement
- DanceChat: Large Language Model-Guided Music-to-Dance Generation
- Simple Radiology VLLM Test-time Scaling with Thought Graph Traversal
- 3DGS-IEval-15K: A Large-scale Image Quality Evaluation Database for 3D Gaussian-Splatting
- CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion
- Co-VisiON: Co-Visibility ReasONing on Sparse Image Sets of Indoor Scenes
- CLGRPO: Reasoning Ability Enhancement for Small VLMs
- UltraAD: Fine-Grained Ultrasound Anomaly Classification via Few-Shot CLIP Adaptation
- Temporal Rate Reduction Clustering for Human Motion Segmentation
- MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models
- PCLVis: Visual Analytics of Process Communication Latency in Large-Scale Simulation
- CPKD: Clinical Prior Knowledge-Constrained Diffusion Models for Surgical Phase Recognition in Endoscopic Submucosal Dissection
- When Trackers Date Fish: A Benchmark and Framework for Underwater Multiple Fish Tracking
- Multimodal Visual Transformer for Sim2real Transfer in Visual Reinforcement Learning
- LifelongPR: Lifelong point cloud place recognition based on sample replay and prompt learning
- ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding
- Efficient Face Image Quality Assessment via Self-training and Knowledge Distillation
- FROSS: Faster-than-Real-Time Online 3D Semantic Scene Graph Generation from RGB-D Images
- Less is More: Skim Transformer for Light Field Image Super-resolution
- A nonlinear elasticity model in computer vision
- Rethinking Theoretical Illumination for Efficient Low-Light Image Enhancement
- A Plug-and-Play Method for Guided Multi-contrast MRI Reconstruction based on Content/Style Modeling
- CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
- Comparison Reveals Commonality: Customized Image Generation through Contrastive Inversion
- Correspondence as Video: Test-Time Adaption on SAM2 for Reference Segmentation in the Wild
- Dream4D: Lifting Camera-Controlled I2V towards Spatiotemporally Consistent 4D Generation
- Prototype-Guided Curriculum Learning for Zero-Shot Learning
- Forecasting Continuous Non-Conservative Dynamical Systems in SO(3)
- GaitSnippet: Gait Recognition Beyond Unordered Sets and Ordered Sequences
- Anatomy-Aware Low-Dose CT Denoising via Pretrained Vision Models and Semantic-Guided Contrastive Learning
- Boosting Active Defense Persistence: A Two-Stage Defense Framework Combining Interruption and Poisoning Against Deepfake
- Power Battery Detection
- MambaTrans: Multimodal Fusion Image Translation via Large Language Model Priors for Downstream Visual Tasks
- Pose-RFT: Enhancing MLLMs for 3D Pose Generation via Hybrid Action Reinforcement Fine-Tuning
- DiTVR: Zero-Shot Diffusion Transformer for Video Restoration
- Semi-supervised Multiscale Matching for SAR-Optical Image
- Segmenting and Understanding: Region-aware Semantic Attention for Fine-grained Image Quality Assessment with Large Language Models
- MIMIC: Multimodal Inversion for Model Interpretation and Conceptualization
- Effortless Vision-Language Model Specialization in Histopathology without Annotation
- CBDES MoE: Hierarchically Decoupled Mixture-of-Experts for Functional Modules in Autonomous Driving
- Morphological Analysis of Semiconductor Microstructures using Skeleton Graphs
- Tracking Any Point Methods for Markerless 3D Tissue Tracking in Endoscopic Stereo Images
- CATP: Contextually Adaptive Token Pruning for Efficient and Enhanced Multimodal In-Context Learning
- TAP: Parameter-efficient Task-Aware Prompting for Adverse Weather Removal
- Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
- CTC Transcription Alignment of the Bullinger Letters: Automatic Improvement of Annotation Quality
- Generative Video Matting
- Mem4D: Decoupling Static and Dynamic Memory for Dynamic Scene Reconstruction
- RSVLM-QA: A Benchmark Dataset for Remote Sensing Vision Language Model-based Question Answering
- TAG: A Simple Yet Effective Temporal-Aware Approach for Zero-Shot Video Temporal Grounding
- VOIDFace: A Privacy-Preserving Multi-Network Face Recognition With Enhanced Security
- TrackOR: Towards Personalized Intelligent Operating Rooms Through Robust Tracking
- The Escalator Problem: Identifying Implicit Motion Blindness in AI for Accessibility
- Prompt-Guided Relational Reasoning for Social Behavior Understanding with Vision Foundation Models
- Sample-aware RandAugment: Search-free Automatic Data Augmentation for Effective Image Recognition
- Mitigating Biases in Surgical Operating Rooms with Geometry
- TRIDE: A Text-assisted Radar-Image weather-aware fusion network for Depth Estimation
- S^2VG: 3D Stereoscopic and Spatial Video Generation via Denoising Frame Matrix
- Information Bottleneck-based Causal Attention for Multi-label Medical Image Recognition
- ME-TST+: Micro-expression Analysis via Temporal State Transition with ROI Relationship Awareness
- Matrix-3D: Omnidirectional Explorable 3D World Generation
- 3D Plant Root Skeleton Detection and Extraction
- TBAC-UniImage: Unified Understanding and Generation by Ladder-Side Diffusion Tuning
- A Physics-Driven Neural Network with Parameter Embedding for Generating Quantitative MR Maps from Weighted Images
- Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control
- FantasyStyle: Controllable Stylized Distillation for 3D Gaussian Splatting
- Pindrop it! Audio and Visual Deepfake Countermeasures for Robust Detection and Fine Grained-Localization
- ReconDreamer-RL: Enhancing Reinforcement Learning via Diffusion-based Scene Reconstruction
- CD-TVD: Contrastive Diffusion for 3D Super-Resolution with Scarce High-Resolution Time-Varying Data
- 3D Human Mesh Estimation from Single View RGBD
- PP-Motion: Physical-Perceptual Fidelity Evaluation for Human Motion Generation
- THAT: Token-wise High-frequency Augmentation Transformer for Hyperspectral Pansharpening
- KARMA: Efficient Structural Defect Segmentation via Kolmogorov-Arnold Representation Learning
- Reinforcement Learning in Vision: A Survey
- Spatial-ORMLLM: Improve Spatial Relation Understanding in the Operating Room with Multimodal Large Language Model
- SAGOnline: Segment Any Gaussians Online
- Learning User Preferences for Image Generation Model
- StableAvatar: Infinite-Length Audio-Driven Avatar Video Generation
- ReferSplat: Referring Segmentation in 3D Gaussian Splatting
- Learning an Implicit Physics Model for Image-based Fluid Simulation
- Codebook-enabled Generative End-to-end Semantic Communication Powered by Transformer
- Digital generation of the 3-D pore architecture of isotropic membranes using 2-D cross-sectional scanning electron microscopy images
- Vibration-Based Energy Metric for Restoring Needle Alignment in Autonomous Robotic Ultrasound
- Fading the Digital Ink: A Universal Black-Box Attack Framework for 3DGS Watermarking Systems
- KLASSify to Verify: Audio-Visual Deepfake Detection Using SSL-based Audio and Handcrafted Visual Features
- Progressive Bird's Eye View Perception for Safety-Critical Autonomous Driving: A Comprehensive Survey
- MSPT: A Lightweight Face Image Quality Assessment Method with Multi-stage Progressive Training
- AD-AVSR: Asymmetric Dual-stream Enhancement for Robust Audio-Visual Speech Recognition
- Sea-Undistort: A Dataset for Through-Water Image Restoration in High Resolution Airborne Bathymetric Mapping
- IPBA: Imperceptible Perturbation Backdoor Attack in Federated Self-Supervised Learning
- Adaptive Cache Enhancement for Test-Time Adaptation of Vision-Language Models
- GAPNet: A Lightweight Framework for Image and Video Salient Object Detection via Granularity-Aware Paradigm
- Voice Pathology Detection Using Phonation
- From Prediction to Explanation: Multimodal, Explainable, and Interactive Deepfake Detection Framework for Non-Expert Users
- LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation
- X2Edit: Revisiting Arbitrary-Instruction Image Editing through Self-Constructed Data and Task-Aware Representation Learning
- An Iterative Reconstruction Method for Dental Cone-Beam Computed Tomography with a Truncated Field of View
- Enhancing Egocentric Object Detection in Static Environments using Graph-based Spatial Anomaly Detection and Correction
- A Trustworthy Method for Multimodal Emotion Recognition
- AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning
- LaRender: Training-Free Occlusion Control in Image Generation via Latent Rendering
- Collaborative Learning of Scattering and Deep Features for SAR Target Recognition with Noisy Labels
- Undress to Redress: A Training-Free Framework for Virtual Try-On
- DiffVC-OSD: One-Step Diffusion-based Perceptual Neural Video Compression Framework
- Make Your MoVe: Make Your 3D Contents by Adapting Multi-View Diffusion Models to External Editing
- Multi-view Normal and Distance Guidance Gaussian Splatting for Surface Reconstruction
- A Registration-Based Star-Shape Segmentation Model and Fast Algorithms
- Enhancing Small-Scale Dataset Expansion with Triplet-Connection-based Sample Re-Weighting
- Grouped Speculative Decoding for Autoregressive Image Generation
- Med-GRIM: Enhanced Zero-Shot Medical VQA using prompt-embedded Multimodal Graph RAG
- DiTalker: A Unified DiT-based Framework for High-Quality and Speaking Styles Controllable Portrait Animation
- BigTokDetect: A Clinically-Informed Vision-Language Model Framework for Detecting Pro-Bigorexia Videos on TikTok
- Frequency Prior Guided Matching: A Data Augmentation Approach for Generalizable Semi-Supervised Polyp Segmentation
- Large Language Models Facilitate Vision Reflection in Image Classification
- Benchmarking Deep Learning-Based Object Detection Models on Feature Deficient Astrophotography Imagery Dataset
- MILD: Multi-Layer Diffusion Strategy for Complex and Precise Multi-IP Aware Human Erasing
- Statistical Confidence Rescoring for Robust 3D Scene Graph Generation from Multi-View Images
- Slice or the Whole Pie? Utility Control for AI Models
- Static and Plugged: Make Embodied Evaluation Simple
- StyleTailor: Towards Personalized Fashion Styling via Hierarchical Negative Feedback
- Grounding Emotion Recognition with Visual Prototypes: VEGA -- Revisiting CLIP in MERC
- ContextGuard-LVLM: Enhancing News Veracity through Fine-grained Cross-modal Contextual Consistency Verification
- VL-MedGuide: A Visual-Linguistic Large Model for Intelligent and Explainable Skin Disease Auxiliary Diagnosis
- CycleDiff: Cycle Diffusion Models for Unpaired Image-to-image Translation
- Rethinking Key-frame-based Micro-expression Recognition: A Robust and Accurate Framework Against Key-frame Errors
- Towards Robust Red-Green Watermarking for Autoregressive Image Generators
- Learning More by Seeing Less: Line Drawing Pretraining for Efficient, Transferable, and Human-Aligned Vision
- Fourier Optics and Deep Learning Methods for Fast 3D Reconstruction in Digital Holography
- Restage4D: Reanimating Deformable 3D Reconstruction from a Single Video
- VOccl3D: A Video Benchmark Dataset for 3D Human Pose and Shape Estimation under real Occlusions
- DiffUS: Differentiable Ultrasound Rendering from Volumetric Imaging
- Edge Detection for Organ Boundaries via Top Down Refinement and SubPixel Upsampling
- DualResolution Residual Architecture with Artifact Suppression for Melanocytic Lesion Segmentation
- VesselRW: Weakly Supervised Subcutaneous Vessel Segmentation via Learned Random Walk Propagation
- Low-Rank Expert Merging for Multi-Source Domain Adaptation in Person Re-Identification
- Hybrid Machine Learning Framework for Predicting Geometric Deviations from 3D Surface Metrology
- A Joint Sparse Self-Representation Learning Method for Multiview Clustering
- LWT-ARTERY-LABEL: A Lightweight Framework for Automated Coronary Artery Identification
- Fusion-Based Brain Tumor Classification Using Deep Learning and Explainable AI, and Rule-Based Reasoning
- eMotions: A Large-Scale Dataset and Audio-Visual Fusion Network for Emotion Analysis in Short-form Videos
- A Simple yet Powerful Instance-Aware Prompting Framework for Training-free Camouflaged Object Segmentation
- MultiRef: Controllable Image Generation with Multiple Visual References
- Talk2Image: A Multi-Agent System for Multi-Turn Image Generation and Editing
- AR-GRPO: Training Autoregressive Image Generation Models via Reinforcement Learning
- SLRTP2025 Sign Language Production Challenge: Methodology, Results, and Future Work
- Adversarial Video Promotion Against Text-to-Video Retrieval
- Evaluating Fisheye-Compatible 3D Gaussian Splatting Methods on Real Images Beyond 180 Degree Field of View
- TADoc: Robust Time-Aware Document Image Dewarping
- OctreeNCA: Single-Pass 184 MP Segmentation on Consumer Hardware
- S2-UniSeg: Fast Universal Agglomerative Pooling for Scalable Segment Anything without Supervision
- Spatio-Temporal Conditional Diffusion Models for Forecasting Future Multiple Sclerosis Lesion Masks Conditioned on Treatments
- HiMat: DiT-based Ultra-High Resolution SVBRDF Generation
- DocRefine: An Intelligent Framework for Scientific Document Understanding and Content Optimization based on Multimodal Large Model Agents
- MV-CoRe: Multimodal Visual-Conceptual Reasoning for Complex Visual Question Answering
- Large Language Model Evaluated Stand-alone Attention-Assisted Graph Neural Network with Spatial and Structural Information Interaction for Precise Endoscopic Image Segmentation
- 3DGS-VBench: A Comprehensive Video Quality Evaluation Benchmark for 3DGS Compression
- SAGCNet: Spatial-Aware Graph Completion Network for Missing Slice Imputation in Population CMR Imaging
- TeSO: Representing and Compressing 3D Point Cloud Scenes with Textured Surfel Octree
- ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting
- Communication-Efficient Multi-Agent 3D Detection via Hybrid Collaboration
- CMAMRNet: A Contextual Mask-Aware Network Enhancing Mural Restoration Through Comprehensive Mask Guidance
- Dynamic Pattern Alignment Learning for Pretraining Lightweight Human-Centric Vision Models
- SketchAnimator: Animate Sketch via Motion Customization of Text-to-Video Diffusion Models
- CoopDiff: Anticipating 3D Human-object Interactions via Contact-consistent Decoupled Diffusion
- EventRR: Event Referential Reasoning for Referring Video Object Segmentation
- Similarity Matters: A Novel Depth-guided Network for Image Restoration and A New Dataset
- Unsupervised Real-World Super-Resolution via Rectified Flow Degradation Modelling
- Bridging Semantic Logic Gaps: A Cognition-Inspired Multimodal Boundary-Preserving Network for Image Manipulation Localization
- Generic Calibration: Pose Ambiguity/Linear Solution and Parametric-hybrid Pipeline
- HaDM-ST: Histology-Assisted Differential Modeling for Spatial Transcriptomics Generation
- Landmark Guided Visual Feature Extractor for Visual Speech Recognition with Limited Resource
- ASM-UNet: Adaptive Scan Mamba Integrating Group Commonalities and Individual Variations for Fine-Grained Segmentation
- Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers
- SUIT: Spatial-Spectral Union-Intersection Interaction Network for Hyperspectral Object Tracking
- Understanding Dynamic Scenes in Ego Centric 4D Point Clouds
- Small-Large Collaboration: Training-efficient Concept Personalization for Large VLM using a Meta Personalized Small VLM
- SynMatch: Rethinking Consistency in Medical Image Segmentation with Sparse Annotations
- BEVANet: Bilateral Efficient Visual Attention Network for Real-Time Semantic Segmentation
- MobileViCLIP: An Efficient Video-Text Model for Mobile Devices
- DocR1: Evidence Page-Guided GRPO for Multi-Page Document Understanding
- RORPCap: Retrieval-based Objects and Relations Prompt for Image Captioning
- Planner-Refiner: Dynamic Space-Time Refinement for Vision-Language Alignment in Videos
- CoAR: Concept Injection into Autoregressive Models for Personalized Text-to-Image Generation
- SODiff: Semantic-Oriented Diffusion Model for JPEG Compression Artifacts Removal
- GS4Buildings: Prior-Guided Gaussian Splatting for 3D Building Reconstruction
- Training and Inference within 1 Second -- Tackle Cross-Sensor Degradation of Real-World Pansharpening with Efficient Residual Feature Tailoring
- DIP-GS: Deep Image Prior For Gaussian Splatting Sparse View Recovery
- LET-US: Long Event-Text Understanding of Scenes
- ForensicsSAM: Toward Robust and Unified Image Forgery Detection and Localization Resisting to Adversarial Attack
- CharacterShot: Controllable and Consistent 4D Character Animation
- CLUE: Leveraging Low-Rank Adaptation to Capture Latent Uncovered Evidence for Image Forgery Localization
- Levarging Learning Bias for Noisy Anomaly Detection
- Health Care Waste Classification Using Deep Learning Aligned with Nepal's Bin Color Guidelines
- AURA: A Fine-Grained Benchmark and Decomposed Metric for Audio-Visual Reasoning
- Novel View Synthesis with Gaussian Splatting: Impact on Photogrammetry Model Accuracy and Resolution
- VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding
- FormCoach: Lift Smarter, Not Harder
- Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing
- Enhancing Reliability of Medical Image Diagnosis through Top-rank Learning with Rejection Module
- Enhanced Generative Structure Prior for Chinese Text Image Super-resolution
- Domain Generalization of Pathological Image Segmentation by Patch-Level and WSI-Level Contrastive Learning
- CoT-Pose: Chain-of-Thought Reasoning for 3D Pose Generation from Abstract Prompts
- Adaptive Pseudo Label Selection for Individual Unlabeled Data by Positive and Unlabeled Learning
- Decoupled Functional Evaluation of Autonomous Driving Models via Feature Map Quality Scoring
- Splat4D: Diffusion-Enhanced 4D Gaussian Splatting for Temporally and Spatially Consistent Content Creation
- Joint Transcription of Acoustic Guitar Strumming Directions and Chords
- Improving Document Retrieval Coherence for Semantically Equivalent Queries
- Exploring Procedural Data Generation for Automatic Acoustic Guitar Fingerpicking Transcription
- Audio-Thinker: Guiding Audio Language Model When and How to Think via Reinforcement Learning
- How Chinese are Chinese Language Models? The Puzzling Lack of Language Policy in China's LLMs
- CLAIR-A: Leveraging Large Language Models to Judge Audio Captions
- Strengthening False Information Propagation Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques in comparison to BERT
- ReGLA: Refining Gated Linear Attention
- ALFA: Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning
- URO-Bench: Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models
- Invisible Walls in Cities: Leveraging Large Language Models to Predict Urban Segregation Experience with Social Media Content
- X-EcoMLA: Upcycling Pre-Trained Attention into MLA for Efficient and Extreme KV Compression
- Both Direct and Indirect Evidence Contribute to Dative Alternation Preferences in Language Models
- Overcoming Vocabulary Constraints with Pixel-level Fallback
- NoveltyBench: Evaluating Language Models for Humanlike Diversity
- QUDsim: Quantifying Discourse Similarities in LLM-Generated Text
- GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning
- Planning with Diffusion Models for Target-Oriented Dialogue Systems
- Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control
- RAIR: Retrieval-Augmented Iterative Refinement for Chinese Spelling Correction
- WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
- Rethinking Prompt Optimizers: From Prompt Merits to Optimization
- Decoding the Multimodal Mind: Generalizable Brain-to-Text Translation via Multimodal Alignment and Adaptive Routing
- The taggedPBC: Annotating a massive parallel corpus for crosslinguistic investigations
- WebDancer: Towards Autonomous Information Seeking Agency
- Document Valuation in LLM Summaries: A Cluster Shapley Approach
- PersianMedQA: Evaluating Large Language Models on a Persian-English Bilingual Medical Question Answering Benchmark
- ACCESS DENIED INC: The First Benchmark Environment for Sensitivity Awareness
- Structure-Augmented Reasoning Generation
- PersonalAI: A Systematic Comparison of Knowledge Graph Storage and Retrieval Approaches for Personalized LLM agents
- MDC-R: The Minecraft Dialogue Corpus with Reference
- Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning
- Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective
- EduCoder: An Open-Source Annotation System for Education Transcript Data
- Investigating writing style as a contributor to gender gaps in science and technology
- SynthVLM: Towards High-Quality and Efficient Synthesis of Image-Caption Datasets for Vision-Language Models
- MathScape: Benchmarking Multimodal Large Language Models in Real-World Mathematical Contexts
- Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs
- How Far Are We from Generating Missing Modalities with Foundation Models?
- GenEscape: Hierarchical Multi-Agent Generation of Escape Room Puzzles
- Vec2Summ: Text Summarization via Probabilistic Sentence Embeddings
- BharatBBQ: A Multilingual Bias Benchmark for Question Answering in the Indian Context
- Gradient Surgery for Safe LLM Fine-Tuning
- Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
- Enhancing Rumor Detection Methods with Propagation Structure Infused Language Model
- Prompt Tuning for Few-Shot Continual Learning Named Entity Recognition
- The 2D+ Dynamic Articulatory Model DYNARTmo: Tongue-Palate Contact Area Estimation
- Arce: Augmented Roberta with Contextualized Elucidations for Ner in Automated Rule Checking
- CCFQA: A Benchmark for Cross-Lingual and Cross-Modal Speech and Text Factuality Evaluation
- Think Before You Talk: Enhancing Meaningful Dialogue Generation in Full-Duplex Speech Language Models with Planning-Inspired Text Guidance
- Let's Revise Step-by-Step: A Unified Local Search Framework for Code Generation with LLMs
- Positional Biases Shift as Inputs Approach Context Window Limits
- Augmenting Bias Detection in LLMs Using Topological Data Analysis
- From Trial-and-Error to Improvement: A Systematic Analysis of LLM Exploration Mechanisms in RLVR
- Keyword-Centric Prompting for One-Shot Event Detection with Self-Generated Rationale Enhancements
- What am I missing here?: Evaluating Large Language Models for Masked Sentence Prediction
- Exploring Causal Effect of Social Bias on Faithfulness Hallucinations in Large Language Models
- SASST: Leveraging Syntax-Aware Chunking and LLMs for Simultaneous Speech Translation
- Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
- Can You Trick the Grader? Adversarial Persuasion of LLM Judges
- Evaluating Compositional Approaches for Focus and Sentiment Analysis
- Evaluating Large Language Models as Expert Annotators
- LLMs for Law: Evaluating Legal-Specific LLMs on Contract Understanding
- Large Language Models for Czech Aspect-Based Sentiment Analysis
- Few-shot Cross-lingual Aspect-Based Sentiment Analysis with Sequence-to-Sequence Models
- Tailored Emotional LLM-Supporter: Enhancing Cultural Sensitivity
- Challenges and opportunities in portraying emotion in generated sign language
- Expert Preference-based Evaluation of Automated Related Work Generation
- Large Language Models for Subjective Language Understanding: A Survey
- Toward Machine Interpreting: Lessons from Human Interpreting Studies
- Understanding Syntactic Generalization in Structure-inducing Language Models
- The Medical Metaphors Corpus (MCC)
- WideSearch: Benchmarking Agentic Broad Info-Seeking
- Progressive Depth Up-scaling via Optimal Transport
- 9th Workshop on Sign Language Translation and Avatar Technologies (SLTAT 2025)
- Iterative refinement, not training objective, makes HuBERT behave differently from wav2vec 2.0
- Czech Dataset for Complex Aspect-Based Sentiment Analysis Tasks
- Data-Efficient Biomedical In-Context Learning: A Diversity-Enhanced Submodular Perspective
- REX-RAG: Reasoning Exploration with Policy Correction in Retrieval-Augmented Generation
- Efficient Speculative Decoding for Llama at Scale: Challenges and Solutions
- Exploring Safety Alignment Evaluation of LLMs in Chinese Mental Health Dialogues via LLM-as-Judge
- Jinx: Unlimited LLMs for Probing Alignment Failures
- Towards Real-World Rumor Detection: Anomaly Detection Framework with Graph Supervised Contrastive Learning
- PrLM: Learning Explicit Reasoning for Personalized RAG via Contrastive Reward Optimization
- BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent
- Train It and Forget It: Merge Lists are Unnecessary for BPE Inference in Language Models
- Measuring Stereotype and Deviation Biases in Large Language Models
- Testing the Limits of Machine Translation from One Book
- SEVADE: Self-Evolving Multi-Agent Analysis with Decoupled Evaluation for Hallucination-Resistant Irony Detection
- Annotating Errors in English Learners' Written Language Production: Advancing Automated Written Feedback Systems
- The ReQAP System for Question Answering over Personal Information
- Score Before You Speak: Improving Persona Consistency in Dialogue Generation using Response Quality Scores
- Two-Stage Quranic QA via Ensemble Retrieval and Instruction-Tuned Answer Extraction
- Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
- Gradient Descent Finds Over-Parameterized Neural Networks with Sharp Generalization for Nonparametric Regression
- TDDBench: A Benchmark for Training data detection
- Quantum Policy Gradient in Reproducing Kernel Hilbert Space
- Pairwise Markov Chains for Volatility Forecasting
- Reconstruction of boosted and resolved multi-Higgs-boson events with symmetry-preserving attention networks
- Mamba-based Deep Learning Approach for Sleep Staging on a Wireless Multimodal Wearable System without Electroencephalography
- Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study
- Accurate and thermodynamically consistent hydrogen equation of state for planetary modeling with flow matching
- MatCLIP: Light- and Shape-Insensitive Assignment of PBR Material Models
- ScaffoldGPT: A Scaffold-based GPT Model for Drug Optimization
- A Practical Introduction to Kernel Discrepancies: MMD, HSIC & KSD
- Exploration of Hepatitis B Virus Infection Dynamics through Physics-Informed Deep Learning Approach
- EDiT: Efficient Diffusion Transformers with Linear Compressed Attention
- Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering
- How Relevance Emerges: Interpreting LoRA Fine-Tuning in Reranking LLMs
- TFMPathy: Tabular Foundation Model for Privacy-Aware, Generalisable Empathy Detection from Videos
- Exploring Video-Based Driver Activity Recognition under Noisy Labels
- Interactive Imitation Learning for Dexterous Robotic Manipulation: Challenges and Perspectives -- A Survey
- SVarM: Linear Support Varifold Machines for Classification and Regression on Geometric Data
- OmniFluids: Physics Pre-trained Modeling of Fluid Dynamics
- Inference-Time Gaze Refinement for Micro-Expression Recognition: Enhancing Event-Based Eye Tracking with Motion-Aware Post-Processing
- Position: Certified Robustness Does Not (Yet) Imply Model Security
- Coupled Entropy: A Goldilocks Generalization for Complex Systems
- Optimal and Practical Batched Linear Bandit Algorithm
- Phase transition of the Sinkhorn-Knopp algorithm
- From Time-series Generation, Model Selection to Transfer Learning: A Comparative Review of Pixel-wise Approaches for Large-scale Crop Mapping
- Wasserstein Barycenter Soft Actor-Critic
- PAE MobiLLM: Privacy-Aware and Efficient LLM Fine-Tuning on the Mobile Device via Additive Side-Tuning
- ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space
- Accurate Measles Rash Detection via Vision Transformer Fine-Tuning
- Highly Fast Text Segmentation With Pairwise Markov Chains
- Online Learning and Optimization for Queues with Unknown Demand Curve and Service Distribution
- Empathy Detection from Text, Audiovisual, Audio or Physiological Signals: A Systematic Review of Task Formulations and Machine Learning Methods
- A Deep Learning Based Resource Allocator for Communication Networks with Dynamic User Utility Demands
- Training 3D ResNets to Extract BSM Physics Parameters from Simulated Data
- Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains
- A variational Bayes approach to debiased inference for low-dimensional parameters in high-dimensional linear regression
- RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design
- Scikit-fingerprints: easy and efficient computation of molecular fingerprints in Python
- FQGA-single: Towards Fewer Training Epochs and Fewer Model Parameters for Image-to-Image Translation Tasks
- Quantum-data-driven dynamical transition in quantum learning
- FlatQuant: Flatness Matters for LLM Quantization
- Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts
- Tensor Decomposition with Unaligned Observations
- MS-IMAP -- A Multi-Scale Graph Embedding Approach for Interpretable Manifold Learning
- ADAM-SINDy: An Efficient Optimization Framework for Parameterized Nonlinear Dynamical System Identification
- An information-matching approach to optimal experimental design and active learning
- sbi reloaded: a toolkit for simulation-based inference workflows
- $\ell_0$-Regularized Quadratic Surface Support Vector Machines
- chebgreen: Learning and Interpolating Continuous Empirical Green's Functions from Data
- Covering Multiple Objectives with a Small Set of Solutions Using Bayesian Optimization
- On the Emergence of Position Bias in Transformers
- Chaos into Order: Neural Framework for Expected Value Estimation of Stochastic Partial Differential Equations
- Active Learning of Model Discrepancy with Bayesian Experimental Design
- Optimistic Interior Point Methods for Sequential Hypothesis Testing by Betting
- Active Advantage-Aligned Online Reinforcement Learning with Offline Data
- Fenchel-Young Variational Learning
- On the Duality between Gradient Transformations and Adapters
- Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
- Real-Time Moving Flock Detection in Pedestrian Trajectories Using Sequential Deep Learning Models
- Robustness to Geographic Distribution Shift Using Location Encoders
- Average-DICE: Stationary Distribution Correction by Regression
- Gradient Extrapolation for Debiased Representation Learning
- Empirical Analysis of Privacy-Fairness-Accuracy Trade-offs in Federated Learning: A Step Towards Responsible AI
- Uncertainty propagation in feed-forward neural network models
- Model-Agnostic Policy Explanations with Large Language Models
- Resource-efficient Inference with Foundation Model Programs
- Self-Supervised Autoencoder Network for Robust Heart Rate Extraction from Noisy Photoplethysmogram: Applying Blind Source Separation to Biosignal Analysis
- Time Marching Neural Operator FE Coupling: AI Accelerated Physics Modeling
- CAOTE: KV Cache Eviction for LLMs via Attention Output Error-Based Token Selection
- Unveiling 3D Ocean Biogeochemical Provinces in the North Atlantic: A Systematic Comparison and Validation of Clustering Methods
- DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering
- Forecasting at Full Spectrum: Holistic Multi-Granular Traffic Modeling under High-Throughput Inference Regimes
- FedSDAF: Leveraging Source Domain Awareness for Enhanced Federated Domain Generalization
- mAIstro: an open-source multi-agentic system for automated end-to-end development of radiomics and deep learning models for medical imaging
- A Square Peg in a Square Hole: Meta-Expert for Long-Tailed Semi-Supervised Learning
- Learning to Reason without External Rewards
- Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents
- MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection
- Machine Learning Algorithms for Improving Exact Classical Solvers in Mixed Integer Continuous Optimization
- Model-Agnostic Sentiment Distribution Stability Analysis for Robust LLM-Generated Texts Detection
- Explainable AI for Curie Temperature Prediction in Magnetic Materials
- TerraMAE: Learning Spatial-Spectral Representations from Hyperspectral Earth Observation Data via Adaptive Masked Autoencoders
- Statistical Inference for Autoencoder-based Anomaly Detection after Representation Learning-based Domain Adaptation
- Taking the Garbage Out of Data-Driven Prediction Across Climate Timescales
- Reconstruction of Solar EUV Irradiance Using CaII K Images and SOHO/SEM Data with Bayesian Deep Learning and Uncertainty Quantification
- Membership Inference Attacks with False Discovery Rate Control
- SEF-MK: Speaker-Embedding-Free Voice Anonymization through Multi-k-means Quantization
- QuProFS: An Evolutionary Training-free Approach to Efficient Quantum Feature Map Search
- AugLift: Boosting Generalization in Lifting-based 3D Human Pose Estimation
- Sensory robustness through top-down feedback and neural stochasticity in recurrent vision models
- How Does a Deep Neural Network Look at Lexical Stress?
- BIGBOY1.2: Generating Realistic Synthetic Data for Disease Outbreak Modelling and Analytics
- Channel Charting in Smart Radio Environments
- Nonparametric Reaction Coordinate Optimization with Histories: A Framework for Rare Event Dynamics
- Event-Aware Sentiment Factors from LLM-Augmented Financial Tweets: A Transparent Framework for Interpretable Quant Trading
- Grounding Multilingual Multimodal LLMs With Cultural Knowledge
- Statistical Theory of Multi-stage Newton Iteration Algorithm for Online Continual Learning
- Structured Superposition of Autoencoders for UEP Codes at Intermediate Blocklengths
- Commentary Generation for Soccer Highlights
- Barron Space Representations for Elliptic PDEs with Homogeneous Boundary Conditions
- Exploiting Layer Normalization Fine-tuning in Visual Transformer Foundation Models for Classification
- Generative Inversion for Property-Targeted Materials Design: Application to Shape Memory Alloys
- G-IFT: A Gated Linear Unit adapter with Iterative Fine-Tuning for Low-Resource Children's Speaker Verification
- Recommendation Is a Dish Better Served Warm
- Being-M0.5: A Real-Time Controllable Vision-Language-Motion Model
- Unequal Uncertainty: Rethinking Algorithmic Interventions for Mitigating Discrimination from AI
- EFU: Enforcing Federated Unlearning via Functional Encryption
- Stochastic dynamics learning with state-space systems
- Meta Off-Policy Estimation
- Safeguarding Generative AI Applications in Preclinical Imaging through Hybrid Anomaly Detection
- Gaussian Approximation for Two-Timescale Linear Stochastic Approximation
- Frequency-Domain Analysis of Time-Dependent Multiomic Data in Progressive Neurodegenerative Diseases: A Proposed Quantum-Classical Hybrid Approach with Quaternionic Extensions
- Adaptive Source-Channel Coding for Semantic Communications
- Likelihood Ratio Tests by Kernel Gaussian Embedding
- Sharper Perturbed-Kullback-Leibler Exponential Tail Bounds for Beta and Dirichlet Distributions
- Prediction error certification for PINNs: Theory, computation, and application to Stokes flow
- Optimizing Federated Learning for Scalable Power-demand Forecasting in Microgrids
- Robust Anomaly Detection in O-RAN: Leveraging LLMs against Data Manipulation Attacks
- PrIINeR: Towards Prior-Informed Implicit Neural Representations for Accelerated MRI
- MDD-Net: Multimodal Depression Detection through Mutual Transformer
- Assessing LLM Text Detection in Educational Contexts: Does Human Contribution Affect Detection?
- An effective potential for generative modelling with active matter
- Integrating Task-Specific and Universal Adapters for Pre-Trained Model-based Class-Incremental Learning
- Adaptive Learning for IRS-Assisted Wireless Networks: Securing Opportunistic Communications Against Byzantine Eavesdroppers
- Fed-TGAN: Federated Learning Framework for Synthesizing Tabular Data
- SOInter: A Novel Deep Energy Based Interpretation Method for Explaining Structured Output Models
- AdaBoost is not an Optimal Weak to Strong Learner
- Optimal Multi-Distribution Learning
- Monte Carlo with kernel-based Gibbs measures: Guarantees for probabilistic herding
- Finite-Time Convergence Analysis of ODE-based Generative Models for Stochastic Interpolants
- Intrinsic training dynamics of deep neural networks
- Tight Bounds for Schr\"odinger Potential Estimation in Unpaired Image-to-Image Translation Problems
- Parity Requires Unified Input Dependence and Negative Eigenvalues in SSMs
- Efficient Reward Identification In Max Entropy Reinforcement Learning with Sparsity and Rank Priors
- Unsupervised operator learning approach for dissipative equations via Onsager principle
- Towards Unveiling Predictive Uncertainty Vulnerabilities in the Context of the Right to Be Forgotten
- MOTGNN: Interpretable Graph Neural Networks for Multi-Omics Disease Classification
- Online Convex Optimization with Heavy Tails: Old Algorithms, New Regrets, and Applications
- N-BEATS-MOE: N-BEATS with a Mixture-of-Experts Layer for Heterogeneous Time Series Forecasting
- Enhancing Privacy in Decentralized Min-Max Optimization: A Differentially Private Approach
- FairDRL-ST: Disentangled Representation Learning for Fair Spatio-Temporal Mobility Prediction
- Physics-Informed Multimodal Bearing Fault Classification under Variable Operating Conditions using Transfer Learning
- Multimodal Remote Inference
- When and how can inexact generative models still sample from the data manifold?
- Extracting Complex Topology from Multivariate Functional Approximation: Contours, Jacobi Sets, and Ridge-Valley Graphs
- Beyond Single: A Data Selection Principle for LLM Alignment via Fine-Grained Preference Signals
- Multi-Turn Jailbreaks Are Simpler Than They Seem
- Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation
- Multi-Hop Privacy Propagation for Differentially Private Federated Learning in Social Networks
- Semantic-Enhanced Time-Series Forecasting via Large Language Models
- Detecting Mislabeled and Corrupted Data via Pointwise Mutual Information
- Robust Reinforcement Learning over Wireless Networks with Homomorphic State Representations
- Separation and Collaboration: Two-Level Routing Grouped Mixture-of-Experts for Multi-Domain Continual Learning
- A Tutorial: An Intuitive Explanation of Offline Reinforcement Learning Theory
- Topological Feature Compression for Molecular Graph Neural Networks
- EvoCoT: Overcoming the Exploration Bottleneck in Reinforcement Learning
- Learning Satellite Attitude Dynamics with Physics-Informed Normalising Flow
- Score Augmentation for Diffusion Models
- Adaptive Fine-Tuning via Pattern Specialization for Deep Time Series Forecasting
- Shapley-Inspired Feature Weighting in $k$-means with No Additional Hyperparameters
- A Physics-informed Deep Operator for Real-Time Freeway Traffic State Estimation
- Communication-Efficient Zero-Order and First-Order Federated Learning Methods over Wireless Networks
- Deep Learning-Based Analysis of Power Consumption in Gasoline, Electric, and Hybrid Vehicles
- From Source to Target: Leveraging Transfer Learning for Predictive Process Monitoring in Organizations
- ELF: Efficient Logic Synthesis by Pruning Redundancy in Refactoring
- Symbolic Quantile Regression for the Interpretable Prediction of Conditional Quantiles
- Fast and Generalizable parameter-embedded Neural Operators for Lithium-Ion Battery Simulation
- NeuroDx-LM: A Clinical Large-Scale Model for EEG-based Neurological Disorder Detection
- OFAL: An Oracle-Free Active Learning Framework
- FairFLRep: Fairness aware fault localization and repair of Deep Neural Networks
- Federated Learning for Epileptic Seizure Prediction Across Heterogeneous EEG Datasets
- Cross-Subject and Cross-Montage EEG Transfer Learning via Individual Tangent Space Alignment and Spatial-Riemannian Feature Fusion
- Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
- Do Streetscapes Still Matter for Customer Ratings of Eating and Drinking Establishments in Car-Dependent Cities?
- RMT-PPAD: Real-time Multi-task Learning for Panoptic Perception in Autonomous Driving
- What Makes "Good" Distractors for Object Hallucination Evaluation in Large Vision-Language Models?
- Transfer Learning with EfficientNet for Accurate Leukemia Cell Classification
- Generative Bid Shading in Real-Time Bidding Advertising
- Age-Diverse Deepfake Dataset: Bridging the Age Gap in Deepfake Detection
- From Label Error Detection to Correction: A Modular Framework and Benchmark for Object Detection Datasets
- Communication-Learning Co-Design for Differentially Private Over-the-Air Federated Distillation
- On the effectiveness of multimodal privileged knowledge distillation in two vision transformer based diagnostic applications
- Bridging Brain Connectomes and Clinical Reports for Early Alzheimer's Disease Diagnosis
- ImpliHateVid: A Benchmark Dataset and Two-stage Contrastive Learning Framework for Implicit Hate Speech Detection in Videos
- Benchmarking Self-Driving Labs
- Federated Online Learning for Heterogeneous Multisource Streaming Data
- Machines Learn Number Fields, But How? The Case of Galois Groups
- Role of Large Language Models and Retrieval-Augmented Generation for Accelerating Crystalline Material Discovery: A Systematic Review
- A Tight Lower Bound for the Approximation Guarantee of Higher-Order Singular Value Decomposition
- ClimateSOM: A Visual Analysis Workflow for Climate Ensemble Datasets
- Mitigating Distribution Shift in Graph-Based Android Malware Classification via Function Metadata and LLM Embeddings
- Story Ribbons: Reimagining Storyline Visualizations with Large Language Models
- A Score-based Diffusion Model Approach for Adaptive Learning of Stochastic Partial Differential Equation Solutions
- MOCA-HESP: Meta High-dimensional Bayesian Optimization for Combinatorial and Mixed Spaces via Hyper-ellipsoid Partitioning
- Energy Efficient Task Offloading in UAV-Enabled MEC Using a Fully Decentralized Deep Reinforcement Learning Approach
- Text to Speech System for Meitei Mayek Script
- Near-Optimal Convergence of Accelerated Gradient Methods under Generalized and $(L_0, L_1)$-Smoothness
- BrainATCL: Adaptive Temporal Brain Connectivity Learning for Functional Link Prediction and Age Estimation
- Approaching Maximal Information Extraction in Low-Signal Regimes via Multiple Instance Learning
- From Nodes to Narratives: Explaining Graph Neural Networks with LLMs and Graph Context
- Multi-Level Service Performance Forecasting via Spatiotemporal Graph Neural Networks
- How Effectively Can Large Language Models Connect SNP Variants and ECG Phenotypes for Cardiovascular Risk Prediction?
- A Globally Optimal Analytic Solution for Semi-Nonnegative Matrix Factorization with Nonnegative or Mixed Inputs
- Strategic Incentivization for Locally Differentially Private Federated Learning
- Policy Newton methods for Distortion Riskmetrics
- PySeizure: A single machine learning classifier framework to detect seizures in diverse datasets
- Self-Organizing Survival Manifolds: A Theory for Unsupervised Discovery of Prognostic Structures in Biological Systems
- Semi-Supervised Supply Chain Fraud Detection with Unsupervised Pre-Filtering
- GFlowNets for Learning Better Drug-Drug Interaction Representations
- Hypergraph Neural Network with State Space Models for Node Classification
- Local Diffusion Models and Phases of Data Distributions
- Learning to Forget with Information Divergence Reweighted Objectives for Noisy Labels
- Privacy-Preserving Tabular Synthetic Data Generation Using TabularARGN
- Transferring Social Network Knowledge from Multiple GNN Teachers to Kolmogorov-Arnold Networks
- Watermarking Kolmogorov-Arnold Networks for Emerging Networked Applications via Activation Perturbation
- Stabilizing Federated Learning under Extreme Heterogeneity with HeteRo-Select
- CISO: Species Distribution Modeling Conditioned on Incomplete Species Observations
- Fed MobiLLM: Efficient Federated LLM Fine-Tuning over Heterogeneous Mobile Devices via Server Assisted Side-Tuning
- Technical Report: Full-Stack Fine-Tuning for the Q Programming Language
- Conformal Prediction and Trustworthy AI
- QuiZSF: An efficient data-model interaction framework for zero-shot time-series forecasting
- BoRA: Towards More Expressive Low-Rank Adaptation with Block Diversity
- Structure-Preserving Digital Twins via Conditional Neural Whitney Forms
- Discovery Learning accelerates battery design evaluation
- UniMove: A Unified Model for Multi-city Human Mobility Prediction
- A Comparative Study of Feature Selection in Tsetlin Machines
- TLCCSP: A Scalable Framework for Enhancing Time Series Forecasting with Time-Lagged Cross-Correlations
- A Stage-Aware Mixture of Experts Framework for Neurodegenerative Disease Progression Modelling
- Differentiable Adaptive Kalman Filtering via Optimal Transport
- Improving Real-Time Concept Drift Detection using a Hybrid Transformer-Autoencoder Framework
- RAPNet: A Receptive-Field Adaptive Convolutional Neural Network for Pansharpening
- SystolicAttention: Fusing FlashAttention within a Single Systolic Array
- Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?
- El Agente: An Autonomous Agent for Quantum Chemistry
- Reasoning Capabilities of Large Language Models on Dynamic Tasks
- Identification of Probabilities of Causation: A Complete Characterization
- SMART: Self-Generating and Self-Validating Multi-Dimensional Assessment for LLMs' Mathematical Problem Solving
- Personalized Constitutionally-Aligned Agentic Superego: Secure AI Behavior Aligned to Diverse Human Values
- Reinforcement Learning for Hybrid Charging Stations Planning and Operation Considering Fixed and Mobile Chargers
- Efficient Contextual Preferential Bayesian Optimization with Historical Examples
- Active Policy Improvement from Multiple Black-box Oracles
- Blending Imitation and Reinforcement Learning for Robust Policy Improvement
- Deep Neural Networks Can Learn Generalizable Same-Different Visual Relations
- ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
- Sparse Variational Student-t Processes
- SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection
- Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring
- On the Sample Efficiency of Abstractions and Potential-Based Reward Shaping in Reinforcement Learning
- Runtime Monitoring and Enforcement of Conditional Fairness in Generative AIs
- Fractured Glass, Failing Cameras: Simulating Physics-Based Adversarial Samples for Autonomous Driving Systems
- From Spikes to Heavy Tails: Unveiling the Spectral Evolution of Neural Networks
- LVBench: An Extreme Long Video Understanding Benchmark
- AI-AI Bias: large language models favor communications generated by large language models
- A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
- Chain of Thought Still Thinks Fast: APriCoT Helps with Thinking Slow
- Reward-Directed Score-Based Diffusion Models via q-Learning
- EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping
- In-Situ Fine-Tuning of Wildlife Models in IoT-Enabled Camera Traps for Efficient Adaptation
- A Closer Look at Machine Unlearning for Large Language Models
- Exploring Spatial Representation to Enhance LLM Reasoning in Aerial Vision-Language Navigation
- MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection
- EfficientEQA: An Efficient Approach to Open-Vocabulary Embodied Question Answering
- Zero-Shot Voice Conversion via Content-Aware Timbre Ensemble and Conditional Flow Matching
- Steering AI-Driven Personalization of Scientific Text for General Audiences
- Understanding and Mitigating Memorization in Generative Models via Sharpness of Probability Landscapes
- LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation
- B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens
- POEX: Towards Policy Executable Jailbreak Attacks Against the LLM-based Robots
- MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval
- Toward Intelligent and Secure Cloud: Large Language Model Empowered Proactive Defense
- WebWalker: Benchmarking LLMs in Web Traversal
- Ehrenfeucht-Haussler Rank and Chain of Thought
- Softplus Attention with Re-weighting Boosts Length Extrapolation in Large Language Models
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging
- Mitigating Traffic Oscillations in Mixed Traffic Flow with Scalable Deep Koopman Predictive Control
- MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization
- Schema-Guided Scene-Graph Reasoning based on Multi-Agent Large Language Model System
- Predicting Depression in Screening Interviews from Interactive Multi-Theme Collaboration
- Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA
- Collective Reasoning Among LLMs: A Framework for Answer Validation Without Ground Truth
- ElementaryNet: A Non-Strategic Neural Network for Predicting Human Behavior in Normal-Form Games
- From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
- FunGraph: Functionality Aware 3D Scene Graphs for Language-Prompted Scene Interaction
- A Theory of Learning with Autoregressive Chain of Thought
- Learning Adaptive Dexterous Grasping from Single Demonstrations
- Learning 3D-Gaussian Simulators from RGB Videos
- $\mu$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models
- How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence
- Bidirectional Hierarchical Protein Multi-Modal Representation Learning
- Sparsity Outperforms Low-Rank Projections in Few-Shot Adaptation
- A Multimodal Deep Learning Approach for White Matter Shape Prediction in Diffusion MRI Tractography
- Can LLM-based Financial Investing Strategies Outperform the Market in Long Run?
- Uniform Loss vs. Specialized Optimization: A Comparative Analysis in Multi-Task Learning
- RIDGECUT: Learning Graph Partitioning with Rings and Wedges
- Extracting Probabilistic Knowledge from Large Language Models for Bayesian Network Parameterization
- Improving LLM Outputs Against Jailbreak Attacks with Expert Model Integration
- FP4 All the Way: Fully Quantized Training of LLMs
- CADRE: Customizable Assurance of Data Readiness in Privacy-Preserving Federated Learning
- Verbal Werewolf: Engage Users with Verbalized Agentic Werewolf Game Framework
- HERGC: Heterogeneous Experts Representation and Generative Completion for Multimodal Knowledge Graphs
- Winner-takes-all for Multivariate Probabilistic Time Series Forecasting
- MLOps with Microservices: A Case Study on the Maritime Domain
- Physics-Informed Teleconnection-Aware Transformer for Global Subseasonal-to-Seasonal Forecasting
- A Two-stage Optimization Method for Wide-range Single-electron Quantum Magnetic Sensing
- MMET: A Multi-Input and Multi-Scale Transformer for Efficient PDEs Solving
- DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving
- Granular-Ball-Induced Multiple Kernel K-Means
- Robust Behavior Cloning Via Global Lipschitz Regularization
- Robust Anomaly Detection in Network Traffic: Evaluating Machine Learning Models on CICIDS2017
- CycleDistill: Bootstrapping Machine Translation using LLMs with Cyclical Distillation
- Exploring Adapter Design Tradeoffs for Low Resource Music Generation
- ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation
- Probabilistic Optimality for Inference-time Scaling
- Foundation versus Domain-specific Models: Performance Comparison, Fusion, and Explainability in Face Recognition
- Addressing The Devastating Effects Of Single-Task Data Poisoning In Exemplar-Free Continual Learning
- LIRA: Inferring Segmentation in Large Multi-modal Models with Local Interleaved Region Assistance
- Speckle2Self: Self-Supervised Ultrasound Speckle Reduction Without Clean Data
- Bridging the Last Mile of Prediction: Enhancing Time Series Forecasting with Conditional Guided Flow Matching
- Towards Theoretical Understanding of Transformer Test-Time Computing: Investigation on In-Context Linear Regression
- IBPS: Indian Bail Prediction System
- ShoulderShot: Generating Over-the-Shoulder Dialogue Videos
- On the Limits of Selective AI Prediction: A Case Study in Clinical Decision Making
- SOFA: Deep Learning Framework for Simulating and Optimizing Atrial Fibrillation Ablation
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
- InterChart: Benchmarking Visual Reasoning Across Decomposed and Distributed Chart Information
- Efficient Approximate Posterior Sampling with Annealed Langevin Monte Carlo
- Attribution Explanations for Deep Neural Networks: A Theoretical Perspective
- Grasp-HGN: Grasping the Unexpected
- Discovering Spatial Correlations between Earth Observations in Global Atmospheric State Estimation by using Adaptive Graph Structure Learning
- GLiClass: Generalist Lightweight Model for Sequence Classification Tasks
- AIS-LLM: A Unified Framework for Maritime Trajectory Prediction, Anomaly Detection, and Collision Risk Assessment with Explainable Forecasting
- MORE-CLEAR: Multimodal Offline Reinforcement learning for Clinical notes Leveraged Enhanced State Representation
- TAR-TVG: Enhancing VLMs with Timestamp Anchor-Constrained Reasoning for Temporal Video Grounding
- LoSemB: Logic-Guided Semantic Bridging for Inductive Tool Retrieval
- Energy Consumption in Parallel Neural Network Training
- Training-Free ANN-to-SNN Conversion for High-Performance Spiking Transformer
- DoorDet: Semi-Automated Multi-Class Door Detection Dataset via Object Detection and Large Language Models
- CognitiveArm: Enabling Real-Time EEG-Controlled Prosthetic Arm Using Embodied Machine Learning
- A Rule-Based Approach to Specifying Preferences over Conflicting Facts and Querying Inconsistent Knowledge Bases
- Chimera: Harnessing Multi-Agent LLMs for Automatic Insider Threat Simulation
- Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment
- Sparse Probabilistic Graph Circuits
- UniSVG: A Unified Dataset for Vector Graphic Understanding and Generation with Multimodal Large Language Models
- Pareto Multi-Objective Alignment for Language Models
- PCA-Guided Autoencoding for Structured Dimensionality Reduction in Active Infrared Thermography
- MIND: A Noise-Adaptive Denoising Framework for Medical Images Integrating Multi-Scale Transformer
- Architectural Co-Design for Zero-Shot Anomaly Detection: Decoupling Representation and Dynamically Fusing Features in CLIP
- Auditory Intelligence: Understanding the World Through Sound
- DETACH: Cross-domain Learning for Long-Horizon Tasks via Mixture of Disentangled Experts
- Deep Space Weather Model: Long-Range Solar Flare Prediction from Multi-Wavelength Images
- Vertex Features for Neural Global Illumination
- Towards Human-AI Collaboration System for the Detection of Invasive Ductal Carcinoma in Histopathology Images
- Selective Contrastive Learning for Weakly Supervised Affordance Grounding
- Autonomous Navigation of Cloud-Controlled Quadcopters in Confined Spaces Using Multi-Modal Perception and LLM-Driven High Semantic Reasoning
- Not Yet AlphaFold for the Mind: Evaluating Centaur as a Synthetic Participant
- NeeCo: Image Synthesis of Novel Instrument States Based on Dynamic and Deformable 3D Gaussian Reconstruction
- Diffusing the Blind Spot: Uterine MRI Synthesis with Diffusion Models
- SCDF: A Speaker Characteristics DeepFake Speech Dataset for Bias Analysis
- Exploring the Challenges and Opportunities of AI-assisted Codebase Generation
- WeChat-YATT: A Simple, Scalable and Balanced RLHF Trainer
- Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL
- Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
- DIVER: A Multi-Stage Approach for Reasoning-intensive Information Retrieval
- Learning to Select MCP Algorithms: From Traditional ML to Dual-Channel GAT-MLP
- Advancing Knowledge Tracing by Exploring Follow-up Performance Trends
- Bridging ASR and LLMs for Dysarthric Speech Recognition: Benchmarking Self-Supervised and Generative Approaches
- Exploring Strategies for Personalized Radiation Therapy: Part III Identifying genetic determinants for Radiation Response with Meta Learning
- BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models
- Multi-modal Adaptive Mixture of Experts for Cold-start Recommendation
- Rethinking Self-Replication: Detecting Distributed Selfhood in the Outlier Cellular Automaton
- On Understanding of the Dynamics of Model Capacity in Continual Learning
- Investigating the Design Space of Visual Grounding in Multimodal Large Language Model
- C-MAG: Cascade Multimodal Attributed Graphs for Supply Chain Link Prediction
- HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches
- Growing Reservoirs with Developmental Graph Cellular Automata
- Dual Information Speech Language Models for Emotional Conversations
- Grid2Guide: A* Enabled Small Language Model for Indoor Navigation
- ChatGPT on the Road: Leveraging Large Language Model-Powered In-vehicle Conversational Agents for Safer and More Enjoyable Driving Experience
- Hyperspectral Imaging
- GRASPTrack: Geometry-Reasoned Association via Segmentation and Projection for Multi-Object Tracking
- Vision-Based Localization and LLM-based Navigation for Indoor Environments
- MemoryKT: An Integrative Memory-and-Forgetting Method for Knowledge Tracing
- Optimal Transport Regularization for Speech Text Alignment in Spoken Language Models
- MuaLLM: A Multimodal Large Language Model Agent for Circuit Design Assistance with Hybrid Contextual Retrieval-Augmented Generation
- Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models
- COMponent-Aware Pruning for Accelerated Control Tasks in Latent Space Models
- Can AI Explanations Make You Change Your Mind?
- LPI-RIT at LeWiDi-2025: Improving Distributional Predictions via Metadata and Loss Reweighting with DisCo
- PyVeritas: On Verifying Python via LLM-Based Transpilation and Bounded Model Checking for C
- Neural Logic Networks for Interpretable Classification
- MedReasoner: Reinforcement Learning Drives Reasoning Grounding from Clinical Thought to Pixel-Level Precision
- RedDino: A foundation model for red blood cell analysis
- Street-Level AI: Are Large Language Models Ready for Real-World Judgments?
- Human-Alignment and Calibration of Inference-Time Uncertainty in Large Language Models
- SAEMark: Multi-bit LLM Watermarking with Inference-Time Scaling
- Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent
- Capabilities of GPT-5 on Multimodal Medical Reasoning
- OMGSR: You Only Need One Mid-timestep Guidance for Real-World Image Super-Resolution
- LL3M: Large Language 3D Modelers
- VGGSounder: Audio-Visual Evaluations for Foundation Models
- Cut2Next: Generating Next Shot via In-Context Tuning
- Sortability of Time Series Data
- Learning How to Vote with Principles: Axiomatic Insights Into the Collective Decisions of Neural Networks
- Graph-Powered Defense: Controller Area Network Intrusion Detection for Unmanned Aerial Vehicles
- A Research Agenda for Usability and Generalisation in Reinforcement Learning
- Observation Interference in Partially Observable Assistance Games
- Aligning Instruction Tuning with Pre-training
- Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models
- Reviewing Clinical Knowledge in Medical Large Language Models: Training and Beyond
- Instructor-Worker Large Language Model System for Policy Recommendation: a Case Study on Air Quality Analysis of the January 2025 Los Angeles Wildfires
- A Planning Compilation to Reason about Goal Achievement at Planning Time
- Extracting Overlapping Microservices from Monolithic Code via Deep Semantic Embeddings and Graph Neural Network-Based Soft Clustering
- From Product Hilbert Spaces to the Generalized Koopman Operator and the Nonlinear Fundamental Lemma
- VA-Blueprint: Uncovering Building Blocks for Visual Analytics System Design
- Intersectoral Knowledge in AI and Urban Studies: A Framework for Transdisciplinary Research
- From Field to Drone: Domain Drift Tolerant Automated Multi-Species and Damage Plant Semantic Segmentation for Herbicide Trials
- Word Clouds as Common Voices: LLM-Assisted Visualization of Participant-Weighted Themes in Qualitative Interviews
- Conversational DNA: A New Visual Language for Understanding Dialogue Structure in Human and AI
- A DICOM Image De-identification Algorithm in the MIDI-B Challenge
- Uncertainty-Driven Reliability: Selective Prediction and Trustworthy Deployment in Modern Machine Learning
- A Small-footprint Acoustic Echo Cancellation Solution for Mobile Full-Duplex Speech Interactions
- Retrieval-Augmented Multi-Agent System for Rapid Statement of Work Generation
- Offline-to-Online Reinforcement Learning with Classifier-Free Diffusion Generation
- Anatomy of a Machine Learning Ecosystem: 2 Million Models on Hugging Face
- Who's the Evil Twin? Differential Auditing for Undesired Behavior
- Highlight All the Phrases: Enhancing LLM Transparency through Visual Factuality Indicators
- Towards Experience-Centered AI: A Framework for Integrating Lived Experience in Design and Development
- AGIC: Attention-Guided Image Captioning to Improve Caption Relevance
- VSI: Visual Subtitle Integration for Keyframe Selection to enhance Long Video Understanding
- Sparsity-Driven Plasticity in Multi-Task Reinforcement Learning
- ESNERA: Empirical and semantic named entity alignment for named entity dataset merging
- NS-FPN: Improving Infrared Small Target Detection and Segmentation from Noise Suppression Perspective
- Maestro-EVC: Controllable Emotional Voice Conversion Guided by References and Explicit Prosody
- BASIC: Boosting Visual Alignment with Intrinsic Refined Embeddings in Multimodal Large Language Models
- Advancements in Chinese font generation since deep learning era: A survey
- MMReID-Bench: Unleashing the Power of MLLMs for Effective and Versatile Person Re-identification
- CROP: Integrating Topological and Spatial Structures via Cross-View Prefixes for Molecular LLMs
- CannyEdit: Selective Canny Control and Dual-Prompt Guidance for Training-Free Image Editing
- CLAP: Coreference-Linked Augmentation for Passage Retrieval
- When Prompt Engineering Meets Software Engineering: CNL-P as Natural and Robust "APIs'' for Human-AI Interaction
- Class Unbiasing for Generalization in Medical Diagnosis
- AMFT: Aligning LLM Reasoners by Meta-Learning the Optimal Imitation-Exploration Balance
- Neural Beam Field for Spatial Beam RSRP Prediction
- Beyond Frequency: Seeing Subtle Cues Through the Lens of Spatial Decomposition for Fine-Grained Visual Classification
- Can Multitask Learning Enhance Model Explainability?
- WeatherDiffusion: Weather-Guided Diffusion Model for Forward and Inverse Rendering
- Conformal Set-based Human-AI Complementarity with Multiple Experts
- Consensus-based Decentralized Multi-agent Reinforcement Learning for Random Access Network Optimization
- Neural Channel Knowledge Map Assisted Scheduling Optimization of Active IRSs in Multi-User Systems
- TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
- Making Effective Decisions: Machine Learning and the Ecogame in 1970
- From Imitation to Optimization: A Comparative Study of Offline Learning for Autonomous Driving
- Trustworthy Medical Imaging with Large Language Models: A Study of Hallucinations Across Modalities
- Balancing Privacy and Efficiency: Music Information Retrieval via Additive Homomorphic Encryption
- Whisfusion: Parallel ASR Decoding via a Diffusion Transformer
- ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability
- Membership and Memorization in LLM Knowledge Distillation
- SEADialogues: A Multilingual Culturally Grounded Multi-turn Dialogue Dataset on Southeast Asian Languages
- Surgical Knowledge Rewrite in Compact LLMs: An 'Unlearn-then-Learn' Strategy with ($IA^3$) for Localized Factual Modulation and Catastrophic Forgetting Mitigation
- Model Predictive Control for Crowd Navigation via Learning-Based Trajectory Prediction
- An Evolutionary Game-Theoretic Merging Decision-Making Considering Social Acceptance for Autonomous Driving
- SQL-Exchange: Transforming SQL Queries Across Domains
- Hide or Highlight: Understanding the Impact of Factuality Expression on User Trust
- Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
- Towards High-Order Mean Flow Generative Models: Feasibility, Expressivity, and Provably Efficient Criteria
- Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution
- Pref-GUIDE: Continual Policy Learning from Real-Time Human Feedback via Preference-Based Learning
- Perceptual Evaluation of GANs and Diffusion Models for Generating X-rays
- Toward AI Matching Policies in Homeless Services: A Qualitative Study with Policymakers
- "Draw me a curator" Examining the visual stereotyping of a cultural services profession by generative AI
- A Stable and Principled Loss Function for Direct Language Model Alignment
- A Real-Time, Self-Tuning Moderator Framework for Adversarial Prompt Detection
- SGD Convergence under Stepsize Shrinkage in Low-Precision Training
- Fairness of Automatic Speech Recognition: Looking Through a Philosophical Lens
- Intention-Aware Diffusion Model for Pedestrian Trajectory Prediction
- Integrating Neurosymbolic AI in Advanced Air Mobility: A Comprehensive Survey
- Large-scale Multi-sequence Pretraining for Generalizable MRI Analysis in Versatile Clinical Applications
- Lightweight Multi-Scale Feature Extraction with Fully Connected LMF Layer for Salient Object Detection
- Improved Personalized Headline Generation via Denoising Fake Interests from Implicit Feedback
- Schema Lineage Extraction at Scale: Multilingual Pipelines, Composite Evaluation, and Language-Model Benchmarks
- Dynamic Benchmark Construction for Evaluating Large Language Models on Real-World Codes
- Explainability-in-Action: Enabling Expressive Manipulation and Tacit Understanding by Bending Diffusion Models in ComfyUI
- DySK-Attn: A Framework for Efficient, Real-Time Knowledge Updating in Large Language Models via Dynamic Sparse Knowledge Attention
- Adapting LLMs to Time Series Forecasting via Temporal Heterogeneity Modeling and Semantic Alignment
- Can Smaller Large Language Models Evaluate Research Quality?
- Propagation Tree Is Not Deep: Adaptive Graph Contrastive Learning Approach for Rumor Detection
- Presburger Functional Synthesis: Complexity and Tractable Normal Forms
- What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains
- Neural Bridge Processes
- LLM-based Agents for Automated Confounder Discovery and Subgroup Analysis in Causal Inference
- Selection and Exploitation of High-Quality Knowledge from Large Language Models for Recommendation
- EDGE: A Theoretical Framework for Misconception-Aware Adaptive Learning
- SocRipple: A Two-Stage Framework for Cold-Start Video Recommendations
- Causal Negative Sampling via Diffusion Model for Out-of-Distribution Recommendation
- OpenHAIV: A Framework Towards Practical Open-World Learning
- Incorporating Contextual Paralinguistic Understanding in Large Speech-Language Models
- MAQuA: Adaptive Question-Asking for Multidimensional Mental Health Screening using Item Response Theory
- Representation Understanding via Activation Maximization
- Fine-Tuning Large Language Models Using EEG Microstate Features for Mental Workload Assessment
- "Pull or Not to Pull?'': Investigating Moral Biases in Leading Large Language Models Across Ethical Dilemmas
- Revisiting Data Attribution for Influence Functions
- When Is Prior Knowledge Helpful? Exploring the Evaluation and Selection of Unsupervised Pretext Tasks from a Neuro-Symbolic Perspective
- From Knowledge to Conjectures: A Modal Framework for Reasoning about Hypotheses
- DragonFruitQualityNet: A Lightweight Convolutional Neural Network for Real-Time Dragon Fruit Quality Inspection on Mobile Devices
- MCITlib: Multimodal Continual Instruction Tuning Library and Benchmark
- HealthBranches: Synthesizing Clinically-Grounded Question Answering Datasets via Decision Pathways
- FlexCTC: GPU-powered CTC Beam Decoding with advanced Contextual Abilities
- ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering
- Strategies of Code-switching in Human-Machine Dialogs
- Efficient Edge LLMs Deployment via HessianAware Quantization and CPU GPU Collaborative
- ProteoKnight: Convolution-based phage virion protein classification and uncertainty analysis
- AutoAssert 1: A LoRA Fine-Tuned LLM Model for Efficient Automated Assertion Generation
- Urbanite: A Dataflow-Based Framework for Human-AI Interactive Alignment in Urban Visual Analytics
- A Spin Glass Characterization of Neural Networks
- AgriVLN: Vision-and-Language Navigation for Agricultural Robots
- Leveraging GNN to Enhance MEF Method in Predicting ENSO
- Real-Time Analysis of Unstructured Data with Machine Learning on Heterogeneous Architectures
- Lightning Prediction under Uncertainty: DeepLight with Hazy Loss
- Freeze and Reveal: Exposing Modality Bias in Vision-Language Models
- Optimizing Districting Plans to Maximize Majority-Minority Districts via IPs and Local Search
- Stackelberg Coupling of Online Representation Learning and Reinforcement Learning
- Noise-Aware Generative Microscopic Traffic Simulation
- ALOPE: Adaptive Layer Optimization for Translation Quality Estimation using Large Language Models
- Large Language Models Do Not Simulate Human Psychology
- DatasetResearch: Benchmarking Agent Systems for Demand-Driven Dataset Discovery
- MASteer: Multi-Agent Adaptive Steer Strategy for End-to-End LLM Trustworthiness Repair
- DSperse: A Framework for Targeted Verification in Zero-Knowledge Machine Learning
- Simulating Biological Intelligence: Active Inference with Experiment-Informed Generative Model
- Efficient and Reliable Hitting-Set Computations for the Implicit Hitting Set Approach
- MultiMedEdit: A Scenario-Aware Benchmark for Evaluating Knowledge Editing in Medical VQA
- K-Dense Analyst: Towards Fully Automated Scientific Analysis
- Towards Safer AI Moderation: Evaluating LLM Moderators Through a Unified Benchmark Dataset and Advocating a Human-First Approach
- Designing a Feedback-Driven Decision Support System for Dynamic Student Intervention
- Multi-Dimensional Summarization Agents with Context-Aware Reasoning over Enterprise Tables
- EndoAgent: A Memory-Guided Reflective Agent for Intelligent Endoscopic Vision-to-Decision Reasoning
- Hallucination as a Computational Boundary: A Hierarchy of Inevitability and the Oracle Escape
- Rethinking Domain-Specific LLM Benchmark Construction: A Comprehensiveness-Compactness Approach
- Pentest-R1: Towards Autonomous Penetration Testing Reasoning Optimized via Two-Stage Reinforcement Learning
- Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks for Enhanced Action Understanding
- Generative AI for Strategic Plan Development
- A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
- Grounding Natural Language for Multi-agent Decision-Making with Multi-agentic LLMs
- CP-Agent: Agentic Constraint Programming
- Democratizing Diplomacy: A Harness for Evaluating Any Large Language Model on Full-Press Diplomacy
- MCPToolBench++: A Large Scale AI Agent Model Context Protocol MCP Tool Use Benchmark
- Optimization of Private Semantic Communication Performance: An Uncooperative Covert Communication Method
- HGMF: A Hierarchical Gaussian Mixture Framework for Scalable Tool Invocation within the Model Context Protocol
- ThinkTuning: Instilling Cognitive Reflections without Distillation
- Multimodal AI Systems for Enhanced Laying Hen Welfare Assessment and Productivity Optimization
- Breaking Down and Building Up: Mixture of Skill-Based Vision-and-Language Navigation Agents
- Disentangling Multiplex Spatial-Temporal Transition Graph Representation Learning for Socially Enhanced POI Recommendation
- 1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning
- EMPATHIA: Multi-Faceted Human-AI Collaboration for Refugee Integration
- Ethics2vec: aligning automatic agents and human preferences
- Symmetry-Aware Transformer Training for Automated Planning
- Best-Effort Policies for Robust Markov Decision Processes
- KIRETT: Knowledge-Graph-Based Smart Treatment Assistant for Intelligent Rescue Operations
- \(X\)-evolve: Solution space evolution powered by large language models
- Deep Reinforcement Learning with anticipatory reward in LSTM for Collision Avoidance of Mobile Robots
- FEAT: A Multi-Agent Forensic AI System with Domain-Adapted Large Language Model for Automated Cause-of-Death Analysis
- Interpreting Fedspeak with Confidence: A LLM-Based Uncertainty-Aware Framework Guided by Monetary Policy Transmission Paths
- Fitting Description Logic Ontologies to ABox and Query Examples
- AdaptFlow: Adaptive Workflow Optimization via Meta-Learning
- FNBT: Full Negation Belief Transformation for Open-World Information Fusion Based on Dempster-Shafer Theory of Evidence
- TeamMedAgents: Enhancing Medical Decision-Making of LLMs Through Structured Teamwork
- BlindGuard: Safeguarding LLM-based Multi-Agent Systems under Unknown Attacks
- From Natural Language to Solver-Ready Power System Optimization: An LLM-Assisted, Validation-in-the-Loop Framework
- UPP: Unified Path Planner with Adaptive Safety and Optimality
- AuthPrint: Fingerprinting Generative Models Against Malicious Model Providers
- Semi-automated Fact-checking in Portuguese: Corpora Enrichment using Retrieval with Claim extraction
- Forecasting Commodity Price Shocks Using Temporal and Semantic Fusion of Prices Signals and Agentic Generative AI Extracted Economic News
- Network-Specific Models for Multimodal Brain Response Prediction
- Computing with Canonical Microcircuits
- Understanding Human Limits in Pattern Recognition: A Computational Model of Sequential Reasoning in Rock, Paper, Scissors
- Retrieval augmented generation based dynamic prompting for few-shot biomedical named entity recognition using large language models
- CarbonScaling: Extending Neural Scaling Laws for Carbon Footprint in Large Language Models
- PiKV: KV Cache Management System for Mixture of Experts
- A Framework Combining 3D CNN and Transformer for Video-Based Behavior Recognition
- The Art of Breaking Words: Rethinking Multilingual Tokenizer Design
- MetAdv: A Unified and Interactive Adversarial Testing Platform for Autonomous Driving
- Symbolic Learning of Interpretable Reduced-Order Models for Jumping Quadruped Robots
- Factor Augmented Supervised Learning with Text Embeddings
- Surformer v1: Transformer-Based Surface Classification Using Tactile and Vision Features
- Teaching Introduction to Programming in the times of AI: A case study of a course re-design
- Efficient Safety Testing of Autonomous Vehicles via Adaptive Search over Crash-Derived Scenarios
- Leveraging LLMs for Privacy-Aware Predictions in Participatory Budgeting
- Discerning minds or generic tutors? Evaluating instructional guidance capabilities in Socratic LLMs
- Omni Geometry Representation Learning vs Large Language Models for Geospatial Entity Resolution
- Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning
- A Federated Learning Framework for Handling Subtype Confounding and Heterogeneity in Large-Scale Neuroimaging Diagnosis
- Generative Artificial Intelligence Extracts Structure-Function Relationships from Plants for New Materials
- Towards Integrated Alignment
- LLM Unlearning Without an Expert Curated Dataset
- Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs
- Generative AI for Intent-Driven Network Management in 6G: A Case Study on Hierarchical Learning Approach
- Generalizing Scaling Laws for Dense and Sparse Large Language Models
- Early Detection of Pancreatic Cancer Using Multimodal Learning on Electronic Health Record
- CoDe-NeRF: Neural Rendering via Dynamic Coefficient Decomposition
- Using Imperfect Synthetic Data in Downstream Inference Tasks
- Segmented Confidence Sequences and Multi-Scale Adaptive Confidence Segments for Anomaly Detection in Nonstationary Time Series
- Fractal Language Modelling by Universal Sequence Maps (USM)
- In-Context Reinforcement Learning via Communicative World Models
- Do Biased Models Have Biased Thoughts?
- MMFformer: Multimodal Fusion Transformer Network for Depression Detection
- Play Favorites: A Statistical Method to Measure Self-Bias in LLM-as-a-Judge
- Large Language Models for Oral History Understanding with Text Classification and Sentiment Analysis
- Learning Causal Structure Distributions for Robust Planning
- Analysis of Schedule-Free Nonconvex Optimization
- Many-Turn Jailbreaking
- FoundBioNet: A Foundation-Based Model for IDH Genotyping of Glioma from Multi-Parametric MRI
- SafePLUG: Empowering Multimodal LLMs with Pixel-Level Insight and Temporal Grounding for Traffic Accident Understanding
- PANAMA: A Network-Aware MARL Framework for Multi-Agent Path Finding in Digital Twin Ecosystems
- Zero-Direction Probing: A Linear-Algebraic Framework for Deep Analysis of Large-Language-Model Drift
- BiXSE: Improving Dense Retrieval via Probabilistic Graded Relevance Distillation
- PROPS: Progressively Private Self-alignment of Large Language Models
- Mode-Aware Non-Linear Tucker Autoencoder for Tensor-based Unsupervised Learning
- Geometry-Aware Spiking Graph Neural Network
- LSDTs: LLM-Augmented Semantic Digital Twins for Adaptive Knowledge-Intensive Infrastructure Planning
- Hardness-Aware Dynamic Curriculum Learning for Robust Multimodal Emotion Recognition with Missing Modalities
- Solving Pasur Using GPU-Accelerated Counterfactual Regret Minimization
- Operationalizing Serendipity: Multi-Agent AI Workflows for Enhanced Materials Characterization with Theory-in-the-Loop
- IRL-VLA: Training an Vision-Language-Action Policy via Reward World Model
- CountQA: How Well Do MLLMs Count in the Wild?
- Formal Concept Analysis: a Structural Framework for Variability Extraction and Analysis
- Zero-Shot Cellular Trajectory Map Matching
- Probabilistic Circuits for Knowledge Graph Completion with Reduced Rule Sets
- GLIDR: Graph-Like Inductive Logic Programming with Differentiable Reasoning
- ParBalans: Parallel Multi-Armed Bandits-based Adaptive Large Neighborhood Search
- Topology Generation of UAV Covert Communication Networks: A Graph Diffusion Approach with Incentive Mechanism
- Pushing the Envelope of LLM Inference on AI-PC
- A Fuzzy Logic Prompting Framework for Large Language Models in Adaptive and Uncertain Tasks
- Natural Language-Driven Viewpoint Navigation for Volume Exploration via Semantic Block Representation
- Remote Sensing Image Intelligent Interpretation with the Language-Centered Perspective: Principles, Methods and Challenges
- Multi-level Advantage Credit Assignment for Cooperative Multi-Agent Reinforcement Learning
- MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams
- MeteorPred: A Meteorological Multimodal Large Model and Dataset for Severe Weather Event Prediction
- Pushdown Reward Machines for Reinforcement Learning
- GDBA Revisited: Unleashing the Power of Guided Local Search for Distributed Constraint Optimization
- Automated Formalization via Conceptual Retrieval-Augmented LLMs
- Intrinsic Explainability of Multimodal Learning for Crop Yield Prediction
Research Sources: 1029 | Generated: 8/25/2025