AI RESEARCH PAPERS & ACADEMIC SOURCES
- Efficient Image-to-Image Schr\"odinger Bridge for CT Field of View Extension
- Fluid Dynamics and Domain Reconstruction from Noisy Flow Images Using Physics-Informed Neural Networks and Quasi-Conformal Mapping
- Temporally-Similar Structure-Aware Spatiotemporal Fusion of Satellite Images
- Allen: Rethinking MAS Design through Step-Level Policy Autonomy
- Guiding WaveMamba with Frequency Maps for Image Debanding
- AnatoMaskGAN: GNN-Driven Slice Feature Fusion and Noise Augmentation for Medical Semantic Image Synthesis
- LKFMixer: Exploring Large Kernel Feature For Efficient Image Super-Resolution
- Subcortical Masks Generation in CT Images via Ensemble-Based Cross-Domain Label Transfer
- SPG: Style-Prompting Guidance for Style-Specific Content Creation
- Relative Position Matters: Trajectory Prediction and Planning with Polar Representation
- Scanpath Prediction in Panoramic Videos via Expected Code Length Minimization
- Lightweight Attribute Localizing Models for Pedestrian Attribute Recognition
- Compositional Zero-shot Learning via Progressive Language-based Observations
- Wild2Avatar: Rendering Humans Behind Occlusions
- Effective Message Hiding with Order-Preserving Mechanisms
- Reconstructing Satellites in 3D from Amateur Telescope Images
- Towards Physically Realizable Adversarial Attacks in Embodied Vision Navigation
- Efficient High-Resolution Visual Representation Learning with State Space Model for Human Pose Estimation
- MUNBa: Machine Unlearning via Nash Bargaining
- GBR: Generative Bundle Refinement for High-fidelity Gaussian Splatting with Enhanced Mesh Reconstruction
- Learning an Adaptive and View-Invariant Vision Transformer for Real-Time UAV Tracking
- Towards Consumer-Grade Cybersickness Prediction: Multi-Model Alignment for Real-Time Vision-Only Inference
- LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recognition
- FairT2I: Mitigating Social Bias in Text-to-Image Generation via Large Language Model-Assisted Detection and Attribute Rebalancing
- Introducing Unbiased Depth into 2D Gaussian Splatting for High-accuracy Surface Reconstruction
- Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation
- AFR-CLIP: Enhancing Zero-Shot Industrial Anomaly Detection with Stateless-to-Stateful Anomaly Feature Rectification
- Image-to-Text for Medical Reports Using Adaptive Co-Attention and Triple-LSTM Module
- Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image Analysis
- Towards Generalizable Forgery Detection and Reasoning
- Casual3DHDR: High Dynamic Range 3D Gaussian Splatting from Casually Captured Videos
- Marmot: Object-Level Self-Correction via Multi-Agent Reasoning
- Physics-Guided Image Dehazing Diffusion
- LSVG: Language-Guided Scene Graphs with 2D-Assisted Multi-Modal Encoding for 3D Visual Grounding
- PRS-Med: Position Reasoning Segmentation with Vision-Language Model in Medical Imaging
- PhysLab: A Benchmark Dataset for Multi-Granularity Visual Parsing of Physics Experiments
- Learning Camera-Agnostic White-Balance Preferences
- HealthiVert-GAN: A Novel Framework of Pseudo-Healthy Vertebral Image Synthesis for Interpretable Compression Fracture Grading
- Pathology-Guided AI System for Accurate Segmentation and Diagnosis of Cervical Spondylosis
- HepatoGEN: Generating Hepatobiliary Phase MRI with Perceptual and Adversarial Models
- Privacy Enhancement for Gaze Data Using a Noise-Infused Autoencoder
- A Survey on Video Temporal Grounding with Multimodal Large Language Model
- VSF: Simple, Efficient, and Effective Negative Guidance in Few-Step Image Generation Models By \underline{V}alue \underline{S}ign \underline{F}lip
- Relative Pose Regression with Pose Auto-Encoders: Enhancing Accuracy and Data Efficiency for Retail Applications
- ViPE: Video Pose Engine for 3D Geometric Perception
- Vision-Only Gaussian Splatting for Collaborative Semantic Occupancy Prediction
- Personalized Face Super-Resolution with Identity Decoupling and Fitting
- Deep Learning for Automated Identification of Vietnamese Timber Species: A Tool for Ecological Monitoring and Conservation
- NIRMAL Pooling: An Adaptive Max Pooling Approach with Non-linear Activation for Enhanced Image Classification
- Topological Structure Description for Artcode Detection Using the Shape of Orientation Histogram
- Analysis of the Compaction Behavior of Textile Reinforcements in Low-Resolution In-Situ CT Scans via Machine-Learning and Descriptor-Based Methods
- IPG: Incremental Patch Generation for Generalized Adversarial Patch Training
- MedAtlas: Evaluating LLMs for Multi-Round, Multi-Task Medical Reasoning Across Diverse Imaging Modalities and Clinical Text
- From Promise to Practical Reality: Transforming Diffusion MRI Analysis with Fast Deep Learning Enhancement
- CSNR and JMIM Based Spectral Band Selection for Reducing Metamerism in Urban Driving
- EVCtrl: Efficient Control Adapter for Visual Generation
- Are Large Pre-trained Vision Language Models Effective Construction Safety Inspectors?
- MedSAMix: A Training-Free Model Merging Approach for Medical Image Segmentation
- Advancing 3D Scene Understanding with MV-ScanQA Multi-View Reasoning Evaluation and TripAlign Pre-training Dataset
- Data-Driven Abdominal Phenotypes of Type 2 Diabetes in Lean, Overweight, and Obese Cohorts
- HierOctFusion: Multi-scale Octree-based 3D Shape Generation via Part-Whole-Hierarchy Message Passing
- UWB-PostureGuard: A Privacy-Preserving RF Sensing System for Continuous Ergonomic Sitting Posture Monitoring
- Residual-based Efficient Bidirectional Diffusion Model for Image Dehazing and Haze Generation
- LEARN: A Story-Driven Layout-to-Image Generation Framework for STEM Instruction
- Semi-supervised Image Dehazing via Expectation-Maximization and Bidirectional Brownian Bridge Diffusion Models
- VFM-Guided Semi-Supervised Detection Transformer for Source-Free Object Detection in Remote Sensing Images
- Exploring the Tradeoff Between Diversity and Discrimination for Continuous Category Discovery
- Fine-Grained VLM Fine-tuning via Latent Hierarchical Adapter Learning
- Versatile Video Tokenization with Generative 2D Gaussian Splatting
- Generating Dialogues from Egocentric Instructional Videos for Task Assistance: Dataset, Method and Benchmark
- UAV-VL-R1: Generalizing Vision-Language Models via Supervised Fine-Tuning and Multi-Stage GRPO for UAV Visual Reasoning
- A Coarse-to-Fine Human Pose Estimation Method based on Two-stage Distillation and Progressive Graph Neural Network
- FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation
- Domain-aware Category-level Geometry Learning Segmentation for 3D Point Clouds
- Unifying Scale-Aware Depth Prediction and Perceptual Priors for Monocular Endoscope Pose Estimation and Tissue Reconstruction
- TimeMachine: Fine-Grained Facial Age Editing with Identity Preservation
- Hyperspectral vs. RGB for Pedestrian Segmentation in Urban Driving Scenes: A Comparative Study
- Denoise-then-Retrieve: Text-Conditioned Video Denoising for Video Moment Retrieval
- Logic Unseen: Revealing the Logical Blindspots of Vision-Language Models
- Delving into Dynamic Scene Cue-Consistency for Robust 3D Multi-Object Tracking
- Noise Matters: Optimizing Matching Noise for Diffusion Classifiers
- GANDiff FR: Hybrid GAN Diffusion Synthesis for Causal Bias Attribution in Face Recognition
- Index-Aligned Query Distillation for Transformer-based Incremental Object Detection
- Cost-Effective Active Labeling for Data-Efficient Cervical Cell Classification
- HOID-R1: Reinforcement Learning for Open-World Human-Object Interaction Detection Reasoning with Multimodal Large Language Model
- RMFAT: Recurrent Multi-scale Feature Atmospheric Turbulence Mitigator
- Training-free Dimensionality Reduction via Feature Truncation: Enhancing Efficiency in Privacy-preserving Multi-Biometric Systems
- ImagiDrive: A Unified Imagination-and-Planning Framework for Autonomous Driving
- Remove360: Benchmarking Residuals After Object Removal in 3D Gaussian Splatting
- MM-R1: Unleashing the Power of Unified Multimodal Large Language Models for Personalized Image Generation
- Data-Driven Deepfake Image Detection Method -- The 2024 Global Deepfake Image Detection Challenge
- CoFi: A Fast Coarse-to-Fine Few-Shot Pipeline for Glomerular Basement Membrane Segmentation
- TACR-YOLO: A Real-time Detection Framework for Abnormal Human Behaviors Enhanced with Coordinate and Task-Aware Representations
- OpenConstruction: A Systematic Synthesis of Open Visual Datasets for Data-Centric Artificial Intelligence in Construction Monitoring
- CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models
- Automated Building Heritage Assessment Using Street-Level Imagery
- Perception in Plan: Coupled Perception and Planning for End-to-End Autonomous Driving
- Hierarchical Graph Feature Enhancement with Adaptive Frequency Modulation for Visual Recognition
- AIM: Amending Inherent Interpretability via Self-Supervised Masking
- A Real-time Concrete Crack Detection and Segmentation Model Based on YOLOv11
- Multi-State Tracker: Enhancing Efficient Object Tracking via Multi-State Specialization and Interaction
- Reinforcing Video Reasoning Segmentation to Think Before It Segments
- Training-Free Anomaly Generation via Dual-Attention Enhancement in Diffusion Model
- TrajSV: A Trajectory-based Model for Sports Video Representations and Applications
- Causality Matters: How Temporal Information Emerges in Video Language Models
- DashCam Video: A complementary low-cost data stream for on-demand forest-infrastructure system monitoring
- CoreEditor: Consistent 3D Editing via Correspondence-constrained Diffusion
- LoRAtorio: An intrinsic approach to LoRA Skill Composition
- Thyme: Think Beyond Images
- The Role of Radiographic Knee Alignment in Knee Replacement Outcomes and Opportunities for Artificial Intelligence-Driven Assessment
- Failures to Surface Harmful Contents in Video Large Language Models
- GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning
- Visual-RAG: Benchmarking Text-to-Image Retrieval Augmented Generation for Visual Knowledge Intensive Queries
- Inaccuracy of an E-Dictionary and Its Influence on Chinese Language Users
- Investigating the Effect of Parallel Data in the Cross-Lingual Transfer for Vision-Language Encoders
- Relationship Detection on Tabular Data Using Statistical Analysis and Large Language Models
- Causal Language in Observational Studies: Sociocultural Backgrounds and Team Composition
- MMESGBench: Pioneering Multimodal Understanding and Complex Reasoning Benchmark for ESG Tasks
- A2HCoder: An LLM-Driven Coding Agent for Hierarchical Algorithm-to-HDL Translation
- PersonaTwin: A Multi-Tier Prompt Conditioning Framework for Generating and Evaluating Personalized Digital Twins
- Hell or High Water: Evaluating Agentic Recovery from External Failures
- BIPOLAR: Polarization-based granular framework for LLM bias evaluation
- Approaching the Source of Symbol Grounding with Confluent Reductions of Abstract Meaning Representation Directed Graphs
- Towards Reliable Multi-Agent Systems for Marketing Applications via Reflection, Memory, and Planning
- MobQA: A Benchmark Dataset for Semantic Understanding of Human Mobility Data through Question Answering
- Overcoming Low-Resource Barriers in Tulu: Neural Models and Corpus Creation for OffensiveLanguage Identification
- Personalized Distractor Generation via MCTS-Guided Reasoning Reconstruction
- Novel Parasitic Dual-Scale Modeling for Efficient and Accurate Multilingual Speech Translation
- UNVEILING: What Makes Linguistics Olympiad Puzzles Tricky for LLMs?
- AI in Mental Health: Emotional and Sentiment Analysis of Large Language Models' Responses to Depression, Anxiety, and Stress Queries
- SafeConstellations: Steering LLM Safety to Reduce Over-Refusals Through Task-Specific Trajectory
- LLM Compression: How Far Can We Go in Balancing Size and Performance?
- SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis
- Feedback Indicators: The Alignment between Llama and a Teacher in Language Learning
- Survey-to-Behavior: Downstream Alignment of Human Values in LLMs via Survey Questions
- HumorPlanSearch: Structured Planning and HuCoT for Contextual AI Humor
- Online Anti-sexist Speech: Identifying Resistance to Gender Bias in Political Discourse
- CoDiEmb: A Collaborative yet Distinct Framework for Unified Representation Learning in Information Retrieval and Semantic Textual Similarity
- Speciesism in AI: Evaluating Discrimination Against Animals in Large Language Models
- Language models align with brain regions that represent concepts across modalities
- AgentMental: An Interactive Multi-Agent Framework for Explainable and Adaptive Mental Health Assessment
- Representing Speech Through Autoregressive Prediction of Cochlear Tokens
- Dataset Creation for Visual Entailment using Generative AI
- TinyTim: A Family of Language Models for Divergent Generation
- The Next Phase of Scientific Fact-Checking: Advanced Evidence Retrieval from Complex Structured Academic Papers
- Empowering Multimodal LLMs with External Tools: A Comprehensive Survey
- Can Multi-modal (reasoning) LLMs detect document manipulation?
- PaperRegister: Boosting Flexible-grained Paper Search via Hierarchical Register Indexing
- +VeriRel: Verification Feedback to Enhance Document Retrieval for Scientific Fact Checking
- Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
- Benchmarking Prosody Encoding in Discrete Speech Tokens
- Emphasis Sensitivity in Speech Representations
- RULEBREAKERS: Challenging LLMs at the Crossroads between Formal Logic and Human-like Reasoning
- Personalized LLM for Generating Customized Responses to the Same Query from Different Users
- A Dual-Perspective NLG Meta-Evaluation Framework with Automatic Benchmark and Better Interpretability
- FAB-PPI: Frequentist, Assisted by Bayes, Prediction-Powered Inference
- Generalizable speech deepfake detection via meta-learned LoRA
- Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
- Adaptive Bayesian Optimization for Robust Identification of Stochastic Dynamical Systems
- Incorporating Coupling Knowledge into Echo State Networks for Learning Spatiotemporally Chaotic Dynamics
- Pr$\epsilon\epsilon$mpt: Sanitizing Sensitive Prompts for LLMs
- ElasticMM: Efficient Multimodal LLMs Serving with Elastic Multimodal Parallelism
- Semantically Guided Adversarial Testing of Vision Models Using Language Models
- Unified Knowledge Distillation Framework: Fine-Grained Alignment and Geometric Relationship Preservation for Deep Face Recognition
- Model Interpretability and Rationale Extraction by Input Mask Optimization
- Rationalizing Transformer Predictions via End-To-End Differentiable Self-Training
- SelfAdapt: Unsupervised Domain Adaptation of Cell Segmentation Models
- Semi-Supervised Learning with Online Knowledge Distillation for Skin Lesion Classification
- An Efficient Medical Image Classification Method Based on a Lightweight Improved ConvNeXt-Tiny Architecture
- Investigating Sensors and Methods in Grasp State Classification in Agricultural Manipulation
- Nonparametric learning of stochastic differential equations from sparse and noisy data
- Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers and Gradient Clipping
- Discovering Invariant Neighborhood Patterns for Heterophilic Graphs
- A Spectral Framework for Evaluating Geodesic Distances Between Graphs
- Incorporating Arbitrary Matrix Group Equivariance into KANs
- Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
- Perfect Counterfactuals in Imperfect Worlds: Modelling Noisy Implementation of Actions in Sequential Algorithmic Recourse
- Embedding Safety into RL: A New Take on Trust Region Methods
- Learning-based Sketches for Frequency Estimation in Data Streams without Ground Truth
- Vulnerability of Text-Matching in ML/AI Conference Reviewer Assignments to Collusions
- A Survey on Pre-Trained Diffusion Model Distillations
- An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models
- SAND: One-Shot Feature Selection with Additive Noise Distortion
- Neighbour-Driven Gaussian Process Variational Autoencoders for Scalable Structured Latent Modelling
- Central Path Proximal Policy Optimization
- Theory of Decentralized Robust Kernel-Based Learning
- LETS Forecast: Learning Embedology for Time Series Forecasting
- Chasing Moving Targets with Online Self-Play Reinforcement Learning for Safer Language Models
- Structured Generative Modeling with the Thermodynamic Kolmogorov-Arnold Model
- Synthetic Data for Robust Stroke Segmentation
- An Efficient Deep Learning Approach for Approximating Parameter-to-Solution Maps of PDEs
- Modeling Sampling Distributions of Test Statistics with Autograd
- Beyond algorithm hyperparameters: on preprocessing hyperparameters and associated pitfalls in machine learning applications
- Convergence of Statistical Estimators via Mutual Information Bounds
- AlgoTune: Can Language Models Speed Up General-Purpose Numerical Programs?
- A Cooperative Game-Based Multi-Criteria Weighted Ensemble Approach for Multi-Class Classification
- BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining
- Quantization vs Pruning: Insights from the Strong Lottery Ticket Hypothesis
- Conditional Independence Estimates for the Generalized Nonparanormal
- SHLIME: Foiling adversarial attacks fooling SHAP and LIME
- Abundance-Aware Set Transformer for Microbiome Sample Embedding
- A Feasibility Experiment on the Application of Predictive Coding to Instant Messaging Corpora
- Relative Advantage Debiasing for Watch-Time Prediction in Short-Video Recommendation
- Predictive Multimodal Modeling of Diagnoses and Treatments in EHR
- Hybrid-Hierarchical Fashion Graph Attention Network for Compatibility-Oriented and Personalized Outfit Recommendation
- CTRL Your Shift: Clustered Transfer Residual Learning for Many Small Datasets
- Towards the Next-generation Bayesian Network Classifiers
- Mitigating Modality Quantity and Quality Imbalance in Multimodal Online Federated Learning
- Meta-learning Structure-Preserving Dynamics
- Borrowing From the Future: Enhancing Early Risk Assessment through Contrastive Learning
- Air Quality PM2.5 Index Prediction Model Based on CNN-LSTM
- Enhancing Interactive Voting-Based Map Matching: Improving Efficiency and Robustness for Heterogeneous GPS Trajectories
- Group Fairness Meets the Black Box: Enabling Fair Algorithms on Closed LLMs via Post-Processing
- Boosting the Robustness-Accuracy Trade-off of SNNs by Robust Temporal Self-Ensemble
- Generalize across Homophily and Heterophily: Hybrid Spectral Graph Pre-Training and Prompt Tuning
- Conformal Prediction Meets Long-tail Classification
- A Global Dataset of Location Data Integrity-Assessed Reforestation Efforts
- Harmonized Gradient Descent for Class Imbalanced Data Stream Online Learning
- Fusing Rewards and Preferences in Reinforcement Learning
- A Remedy for Over-Squashing in Graph Learning via Forman-Ricci Curvature based Graph-to-Hypergraph Structural Lifting
- Generative Co-Design of Antibody Sequences and Structures via Black-Box Guidance in a Shared Latent Space
- Robust Convolution Neural ODEs via Contractivity-promoting regularization
- Multi-Sensory Cognitive Computing for Learning Population-level Brain Connectivity
- Calibrated and uncertain? Evaluating uncertainty estimates in binary classification models
- Predicting and Explaining Traffic Crash Severity Through Crash Feature Selection
- DiCriTest: Testing Scenario Generation for Decision-Making Agents Considering Diversity and Criticality
- Finite-Width Neural Tangent Kernels from Feynman Diagrams
- Physics-Informed Diffusion Models for Unsupervised Anomaly Detection in Multivariate Time Series
- DFed-SST: Building Semantic- and Structure-aware Topologies for Decentralized Federated Graph Learning
- Nested Operator Inference for Adaptive Data-Driven Learning of Reduced-order Models
- SeamlessFlow: A Trainer Agent Isolation RL Framework Achieving Bubble-Free Pipelines via Tag Scheduling
- Optimal CO2 storage management considering safety constraints in multi-stakeholder multi-site CCS projects: a game theoretic perspective
- Data-driven global ocean model resolving ocean-atmosphere coupling dynamics
- Uncovering Latent Connections in Indigenous Heritage: Semantic Pipelines for Cultural Preservation in Brazil
- Insect-Wing Structured Microfluidic System for Reservoir Computing
- CleanCTG: A Deep Learning Model for Multi-Artefact Detection and Reconstruction in Cardiotocography
- HQ-OV3D: A High Box Quality Open-World 3D Detection Framework based on Diffision Model
- Non-asymptotic convergence bound of conditional diffusion models
- iWatchRoad: Scalable Detection and Geospatial Visualization of Potholes for Smart Cities
- Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling
- Counterfactual Survival Q Learning for Longitudinal Randomized Trials via Buckley James Boosting
- Human-in-the-Loop Systems for Adaptive Learning Using Generative AI
- Functional Analysis of Variance for Association Studies
- The Role of Entanglement in Quantum Reservoir Computing with Coupled Kerr Nonlinear Oscillators
- HistoViT: Vision Transformer for Accurate and Scalable Histopathological Cancer Diagnosis
- CHARM3R: Towards Unseen Camera Height Robust Monocular 3D Detector
- A CLIP-based Uncertainty Modal Modeling (UMM) Framework for Pedestrian Re-Identification in Autonomous Driving
- Uniform convergence for Gaussian kernel ridge regression
- Probing the Representational Power of Sparse Autoencoders in Vision Models
- Approximating the universal thermal climate index using sparse regression with orthogonal polynomials
- Repetitive TMS-based Identification of Methamphetamine-Dependent Individuals Using EEG Spectra
- Weighted First Order Model Counting for Two-variable Logic with Axioms on Two Relations
- A Comprehensive Perspective on Explainable AI across the Machine Learning Workflow
- ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization
- Aware First, Think Less: Dynamic Boundary Self-Awareness Drives Extreme Reasoning Efficiency in Large Language Models
- Visual Perception Engine: Fast and Flexible Multi-Head Inference for Robotic Vision Tasks
- CryptoScope: Utilizing Large Language Models for Automated Cryptographic Logic Vulnerability Detection
- Pretrained Conformers for Audio Fingerprinting and Retrieval
- Controlling Multimodal LLMs via Reward-guided Decoding
- Is ChatGPT-5 Ready for Mammogram VQA?
- Sophisticated Learning: A novel algorithm for active learning during model-based planning
- MetaAgents: Large Language Model Based Agents for Decision-Making on Teaming
- Tool-Planner: Task Planning with Clusters across Multiple Tools
- Sketch Decompositions for Classical Planning via Deep Reinforcement Learning
- Learning to Be A Doctor: Searching for Effective Medical Agent Architectures
- CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking
- Recent Advances in Generative AI for Healthcare Applications
- Large-Scale Multi-Robot Assembly Planning for Autonomous Manufacturing
- JMA: a General Algorithm to Craft Nearly Optimal Targeted Adversarial Example
- A Survey on Recent Advances in LLM-Based Multi-turn Dialogue Systems
- TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendation
- Clean-Label Physical Backdoor Attacks with Data Distillation
- Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding
- SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression
- Data Diversity as Implicit Regularization: How Does Diversity Shape the Weight Space of Deep Neural Networks?
- Language-Based Bayesian Optimization Research Assistant (BORA)
- Uncertainty-Aware Adaptation of Large Language Models for Protein-Protein Interaction Analysis
- Human-AI Experience in Integrated Development Environments: A Systematic Literature Review
- Feather-SQL: A Lightweight NL2SQL Framework with Dual-Model Collaboration Paradigm for Small Language Models
- EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing
- L3AC: Towards a Lightweight and Lossless Audio Codec
- Once Upon an AI: Six Scaffolds for Child-AI Interaction Design, Inspired by Disney
- EmbodiedAgent: A Scalable Hierarchical Approach to Overcome Practical Challenge in Multi-Robot Control
- SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models
- Convolutional Autoencoders for Data Compression and Anomaly Detection in Small Satellite Technologies
- Blending 3D Geometry and Machine Learning for Multi-View Stereopsis
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
- A Closer Look at Multimodal Representation Collapse
- Omni-DPO: A Dual-Perspective Paradigm for Dynamic Preference Learning of LLMs
- ViFusionTST: Deep Fusion of Time-Series Image Representations from Load Signals for Early Bed-Exit Prediction
- Text-to-Level Diffusion Models With Various Text Encoders for Super Mario Bros
- What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models
- Actor-Critic for Continuous Action Chunks: A Reinforcement Learning Framework for Long-Horizon Robotic Manipulation with Sparse Reward
- AlphaAgents: Large Language Model based Multi-Agents for Equity Portfolio Constructions
- Role-Augmented Intent-Driven Generative Search Engine Optimization
- Better Supervised Fine-tuning for VQA: Integer-Only Loss
- A Semi-supervised Generative Model for Incomplete Multi-view Data Integration with Missing Labels
- Quantum-Boosted High-Fidelity Deep Learning
- E-CaTCH: Event-Centric Cross-Modal Attention with Temporal Consistency and Class-Imbalance Handling for Misinformation Detection
- Visuomotor Grasping with World Models for Surgical Robots
- StyleMM: Stylized 3D Morphable Face Model via Text-Driven Aligned Image Translation
- Multi-Group Equivariant Augmentation for Reinforcement Learning in Robot Manipulation
- How Causal Abstraction Underpins Computational Explanation
- ORFuzz: Fuzzing the "Other Side" of LLM Safety -- Testing Over-Refusal
- Cross-Granularity Hypergraph Retrieval-Augmented Generation for Multi-hop Question Answering
- Graph Neural Diffusion via Generalized Opinion Dynamics
- Generalized Decoupled Learning for Enhancing Open-Vocabulary Dense Perception
- Hallucination in LLM-Based Code Generation: An Automotive Case Study
- Vision-Language Models display a strong gender bias
- Enhancing Supervised Composed Image Retrieval via Reasoning-Augmented Representation Engineering
- Is General-Purpose AI Reasoning Sensitive to Data-Induced Cognitive Biases? Dynamic Benchmarking on Typical Software Engineering Dilemmas
- LETToT: Label-Free Evaluation of Large Language Models On Tourism Using Expert Tree-of-Thought
- ToxiFrench: Benchmarking and Enhancing Language Models via CoT Fine-Tuning for French Toxicity Detection
- Scene Graph-Guided Proactive Replanning for Failure-Resilient Embodied Agent
- CSGO: Generalized Optimization for Cold Start in Wireless Collaborative Edge LLM Systems
- Dynamic Quality-Latency Aware Routing for LLM Inference in Wireless Edge-Device Networks
- SGSimEval: A Comprehensive Multifaceted and Similarity-Enhanced Benchmark for Automatic Survey Generation Systems
- RegimeNAS: Regime-Aware Differentiable Architecture Search With Theoretical Guarantees for Financial Trading
- NeMo: A Neuron-Level Modularizing-While-Training Approach for Decomposing DNN Models
- Leveraging the RETFound foundation model for optic disc segmentation in retinal images
- ETTRL: Balancing Exploration and Exploitation in LLM Test-Time Reinforcement Learning Via Entropy Mechanism
- PTSM: Physiology-aware and Task-invariant Spatio-temporal Modeling for Cross-Subject EEG Decoding
- Minimizing Surrogate Losses for Decision-Focused Learning using Differentiable Optimization
- Does the Skeleton-Recall Loss Really Work?
- G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior Integration
- When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs
- Retrieval-augmented reasoning with lean language models
- Trustworthy AI Psychotherapy: Multi-Agent LLM Workflow for Counseling and Explainable Mental Disorder Diagnosis
- An Exploratory Study on Crack Detection in Concrete through Human-Robot Collaboration
- Open, Reproducible and Trustworthy Robot-Based Experiments with Virtual Labs and Digital-Twin-Based Execution Tracing
- On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting
- Informative Post-Hoc Explanations Only Exist for Simple Functions
- Inside Knowledge: Graph-based Path Generation with Explainable Data Augmentation and Curriculum Learning for Visual Indoor Navigation
- Reference Points in LLM Sentiment Analysis: The Role of Structured Context
- RMSL: Weakly-Supervised Insider Threat Detection with Robust Multi-sphere Learning
- Handwritten Text Recognition of Historical Manuscripts Using Transformer-Based Models
- Sim2Dust: Mastering Dynamic Waypoint Tracking on Granular Media
- Towards Faithful Class-level Self-explainability in Graph Neural Networks by Subgraph Dependencies
- Grounding Rule-Based Argumentation Using Datalog
- From Individual to Multi-Agent Algorithmic Recourse: Minimizing the Welfare Gap via Capacitated Bipartite Matching
- Learn to optimize for automatic proton PBS treatment planning for H&N cancers
- On Strong and Weak Admissibility in Non-Flat Assumption-Based Argumentation
- Beyond Solving Math Quiz: Evaluating the Ability of Large Reasoning Models to Ask for Information
- SAGE: Scale-Aware Gradual Evolution for Continual Knowledge Graph Embedding
- CRAFT-GUI: Curriculum-Reinforced Agent For GUI Tasks
- AIM-Bench: Evaluating Decision-making Biases of Agentic LLM as Inventory Manager
- Inclusion Arena: An Open Platform for Evaluating Large Foundation Models with Real-World Apps
- Landmark-Assisted Monte Carlo Planning
- Inspire or Predict? Exploring New Paradigms in Assisting Classical Planners with Large Language Models
- A weighted U statistic for association analysis considering genetic heterogeneity
- A Generalized Similarity U Test for Multivariate Analysis of Sequencing Data
- A Weighted U Statistic for Genetic Association Analyses of Sequencing Data
- Trees Assembling Mann Whitney Approach for Detecting Genome-wide Joint Association among Low Marginal Effect loci
- Generalized Similarity U: A Non-parametric Test of Association Based on Similarity
- FLUID: Flow-Latent Unified Integration via Token Distillation for Expert Specialization in Multimodal Learning
- SDSNN: A Single-Timestep Spiking Neural Network with Self-Dropping Neuron and Bayesian Optimization
- Multimodal Quantitative Measures for Multiparty Behaviour Evaluation
- Managing the unexpected: Operator behavioural data and its value in predicting correct alarm responses
- Human-AI collaboration or obedient and often clueless AI in instruct, serve, repeat dynamics?
- gpt-oss-120b & gpt-oss-20b Model Card
- Modeling and Detecting Company Risks from News: A Case Study in Bloomberg News
- Apriel-Nemotron-15B-Thinker
- Towards Efficient Prompt-based Continual Learning in Distributed Medical AI
- ORBIT: An Object Property Reasoning Benchmark for Visual Inference Tasks
- Retro-Expert: Collaborative Reasoning for Interpretable Retrosynthesis
- Rule2Text: A Framework for Generating and Evaluating Natural Language Explanations of Knowledge Graph Rules
- Not There Yet: Evaluating Vision Language Models in Simulating the Visual Perception of People with Low Vision
- MCP-Guard: A Defense Framework for Model Context Protocol Integrity in Large Language Model Applications
- Match & Choose: Model Selection Framework for Fine-tuning Text-to-Image Diffusion Models
- SproutBench: A Benchmark for Safe and Ethical Large Language Models for Youth
- Deep Learning-Based Automated Segmentation of Uterine Myomas
- CURE: Critical-Token-Guided Re-concatenation for Entropy-collapse Prevention
- Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics
- Zono-Conformal Prediction: Zonotope-Based Uncertainty Quantification for Regression and Classification Tasks
- Risk-Based Prognostics and Health Management
- Note on Selection Bias in Observational Estimates of Algorithmic Progress
- Learning with Confidence
- AI That Helps Us Help Each Other: A Proactive System for Scaffolding Mentor-Novice Collaboration in Entrepreneurship Coaching
- LD-LAudio-V1: Video-to-Long-Form-Audio Generation Extension with Dual Lightweight Adapters
- Compressive Meta-Learning
- Utilizing Vision-Language Models as Action Models for Intent Recognition and Assistance
- Diffusion is a code repair operator and generator
- Quantization through Piecewise-Affine Regularization: Optimization and Statistical Guarantees
- Tabularis Formatus: Predictive Formatting for Tables
- MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Documents
- A Cross-Modal Rumor Detection Scheme via Contrastive Learning by Exploring Text and Image internal Correlations
Research Sources: 386 | Generated: 8/25/2025