AI RESEARCH PAPERS & ACADEMIC SOURCES
- Estimating oil recovery factor using machine learning: Applications of XGBoost classification
- Lightweight posterior construction for gravitational-wave catalogs with the Kolmogorov-Arnold network
- How many samples are needed to train a deep neural network?
- Deshadow-Anything: When Segment Anything Model Meets Zero-shot shadow removal
- Meta-Learned Modality-Weighted Knowledge Distillation for Robust Multi-Modal Learning with Missing Data
- FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization
- MCGS: Multiview Consistency Enhancement for Sparse-View 3D Gaussian Radiance Fields
- ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
- Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes
- StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models
- Survey on Monocular Metric Depth Estimation
- Single-Domain Generalized Object Detection by Balancing Domain Diversity and Invariance
- PromptGAR: Flexible Promptive Group Activity Recognition
- Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness
- LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification
- ForgetMe: Evaluating Selective Forgetting in Generative Models
- WMKA-Net: A Weighted Multi-Kernel Attention Network for Retinal Vessel Segmentation
- PhysioSync: Temporal and Cross-Modal Contrastive Learning Inspired by Physiological Synchronization for EEG-Based Emotion Recognition
- RAFT: Robust Augmentation of FeaTures for Image Segmentation
- MonoCoP: Chain-of-Prediction for Monocular 3D Object Detection
- Generative Data Augmentation for Object Point Cloud Segmentation
- Egocentric Human-Object Interaction Detection: A New Benchmark and Method
- Prompt-based Dynamic Token Pruning for Efficient Segmentation of Medical Images
- OpenEvents V1: Large-Scale Benchmark Dataset for Multimodal Event Grounding
- Confidence-driven Gradient Modulation for Multimodal Human Activity Recognition: A Dynamic Contrastive Dual-Path Learning Approach
- DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
- Online Micro-gesture Recognition Using Data Augmentation and Spatial-Temporal Attention
- MOSformer: Momentum encoder-based inter-slice fusion transformer for medical image segmentation
- TimeFlow: Temporal Conditioning for Longitudinal Brain MRI Registration and Aging Analysis
- Enhancing Reusability of Learned Skills for Robot Manipulation via Gaze Information and Motion Bottlenecks
- Image Coding for Machines via Feature-Preserving Rate-Distortion Optimization
- Objective Task-based Evaluation of Quantitative Medical Imaging Methods: Emerging Frameworks and Future Directions
- TRAN-D: 2D Gaussian Splatting-based Sparse-view Transparent Object Depth Reconstruction via Physics Simulation for Scene Update
- Handcrafted vs. Deep Radiomics vs. Fusion vs. Deep Learning: A Comprehensive Review of Machine Learning -Based Cancer Outcome Prediction in PET and SPECT Imaging
- Style4D-Bench: A Benchmark Suite for 4D Stylization
- Articulate3D: Zero-Shot Text-Driven 3D Object Posing
- VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
- Impact of Target and Tool Visualization on Depth Perception and Usability in Optical See-Through AR
- Controllable Single-shot Animation Blending with Temporal Conditioning
- SemLayoutDiff: Semantic Layout Generation with Diffusion Model for Indoor Scene Synthesis
- Enhancing Video-Based Robot Failure Detection Using Task Knowledge
- A Closer Look at Edema Area Segmentation in SD-OCT Images Using Adversarial Framework
- Hidden Tail: Adversarial Image Causing Stealthy Resource Consumption in Vision-Language Models
- Quantum-Circuit-Based Visual Fractal Image Generation in Qiskit and Analytics
- PanoHair: Detailed Hair Strand Synthesis on Volumetric Heads
- Enhanced UAV Path Planning Using the Tangent Intersection Guidance (TIG) Algorithm
- Understanding Benefits and Pitfalls of Current Methods for the Segmentation of Undersampled MRI Data
- Time Series Analysis of Spiking Neural Systems via Transfer Entropy and Directed Persistent Homology
- MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
- SFormer: SNR-guided Transformer for Underwater Image Enhancement from the Frequency Domain
- Hierarchical Spatio-temporal Segmentation Network for Ejection Fraction Estimation in Echocardiography Videos
- Feature-Space Planes Searcher: A Universal Domain Adaptation Framework for Interpretability and Computational Efficiency
- A Novel Deep Hybrid Framework with Ensemble-Based Feature Optimization for Robust Real-Time Human Activity Recognition
- ColorGS: High-fidelity Surgical Scene Reconstruction with Colored Gaussian Splatting
- Class-wise Flooding Regularization for Imbalanced Image Classification
- Flatness-aware Curriculum Learning via Adversarial Difficulty
- Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vectorized Drawings
- Rethinking Human-Object Interaction Evaluation for both Vision-Language Models and HOI-Specific Methods
- Design, Implementation and Evaluation of a Real-Time Remote Photoplethysmography (rPPG) Acquisition System for Non-Invasive Vital Sign Monitoring
- Robust and Label-Efficient Deep Waste Detection
- Embedding Font Impression Word Tags Based on Co-occurrence
- Deep Pre-trained Time Series Features for Tree Species Classification in the Dutch Forest Inventory
- Automated Classification of Normal and Atypical Mitotic Figures Using ConvNeXt V2: MIDOG 2025 Track 2
- Boosting Micro-Expression Analysis via Prior-Guided Video-Level Regression
- Quantitative Outcome-Oriented Assessment of Microsurgical Anastomosis
- Harnessing Meta-Learning for Controllable Full-Frame Video Stabilization
- Toward Robust Medical Fairness: Debiased Dual-Modal Alignment via Text-Guided Attribute-Disentangled Prompt Learning for Vision-Language Models
- DQEN: Dual Query Enhancement Network for DETR-based HOI Detection
- Event-Enriched Image Analysis Grand Challenge at ACM Multimedia 2025
- Preliminary Study on Space Utilization and Emergent Behaviors of Group vs. Single Pedestrians in Real-World Trajectories
- Generative AI in Map-Making: A Technical Exploration and Its Implications for Cartographers
- Can we make NeRF-based visual localization privacy-preserving?
- Enhancing Document VQA Models via Retrieval-Augmented Generation
- Ask Me Again Differently: GRAS for Measuring Bias in Vision Language Models on Gender, Race, Age, and Skin Tone
- MicroDetect-Net (MDN): Leveraging Deep Learning to Detect Microplastics in Clam Blood, a Step Towards Human Blood Analysis
- ProPy: Building Interactive Prompt Pyramids upon CLIP for Partially Relevant Video Retrieval
- VibES: Induced Vibration for Persistent Event-Based Sensing
- Dual Enhancement on 3D Vision-Language Perception for Monocular 3D Visual Grounding
- Beyond flattening: a geometrically principled positional encoding for vision transformers with Weierstrass elliptic functions
- SoccerNet 2025 Challenges Results
- FastMesh:Efficient Artistic Mesh Generation via Component Decoupling
- All-in-One Slider for Attribute Manipulation in Diffusion Models
- OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation
- Automated Feature Tracking for Real-Time Kinematic Analysis and Shape Estimation of Carbon Nanotube Growth
- Autoregressive Universal Video Segmentation Model
- HateDebias: On the Diversity and Variability of Hate Speech Debiasing
- Recognizing Limits: Investigating Infeasibility in Large Language Models
- Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models
- Adapting Large Language Models to Log Analysis with Interpretable Domain Knowledge
- Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications
- Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence
- Improving Multilingual Language Models by Aligning Representations through Steering
- Truth or Twist? Optimal Model Selection for Reliable Label Flipping Evaluation in LLM-based Counterfactuals
- Can Pruning Improve Reasoning? Revisiting Long-CoT Compression with Capability in Mind for Better Reasoning
- ELSPR: Evaluator LLM Training Data Self-Purification on Non-Transitive Preferences via Tournament Graph Reconstruction
- Subjective Perspectives within Learned Representations Predict High-Impact Innovation
- Weakly-Supervised 3D Visual Grounding based on Visual Language Alignment
- VAGUE: Visual Contexts Clarify Ambiguous Expressions
- Evolutionary Automata and Deep Evolutionary Computation
- Aligning NLP Models with Target Population Perspectives using PAIR: Population-Aligned Instance Replication
- SERES: Semantic-aware neural reconstruction from sparse views
- FastAvatar: Instant 3D Gaussian Splatting for Faces from Single Unconstrained Poses
- Securing Face and Fingerprint Templates in Humanitarian Biometric Systems
- Why Relational Graphs Will Save the Next Generation of Vision Foundation Models?
- LPLC: A Dataset for License Plate Legibility Classification
- VQualA 2025 Challenge on Face Image Quality Assessment: Methods and Results
- DoGFlow: Self-Supervised LiDAR Scene Flow via Cross-Modal Doppler Guidance
- Adaptive Visual Navigation Assistant in 3D RPGs
- Wan-S2V: Audio-Driven Cinematic Video Generation
- Decouple, Reorganize, and Fuse: A Multimodal Framework for Cancer Survival Prediction
- OwlCap: Harmonizing Motion-Detail for Video Captioning via HMD-270K and Caption Set Equivalence Reward
- Semantic Attractors and the Emergence of Meaning: Towards a Teleological Model of AGI
- Not All Visitors are Bilingual: A Measurement Study of the Multilingual Web from an Accessibility Perspective
- Language-Specific Layer Matters: Efficient Multilingual Enhancement for Large Vision-Language Models
- Integral Transformer: Denoising Attention, Not Too Much Not Too Little
- Integrating gender inclusivity into large language models via instruction tuning
- COMET-poly: Machine Translation Metric Grounded in Other Candidates
- The Mind's Eye: A Multi-Faceted Reward Framework for Guiding Visual Metaphor Generation
- A New NMT Model for Translating Clinical Texts from English to Spanish
- Thinking Before You Speak: A Proactive Test-time Scaling Approach
- Emotion Omni: Enabling Empathetic Speech Response Generation through Large Language Models
- Knowing or Guessing? Robust Medical Visual Question Answering via Joint Consistency and Contrastive Learning
- Attention2Probability: Attention-Driven Terminology Probability Estimation for Robust Speech-to-Text System
- Filtering for Creativity: Adaptive Prompting for Multilingual Riddle Generation in LLMs
- EMMM, Explain Me My Model! Explainable Machine Generated Text Detection in Dialogues
- Chronological Passage Assembling in RAG framework for Temporal Question Answering
- ThinkDial: An Open Recipe for Controlling Reasoning Effort in Large Language Models
- Controllable Conversational Theme Detection Track at DSTC 12
- LaTeXTrans: Structured LaTeX Translation with Multi-Agent Coordination
- LLM-based Contrastive Self-Supervised AMR Learning with Masked Graph Autoencoders for Fake News Detection
- Arrows of Math Reasoning Data Synthesis for Large Language Models: Diversity, Complexity and Correctness
- Empowering Computing Education Researchers Through LLM-Assisted Content Analysis
- Affective Polarization across European Parliaments
- MovieCORE: COgnitive REasoning in Movies
- "Where does it hurt?" - Dataset and Study on Physician Intent Trajectories in Doctor Patient Dialogues
- It's All About In-Context Learning! Teaching Extremely Low-Resource Languages to LLMs
- Retrieval-Augmented Generation for Natural Language Art Provenance Searches in the Getty Provenance Index
- Beyond the Black Box: Integrating Lexical and Semantic Methods in Quantitative Discourse Analysis with BERTopic
- Do LVLMs Know What They Know? A Systematic Study of Knowledge Boundary Perception in LVLMs
- Demystifying Scientific Problem-Solving in LLMs by Probing Knowledge and Reasoning
- Evaluating the Evaluators: Are readability metrics good measures of readability?
- Designing across domains with declarative thinking: Insights from the 96-Eyes ptychographic imager project
- UniC-RAG: Universal Knowledge Corruption Attacks to Retrieval-Augmented Generation
- Beyond the Textual: Generating Coherent Visual Options for MCQs
- The Double-edged Sword of LLM-based Data Reconstruction: Understanding and Mitigating Contextual Vulnerability in Word-level Differential Privacy Text Sanitization
- A Survey on Data Selection for LLM Instruction Tuning
- KNN and K-means in Gini Prametric Spaces
- Keep your distance: learning dispersed embeddings on $\mathbb{S}_m$
- General Intelligence Requires Reward-based Pretraining
- Seal Your Backdoor with Variational Defense
- VectorLiteRAG: Latency-Aware and Fine-Grained Resource Partitioning for Efficient RAG
- ChemKANs for Combustion Chemistry Modeling and Acceleration
- Local Learning Rules for Out-of-Equilibrium Physical Generative Models
- Deep Generative Methods and Tire Architecture Design
- Multi-Component VAE with Gaussian Markov Random Field
- PointFix: Learning to Fix Domain Bias for Robust Online Stereo Adaptation
- Data Compression using Rank-1 Lattices for Parameter Estimation in Machine Learning
- Deep vectorised operators for pulsatile hemodynamics estimation in coronary arteries from a steady-state prior
- MCI-GRU: Stock Prediction Model Based on Multi-Head Cross-Attention and Improved GRU
- Human Vision Constrained Super-Resolution
- Incremental Multi-Scene Modeling via Continual Neural Graphics Primitives
- SmartBench: Is Your LLM Truly a Good Chinese Smartphone Assistant?
- Multi-timescale time encoding for CNN prediction of Fenna-Matthews-Olson energy-transfer dynamics
- Data Requirement Goal Modeling for Machine Learning Systems
- Steerable Scene Generation with Post Training and Inference-Time Search
- Uni-AIMS: AI-Powered Microscopy Image Analysis
- Distribution free M-estimation
- Dense Retrievers Can Fail on Simple Queries: Revealing The Granularity Dilemma of Embeddings
- Evaluating DNA function understanding in genomic language models using evolutionarily implausible sequences
- Investigating the Robustness of Extreme Precipitation Super-Resolution Across Climates
- Deterministic Coreset Construction via Adaptive Sensitivity Trimming
- Training Language Model Agents to Find Vulnerabilities with CTF-Dojo
- DenseRec: Revisiting Dense Content Embeddings for Sequential Transformer-based Recommendation
- From Prediction to Simulation: AlphaFold 3 as a Differentiable Framework for Structural Biology
- Context-Aware Zero-Shot Anomaly Detection in Surveillance Using Contrastive and Predictive Spatiotemporal Modeling
- Huracan: A skillful end-to-end data-driven system for ensemble data assimilation and weather prediction
- An Analytical Approach to Privacy and Performance Trade-Offs in Healthcare Data Sharing
- Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems
- Scalable Fairness Shaping with LLM-Guided Multi-Agent Reinforcement Learning for Peer-to-Peer Electricity Markets
- Stress-testing cross-cancer generalizability of 3D nnU-Net for PET-CT tumor segmentation: multi-cohort evaluation with novel oesophageal and lung cancer datasets
- ModAn-MulSupCon: Modality-and Anatomy-Aware Multi-Label Supervised Contrastive Pretraining for Medical Imaging
- Taming the One-Epoch Phenomenon in Online Recommendation System by Two-stage Contrastive ID Pre-training
- Data-Driven Discovery and Formulation Refines the Quasi-Steady Model of Flapping-Wing Aerodynamics
- Are All Marine Species Created Equal? Performance Disparities in Underwater Object Detection
- Rethinking Caching for LLM Serving Systems: Beyond Traditional Heuristics
- Beyond Quality: Unlocking Diversity in Ad Headline Generation with Large Language Models
- Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits
- PseudoMapTrainer: Learning Online Mapping without HD Maps
- Temperature-Aware Recurrent Neural Operator for Temperature-Dependent Anisotropic Plasticity in HCP Materials
- Learning Real-World Acrobatic Flight from Human Preferences
- Sparse minimum Redundancy Maximum Relevance for feature selection
- Forecasting Probability Distributions of Financial Returns with Deep Neural Networks
- The GINN framework: a stochastic QED correspondence for stability and chaos in deep neural networks
- Enhancing compact convolutional transformers with super attention
- USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
- Is attention truly all we need? An empirical study of asset pricing in pretrained RNN sparse and global attention models
- GReAT: leveraging geometric artery data to improve wall shear stress assessment
- Learning Binary Sampling Patterns for Single-Pixel Imaging using Bilevel Optimisation
- CARMA: Collocation-Aware Resource Manager with GPU Memory Estimator
- Universal Dynamics with Globally Controlled Analog Quantum Simulators
- Random forest-based out-of-distribution detection for robust lung cancer segmentation
- A Bag of Tricks for Efficient Implicit Neural Point Clouds
- Echoes of the past: A unified perspective on fading memory and echo states
- Leveraging Evolutionary Surrogate-Assisted Prescription in Multi-Objective Chlorination Control Systems
- Planning-Query-Guided Model Generation for Model-Based Deformable Object Manipulation
- Branch and Bound for Piecewise Linear Neural Network Verification
- Sharp Lower Bounds on Interpolation by Deep ReLU Neural Networks at Irregularly Spaced Data
- Contraction Properties of the Global Workspace Primitive
- Learning Optimal Classification Trees Robust to Distribution Shifts
- Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting
- TopoBench: A Framework for Benchmarking Topological Deep Learning
- Large Language Model Aided QoS Prediction for Service Recommendation
- Activation degree thresholds and expressiveness of polynomial neural networks
- Instruction-Based Molecular Graph Generation with Unified Text-Graph Diffusion Model
- PinnDE: Physics-Informed Neural Networks for Solving Differential Equations
- Gradient Boosting Decision Trees on Medical Diagnosis over Tabular Data
- fLSA: Learning Semantic Structures in Document Collections Using Foundation Models
- Generalization, Expressivity, and Universality of Graph Neural Networks on Attributed Graphs
- Graph Neural Network Based Action Ranking for Planning
- Learning Spatio-Temporal Dynamics via Operator-Valued RKHS and Kernel Koopman Methods
- ZTFed-MAS2S: A Zero-Trust Federated Learning Framework with Verifiable Privacy and Trust-Aware Aggregation for Wind Power Data Imputation
- Linear cost mutual information estimation and independence test of similar performance as HSIC
- DualSparse-MoE: Coordinating Tensor/Neuron-Level Sparsity with Expert Partition and Reconstruction
- LLM-Driven Intrinsic Motivation for Sparse Reward Reinforcement Learning
- Enhancing Trust-Region Bayesian Optimization via Newton Methods
- Breaking Through Barren Plateaus: Reinforcement Learning Initializations for Deep Variational Quantum Circuits
- Quantifying The Limits of AI Reasoning: Systematic Neural Network Representations of Algorithms
- BTW: A Non-Parametric Variance Stabilization Framework for Multimodal Model Integration
- Enhancing Chemical Explainability Through Counterfactual Masking
- A Note on Graphon-Signal Analysis of Graph Neural Networks
- Improving Long-term Autoregressive Spatiotemporal Predictions: A Proof of Concept with Fluid Dynamics
- Sparse Autoencoders for Low-$N$ Protein Function Prediction and Design
- History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL
- Linear Trading Position with Sparse Spectrum
- Uncertainty Awareness on Unsupervised Domain Adaptation for Time Series Data
- STRATA-TS: Selective Knowledge Transfer for Urban Time Series Forecasting with Retrieval-Guided Reasoning
- Biologically Disentangled Multi-Omic Modeling Reveals Mechanistic Insights into Pan-Cancer Immunotherapy Resistance
- Utilizing Training Data to Improve LLM Reasoning for Tabular Understanding
- End to End Autoencoder MLP Framework for Sepsis Prediction
- Natural Image Classification via Quasi-Cyclic Graph Ensembles and Random-Bond Ising Models at the Nishimori Temperature
- Beyond Tokens: Enhancing RTL Quality Estimation via Structural Graph Learning
- Stability and Generalization for Bellman Residuals
- Constraint Matters: Multi-Modal Representation for Reducing Mixed-Integer Linear programming
- UltraMemV2: Memory Networks Scaling to 120B Parameters with Superior Long-Context Learning
- Governance-as-a-Service: A Multi-Agent Framework for AI System Compliance and Policy Enforcement
- Predicting Drug-Drug Interactions Using Heterogeneous Graph Neural Networks: HGNN-DDI
- Federated Learning with Heterogeneous and Private Label Sets
- SWiFT: Soft-Mask Weight Fine-tuning for Bias Mitigation
- DRMD: Deep Reinforcement Learning for Malware Detection under Concept Drift
- Recycling History: Efficient Recommendations from Contextual Dueling Bandits
- C-Flat++: Towards a More Efficient and Powerful Framework for Continual Learning
- MOCHA: Discovering Multi-Order Dynamic Causality in Temporal Point Processes
- Generalization Bound for a General Class of Neural Ordinary Differential Equations
- Energy-Based Flow Matching for Generating 3D Molecular Structure
- Estimating Conditional Covariance between labels for Multilabel Data
- On the Generalisation of Koopman Representations for Chaotic System Control
- FedProtoKD: Dual Knowledge Distillation with Adaptive Class-wise Prototype Margin for Heterogeneous Federated Learning
- Learning with springs and sticks
- Working My Way Back to You: Resource-Centric Next-Activity Prediction
- GRADSTOP: Early Stopping of Gradient Descent via Posterior Sampling
- When recalling in-context, Transformers are not SSMs
- Breaking the Black Box: Inherently Interpretable Physics-Informed Machine Learning for Imbalanced Seismic Data
- Automated discovery of finite volume schemes using Graph Neural Networks
- Composition and Alignment of Diffusion Models using Constrained Learning
- Active Query Selection for Crowd-Based Reinforcement Learning
- Saddle Hierarchy in Dense Associative Memory
- Get Global Guarantees: On the Probabilistic Nature of Perturbation Robustness
- Predicting the Order of Upcoming Tokens Improves Language Modeling
- Approximating High-Dimensional Earth Mover's Distance as Fast as Closest Pair
- Cultural Dimensions of AI Perception: Charting Expectations, Risks, Benefits, Tradeoffs, and Value in Germany and China
- TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
- Safe Multiagent Coordination via Entropic Exploration
- Generative Artificial Intelligence-Supported Pentesting: A Comparison between Claude Opus, GPT-4, and Copilot
- Provably-Safe Neural Network Training Using Hybrid Zonotope Reachability Analysis
- StagFormer: Time Staggering Transformer Decoding for RunningLayers In Parallel
- TableTalk: Scaffolding Spreadsheet Development with a Language Agent
- Large Language Models Badly Generalize across Option Length, Problem Types, and Irrelevant Noun Replacements
- Collaborative Evaluation of Deepfake Text with Deliberation-Enhancing Dialogue Systems
- UniGenX: a unified generative foundation model that couples sequence, structure and function to accelerate scientific design across proteins, molecules and materials
- Faster Parameter-Efficient Tuning with Token Redundancy Reduction
- Noise-based reward-modulated learning
- A Hybrid Fully Convolutional CNN-Transformer Model for Inherently Interpretable Disease Detection from Retinal Fundus Images
- Video CLIP Model for Multi-View Echocardiography Interpretation
- Prefill-level Jailbreak: A Black-Box Risk Analysis of Large Language Models
- An Ontology-Driven Graph RAG for Legal Norms: A Hierarchical, Temporal, and Deterministic Approach
- Unveiling the Landscape of LLM Deployment in the Wild: An Empirical Study
- Concept-Guided Interpretability via Neural Chunking
- Revisiting SSL for sound event detection: complementary fusion and adaptive post-processing
- RePPL: Recalibrating Perplexity by Uncertainty in Semantic Propagation and Language Generation for Explainable QA Hallucination Detection
- EVM-Fusion: An Explainable Vision Mamba Architecture with Neural Algorithmic Fusion
- UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
- Spectra-to-Structure and Structure-to-Spectra Inference Across the Periodic Table
- An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
- Solar Altitude Guided Scene Illumination
- Krul: Efficient State Restoration for Multi-turn Conversations with Dynamic Cross-layer KV Sharing
- Demographic-aware fine-grained classification of pediatric wrist fractures
- Apple Intelligence Foundation Language Models: Tech Report 2025
- Reasoning Steps as Curriculum: Using Depth of Thought as a Difficulty Signal for Tuning LLMs
- Data-driven models for production forecasting and decision supporting in petroleum reservoirs
- A Fast and Minimal System to Identify Depression Using Smartphones: Explainable Machine Learning-Based Approach
- Secure Reinforcement Learning via Shuffle Privacy Model
- From Intents to Conversations: Generating Intent-Driven Dialogues with Contrastive Learning for Multi-Turn Classification
- Hierarchical Object-Oriented POMDP Planning for Object Rearrangement
- Perception Gaps in Risk, Benefit, and Value Between Experts and Public Challenge Socially Accepted AI
- CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers
- Optimization of Latent-Space Compression using Game-Theoretic Techniques for Transformer-Based Vector Search
- HAEPO: History-Aggregated Exploratory Policy Optimization
- pyFAST: A Modular PyTorch Framework for Time Series Modeling with Multi-source and Sparse Data
- Interpretable Decision-Making for End-to-End Autonomous Driving
- Distance-informed Neural Processes
- SegReConcat: A Data Augmentation Method for Voice Anonymization Attack
- Enhancing Model Privacy in Federated Learning with Random Masking and Quantization
- HOTSPOT-YOLO: A Lightweight Deep Learning Attention-Driven Model for Detecting Thermal Anomalies in Drone-Based Solar Photovoltaic Inspections
- HierCVAE: Hierarchical Attention-Driven Conditional Variational Autoencoders for Multi-Scale Temporal Modeling
- Diverse And Private Synthetic Datasets Generation for RAG evaluation: A multi-agent framework
- The point is the mask: scaling coral reef segmentation with weak supervision
- PAX-TS: Model-agnostic multi-granular explanations for time series forecasting via localized perturbations
- Interpretable by AI Mother Tongue: Native Symbolic Reasoning in Neural Models
- Automatic Prompt Optimization with Prompt Distillation
- GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks Through Code Repository Leveraging
- RoofSeg: An edge-aware transformer-based network for end-to-end roof plane segmentation
- STDiff: A State Transition Diffusion Framework for Time Series Imputation in Industrial Systems
- Metric Matters: A Formal Evaluation of Similarity Measures in Active Learning for Cyber Threat Intelligence
- No Label Left Behind: A Unified Surface Defect Detection Model for all Supervision Regimes
- Tackling Federated Unlearning as a Parameter Estimation Problem
- Dynamic Triangulation-Based Graph Rewiring for Graph Neural Networks
- Attackers Strike Back? Not Anymore - An Ensemble of RL Defenders Awakens for APT Detection
- An LLM-powered Natural-to-Robotic Language Translation Framework with Correctness Guarantees
- HiPlan: Hierarchical Planning for LLM-Based Agents with Adaptive Global-Local Guidance
- APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration
- SecureV2X: An Efficient and Privacy-Preserving System for Vehicle-to-Everything (V2X) Applications
- ZeST: an LLM-based Zero-Shot Traversability Navigation for Unknown Environments
- Uncertainty-Resilient Active Intention Recognition for Robotic Assistants
- RDDM: Practicing RAW Domain Diffusion Model for Real-world Image Restoration
- Few-Shot Connectivity-Aware Text Line Segmentation in Historical Documents
- From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity
- Real-Time Model Checking for Closed-Loop Robot Reactive Planning
- Emotions as Ambiguity-aware Ordinal Representations
- Understanding Tool-Integrated Reasoning
- LSD-3D: Large-Scale 3D Driving Scene Generation with Geometry Grounding
- VibeVoice Technical Report
- Interpolating Speaker Identities in Embedding Space for Data Expansion
- Generative Interfaces for Language Models
- A Survey on Causal Discovery: Theory and Practice
- Integrating Large Language Model for Improved Causal Discovery
- Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding
- Pessimistic Iterative Planning with RNNs for Robust POMDPs
- Consensus in Motion: A Case of Dynamic Rationality of Sequential Learning in Probability Aggregation
- YuLan-OneSim: Towards the Next Generation of Social Simulator with Large Language Models
- The Influence of Human-inspired Agentic Sophistication in LLM-driven Strategic Reasoners
- mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation
- Feature-Guided Neighbor Selection for Non-Expert Evaluation of Model Predictions
- Multi-Agent LLMs as Ethics Advocates for AI-Based Systems
- Bayesian Deep Learning for Segmentation for Autonomous Safe Planetary Landing
- Beyond Discriminant Patterns: On the Robustness of Decision Rule Ensembles
- DiffBlender: Composable and Versatile Multimodal Text-to-Image Diffusion Models
- Rethinking Distribution Shifts: Empirical Analysis and Inductive Modeling for Tabular Data
- Memory augment is All You Need for image restoration
- Learning county from pixels: corn yield prediction with attention-weighted multiple instance learning
- Exploring the Robustness of Language Models for Tabular Question Answering via Attention Analysis
- Ego-Foresight: Self-supervised Learning of Agent-Aware Representations for Improved RL
- ChatGPT Doesn't Trust Chargers Fans: Guardrail Sensitivity in Context
- GeNet: A Multimodal LLM-Based Co-Pilot for Network Topology and Configuration
- Leveraging Multi-facet Paths for Heterogeneous Graph Representation Learning
- HonestCyberEval: An AI Cyber Risk Benchmark for Automated Software Exploitation
- Benchmarking XAI Explanations with Human-Aligned Evaluations
- Overcoming label shift with target-aware federated learning
- H-PRM: A Pluggable Hotword Pre-Retrieval Module for Various Speech Recognition Systems
- Federative ischemic stroke segmentation as alternative to overcome domain-shift multi-institution challenges
- Can VLMs Recall Factual Associations From Visual References?
- Murakkab: Resource-Efficient Agentic Workflow Orchestration in Cloud Platforms
- Learning Explainable Imaging-Genetics Associations Related to a Neurological Disorder
- scI2CL: Effectively Integrating Single-cell Multi-omics by Intra- and Inter-omics Contrastive Learning
- SALMAN: Stability Analysis of Language Models Through the Maps Between Graph-based Manifolds
- CoPE: A Lightweight Complex Positional Encoding
- What Matters in Data for DPO?
- ProtoEHR: Hierarchical Prototype Learning for EHR-based Healthcare Predictions
- Automated Landfill Detection Using Deep Learning: A Comparative Study of Lightweight and Custom Architectures with the AerialWaste Dataset
- Evaluating Federated Learning for At-Risk Student Prediction: A Comparative Analysis of Model Complexity and Data Balancing
- Does Calibration Affect Human Actions?
- LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions
- Structures Meet Semantics: Multimodal Fusion via Graph Contrastive Learning
- Facilitating Matches on Allocation Platforms
- EAI-Avatar: Emotion-Aware Interactive Talking Head Generation
- Backprompting: Leveraging Synthetic Production Data for Health Advice Guardrails
- Latent Self-Consistency for Reliable Majority-Set Selection in Short- and Long-Answer Reasoning
- Mining the Long Tail: A Comparative Study of Data-Centric Criticality Metrics for Robust Offline Reinforcement Learning in Autonomous Motion Planning
- Toward Generalized Autonomous Agents: A Neuro-Symbolic AI Framework for Integrating Social and Technical Support in Education
- Can Out-of-Distribution Evaluations Uncover Reliance on Shortcuts? A Case Study in Question Answering
- Low-Rank Tensor Decompositions for the Theory of Neural Networks
- CLARIFY: A Specialist-Generalist Framework for Accurate and Lightweight Dermatological Visual Question Answering
- A Systematic Approach to Predict the Impact of Cybersecurity Vulnerabilities Using LLMs
- SwiftF0: Fast and Accurate Monophonic Pitch Detection
- How Reliable are LLMs for Reasoning on the Re-ranking task?
- VERIRL: Boosting the LLM-based Verilog Code Generation via Reinforcement Learning
- Vectorized Attention with Learnable Encoding for Quantum Transformer
- Principled Detection of Hallucinations in Large Language Models via Multiple Testing
- DRTA: Dynamic Reward Scaling for Reinforcement Learning in Time Series Anomaly Detection
- Collaborative Intelligence: Topic Modelling of Large Language Model use in Live Cybersecurity Operations
- Data Augmentation Improves Machine Unlearning
- Analise de Desaprendizado de Maquina em Modelos de Classificacao de Imagens Medicas
- A Deep Learning Application for Psoriasis Detection
- SAT-SKYLINES: 3D Building Generation from Satellite Imagery and Coarse Geometric Priors
- Beyond prior knowledge: The predictive role of knowledge-building in Tutor Learning
- The Quasi-Creature and the Uncanny Valley of Agency: A Synthesis of Theory and Evidence on User Interaction with Inconsistent Generative AI
- DrugReasoner: Interpretable Drug Approval Prediction with a Reasoning-augmented Language Model
- A Case Study on the Effectiveness of LLMs in Verification with Proof Assistants
- What do language models model? Transformers, automata, and the format of thought
- Scaling Laws for Task-Stratified Knowledge in Post-Training Quantized Large Language Models
- ROSE: Remove Objects with Side Effects in Videos
- LaQual: A Novel Framework for Automated Evaluation of LLM App Quality
- Clustering-based Feature Representation Learning for Oracle Bone Inscriptions Detection
- PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality
- Breaking the Trade-Off Between Faithfulness and Expressiveness for Large Language Models
- The Sound of Risk: A Multimodal Physics-Informed Acoustic Model for Forecasting Market Volatility and Enhancing Market Interpretability
- FFT-MoE: Efficient Federated Fine-Tuning for Foundation Models via Large-scale Sparse MoE under Heterogeneous Edge
- Membership Inference Attacks on LLM-based Recommender Systems
- Auditing Approximate Machine Unlearning for Differentially Private Models
- Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks
- Tailored Teaching with Balanced Difficulty: Elevating Reasoning in Multimodal Chain-of-Thought via Prompt Curriculum
- FALCON: Autonomous Cyber Threat Intelligence Mining with LLMs for IDS Rule Generation
- AgriChrono: A Multi-modal Dataset Capturing Crop Growth and Lighting Variability with a Field Robot
- Skill-Aligned Fairness in Multi-Agent Learning for Collaboration in Healthcare
- Cross-Learning Fine-Tuning Strategy for Dysarthric Speech Recognition Via CDSD database
- Improving Noise Robust Audio-Visual Speech Recognition via Router-Gated Cross-Modal Feature Fusion
- SkyTrust: Blockchain-Enhanced UAV Security for NTNs with Dynamic Trust and Energy-Aware Consensus
- FLAegis: A Two-Layer Defense Framework for Federated Learning Against Poisoning Attacks
- M3HG: Multimodal, Multi-scale, and Multi-type Node Heterogeneous Graph for Emotion Cause Triplet Extraction in Conversations
- Text to Query Plans for Question Answering on Large Tables
- Harnessing Rule-Based Reinforcement Learning for Enhanced Grammatical Error Correction
- Long-Term Variability in Physiological-Arousal Relationships for Robust Emotion Estimation
- Insights into User Interface Innovations from a Design Thinking Workshop at deRSE25
- EMind: A Foundation Model for Multi-task Electromagnetic Signals Understanding
- A Survey on Cloud-Edge-Terminal Collaborative Intelligence in AIoT Networks
- ConfTuner: Training Large Language Models to Express Their Confidence Verbally
- ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive
- ReflectivePrompt: Reflective evolution in autoprompting algorithms
- AI LLM Proof of Self-Consciousness and User-Specific Attractors
- Information Templates: A New Paradigm for Intelligent Active Feature Acquisition
- PKG-DPO: Optimizing Domain-Specific AI systems with Physics Knowledge Graphs and Direct Preference Optimization
- The AI in the Mirror: LLM Self-Recognition in an Iterated Public Goods Game
- Language Models For Generalised PDDL Planning: Synthesising Sound and Programmatic Policies
- Weisfeiler-Leman Features for Planning: A 1,000,000 Sample Size Hyperparameter Study
- Symmetry-Invariant Novelty Heuristics via Unsupervised Weisfeiler-Leman Features
- Generic Guard AI in Stealth Game with Composite Potential Fields
- A Database-Driven Framework for 3D Level Generation with LLMs
- SchemaCoder: Automatic Log Schema Extraction Coder with Residual Q-Tree Boosting
- eSkinHealth: A Multimodal Dataset for Neglected Tropical Skin Diseases
- RLMR: Reinforcement Learning with Mixed Rewards for Creative Writing
- Beyond Benchmark: LLMs Evaluation with an Anthropomorphic and Value-oriented Roadmap
- MUA-RL: Multi-turn User-interacting Agent Reinforcement Learning for agentic tool use
- AppAgent-Pro: A Proactive GUI Agent System for Multidomain Information Integration and User Assistance
- VistaWise: Building Cost-Effective Agent with Cross-Modal Knowledge Graph for Minecraft
- Bias Mitigation Agent: Optimizing Source Selection for Fair and Balanced Knowledge Retrieval
- CAC-CoT: Connector-Aware Compact Chain-of-Thought for Efficient Reasoning Data Synthesis Across Dual-System Cognitive Tasks
- Reflection-Enhanced Meta-Optimization Integrating TextGrad-style Prompt Optimization with Memory-Driven Self-Evolution
- Stabilizing Open-Set Test-Time Adaptation via Primary-Auxiliary Filtering and Knowledge-Integrated Prediction
- Answering the Unanswerable Is to Err Knowingly: Analyzing and Mitigating Abstention Failures in Large Reasoning Models
- Dynamic Collaboration of Multi-Language Models based on Minimal Complete Semantic Units
- AniME: Adaptive Multi-Agent Planning for Long Animation Generation
- CausalMACE: Causality Empowered Multi-Agents in Minecraft Cooperative Tasks
- STARec: An Efficient Agent Framework for Recommender Systems via Autonomous Deliberate Reasoning
- Judicial Requirements for Generative AI in Legal Reasoning
- Interactive Evaluation of Large Language Models for Multi-Requirement Software Engineering Tasks
- FormaRL: Enhancing Autoformalization with no Labeled Data
- Who Is Lagging Behind: Profiling Student Behaviors with Graph-Level Encoding in Curriculum-Based Online Learning Systems
- VISION: Robust and Interpretable Code Vulnerability Detection Leveraging Counterfactual Augmentation
- Novel Approaches to Artificial Intelligence Development Based on the Nearest Neighbor Method
- Enabling MoE on the Edge via Importance-Driven Expert Scheduling
- AI Models Exceed Individual Human Accuracy in Predicting Everyday Social Norms
- Building Self-Evolving Agents via Experience-Driven Lifelong Learning: A Framework and Benchmark
- Sense of Self and Time in Borderline Personality. A Comparative Robustness Study with Generative AI
- MAB Optimizer for Estimating Math Question Difficulty via Inverse CV without NLP
- Investigating Advanced Reasoning of Large Language Models via Black-Box Interaction
- A Concurrent Modular Agent: Framework for Autonomous LLM Agents
- Can Structured Templates Facilitate LLMs in Tackling Harder Tasks? : An Exploration of Scaling Laws by Difficulty
- Trustworthy Agents for Electronic Health Records through Confidence Estimation
- Reasoning LLMs in the Medical Domain: A Literature Survey
- Hybrid Deep Searcher: Integrating Parallel and Sequential Search Reasoning
- Algorithmic Collective Action with Multiple Collectives
- Playstyle and Artificial Intelligence: An Initial Blueprint Through the Lens of Video Games
- MATRIX: Multi-Agent simulaTion fRamework for safe Interactions and conteXtual clinical conversational evaluation
- The Ramon Llull's Thinking Machine for Automated Ideation
- The Subset Sum Matching Problem
- StepWiser: Stepwise Generative Judges for Wiser Reasoning
- Model Context Protocols in Adaptive Transport Systems: A Survey
- Technology-assisted Personalized Yoga for Better Health - Challenges and Outlook
- Multi-Modal Drift Forecasting of Leeway Objects via Navier-Stokes-Guided CNN and Sequence-to-Sequence Attention-Based Models
- Toward Responsible ASR for African American English Speakers: A Scoping Review of Bias and Equity in Speech Technology
- Consensus Is All You Need: Gossip-Based Reasoning Among Large Language Models
- Towards Training-Free Underwater 3D Object Detection from Sonar Point Clouds: A Comparison of Traditional and Deep Learning Approaches
- MobileDenseAttn:A Dual-Stream Architecture for Accurate and Interpretable Brain Tumor Detection
Research Sources: 492 | Generated: 8/27/2025