AI RESEARCH PAPERS & ACADEMIC SOURCES
- From Partial Exchangeability to Predictive Probability: A Bayesian Perspective on Classification
- VFOG: Variance-Reduced Fast Optimistic Gradient Methods for a Class of Nonmonotone Generalized Equations
- On the attainment of the Wasserstein--Cramer--Rao lower bound
- LEL: A Novel Lipschitz Continuity-constrained Ensemble Learning Model for EEG-based Emotion Recognition
- AffordanceSAM: Segment Anything Once More in Affordance Grounding
- AnimateAnywhere: Rouse the Background in Human Image Animation
- Mesh-Learner: Texturing Mesh with Spherical Harmonics
- PainFormer: a Vision Foundation Model for Automatic Pain Assessment
- Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields
- VIN-NBV: A View Introspection Network for Next-Best-View Selection
- FaceCrafter: Identity-Conditional Diffusion with Disentangled Control over Facial Pose, Expression, and Emotion
- ANT: Adaptive Neural Temporal-Aware Text-to-Motion Model
- WetCat: Enabling Automated Skill Assessment in Wet-Lab Cataract Surgery Videos
- DiffS-NOCS: 3D Point Cloud Reconstruction through Coloring Sketches to NOCS Maps Using Diffusion Models
- Baltimore Atlas: FreqWeaver Adapter for Semi-supervised Ultra-high Spatial Resolution Land Cover Classification
- BoxFusion: Reconstruction-Free Open-Vocabulary 3D Object Detection via Real-Time Multi-View Box Fusion
- Boosting Temporal Sentence Grounding via Causal Inference
- Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks
- Beyond Label Semantics: Language-Guided Action Anatomy for Few-shot Action Recognition
- Using Visual Anomaly Detection for Task Execution Monitoring
- A Multimodal Handover Failure Detection Dataset and Baselines
- Joint Quality Assessment and Example-Guided Image Processing by Disentangling Picture Appearance from Content
- Denoising, segmentation and volumetric rendering of optical coherence tomography angiography (OCTA) image using deep learning techniques: a review
- FetchBot: Learning Generalizable Object Fetching in Cluttered Scenes via Zero-Shot Sim2Real
- Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis
- FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
- CNeuroMod-THINGS, a densely-sampled fMRI dataset for visual neuroscience
- Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images
- PriorFormer: A Transformer for Real-time Monocular 3D Human Pose Estimation with Versatile Geometric Priors
- GSVisLoc: Generalizable Visual Localization for Gaussian Splatting Scene Representations
- MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs
- InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
- ObjFiller-3D: Consistent Multi-view 3D Inpainting via Video Diffusion Models
- 3D latent diffusion models for parameterizing and history matching multiscenario facies systems
- Predicting brain tumour enhancement from non-contrast MR imaging with artificial intelligence
- BrainPath: Generating Subject-Specific Brain Aging Trajectories
- Multimodal Medical Endoscopic Image Analysis via Progressive Disentangle-aware Contrastive Learning
- Generating Synthetic Contrast-Enhanced Chest CT Images from Non-Contrast Scans Using Slice-Consistent Brownian Bridge Diffusion Network
- MDD: A Dataset for Text-and-Music Conditioned Duet Dance Generation
- HunyuanVideo-Foley: Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation
- A Survey of Deep Learning-based Point Cloud Denoising
- Deep Learning Architectures for Medical Image Denoising: A Comparative Study of CNN-DAE, CADTra, and DCMIEDNet
- Semantic Diffusion Posterior Sampling for Cardiac Ultrasound Dehazing
- DanceEditor: Towards Iterative Editable Music-driven Dance Generation with Open-Vocabulary Descriptions
- SEBVS: Synthetic Event-based Visual Servoing for Robot Navigation and Manipulation
- Towards Trustworthy Breast Tumor Segmentation in Ultrasound using Monte Carlo Dropout and Deep Ensembles for Epistemic Uncertainty Estimation
- Egocentric Instruction-oriented Affordance Prediction via Large Multimodal Model
- TuningIQA: Fine-Grained Blind Image Quality Assessment for Livestreaming Camera Tuning
- A holistic perception system of internal and external monitoring for ground autonomous vehicles: AutoTRUST paradigm
- Scene-Agnostic Traversability Labeling and Estimation via a Multimodal Self-supervised Framework
- Deep Face Restoration: A Survey
- Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering
- PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal
- Imperceptible Protection against Style Imitation from Diffusion Models
- Top-Down Guidance for Learning Object-Centric Representations
- PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation
- 3D Feature Distillation with Object-Centric Priors
- CTRL-F: Pairing Convolution with Transformer for Image Classification via Multi-Level Feature Cross-Attention and Representation Learning Fusion
- One Framework to Rule Them All: Unifying Multimodal Tasks with LLM Neural-Tuning
- Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture
- ModalPrompt: Towards Efficient Multimodal Continual Instruction Tuning with Dual-Modality Guided Prompt
- CARLA2Real: a tool for reducing the sim2real appearance gap in CARLA simulator
- LumiSculpt: Enabling Consistent Portrait Lighting in Video Generation
- V2X-R: Cooperative LiDAR-4D Radar Fusion with Denoising Diffusion for 3D Object Detection
- Neural Shadow Art
- Addressing Text Embedding Leakage in Diffusion-based Image Editing
- CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models
- MSCN: Multi-view Structural Convolution Network for Domain-Invariant Point Cloud Recognition of Autonomous Vehicles
- Navi-plus: Managing Ambiguous GUI Navigation Tasks with Follow-up Questions
- T*: Re-thinking Temporal Search for Long-Form Video Understanding
- GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation
- Social-MAE: A Transformer-Based Multimodal Autoencoder for Face and Voice
- Minimal Solvers for Full DoF Motion Estimation from Asynchronous Tracks
- IDU: Incremental Dynamic Update of Existing 3D Virtual Environments with New Imagery Data
- HERO: Hierarchical Extrapolation and Refresh for Efficient World Models
- TinyGiantVLM: A Lightweight Vision-Language Architecture for Spatial Reasoning under Resource Constraints
- HotSpotter - Patterned Species Instance Recognition
- A Weighted Vision Transformer-Based Multi-Task Learning Framework for Predicting ADAS-Cog Scores
- JCo-MVTON: Jointly Controllable Multi-Modal Diffusion Transformer for Mask-Free Virtual Try-on
- Improving Interpretability in Alzheimer's Prediction via Joint Learning of ADAS-Cog Scores
- Wound3DAssist: A Practical Framework for 3D Wound Assessment
- HyTver: A Novel Loss Function for Longitudinal Multiple Sclerosis Lesion Segmentation
- FloraSyntropy-Net: Scalable Deep Learning with Novel FloraSyntropy Archive for Large-Scale Plant Disease Diagnosis
- Rethinking the Detail-Preserved Completion of Complex Tubular Structures based on Point Cloud: a Dataset and a Benchmark
- M^3-GloDets: Multi-Region and Multi-Scale Analysis of Fine-Grained Diseased Glomerular Detection
- Language-Guided Temporal Token Pruning for Efficient VideoLLM Processing
- Benchmarking Class Activation Map Methods for Explainable Brain Hemorrhage Classification on Hemorica Dataset
- CATformer: Contrastive Adversarial Transformer for Image Super-Resolution
- NGD: Neural Gradient Based Deformation for Monocular Garment Reconstruction
- F2RVLM: Boosting Fine-grained Fragment Retrieval for Multi-Modal Long-form Dialogue with Vision Language Model
- Few-shot Human Action Anomaly Detection via a Unified Contrastive Learning Framework
- CMFDNet: Cross-Mamba and Feature Discovery Network for Polyp Segmentation
- DroneKey: Drone 3D Pose Estimation in Image Sequences using Gated Key-representation and Pose-adaptive Learning
- From Global to Local: Social Bias Transfer in CLIP
- Sketchpose: Learning to Segment Cells with Partial Annotations
- PoRe: Position-Reweighted Visual Token Pruning for Vision Language Models
- TemCoCo: Temporally Consistent Multi-modal Video Fusion with Visual-Semantic Collaboration
- HLG: Comprehensive 3D Room Construction via Hierarchical Layout Generation
- SCOUT: Semi-supervised Camouflaged Object Detection by Utilizing Text and Adaptive Data Selection
- Box-Level Class-Balanced Sampling for Active Object Detection
- Camera Pose Refinement via 3D Gaussian Splatting
- ISALux: Illumination and Segmentation Aware Transformer Employing Mixture of Experts for Low Light Image Enhancement
- UniAPO: Unified Multimodal Automated Prompt Optimization
- EndoUFM: Utilizing Foundation Models for Monocular depth estimation of endoscopic images
- Gaze into the Heart: A Multi-View Video Dataset for rPPG and Health Biomarkers Estimation
- Beam Geometry and Input Dimensionality: Impact on Sparse-Sampling Artifact Correction for Clinical CT with U-Nets
- SAIL-Recon: Large SfM by Augmenting Scene Regression with Localization
- Enhanced Drift-Aware Computer Vision Architecture for Autonomous Driving
- Propose and Rectify: A Forensics-Driven MLLM Framework for Image Manipulation Localization
- Fence off Anomaly Interference: Cross-Domain Distillation for Fully Unsupervised Anomaly Detection
- FCR: Investigating Generative AI models for Forensic Craniofacial Reconstruction
- Visual-CoG: Stage-Aware Reinforcement Learning with Chain of Guidance for Text-to-Image Generation
- ArgusCogito: Chain-of-Thought for Cross-Modal Synergy and Omnidirectional Reasoning in Camouflaged Object Segmentation
- Annotation-Free Open-Vocabulary Segmentation for Remote-Sensing Images
- EventTracer: Fast Path Tracing-based Event Stream Rendering
- Few-shot Unknown Class Discovery of Hyperspectral Images with Prototype Learning and Clustering
- Follow My Hold: Hand-Object Interaction Reconstruction through Geometric Guidance
- GM-Skip: Metric-Guided Transformer Block Skipping for Efficient Vision-Language Models
- Sealing The Backdoor: Unlearning Adversarial Text Triggers In Diffusion Models Using Knowledge Distillation
- Interpretable Evaluation of AI-Generated Content with Language-Grounded Sparse Encoders
- Addressing Annotation Scarcity in Hyperspectral Brain Image Segmentation with Unsupervised Domain Adaptation
- NAT: Learning to Attack Neurons for Enhanced Adversarial Transferability
- HieroAction: Hierarchically Guided VLM for Fine-Grained Action Analysis
- RPD-Diff: Region-Adaptive Physics-Guided Diffusion Model for Visibility Enhancement under Dense and Non-Uniform Haze
- Local Information Matters: A Rethink of Crowd Counting
- Robust Diagram Reasoning: A Framework for Enhancing LVLM Performance on Visually Perturbed Scientific Diagrams
- Balanced Sharpness-Aware Minimization for Imbalanced Regression
- Hierarchical Contextual Grounding LVLM: Enhancing Fine-Grained Visual-Language Understanding with Robust Grounding
- HiCache: Training-free Acceleration of Diffusion Models via Hermite Polynomial-based Feature Caching
- Contrastive Prompt Clustering for Weakly Supervised Semantic Segmentation
- Fiducial Marker Splatting for High-Fidelity Robotics Simulations
- Dual Orthogonal Guidance for Robust Diffusion-based Handwritten Text Generation
- Probabilistic Temporal Masked Attention for Cross-view Online Action Detection
- A Novel Local Focusing Mechanism for Deepfake Detection Generalization
- F4-ITS: Fine-grained Feature Fusion for Food Image-Text Search
- M3DMap: Object-aware Multimodal 3D Mapping for Dynamic Environments
- Styleclone: Face Stylization with Diffusion Based Data Augmentation
- PVNet: Point-Voxel Interaction LiDAR Scene Upsampling Via Diffusion Models
- DeltaFlow: An Efficient Multi-frame Scene Flow Estimation Method
- REGEN: Real-Time Photorealism Enhancement in Games via a Dual-Stage Generative Network Framework
- PD-Loss: Proxy-Decidability for Efficient Metric Learning
- GRASP: Geospatial pixel Reasoning viA Structured Policy learning
- Structural Damage Detection Using AI Super Resolution and Visual Language Model
- Development of an isotropic segmentation model for medial temporal lobe subregions on anisotropic MRI atlas using implicit neural representation
- Advancing Weakly-Supervised Change Detection in Satellite Images via Adversarial Class Prompting
- MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via Pseudo-labeling
- Multi-modal Knowledge Decomposition based Online Distillation for Biomarker Prediction in Breast Cancer Histopathology
- 4D Visual Pre-training for Robot Learning
- PersPose: 3D Human Pose Estimation with Perspective Encoding and Perspective Rotation
- Uncovering and Mitigating Destructive Multi-Embedding Attacks in Deepfake Proactive Forensics
- SEER-VAR: Semantic Egocentric Environment Reasoner for Vehicle Augmented Reality
- AdaGAT: Adaptive Guidance Adversarial Training for the Robustness of Deep Neural Networks
- Spatial-Temporal Human-Object Interaction Detection
- MTNet: Learning modality-aware representation with transformer for RGBT tracking
- FoundDiff: Foundational Diffusion Model for Generalizable Low-Dose CT Denoising
- PosBridge: Multi-View Positional Embedding Transplant for Identity-Aware Image Editing
- First Place Solution to the MLCAS 2025 GWFSS Challenge: The Devil is in the Detail and Minority
- Defending Deepfake via Texture Feature Perturbation
- SpecGen: Neural Spectral BRDF Generation via Spectral-Spatial Tri-plane Aggregation
- No Pixel Left Behind: A Detail-Preserving Architecture for Robust High-Resolution AI-Generated Image Detection
- DiCache: Let Diffusion Model Determine Its Own Cache
- Lightweight Joint Optimization of General-Purpose Vision-Language Models and Retrievers for Medical Diagnosis
- Enhancing Underwater Images via Deep Learning: A Comparative Study of VGG19 and ResNet50-Based Approaches
- MoCo: Motion-Consistent Human Video Generation via Structure-Appearance Decoupling
- E-BayesSAM: Efficient Bayesian Adaptation of SAM with Self-Optimizing KAN-Based Interpretation for Uncertainty-Aware Ultrasonic Segmentation
- Data Leakage in Visual Datasets
- Constrained Prompt Enhancement for Improving Zero-Shot Generalization of Vision-Language Models
- Robust Point Cloud Registration via Geometric Overlapping Guided Rotation Search
- TinySR: Pruning Diffusion for Real-World Image Super-Resolution
- An LLM-LVLM Driven Agent for Iterative and Fine-Grained Image Editing
- Disentangled Geometry and Appearance for Efficient Multi-View Surface Reconstruction and Rendering
- Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
- Investigating Domain Gaps for Indoor 3D Object Detection
- Multi-Level LVLM Guidance for Untrimmed Video Action Recognition
- T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation
- GraphMMP: A Graph Neural Network Model with Mutual Information and Global Fusion for Multimodal Medical Prognosis
- Optimizing Multi-Modal Trackers via Sensitivity-aware Regularized Tuning
- Confidential Prompting: Privacy-preserving LLM Inference on Cloud
- Leveraging the Power of MLLMs for Gloss-Free Sign Language Translation
- AutoDCWorkflow: LLM-based Data Cleaning Workflow Auto-Generation and Benchmark
- Towards New Benchmark for AI Alignment & Sentiment Analysis in Socially Important Issues: A Comparative Study of Human and LLMs in the Context of AGI
- SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information
- Towards High-Precision Depth Sensing via Monocular-Aided iToF and RGB Integration
- CountLoop: Training-Free High-Instance Image Generation via Iterative Agent Guidance
- Do VLMs Have Bad Eyes? Diagnosing Compositional Failures via Mechanistic Interpretability
- MSNav: Zero-Shot Vision-and-Language Navigation with Dynamic Memory and LLM Spatial Reasoning
- QA-VLM: Providing human-interpretable quality assessment for wire-feed laser additive manufacturing parts with Vision Language Models
- Two-Stage Framework for Efficient UAV-Based Wildfire Video Analysis with Adaptive Compression and Fire Source Detection
- A Framework for Benchmarking Fairness-Utility Trade-offs in Text-to-Image Models via Pareto Frontiers
- WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
- Towards Open-Vocabulary Multimodal 3D Object Detection with Attributes
- AIM 2025 Low-light RAW Video Denoising Challenge: Dataset, Methods and Results
- Transformer-Based Neural Network for Transient Detection without Image Subtraction
- RF-PGS: Fully-structured Spatial Wireless Channel Representation with Planar Gaussian Splatting
- Beyond Emotion Recognition: A Multi-Turn Multimodal Emotion Understanding and Reasoning Benchmark
- Delta-SVD: Efficient Compression for Personalized Text-to-Image Models
- Do Multimodal LLMs See Sentiment?
- AWM-Fuse: Multi-Modality Image Fusion for Adverse Weather via Global and Local Text Perception
- A Lightweight Convolution and Vision Transformer integrated model with Multi-scale Self-attention Mechanism
- MDIQA: Unified Image Quality Assessment for Multi-dimensional Evaluation and Restoration
- Structural Energy-Guided Sampling for View-Consistent Text-to-3D
- MSPCaps: A Multi-Scale Patchify Capsule Network with Cross-Agreement Routing for Visual Recognition
- LGE-Guided Cross-Modality Contrastive Learning for Gadolinium-Free Cardiomyopathy Screening in Cine CMR
- Align 3D Representation and Text Embedding for 3D Content Personalization
- How Do LLM-Generated Texts Impact Term-Based Retrieval Models?
- CEIDM: A Controlled Entity and Interaction Diffusion Model for Enhanced Text-to-Image Generation
- HLLM-Creator: Hierarchical LLM-based Personalized Creative Generation
- Can AI Have a Personality? Prompt Engineering for AI Personality Simulation: A Chatbot Case Study in Gender-Affirming Voice Therapy Training
- Rethinking Cross-Subject Data Splitting for Brain-to-Text Decoding
- Backdoor Attacks on Dense Retrieval via Public and Unintentional Triggers
- ComplexTempQA:A 100m Dataset for Complex Temporal Question Answering
- A Factuality and Diversity Reconciled Decoding Method for Knowledge-Grounded Dialogue Generation
- Localizing Factual Inconsistencies in Attributable Text Generation
- SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition
- Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge
- Trust Me, I'm Wrong: LLMs Hallucinate with Certainty Despite Knowing the Answer
- Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference
- Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models
- OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
- Post-Training Language Models for Continual Relation Extraction
- Unified attacks to large language model watermarks: spoofing and scrubbing in unauthorized knowledge distillation
- A Factorized Probabilistic Model of the Semantics of Vague Temporal Adverbials Relative to Different Event Types
- sudoLLM: On Multi-role Alignment of Language Models
- DecisionFlow: Advancing Large Language Model as Principled Decision Maker
- Words Like Knives: Backstory-Personalized Modeling and Detection of Violent Communication
- Self-Correcting Code Generation Using Small Language Models
- Measuring Sycophancy of Language Models in Multi-turn Dialogues
- Automatic Speech Recognition of African American English: Lexical and Contextual Effects
- Reasoning with RAGged events: RAG-Enhanced Event Knowledge Base Construction and reasoning with proof-assistants
- CoLMbo: Speaker Language Model for Descriptive Profiling
- Modeling Probabilistic Reduction using Information Theory and Naive Discriminative Learning
- Evaluating Scoring Bias in LLM-as-a-Judge
- GoalfyMax: A Protocol-Driven Multi-Agent System for Intelligent Experience Entities
- Adaptive Linguistic Prompting (ALP) Enhances Phishing Webpage Detection in Multimodal Large Language Models
- Agri-Query: A Case Study on RAG vs. Long-Context LLMs for Cross-Lingual Technical Question Answering
- SentiMM: A Multimodal Multi-Agent Framework for Sentiment Analysis in Social Media
- Toward a Better Localization of Princeton WordNet
- S2Sent: Nested Selectivity Aware Sentence Representation Learning
- DiscussLLM: Teaching Large Language Models When to Speak
- Improving End-to-End Training of Retrieval-Augmented Generation Models via Joint Stochastic Approximation
- Exploring the Interplay between Musical Preferences and Personality through the Lens of Language
- Better Language Model-Based Judging Reward Modeling through Scaling Comprehension Boundaries
- Demographic Biases and Gaps in the Perception of Sexism in Large Language Models
- From BERT to LLMs: Comparing and Understanding Chinese Classifier Prediction in Language Models
- MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains
- Empirical Analysis of the Effect of Context in the Task of Automated Essay Scoring in Transformer-Based Models
- Leveraging Multi-Source Textural UGC for Neighbourhood Housing Quality Assessment: A GPT-Enhanced Framework
- RephraseTTS: Dynamic Length Text based Speech Insertion with Speaker Style Transfer
- Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol
- Dynamic Embedding of Hierarchical Visual Features for Efficient Vision-Language Fine-Tuning
- DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards
- DS@GT at CheckThat! 2025: A Simple Retrieval-First, LLM-Backed Framework for Claim Normalization
- Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD
- Evaluating the Impact of Verbal Multiword Expressions on Machine Translation
- Improving French Synthetic Speech Quality via SSML Prosody Control
- Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models?
- Humanizing Machines: Rethinking LLM Anthropomorphism Through a Multi-Level Framework of Design
- Less Is More? Examining Fairness in Pruned Large Language Models for Summarising Opinions
- EMO-Reasoning: Benchmarking Emotional Reasoning Capabilities in Spoken Dialogue Systems
- SurveyGen: Quality-Aware Scientific Survey Generation with Large Language Models
- CoCoA: Confidence- and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models
- EMPOWER: Evolutionary Medical Prompt Optimization With Reinforcement Learning
- Layerwise Importance Analysis of Feed-Forward Networks in Transformer-based Language Models
- SMITE: Enhancing Fairness in LLMs through Optimal In-Context Example Selection via Dynamic Validation
- Speculating LLMs' Chinese Training Data Pollution from Their Tokens
- Zero-shot Context Biasing with Trie-based Decoding using Synthetic Multi-Pronunciation
- DRQA: Dynamic Reasoning Quota Allocation for Controlling Overthinking in Reasoning Large Language Models
- Beyond Demographics: Enhancing Cultural Value Survey Simulation with Multi-Stage Personality-Driven Cognitive Reasoning
- Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in SpeechLLMs
- Pandora: Leveraging Code-driven Knowledge Transfer for Unified Structured Knowledge Reasoning
- Evaluating the Representation of Vowels in Wav2Vec Feature Extractor: A Layer-Wise Analysis Using MFCCs
- Information availability in different languages and various technological constraints related to multilinguism on the Internet
- Feature-Refined Unsupervised Model for Loanword Detection
- German4All - A Dataset and Model for Readability-Controlled Paraphrasing in German
- A Retail-Corpus for Aspect-Based Sentiment Analysis with Large Language Models
- Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
- Speech-Based Depressive Mood Detection in the Presence of Multiple Sclerosis: A Cross-Corpus and Cross-Lingual Study
- Error Reflection Prompting: Can Large Language Models Successfully Understand Errors?
- GAICo: A Deployed and Extensible Framework for Evaluating Diverse and Multimodal Generative AI Outputs
- How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking Models
- Toward Socially Aware Vision-Language Models: Evaluating Cultural Competence Through Multimodal Story Generation
- Assess and Prompt: A Generative RL Framework for Improving Engagement in Online Mental Health Communities
- ReProCon: Scalable and Resource-Efficient Few-Shot Biomedical Named Entity Recognition
- LLMs Learn Constructions That Humans Do Not Know
- If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition
- Learning from Diverse Reasoning Paths with Routing and Collaboration
- QFrCoLA: a Quebec-French Corpus of Linguistic Acceptability Judgments
- JUDGEBERT: Assessing Legal Meaning Preservation Between Sentences
- ObjexMT: Objective Extraction and Metacognitive Calibration for LLM-as-a-Judge under Multi-Turn Jailbreaks
- Unbiased Reasoning for Knowledge-Intensive Tasks in Large Language Models via Conditional Front-Door Adjustment
- Being Kind Isn't Always Being Safe: Diagnosing Affective Hallucination in LLMs
- Decoding Alignment: A Critical Survey of LLM Development Initiatives through Value-setting and Data-centric Lens
- DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation
- Planning for Success: Exploring LLM Long-term Planning Capabilities in Table Understanding
- Improving Table Understanding with LLMs and Entity-Oriented Search
- A Straightforward Pipeline for Targeted Entailment and Contradiction Detection
- The Power of Framing: How News Headlines Guide Search Behavior
- Geolocation-Aware Robust Spoken Language Identification
- SPORTSQL: An Interactive System for Real-Time Sports Reasoning and Visualization
- Quantifying Language Disparities in Multilingual Large Language Models
- The Impact of Annotator Personas on LLM Behavior Across the Perspectivism Spectrum
- Towards Alignment-Centric Paradigm: A Survey of Instruction Tuning in Large Language Models
- Active Domain Knowledge Acquisition with \$100 Budget: Enhancing LLMs via Cost-Efficient, Expert-Involved Interaction in Sensitive Domains
- Routing Distilled Knowledge via Mixture of LoRA Experts for Large Language Model based Bundle Generation
- Are You Sure You're Positive? Consolidating Chain-of-Thought Agents with Uncertainty Quantification for Aspect-Category Sentiment Analysis
- From Language to Action: A Review of Large Language Models as Autonomous Agents and Tool Users
- Handling Students Dropouts in an LLM-driven Interactive Online Course Using Language Models
- UI-Level Evaluation of ALLaM 34B: Measuring an Arabic-Centric LLM via HUMAIN Chat
- Federated Adversarial Domain Adaptation
- Dynamic Reserve Price Design with Distributed Solving Algorithm
- A Global Optimization Algorithm for K-Center Clustering of One Billion Samples
- Conditional Stochastic Interpolation for Generative Learning
- Does provable absence of barren plateaus imply classical simulability?
- Simulation Based Bayesian Optimization
- SVGCraft: Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout
- Enhancing the Trainability of Variational Quantum Circuits with Regularization Strategies
- On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization
- Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning
- Fingerprint Vector: Enabling Scalable and Efficient Model Fingerprint Transfer via Vector Addition
- Fitting Multilevel Factor Models
- Learning from Summarized Data: Gaussian Process Regression with Sample Quasi-Likelihood
- Harnessing Large Language Models for Disaster Management: A Survey
- Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation
- Poisson Hierarchical Indian Buffet Processes-With Indications for Microbiome Species Sampling Models
- TranSQL+: Serving Large Language Models with SQL on Low-Resource Hardware
- Learning an Optimal Assortment Policy under Observational Data
- Neural Posterior Estimation for Cataloging Astronomical Images with Spatially Varying Backgrounds and Point Spread Functions
- Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks
- CAARMA: Class Augmentation with Adversarial Mixup Regularization
- Optimistic Online Learning in Symmetric Cone Games
- Deep spatio-temporal point processes: Advances and new directions
- Machine Learning-Based Prediction of Quality Shifts on Video Streaming Over 5G
- HMAE: Self-Supervised Few-Shot Learning for Quantum Spin Systems
- Fairmetrics: An R package for group fairness evaluation
- Macro Graph of Experts for Billion-Scale Multi-Task Recommendation
- How do Probabilistic Graphical Models and Graph Neural Networks Look at Network Data?
- FlatCAD: Fast Curvature Regularization of Neural SDFs for CAD Models
- Beyond Blur: A Fluid Perspective on Generative Diffusion Models
- Global Convergence of Iteratively Reweighted Least Squares for Robust Subspace Recovery
- FuSeFL: Fully Secure and Scalable Cross-Silo Federated Learning
- On the Foundation of Distributionally Robust Reinforcement Learning
- Provable Emergence of Deep Neural Collapse and Low-Rank Bias in $L^2$-Regularized Nonlinear Networks
- Revisiting Differentially Private Hyper-parameter Tuning
- SINDy-RL: Interpretable and Efficient Model-Based Reinforcement Learning
- Tabular and Deep Reinforcement Learning for Gittins Index
- When predict can also explain: few-shot prediction to select better neural latents
- What Did I Do Wrong? Quantifying LLMs' Sensitivity and Consistency to Prompt Engineering
- Probabilistic Classification of Near-Surface Shallow-Water Sediments using A Portable Free-Fall Penetrometer
- Making Hard Problems Easier with Custom Data Distributions and Loss Regularization: A Case Study in Modular Arithmetic
- Local Off-Grid Weather Forecasting with Multi-Modal Earth Observation Data
- LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation
- Active Learning-Based Optimization of Hydroelectric Turbine Startup to Minimize Fatigue Damage
- HeteroTune: Efficient Federated Learning for Large Heterogeneous Models
- ReHub: Linear Complexity Graph Transformers with Adaptive Hub-Spoke Reassignment
- DeMem: Privacy-Enhanced Robust Adversarial Learning via De-Memorization
- An Inquiry into Datacenter TCO for LLM Inference with FP8
- WaveStitch: Flexible and Fast Conditional Time Series Generation with Diffusion Models
- Manifold learning in metric spaces
- Understanding Bias Reinforcement in LLM Agents Debate
- Kernel Ridge Regression for Efficient Learning of High-Capacity Hopfield Networks
- Fault Detection in New Wind Turbines with Limited Data by Generative Transfer Learning
- DeeP-Mod: Deep Dynamic Programming based Environment Modelling using Feature Extraction
- Generative Machine Learning in Adaptive Control of Dynamic Manufacturing Processes: A Review
- Where's the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
- USPR: Learning a Unified Solver for Profiled Routing
- Reconsidering Fairness Through Unawareness From the Perspective of Model Multiplicity
- How to craft a deep reinforcement learning policy for wind farm flow control
- CoxNTF: A New Approach for Joint Clustering and Prediction in Survival Analysis
- Constrained Diffusion Models for Synthesizing Representative Power Flow Datasets
- GUST: Quantifying Free-Form Geometric Uncertainty of Metamaterials Using Small Data
- Continual Learning for Generative AI: From LLMs to MLLMs and Beyond
- Mitigating Message Imbalance in Fraud Detection with Dual-View Graph Representation Learning
- The Target Polish: A New Approach to Outlier-Resistant Non-Negative Matrix Factorization
- From Small to Large: A Graph Convolutional Network Approach for Solving Assortment Optimization Problems
- HV Metric For Time-Domain Full Waveform Inversion
- Rao Differential Privacy
- Factor Informed Double Deep Learning For Average Treatment Effect Estimation
- Frequency Response Identification of Low-Order Systems: Finite-Sample Analysis
- Integrative Experiments Identify How Punishment Impacts Welfare in Public Goods Games
- On the sample complexity of semi-supervised multi-objective learning
- VROOM - Visual Reconstruction over Onboard Multiview
- Deep Learning with Self-Attention and Enhanced Preprocessing for Precise Diagnosis of Acute Lymphoblastic Leukemia from Bone Marrow Smears in Hemato-Oncology
- TokenLake: A Unified Segment-level Prefix Cache Pool for Fine-grained Elastic Long-Context LLM Serving
- Learning Short-Term and Long-Term Patterns of High-Order Dynamics in Real-World Networks
- CLIFF: Continual Learning for Incremental Flake Features in 2D Material Identification
- Quickly Tuning Foundation Models for Image Segmentation
- DropLoRA: Sparse Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
- Who Wins the Race? (R Vs Python) - An Exploratory Study on Energy Consumption of Machine Learning Algorithms
- Detecting Struggling Student Programmers using Proficiency Taxonomies
- Programmable k-local Ising Machines and all-optical Kolmogorov-Arnold Networks on Photonic Platforms
- MahaParaphrase: A Marathi Paraphrase Detection Corpus and BERT-based Models
- Efficient Zero-Shot Long Document Classification by Reducing Context Through Sentence Ranking
- High-Order Langevin Monte Carlo Algorithms
- Boltzina: Efficient and Accurate Virtual Screening via Docking-Guided Binding Prediction with Boltz-2
- Towards Optimal Convolutional Transfer Learning Architectures for Breast Lesion Classification and ACL Tear Detection
- CausalSent: Interpretable Sentiment Classification with RieszNet
- The Statistical Fairness-Accuracy Frontier
- Citizen Centered Climate Intelligence: Operationalizing Open Tree Data for Urban Cooling and Eco-Routing in Indian Cities
- Text Meets Topology: Rethinking Out-of-distribution Detection in Text-Rich Networks
- Segmentation and Classification of Pap Smear Images for Cervical Cancer Detection Using Deep Learning
- ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
- Robust Anomaly Detection in Industrial Environments via Meta-Learning
- A Contrastive Learning-Guided Confident Meta-learning for Zero Shot Anomaly Detection
- Diffusion-Based Data Augmentation for Medical Image Segmentation
- Alternating Training-based Label Smoothing Enhances Prompt Generalization
- ILRe: Intermediate Layer Retrieval for Context Compression in Causal Language Models
- WOMAC: A Mechanism For Prediction Competitions
- Entanglement Detection with Quantum-inspired Kernels and SVMs
- DesCartes Builder: A Tool to Develop Machine-Learning Based Digital Twins
- Unseen Speaker and Language Adaptation for Lightweight Text-To-Speech with Adapters
- Development of a Neural Network Model for Currency Detection to aid visually impaired people in Nigeria
- How Quantization Shapes Bias in Large Language Models
- Incorporating Pre-trained Diffusion Models in Solving the Schr\"odinger Bridge Problem
- Detecting and Characterizing Planning in Language Models
- BirdRecorder's AI on Sky: Safeguarding birds of prey by detection and classification of tiny objects around wind turbines
- SpotEdit: Evaluating Visually-Guided Image Editing Methods
- Hybrid Quantum-Classical Learning for Multiclass Image Classification
- PCR-CA: Parallel Codebook Representations with Contrastive Alignment for Multiple-Category App Recommendation
- Scene-Aware Vectorized Memory Multi-Agent Framework with Cross-Modal Differentiated Quantization VLMs for Visually Impaired Assistance
- Introduction to Regularization and Learning Methods for Inverse Problems
- Emerging Semantic Segmentation from Positive and Negative Coarse Label Learning
- Practical GPU Choices for Earth Observation: ResNet-50 Training Throughput on Integrated, Laptop, and Cloud Accelerators
- Clinical characteristics, complications and outcomes of critically ill patients with Dengue in Brazil, 2012-2024: a nationwide, multicentre cohort study
- Flexibility-Conditioned Protein Structure Design with Flow Matching
- Flash Sparse Attention: An Alternative Efficient Implementation of Native Sparse Attention Kernel
- One-step learning algorithm selection for classification via convolutional neural networks
- On the Edge of Memorization in Diffusion Models
- Rethinking Federated Learning Over the Air: The Blessing of Scaling Up
- Adaptive Ensemble Learning with Gaussian Copula for Load Forecasting
- Copyright Protection for 3D Molecular Structures with Watermarking
- Randomly Removing 50% of Dimensions in Text Embeddings has Minimal Impact on Retrieval and Classification Tasks
- Multi-layer Abstraction for Nested Generation of Options (MANGO) in Hierarchical Reinforcement Learning
- SuperGen: An Efficient Ultra-high-resolution Video Generation System with Sketching and Tiling
- Evaluating the Quality of the Quantified Uncertainty for (Re)Calibration of Data-Driven Regression Models
- Puzzle: Scheduling Multiple Deep Learning Models on Mobile Device with Heterogeneous Processors
- Multi-domain Distribution Learning for De Novo Drug Design
- Spectrum Prediction in the Fractional Fourier Domain with Adaptive Filtering
- Learning to Detect Label Errors by Making Them: A Method for Segmentation and Object Detection Datasets
- Choice Outweighs Effort: Facilitating Complementary Knowledge Fusion in Federated Learning via Re-calibration and Merit-discrimination
- Generative Feature Imputing - A Technique for Error-resilient Semantic Communication
- Topology Aware Neural Interpolation of Scalar Fields
- A Novel Framework for Uncertainty Quantification via Proper Scores for Classification and Beyond
- Does simple trump complex? Comparing strategies for adversarial robustness in DNNs
- Enhancing Differentially Private Linear Regression via Public Second-Moment
- Riemannian Change Point Detection on Manifolds with Robust Centroid Estimation
- Training Transformers for Mesh-Based Simulations
- Weisfeiler-Lehman meets Events: An Expressivity Analysis for Continuous-Time Dynamic Graph Neural Networks
- FedGreed: A Byzantine-Robust Loss-Based Aggregation Method for Federated Learning
- Quantum-Classical Hybrid Framework for Zero-Day Time-Push GNSS Spoofing Detection
- Provable Mixed-Noise Learning with Flow-Matching
- Frozen in Time: Parameter-Efficient Time Series Transformers via Reservoir-Induced Feature Expansion and Fixed Random Dynamics
- Unveiling the Actual Performance of Neural-based Models for Equation Discovery on Graph Dynamical Systems
- HypER: Hyperbolic Echo State Networks for Capturing Stretch-and-Fold Dynamics in Chaotic Flows
- Aligning the Evaluation of Probabilistic Predictions with Downstream Value
- Increasing Interaction Fidelity: Training Routines for Biomechanical Models in HCI
- HemePLM-Diffuse: A Scalable Generative Framework for Protein-Ligand Dynamics in Large Biomolecular System
- WHAR Datasets: An Open Source Library for Wearable Human Activity Recognition
- Generative Latent Diffusion Model for Inverse Modeling and Uncertainty Analysis in Geological Carbon Sequestration
- Sparse and Dense Retrievers Learn Better Together: Joint Sparse-Dense Optimization for Text-Image Retrieval
- Analysis of Transferability Estimation Metrics for Surgical Phase Recognition
- Walk-on-Interfaces: A Monte Carlo Estimator for an Elliptic Interface Problem with Nonhomogeneous Flux Jump Conditions and a Neumann Boundary Condition
- TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling
- Bootstrapping Conditional Retrieval for User-to-Item Recommendations
- Predictability Enables Parallelization of Nonlinear State Space Models
- The compressible Neural Particle Method for Simulating Compressible Viscous Fluid Flows
- Preserving Domain Generalization in Fine-Tuning via Joint Parameter Selection
- GraphPPD: Posterior Predictive Modelling for Graph-Level Inference
- KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF
- EduRABSA: An Education Review Dataset for Aspect-based Sentiment Analysis Tasks
- Limitations of refinement methods for weak to strong generalization
- GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection
- CP4SBI: Local Conformal Calibration of Credible Sets in Simulation-Based Inference
- A Decoupled LOB Representation Framework for Multilevel Manipulation Detection with Supervised Contrastive Learning
- Neural Stochastic Differential Equations on Compact State-Spaces
- SugarcaneShuffleNet: A Very Fast, Lightweight Convolutional Neural Network for Diagnosis of 15 Sugarcane Leaf Diseases
- Is the Frequency Principle always valid?
- MetaFed: Advancing Privacy, Performance, and Sustainability in Federated Metaverse Systems
- ShortListing Model: A Streamlined SimplexDiffusion for Discrete Variable Generation
- Trust Me, I Know This Function: Hijacking LLM Static Analysis using Bias
- ShaLa: Multimodal Shared Latent Space Modelling
- FedERL: Federated Efficient and Robust Learning for Common Corruptions
- Effective Clustering for Large Multi-Relational Graphs
- Mutual Information Surprise: Rethinking Unexpectedness in Autonomous Systems
- FRAME : Comprehensive Risk Assessment Framework for Adversarial Machine Learning Threats
- Modular MeanFlow: Towards Stable and Scalable One-Step Generative Modeling
- TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling
- Rectified Robust Policy Optimization for Model-Uncertain Constrained Reinforcement Learning without Strong Duality
- ReviBranch: Deep Reinforcement Learning for Branch-and-Bound with Revived Trajectories
- A Systematic Literature Review on Multi-label Data Stream Classification
- Adversarial Examples Are Not Bugs, They Are Superposition
- MoE-Inference-Bench: Performance Evaluation of Mixture of Expert Large Language and Vision Models
- A Human-In-The-Loop Approach for Improving Fairness in Predictive Business Process Monitoring
- Learning Interpretable Differentiable Logic Networks for Time-Series Classification
- GateTS: Versatile and Efficient Forecasting via Attention-Inspired routed Mixture-of-Experts
- Modeling Irregular Astronomical Time Series with Neural Stochastic Delay Differential Equations
- Gumbel-MPNN: Graph Rewiring with Gumbel-Softmax
- Bridging Graph and State-Space Modeling for Intensive Care Unit Length of Stay Prediction
- Exploring Efficient Learning of Small BERT Networks with LoRA and DoRA
- ChartMaster: Advancing Chart-to-Code Generation with Real-World Charts and Chart Similarity Reinforcement Learning
- A Proportional-Integral Controller-Incorporated SGD Algorithm for High Efficient Latent Factor Analysis
- Quantum Graph Attention Network: A Novel Quantum Multi-Head Attention Mechanism for Graph Learning
- Longitudinal Progression Prediction of Alzheimer's Disease with Tabular Foundation Model
- Heterogeneous co-occurrence embedding for visual information exploration
- Towards Synthesizing Normative Data for Cognitive Assessments Using Generative Multimodal Large Language Models
- TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training
- Characterizing the Behavior of Training Mamba-based State Space Models on GPUs
- Neural Contrast Expansion for Explainable Structure-Property Prediction and Random Microstructure Design
- UM3: Unsupervised Map to Map Matching
- Quantifying Out-of-Training Uncertainty of Neural-Network based Turbulence Closures
- Reinforcement-Guided Hyper-Heuristic Hyperparameter Optimization for Fair and Explainable Spiking Neural Network-Based Financial Fraud Detection
- Attention Layers Add Into Low-Dimensional Residual Subspaces
- Sig-DEG for Distillation: Making Diffusion Models Faster and Lighter
- Disentangling Polysemantic Neurons with a Null-Calibrated Polysemanticity Index and Causal Patch Interventions
- Unveiling the Latent Directions of Reflection in Large Language Models
- Online Learning for Approximately-Convex Functions with Long-term Adversarial Constraints
- Learned Structure in CARTRIDGES: Keys as Shareable Routers in Self-Studied Representations
- Learning ON Large Datasets Using Bit-String Trees
- Reconciling Communication Compression and Byzantine-Robustness in Distributed Learning
- MoE-Beyond: Learning-Based Expert Activation Prediction on Edge Devices
- Stochastic Gradient Descent with Strategic Querying
- Towards Safeguarding LLM Fine-tuning APIs against Cipher Attacks
- Sharpness-Aware Geometric Defense for Robust Out-Of-Distribution Detection
- Curvature Learning for Generalization of Hyperbolic Neural Networks
- DeepCFD: Efficient near-ground airfoil lift coefficient approximation with deep convolutional neural networks
- Explainable AI (XAI) for Arrhythmia detection from electrocardiograms
- Physics-informed neural network for fatigue life prediction of irradiated austenitic and ferritic/martensitic steels
- AdaptiveK Sparse Autoencoders: Dynamic Sparsity Allocation for Interpretable LLM Representations
- Quantum-Inspired DRL Approach with LSTM and OU Noise for Cut Order Planning Optimization
- CrystalDiT: A Diffusion Transformer for Crystal Generation
- Leveraging the Christoffel Function for Outlier Detection in Data Streams
- Recurrent Transformer U-Net Surrogate for Flow Modeling and Data Assimilation in Subsurface Formations with Faults
- A Novel Unified Extended Matrix for Graph Signal Processing: Theory and Application
- Enhancing Transformer-Based Foundation Models for Time Series Forecasting via Bagging, Boosting and Statistical Ensembles
- AdapSNE: Adaptive Fireworks-Optimized and Entropy-Guided Dataset Sampling for Edge DNN Training
- A Laplace diffusion-based transformer model for heart rate forecasting within daily activity context
- OASIS: Open-world Adaptive Self-supervised and Imbalanced-aware System
- WISCA: A Lightweight Model Transition Method to Improve LLM Training via Weight Scaling
- Multidimensional Distributional Neural Network Output Demonstrated in Super-Resolution of Surface Wind Speed
- Native Logical and Hierarchical Representations with Subspace Embeddings
- A novel auxiliary equation neural networks method for exactly explicit solutions of nonlinear partial differential equations
- Aligning Distributionally Robust Optimization with Practical Deep Learning Needs
- Deep Learning for Markov Chains: Lyapunov Functions, Poisson's Equation, and Stationary Distributions
- Hyperbolic Multimodal Representation Learning for Biological Taxonomies
- DR-CircuitGNN: Training Acceleration of Heterogeneous Circuit Graph Neural Network on GPUs
- Latent Graph Learning in Generative Models of Neural Signals
- Anchor-MoE: A Mean-Anchored Mixture of Experts For Probabilistic Regression
- Uncertainty Propagation Networks for Neural Ordinary Differential Equations
- EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
- Towards Controllable Speech Synthesis in the Era of Large Language Models: A Systematic Survey
- DRT: Deep Reasoning Translation via Long Chain-of-Thought
- From Models to Network Topologies: A Topology Inference Attack in Decentralized Federated Learning
- Disentangling Exploration of Large Language Models by Optimal Exploitation
- Visual Generation Without Guidance
- SCP-116K: A High-Quality Problem-Solution Dataset and a Generalized Pipeline for Automated Extraction in the Higher Education Science Domain
- Optimizing the Optimizer for Physics-Informed Neural Networks and Kolmogorov-Arnold Networks
- 360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation
- TombRaider: Entering the Vault of History to Jailbreak Large Language Models
- Evaluation of Large Language Models via Coupled Token Generation
- Field Matching: an Electrostatic Paradigm to Generate and Transfer Data
- Investigating the Robustness of Deductive Reasoning with Large Language Models
- EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models
- Head-Specific Intervention Can Induce Misaligned AI Coordination in Large Language Models
- Forgotten Polygons: Multimodal Large Language Models are Shape-Blind
- LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
- BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities
- AutoMisty: A Multi-Agent LLM Framework for Automated Code Generation in the Misty Social Robot
- CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation
- MedLoRD: A Medical Low-Resource Diffusion Model for High-Resolution 3D CT Image Synthesis
- HoarePrompt: Structural Reasoning About Program Correctness in Natural Language
- ImF: Implicit Fingerprint for Large Language Models
- Exponentially Weighted Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection Model Training in Unmanned Aerial Vehicles Surveillance Scenarios
- Celler:A Genomic Language Model for Long-Tailed Single-Cell Annotation
- CLaP -- State Detection from Time Series
- X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents
- VeriCoder: Enhancing LLM-Based RTL Code Generation through Functional Correctness Validation
- SVD Based Least Squares for X-Ray Pneumonia Classification Using Deep Features
- Theory of Mind in Large Language Models: Assessment and Enhancement
- Multimodal Masked Autoencoder Pre-training for 3D MRI-Based Brain Tumor Analysis with Missing Modalities
- ICQuant: Index Coding enables Low-bit LLM Quantization
- WATCH: Adaptive Monitoring for AI Deployments via Weighted-Conformal Martingales
- DSADF: Thinking Fast and Slow for Decision Making
- A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?
- From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora
- Explainable Prediction of the Mechanical Properties of Composites with CNNs
- Learning with Spike Synchrony in Spiking Neural Networks
- Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding
- IRONIC: Coherence-Aware Reasoning Chains for Multi-Modal Sarcasm Detection
- An Outlook on the Opportunities and Challenges of Multi-Agent AI Systems
- Exploring the Vulnerability of the Content Moderation Guardrail in Large Language Models via Intent Manipulation
- Debate-to-Detect: Reformulating Misinformation Detection as a Real-World Debate with Large Language Models
- Large Language Models in the Task of Automatic Validation of Text Classifier Predictions
- Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate
- RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving
- Equivariant Spherical Transformer for Efficient Molecular Modeling
- Accountability Attribution: Tracing Model Behavior to Training Processes
- EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models
- Auto prompt sql: a resource-efficient architecture for text-to-sql translation in constrained environments
- AudioLens: A Closer Look at Auditory Attribute Perception of Large Audio-Language Models
- From Legal Texts to Defeasible Deontic Logic via LLMs: A Study in Automated Semantic Analysis
- Effective Red-Teaming of Policy-Adherent Agents
- A foundation model with multi-variate parallel attention to generate neuronal activity
- BiMark: Unbiased Multilayer Watermarking for Large Language Models
- Multi-Level Fusion Graph Neural Network for Molecule Property Prediction
- CRABS: A syntactic-semantic pincer strategy for bounding LLM interpretation of Python notebooks
- QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation
- Leveraging Large Language Models for Accurate Sign Language Translation in Low-Resource Scenarios
- BRAIN: Bias-Mitigation Continual Learning Approach to Vision-Brain Understanding
- Explain and Monitor Deep Learning Models for Computer Vision using Obz AI
- Why Synthetic Isn't Real Yet: A Diagnostic Framework for Contact Center Dialogue Generation
- Deep Learning and Matrix Completion-aided IoT Network Localization in the Outlier Scenarios
- KillChainGraph: ML Framework for Predicting and Mapping ATT&CK Techniques
- MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols
- Type-Compliant Adaptation Cascades: Adapting Programmatic LM Workflows to Data
- ANO : Faster is Better in Noisy Landscape
- SafeBimanual: Diffusion-based Trajectory Optimization for Safe Bimanual Manipulation
- Bridging Models to Defend: A Population-Based Strategy for Robust Adversarial Defense
- Evasive Active Hypothesis Testing with Deep Neuroevolution: The Single- and Multi-Agent Cases
- Defending against Jailbreak through Early Exit Generation of Large Language Models
- Using LLM for Real-Time Transcription and Summarization of Doctor-Patient Interactions into ePuskesmas in Indonesia: A Proof-of-Concept Study
- Can Large Language Models Act as Ensembler for Multi-GNNs?
- DataTales: A Benchmark for Real-World Intelligent Data Narration
- Scaling Capability in Token Space: An Analysis of Large Vision Language Model
- PRISM: Efficient Long-Range Reasoning With Short-Context LLMs
- SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially?
- Task Memory Engine (TME): Enhancing State Awareness for Multi-Step LLM Agent Tasks
- Metacognition and Uncertainty Communication in Humans and Large Language Models
- Chemical classification program synthesis using generative artificial intelligence
- Jigsaw-Puzzles: From Seeing to Understanding to Reasoning in Vision-Language Models
- From Reasoning to Learning: A Survey on Hypothesis Discovery and Rule Learning with Large Language Models
- Toward Knowledge-Guided AI for Inverse Design in Manufacturing: A Perspective on Domain, Physics, and Human-AI Synergy
- Hidden in Plain Sight: Reasoning in Underspecified and Misspecified Scenarios for Multimodal LLMs
- WHEN TO ACT, WHEN TO WAIT: Modeling the Intent-Action Alignment Problem in Dialogue
- MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark
- Taming the Untamed: Graph-Based Knowledge Retrieval and Reasoning for MLLMs to Conquer the Unknown
- Architecting Clinical Collaboration: Multi-Agent Reasoning Systems for Multimodal Medical VQA
- When Developer Aid Becomes Security Debt: A Systematic Analysis of Insecure Behaviors in LLM Coding Agents
- Understanding visual attention beehind bee-inspired UAV navigation
- Transformer-based Models to Deal with Heterogeneous Environments in Human Activity Recognition
- Adversarial Illusions in Multi-Modal Embeddings
- Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
- Does GPT-4 surpass human performance in linguistic pragmatics?
- The Dual Impact of Virtual Reality: Examining the Addictive Potential and Therapeutic Applications of Immersive Media in the Metaverse
- Towards Identifiable Unsupervised Domain Translation: A Diversified Distribution Matching Approach
- Intelligent Condition Monitoring of Industrial Plants: An Overview of Methodologies and Uncertainty Management Strategies
- Optimizing the Design of an Artificial Pancreas to Improve Diabetes Management
- Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining
- History-Aware and Dynamic Client Contribution in Federated Learning
- Optimization-based Prompt Injection Attack to LLM-as-a-Judge
- Quadratic Binary Optimization with Graph Neural Networks
- I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey
- Large Language Models Meet NLP: A Survey
- TokenUnify: Scaling Up Autoregressive Pretraining for Neuron Segmentation
- LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation
- Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG
- Hypformer: Exploring Efficient Transformer Fully in Hyperbolic Space
- PII-Compass: Guiding LLM training data extraction prompts towards the target PII via grounding
- Visual Evaluative AI: A Hypothesis-Driven Tool with Concept-Based Explanations and Weight of Evidence
- Source Code Summarization in the Era of Large Language Models
- Graph Memory Learning: Imitating Lifelong Remembering and Forgetting of Brain Networks
- Interaction-Data-guided Conditional Instrumental Variables for Debiasing Recommender Systems
- A Tie-breaking based Local Search Algorithm for Stable Matching Problems
- Orthogonal Finetuning for Direct Preference Optimization
- A Multisource Fusion Framework for Cryptocurrency Price Movement Prediction
- $\mathsf{OPA}$: One-shot Private Aggregation with Single Client Interaction and its Applications to Federated Learning
- FlexTSF: A Flexible Forecasting Model for Time Series with Variable Regularities
- TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
- Watermarking Visual Concepts for Diffusion Models
- MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation
- Draft Model Knows When to Stop: Self-Verification Speculative Decoding for Long-Form Generation
- Missing Melodies: AI Music Generation and its "Nearly" Complete Omission of the Global South
- Agent-Testing Agent: A Meta-Agent for Automated Testing and Evaluation of Conversational AI Agents
- Retrieval Capabilities of Large Language Models Scale with Pretraining FLOPs
- Convergence and Generalization of Anti-Regularization for Parametric Models
- FedKLPR: Personalized Federated Learning for Person Re-Identification with Adaptive Pruning
- Bias Amplification in Stable Diffusion's Representation of Stigma Through Skin Tones and Their Homogeneity
- Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation
- A Synthetic Dataset for Manometry Recognition in Robotic Applications
- Multimodal Representation Learning Conditioned on Semantic Relations
- DinoTwins: Combining DINO and Barlow Twins for Robust, Label-Efficient Vision Transformers
- TANDEM: Temporal Attention-guided Neural Differential Equations for Missingness in Time Series Classification
- An experimental approach: The graph of graphs
- OmniMRI: A Unified Vision--Language Foundation Model for Generalist MRI Interpretation
- Activation Transport Operators
- LodeStar: Long-horizon Dexterity via Synthetic Data Augmentation from Human Demonstrations
- In-Context Algorithm Emulation in Fixed-Weight Transformers
- MetaGen: A DSL, Database, and Benchmark for VLM-Assisted Metamaterial Generation
- UQ: Assessing Language Models on Unsolved Questions
- RubikSQL: Lifelong Learning Agentic Knowledge Base as an Industrial NL2SQL System
- GWM: Towards Scalable Gaussian World Models for Robotic Manipulation
- Steering When Necessary: Flexible Steering Large Language Models with Backtracking
- Stop Spinning Wheels: Mitigating LLM Overthinking via Mining Patterns for Early Reasoning Exit
- ControlEchoSynth: Boosting Ejection Fraction Estimation Models via Controlled Video Diffusion
- Finding Outliers in a Haystack: Anomaly Detection for Large Pointcloud Scenes
- Few-Shot Pattern Detection via Template Matching and Regression
- Weights-Rotated Preference Optimization for Large Language Models
- Hierarchical Vision-Language Learning for Medical Out-of-Distribution Detection
- Consistent Opponent Modeling of Static Opponents in Imperfect-Information Games
- Attacking LLMs and AI Agents: Advertisement Embedding Attacks Against Large Language Models
- Robustness Feature Adapter for Efficient Adversarial Training
- Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery
- Database Normalization via Dual-LLM Self-Refinement
- Instant Preference Alignment for Text-to-Image Diffusion Models
- Speculative Safety-Aware Decoding
- EEG-FM-Bench: A Comprehensive Benchmark for the Systematic Evaluation of EEG Foundation Models
- Talking to Robots: A Practical Examination of Speech Foundation Models for HRI Applications
- DiffusionGS: Generative Search with Query Conditioned Diffusion in Kuaishou
- Algebraic Approach to Ridge-Regularized Mean Squared Error Minimization in Minimal ReLU Neural Network
- Proximal Supervised Fine-Tuning
- Adaptive Output Steps: FlexiSteps Network for Dynamic Trajectory Prediction
- MeshSplat: Generalizable Sparse-View Surface Reconstruction via Gaussian Splatting
- Scalable Engine and the Performance of Different LLM Models in a SLURM based HPC architecture
- UniSino: Physics-Driven Foundational Model for Universal CT Sinogram Standardization
- Limitations of Normalization in Attention Mechanism
- Limits of message passing for node classification: How class-bottlenecks restrict signal-to-noise ratio
- Group Expectation Policy Optimization for Stable Heterogeneous Reinforcement Learning in LLMs
- VISA: Group-wise Visual Token Selection and Aggregation via Graph Summarization for Efficient MLLMs Inference
- AVAM: Universal Training-free Adaptive Visual Anchoring Embedded into Multimodal Large Language Model for Multi-image Question Answering
- Ada-TransGNN: An Air Quality Prediction Model Based On Adaptive Graph Convolutional Networks
- FasterVoiceGrad: Faster One-step Diffusion-Based Voice Conversion with Adversarial Diffusion Conversion Distillation
- Vocoder-Projected Feature Discriminator
- Edge-Enhanced Vision Transformer Framework for Accurate AI-Generated Image Detection
- Designing Practical Models for Isolated Word Visual Speech Recognition
- A Defect Classification Framework for AI-Based Software Systems (AI-ODC)
- Riemannian Optimization for LoRA on the Stiefel Manifold
- AMELIA: A Family of Multi-task End-to-end Language Models for Argumentation
- See What You Need: Query-Aware Visual Intelligence through Reasoning-Perception Loops
- A Feminist Account of Intersectional Algorithmic Fairness
- Debiasing Multilingual LLMs in Cross-lingual Latent Space
- Understanding Subword Compositionality of Large Language Models
- Automating Conflict-Aware ACL Configurations with Natural Language Intents
- Previously on... Automating Code Review
- Towards Continual Visual Anomaly Detection in the Medical Domain
- AQ-PCDSys: An Adaptive Quantized Planetary Crater Detection System for Autonomous Space Exploration
- HyST: LLM-Powered Hybrid Retrieval over Semi-Structured Tabular Data
- Dynamic Fusion Multimodal Network for SpeechWellness Detection
- Arnold: a generalist muscle transformer policy
- Named Entity Recognition of Historical Text via Large Language Model
- A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code
- CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics
- Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations
- Learning from Few Samples: A Novel Approach for High-Quality Malcode Generation
- Assessing the Noise Robustness of Class Activation Maps: A Framework for Reliable Model Interpretability
- The Computational Complexity of Satisfiability in State Space Models
- Amortized Sampling with Transferable Normalizing Flows
- AdLoCo: adaptive batching significantly improves communications efficiency and convergence for Large Language Models
- A biological vision inspired framework for machine perception of abutting grating illusory contours
- Provable Generalization in Overparameterized Neural Nets
- ResLink: A Novel Deep Learning Architecture for Brain Tumor Classification with Area Attention and Residual Connections
- Deep Learning-Assisted Detection of Sarcopenia in Cross-Sectional Computed Tomography Imaging
- Explain Before You Answer: A Survey on Compositional Visual Reasoning
- Bine Trees: Enhancing Collective Operations by Optimizing Communication Locality
- Chinese Court Simulation with LLM-Based Agent System
- CultranAI at PalmX 2025: Data Augmentation for Cultural Knowledge Representation
- Omne-R1: Learning to Reason with Memory for Multi-hop Question Answering
- Mind the (Language) Gap: Towards Probing Numerical and Cross-Lingual Limits of LVLMs
- Modality-Specific Speech Enhancement and Noise-Adaptive Fusion for Acoustic and Body-Conduction Microphone Framework
- Capturing Legal Reasoning Paths from Facts to Law in Court Judgments using Knowledge Graphs
- Agentic AI for Software: thoughts from Software Engineering community
- The Arabic Generality Score: Another Dimension of Modeling Arabic Dialectness
- Condition Weaving Meets Expert Modulation: Towards Universal and Controllable Image Generation
- Graph-R1: Incentivizing the Zero-Shot Graph Learning Capability in LLMs via Explicit Reasoning
- Neural Proteomics Fields for Super-resolved Spatial Proteomics Prediction
- Physics-Inspired Spatial Temporal Graph Neural Networks for Predicting Industrial Chain Resilience
- A Survey of Threats Against Voice Authentication and Anti-Spoofing Systems
- NinA: Normalizing Flows in Action. Training VLA Models with Normalizing Flows
- Gaussian Primitive Optimized Deformable Retinal Image Registration
- DevLicOps: A Framework for Mitigating Licensing Risks in AI-Generated Code
- A Workflow for Map Creation in Autonomous Vehicle Simulations
- WildSpoof Challenge Evaluation Plan
- TriagerX: Dual Transformers for Bug Triaging Tasks with Content and Interaction Based Rankings
- Dream to Chat: Model-based Reinforcement Learning on Dialogues with User Belief Modeling
- Tri-Accel: Curvature-Aware Precision-Adaptive and Memory-Elastic Optimization for Efficient GPU Usage
- TextOnly: A Unified Function Portal for Text-Related Functions on Smartphones
- Degree of Staleness-Aware Data Updating in Federated Learning
- THEME : Enhancing Thematic Investing with Semantic Stock Representations and Temporal Dynamics
- HumanoidVerse: A Versatile Humanoid for Vision-Language Guided Multi-Object Rearrangement
- Drive As You Like: Strategy-Level Motion Planning Based on A Multi-Head Diffusion Model
- Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
- LLM-based Human-like Traffic Simulation for Self-driving Tests
- Explaining Black-box Language Models with Knowledge Probing Systems: A Post-hoc Explanation Perspective
- Combating Digitally Altered Images: Deepfake Detection
- ReFactX: Scalable Reasoning with Reliable Facts via Constrained Generation
- Score Matching on Large Geometric Graphs for Cosmology Generation
- GRADE: Generating multi-hop QA and fine-gRAined Difficulty matrix for RAG Evaluation
- An Efficient Dual-Line Decoder Network with Multi-Scale Convolutional Attention for Multi-organ Segmentation
- TabResFlow: A Normalizing Spline Flow Model for Probabilistic Univariate Tabular Regression
- SSG-Dit: A Spatial Signal Guided Framework for Controllable Video Generation
- Optimizing Neural Networks with Learnable Non-Linear Activation Functions via Lookup-Based FPGA Acceleration
- Linguistic Neuron Overlap Patterns to Facilitate Cross-lingual Transfer on Low-resource Languages
- Zero-shot Multimodal Document Retrieval via Cross-modal Question Generation
- Proximal Vision Transformer: Enhancing Feature Representation through Two-Stage Manifold Geometry
- Enhancing Knowledge Tracing through Leakage-Free and Recency-Aware Embeddings
- Convolutional Neural Networks for Accurate Measurement of Train Speed
- Two Birds with One Stone: Enhancing Uncertainty Quantification and Interpretability with Graph Functional Neural Process
- PlantVillageVQA: A Visual Question Answering Dataset for Benchmarking Vision-Language Models in Plant Science
- Token Homogenization under Positional Bias
- CE-RS-SBCIT A Novel Channel Enhanced Hybrid CNN Transformer with Residual, Spatial, and Boundary-Aware Learning for Brain Tumor MRI Analysis
- SACA: Selective Attention-Based Clustering Algorithm
- Natural Language Satisfiability: Exploring the Problem Distribution and Evaluating Transformer-based Language Models
- Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents
- Beyond Play and Pause: Turning GPT-4o Spatial Weakness into a Strength for In-Depth Interactive Video Learning
- Error analysis for the deep Kolmogorov method
- ONG: Orthogonal Natural Gradient Descent
- Scaling Graph Transformers: A Comparative Study of Sparse and Dense Attention
- LLM Assertiveness can be Mechanistically Decomposed into Emotional and Logical Components
- BudgetThinker: Empowering Budget-aware LLM Reasoning with Control Tokens
- Multi-Agent Visual-Language Reasoning for Comprehensive Highway Scene Understanding
- How to make Medical AI Systems safer? Simulating Vulnerabilities, and Threats in Multimodal Medical RAG System
- GPG-HT: Generalized Policy Gradient with History-Aware Decision Transformer for Probabilistic Path Planning
- Exposing Privacy Risks in Graph Retrieval-Augmented Generation
- SSFO: Self-Supervised Faithfulness Optimization for Retrieval-Augmented Generation
- Multi-Metric Preference Alignment for Generative Speech Restoration
- Module-Aware Parameter-Efficient Machine Unlearning on Transformers
- ClaimGen-CN: A Large-scale Chinese Dataset for Legal Claim Generation
- CoViPAL: Layer-wise Contextualized Visual Token Pruning for Large Vision-Language Models
- Neural Algorithmic Reasoners informed Large Language Model for Multi-Agent Path Finding
- PerPilot: Personalizing VLM-based Mobile Agents via Memory and Exploration
- Teaching LLMs to Think Mathematically: A Critical Study of Decision-Making via Optimization
- The AI Data Scientist
- SEAM: Semantically Equivalent Across Modalities Benchmark for Vision-Language Models
- ST-Raptor: LLM-Powered Semi-Structured Table Question Answering
- Unraveling the cognitive patterns of Large Language Models through module communities
- Disentangling the Factors of Convergence between Brains and Computer Vision Models
- Efficient Computation of Blackwell Optimal Policies using Rational Functions
- Hermes 4 Technical Report
- Adaptive Command: Real-Time Policy Adjustment via Language Models in StarCraft II
- Predicting User Grasp Intentions in Virtual Reality
- Robust Market Making: To Quote, or not To Quote
- ARL-Based Multi-Action Market Making with Hawkes Processes and Variable Volatility
- Bridging Foundation Models and Efficient Architectures: A Modular Brain Imaging Framework with Local Masking and Pretrained Representation Learning
- Humans Perceive Wrong Narratives from AI Reasoning Texts
- An Embodied AR Navigation Agent: Integrating BIM with Retrieval-Augmented Generation for Language Guidance
- GreenTEA: Gradient Descent with Topic-modeling and Evolutionary Auto-prompting
- Multimodal Appearance based Gaze-Controlled Virtual Keyboard with Synchronous Asynchronous Interaction for Low-Resource Settings
- "Accessibility people, you go work on that thing of yours over there": Addressing Disability Inclusion in AI Product Organizations
- Social Identity in Human-Agent Interaction: A Primer
- To Explain Or Not To Explain: An Empirical Investigation Of AI-Based Recommendations On Social Media Platforms
- Negative Shanshui: Real-time Interactive Ink Painting Synthesis
- STRelay: A Universal Spatio-Temporal Relaying Framework for Location Prediction with Future Spatiotemporal Contexts
- A Retrieval Augmented Spatio-Temporal Framework for Traffic Prediction
- The GPT-4o Shock Emotional Attachment to AI Models and Its Impact on Regulatory Acceptance: A Cross-Cultural Analysis of the Immediate Transition from GPT-4o to GPT-5
- Data and Context Matter: Towards Generalizing AI-based Software Vulnerability Detection
- The Impact of Artificial Intelligence on Human Thought
- Learn to Memorize: Optimizing LLM-based Agents with Adaptive Memory Framework
- Adaptive Variance-Penalized Continual Learning with Fisher Regularization
- Few-shot Class-incremental Fault Diagnosis by Preserving Class-Agnostic Knowledge with Dual-Granularity Representations
- Cognitive Decision Routing in Large Language Models: When to Think Fast, When to Think Slow
- From Classical Probabilistic Latent Variable Models to Modern Generative AI: A Unified Perspective
- Equinox: Holistic Fair Scheduling in Serving Large Language Models
- LatentFlow: Cross-Frequency Experimental Flow Reconstruction from Sparse Pressure via Latent Mapping
- HiCL: Hippocampal-Inspired Continual Learning
- Enabling Multi-Agent Systems as Learning Designers: Applying Learning Sciences to AI Instructional Design
- Optimizing Hyper parameters in CNN for Soil Classification using PSO and Whale Optimization Algorithm
- The Loupe: A Plug-and-Play Attention Module for Amplifying Discriminative Features in Vision Transformers
- Trust but Verify! A Survey on Verification Design for Test-time Scaling
- Situational Awareness as the Imperative Capability for Disaster Resilience in the Era of Complex Hazards and Artificial Intelligence
- COVID19 Prediction Based On CT Scans Of Lungs Using DenseNet Architecture
- Reflective Paper-to-Code Reproduction Enabled by Fine-Grained Verification
- The AI Model Risk Catalog: What Developers and Researchers Miss About Real-World AI Harms
- Invisible Filters: Cultural Bias in Hiring Evaluations Using Large Language Models
- MedRepBench: A Comprehensive Benchmark for Medical Report Interpretation
- Recall-Extend Dynamics: Enhancing Small Language Models through Controlled Exploration and Refined Offline Integration
- CALR: Corrective Adaptive Low-Rank Decomposition for Efficient Large Language Model Layer Compression
- STGAtt: A Spatial-Temporal Unified Graph Attention Network for Traffic Flow Forecasting
- Cybernaut: Towards Reliable Web Automation
- Making AI Inevitable: Historical Perspective and the Problems of Predicting Long-Term Technological Change
- Do Cognitively Interpretable Reasoning Traces Improve LLM Performance?
- DecoMind: A Generative AI System for Personalized Interior Design Layouts
- QueryBandits for Hallucination Mitigation: Exploiting Semantic Features for No-Regret Rewriting
- GPT-OSS-20B: A Comprehensive Deployment-Centric Analysis of OpenAI's Open-Weight Mixture of Experts Model
- Generative Artificial Intelligence and Agents in Research and Teaching
- Dynamic Sparse Attention on Mobile SoCs
- Assessing Consciousness-Related Behaviors in Large Language Models Using the Maze Test
- RoboBuddy in the Classroom: Exploring LLM-Powered Social Robots for Storytelling in Learning and Integration Activities
- Systematic Characterization of LLM Quantization: A Performance, Energy, and Quality Perspective
- CelloAI: Leveraging Large Language Models for HPC Software Development in High Energy Physics
- AI Product Value Assessment Model: An Interdisciplinary Integration Based on Information Theory, Economics, and Psychology
- WST: Weak-to-Strong Knowledge Transfer via Reinforcement Learning
- CellEcoNet: Decoding the Cellular Language of Pathology with Deep Learning for Invasive Lung Adenocarcinoma Recurrence Prediction
- Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling
- FAIRWELL: Fair Multimodal Self-Supervised Learning for Wellbeing Prediction
- Guarding Your Conversations: Privacy Gatekeepers for Secure Interactions with Cloud-Based AI Models
- EyeMulator: Improving Code Language Models by Mimicking Human Visual Attention
- Improving Performance, Robustness, and Fairness of Radiographic AI Models with Finely-Controllable Synthetic Data
- Interpreting the Effects of Quantization on LLMs
- Autonomous UAV Flight Navigation in Confined Spaces: A Reinforcement Learning Approach
- Exploring the Impact of Generative Artificial Intelligence on Software Development in the IT Sector: Preliminary Findings on Productivity, Efficiency and Job Security
- Understanding and Tackling Over-Dilution in Graph Neural Networks
- Out of Distribution Detection for Efficient Continual Learning in Quality Prediction for Arc Welding
- Revisiting Rule-Based Stuttering Detection: A Comprehensive Analysis of Interpretable Models for Clinical Applications
- Explainable AI for Predicting and Understanding Mathematics Achievement: A Cross-National Analysis of PISA 2018
- Evaluation and LLM-Guided Learning of ICD Coding Rationales
- PuzzleJAX: A Benchmark for Reasoning and Learning
- Route-and-Execute: Auditable Model-Card Matching and Specialty-Level Deployment
- Quantifying Sycophancy as Deviations from Bayesian Rationality in LLMs
- RADAR: A Reasoning-Guided Attribution Framework for Explainable Visual Data Analysis
- Complexity in finitary argumentation (extended version)
- WebSight: A Vision-First Architecture for Robust Web Agents
- Solving the Min-Max Multiple Traveling Salesmen Problem via Learning-Based Path Generation and Optimal Splitting
- PowerChain: Automating Distribution Grid Analysis with Agentic AI Workflows
- Rethinking How AI Embeds and Adapts to Human Values: Challenges and Opportunities
- MaRVL-QA: A Benchmark for Mathematical Reasoning over Visual Landscapes
- PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs
- From reactive to cognitive: brain-inspired spatial intelligence for embodied agents
- Large Language Model-Based Automatic Formulation for Stochastic Optimization Models
- Explainable Counterfactual Reasoning in Depression Medication Selection at Multi-Levels (Personalized and Population)
- Reinforcement Learning enhanced Online Adaptive Clinical Decision Support via Digital Twin powered Policy and Treatment Effect optimized Reward
- MC3G: Model Agnostic Causally Constrained Counterfactual Generation
- L-XAIDS: A LIME-based eXplainable AI framework for Intrusion Detection Systems
- Federated Reinforcement Learning for Runtime Optimization of AI Applications in Smart Eyewears
- ERF-BA-TFD+: A Multimodal Model for Audio-Visual Deepfake Detection
- MEENA (PersianMMMU): Multimodal-Multilingual Educational Exams for N-level Assessment
- Meta-R1: Empowering Large Reasoning Models with Metacognition
- Evolving Collective Cognition in Human-Agent Hybrid Societies: How Agents Form Stances and Boundaries
- Mimicking the Physicist's Eye:A VLM-centric Approach for Physics Formula Discovery
- Large Language Models as Universal Predictors? An Empirical Study on Small Tabular Datasets
- Solving Constrained Stochastic Shortest Path Problems with Scalarisation
- School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs
- Evaluating Retrieval-Augmented Generation Strategies for Large Language Models in Travel Mode Choice Prediction
- Consciousness as a Functor
- TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis
- Evaluating Movement Initiation Timing in Ultimate Frisbee via Temporal Counterfactuals
- Spacer: Towards Engineered Scientific Inspiration
- A Taxonomy of Transcendence
- LLM-based Agentic Reasoning Frameworks: A Survey from Methods to Scenarios
- AgentRAN: An Agentic AI Architecture for Autonomous Control of Open 6G Networks
- Interpretable Early Failure Detection via Machine Learning and Trace Checking-based Monitoring
- FAIRGAMER: Evaluating Biases in the Application of Large Language Models to Video Games
- Language Models Coupled with Metacognition Can Outperform Reasoning Models
Research Sources: 929 | Generated: 8/27/2025