AI RESEARCH PAPERS & ACADEMIC SOURCES
- Multi-needle Localization for Pelvic Seed Implant Brachytherapy based on Tip-handle Detection and Matching
- Can multimodal representation learning by alignment preserve modality-specific information?
- Enhancing Semantic Segmentation with Continual Self-Supervised Pre-training
- ContextFlow: Training-Free Video Object Editing via Adaptive Context Enrichment
- Semantic and Visual Crop-Guided Diffusion Models for Heterogeneous Tissue Synthesis in Histopathology
- ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos
- Trainee Action Recognition through Interaction Analysis in CCATT Mixed-Reality Training
- Does Audio Matter for Modern Video-LLMs and Their Benchmarks?
- SmaRT: Style-Modulated Robust Test-Time Adaptation for Cross-Domain Brain Tumor Segmentation in MRI
- Accurate and Efficient Low-Rank Model Merging in Core Space
- From Restoration to Reconstruction: Rethinking 3D Gaussian Splatting for Underwater Scenes
- Degradation-Aware All-in-One Image Restoration via Latent Prior Encoding
- TS-P$^2$CL: Plug-and-Play Dual Contrastive Learning for Vision-Guided Medical Time Series Classification
- Selecting Optimal Camera Views for Gait Analysis: A Multi-Metric Assessment of 2D Projections
- FROQ: Observing Face Recognition Models for Efficient Quality Assessment
- Depth Edge Alignment Loss: DEALing with Depth in Weakly Supervised Semantic Segmentation
- Neurodynamics-Driven Coupled Neural P Systems for Multi-Focus Image Fusion
- Automatic Intermodal Loading Unit Identification using Computer Vision: A Scoping Review
- RCTDistill: Cross-Modal Knowledge Distillation Framework for Radar-Camera 3D Object Detection with Temporal Fusion
- Automated Labeling of Intracranial Arteries with Uncertainty Quantification Using Deep Learning
- Adaptive Fast-and-Slow Visual Program Reasoning for Long-Form VideoQA
- Dual-View Alignment Learning with Hierarchical-Prompt for Class-Imbalance Multi-Label Classification
- Multi-Agent Amodal Completion: Direct Synthesis with Fine-Grained Semantic Guidance
- Neural-MMGS: Multi-modal Neural Gaussian Splats for Large-Scale Scene Reconstruction
- Incorporating the Refractory Period into Spiking Neural Networks through Spike-Triggered Threshold Dynamics
- I2VWM: Robust Watermarking for Image to Video Generation
- OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models
- A$^2$M$^2$-Net: Adaptively Aligned Multi-Scale Moment for Few-Shot Action Recognition
- VideoArtGS: Building Digital Twins of Articulated Objects from Monocular Video
- Evict3R: Training-Free Token Eviction for Memory-Bounded Streaming Visual Geometry Transformers
- SISMA: Semantic Face Image Synthesis with Mamba
- Clothing agnostic Pre-inpainting Virtual Try-ON
- Development and validation of an AI foundation model for endoscopic diagnosis of esophagogastric junction adenocarcinoma: a cohort and deep learning study
- SD-VLM: Spatial Measuring and Understanding with Depth-Encoded Vision-Language Models
- Tailored Transformation Invariance for Industrial Anomaly Detection
- DINOv3-Diffusion Policy: Self-Supervised Large Visual Model for Visuomotor Diffusion Policy Learning
- Predicting Depth Maps from Single RGB Images and Addressing Missing Information in Depth Estimation
- SouLLMate: An Application Enhancing Diverse Mental Health Support with Adaptive LLMs, Prompt Engineering, and RAG Techniques
- Visual Instruction Pretraining for Domain-Specific Foundation Models
- MRN: Harnessing 2D Vision Foundation Models for Diagnosing Parkinson's Disease with Limited 3D MR Data
- PRNU-Bench: A Novel Benchmark and Model for PRNU-Based Camera Identification
- Interpreting Attention Heads for Image-to-Text Information Flow in Large Vision-Language Models
- Domain Adaptive Object Detection for Space Applications with Real-Time Constraints
- COLA: Context-aware Language-driven Test-time Adaptation
- From Benchmarks to Reality: Advancing Visual Anomaly Detection by the VAND 3.0 Challenge
- Tensor-Based Self-Calibration of Cameras via the TrifocalCalib Method
- Overview of PlantCLEF 2023: Image-based Plant Identification at Global Scale
- From Documents to Database: Failure Modes for Industrial Assets
- Through the Lens of Human-Human Collaboration: A Configurable Research Platform for Exploring Human-Agent Collaboration
- OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System
- MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction
- Scaling Efficient LLMs
- MindRef: Mimicking Human Memory for Hierarchical Reference Retrieval with Fine-Grained Location Awareness
- Temporal Scaling Law for Large Language Models
- LLaSA: A Sensor-Aware LLM for Natural Language Reasoning of Human Activity from IMU Data
- Evaluating Fairness in Large Vision-Language Models Across Diverse Demographic Attributes and Prompts
- On the Low-Rank Parametrization of Reward Models for Controlled Language Generation
- Rethinking Backdoor Detection Evaluation for Language Models
- GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding
- Autiverse: Eliciting Autistic Adolescents' Daily Narratives through AI-guided Multimodal Journaling
- LingoQ: Bridging the Gap between ESL Learning and Work through AI-Generated Work-Related Quizzes
- AutiHero: Leveraging Generative AI in Social Narratives to Engage Parents in Story-Driven Behavioral Guidance for Autistic Children
- WISE: Weak-Supervision-Guided Step-by-Step Explanations for Multimodal LLMs in Image Classification
- Deep Learning Inductive Biases for fMRI Time Series Classification during Resting-state and Movie-watching
- From Prediction to Understanding: Will AI Foundation Models Transform Brain Science?
- Explainability matters: The effect of liability rules on the healthcare sector
- Barwise Section Boundary Detection in Symbolic Music Using Convolutional Neural Networks
- Bayesian Ego-graph inference for Networked Multi-Agent Reinforcement Learning
- Safe Guaranteed Dynamics Exploration with Probabilistic Models
- Knowledge Distillation for Variational Quantum Convolutional Neural Networks on Heterogeneous Data
- Increase Alpha: Performance and Risk of an AI-Driven Trading Framework
- A Study on Stabilizer R\'enyi Entropy Estimation using Machine Learning
- Exploring AI Capabilities in Participatory Budgeting within Smart Cities: The Case of Sao Paulo
- Comparing RAG and GraphRAG for Page-Level Retrieval Question Answering on Math Textbook
- AdaptiveGuard: Towards Adaptive Runtime Safety for LLM-Powered Software
- Cross-Attention with Confidence Weighting for Multi-Channel Audio Alignment
- Leveraging Multiple Speech Enhancers for Non-Intrusive Intelligibility Prediction for Hearing-Impaired Listeners
- A Chain-of-thought Reasoning Breast Ultrasound Dataset Covering All Histopathology Categories
- $\texttt{DiffSyn}$: A Generative Diffusion Approach to Materials Synthesis Planning
- Prompt-with-Me: in-IDE Structured Prompt Management for LLM-Driven Software Engineering
- MaskVCT: Masked Voice Codec Transformer for Zero-Shot Voice Conversion With Increased Controllability via Multiple Guidances
- Agentic AI for Multi-Stage Physics Experiments at a Large-Scale User Facility Particle Accelerator
- Generalizability of Large Language Model-Based Agents: A Comprehensive Survey
- Significativity Indices for Agreement Values
- Towards Quantifying the Hessian Structure of Neural Networks
- Wasserstein Convergence of Score-based Generative Models under Semiconvexity and Discontinuous Gradients
- Minimax Adaptive Online Nonparametric Regression over Besov Spaces
- Multi-scale clustering and source separation of InSight mission seismic data
- Addressing the Inconsistency in Bayesian Deep Learning via Generalized Laplace Approximation
- Theoretical Insights into CycleGAN: Analyzing Approximation and Estimation Errors in Unpaired Data Generation
- Robust Reinforcement Learning with Dynamic Distortion Risk Measures
- Measure-to-measure interpolation using Transformers
- Are Deep Learning Methods Suitable for Downscaling Global Climate Projections? An Intercomparison for Temperature and Precipitation over Spain
- Flatten Graphs as Sequences: Transformers are Scalable Graph Generators
- Unsupervised Structural-Counterfactual Generation under Domain Shift
- Bayesian Algorithms for Adversarial Online Learning: from Finite to Infinite Action Spaces
- Cover Learning for Large-Scale Topology Representation
- Locally minimax optimal confidence sets for the best model
- On Quantification of Borrowing of Information in Hierarchical Bayesian Models
- Bayesian Semi-supervised Inference via a Debiased Modeling Approach
- Synth-MIA: A Testbed for Auditing Privacy Leakage in Tabular Data Synthesis
- Core-elements Subsampling for Alternating Least Squares
- Deep Learning as the Disciplined Construction of Tame Objects
- Validation-Free Sparse Learning: A Phase Transition Approach to Feature Selection
- Tight PAC-Bayesian Risk Certificates for Contrastive Learning
- Learning Massive-scale Partial Correlation Networks in Clinical Multi-omics Studies with HP-ACCORD
- An AI-powered Bayesian generative modeling approach for causal inference in observational studies
- Conformal Prediction with Upper and Lower Bound Models
- AICO: Feature Significance Tests for Supervised Learning
- Robust Mixture Models for Algorithmic Fairness Under Latent Heterogeneity
- Bilateral Distribution Compression: Reducing Both Data Size and Dimensionality
- Whitening Spherical Gaussian Mixtures in the Large-Dimensional Regime
- Robust, Online, and Adaptive Decentralized Gaussian Processes
- Fr\'echet Geodesic Boosting
- Kernel K-means clustering of distributional data
- Functional effects models: Accounting for preference heterogeneity in panel data with machine learning
- Learning Centre Partitions from Summaries
- Overfitting in Adaptive Robust Optimization
- Hessian-guided Perturbed Wasserstein Gradient Flows for Escaping Saddle Points
- Data-efficient Kernel Methods for Learning Hamiltonian Systems
- Chat-CBM: Towards Interactive Concept Bottleneck Models with Frozen Large Language Models
- SimToken: A Simple Baseline for Referring Audio-Visual Segmentation
- An Empirical Study on the Robustness of YOLO Models for Underwater Object Detection
- Low-Rank Adaptation of Evolutionary Deep Neural Networks for Efficient Learning of Time-Dependent PDEs
- Conditional Multidimensional Scaling with Incomplete Conditioning Data
- System-Level Uncertainty Quantification with Multiple Machine Learning Models: A Theoretical Framework
- DoubleGen: Debiased Generative Modeling of Counterfactuals
- Risk Comparisons in Linear Regression: Implicit Regularization Dominates Explicit Regularization
- Bias-variance Tradeoff in Tensor Estimation
- CARINOX: Inference-time Scaling with Category-Aware Reward-based Initial Noise Optimization and Exploration
- CSDformer: A Conversion Method for Fully Spike-Driven Transformer
- MAESTRO: Task-Relevant Optimization via Adaptive Feature Enhancement and Suppression for Multi-task 3D Perception
- Stable Video-Driven Portraits
- ChartHal: A Fine-grained Framework Evaluating Hallucination of Large Vision Language Models in Chart Understanding
- Multimodal Medical Image Classification via Synergistic Learning Pre-training
- Vision-Based Driver Drowsiness Monitoring: Comparative Analysis of YOLOv5-v11 Models
- SAMSON: 3rd Place Solution of LSVOS 2025 VOS Challenge
- 4D-MoDe: Towards Editable and Scalable Volumetric Streaming via Motion-Decoupled 4D Gaussian Compression
- 4DGCPro: Efficient Hierarchical 4D Gaussian Compression for Progressive Volumetric Video Streaming
- Unified Multimodal Coherent Field: Synchronous Semantic-Spatial-Vision Fusion for Brain Tumor Segmentation
- Pre-Trained CNN Architecture for Transformer-Based Image Caption Generation Model
- Revisiting Vision Language Foundations for No-Reference Image Quality Assessment
- Diff-GNSS: Diffusion-based Pseudorange Error Estimation
- Interpreting vision transformers via residual replacement model
- Real-Time Fish Detection in Indonesian Marine Ecosystems Using Lightweight YOLOv10-nano Architecture
- Single-Image Depth from Defocus with Coded Aperture and Diffusion Posterior Sampling
- Multi-scale Temporal Prediction via Incremental Generation and Multi-agent Collaboration
- EmbodiedSplat: Personalized Real-to-Sim-to-Real Navigation with Gaussian Splats from a Mobile Device
- Emergent 3D Correspondence from Neural Shape Representation
- Training-Free Label Space Alignment for Universal Domain Adaptation
- Explainable AI for Analyzing Person-Specific Patterns in Facial Recognition Tasks
- Echo-Path: Pathology-Conditioned Echo Video Generation
- Guided and Unguided Conditional Diffusion Mechanisms for Structured and Semantically-Aware 3D Point Cloud Generation
- Point-RTD: Replaced Token Denoising for Pretraining Transformer Models on Point Clouds
- MirrorSAM2: Segment Mirror in Videos with Depth Perception
- DT-NeRF: A Diffusion and Transformer-Based Optimization Approach for Neural Radiance Fields in 3D Reconstruction
- SPFSplatV2: Efficient Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views
- Optimized Learned Image Compression for Facial Expression Recognition
- Task-Oriented Communications for 3D Scene Representation: Balancing Timeliness and Fidelity
- Automated Facility Enumeration for Building Compliance Checking using Door Detection and Large Language Models
- DepTR-MOT: Unveiling the Potential of Depth-Informed Trajectory Refinement for Multi-Object Tracking
- UIPro: Unleashing Superior Interaction Capability For GUI Agents
- SmokeSeer: 3D Gaussian Splatting for Smoke Removal and Scene Reconstruction
- MoCLIP-Lite: Efficient Video Recognition by Fusing CLIP with Motion Vectors
- SFN-YOLO: Towards Free-Range Poultry Detection via Scale-aware Fusion Networks
- AlignedGen: Aligning Style Across Generated Images
- Uncertainty-Supervised Interpretable and Robust Evidential Segmentation
- The SAGES Critical View of Safety Challenge: A Global Benchmark for AI-Assisted Surgical Quality Assessment
- CoBEVMoE: Heterogeneity-aware Feature Fusion with Dynamic Mixture-of-Experts for Collaborative Perception
- Stencil: Subject-Driven Generation with Context Guidance
- SAEC: Scene-Aware Enhanced Edge-Cloud Collaborative Industrial Vision Inspection with Multimodal LLM
- SynergyNet: Fusing Generative Priors and State-Space Models for Facial Beauty Prediction
- Ambiguous Medical Image Segmentation Using Diffusion Schr\"{o}dinger Bridge
- Efficient 3D Scene Reconstruction and Simulation from Sparse Endoscopic Views
- From Easy to Hard: The MIR Benchmark for Progressive Interleaved Multi-Image Reasoning
- Towards Generalized Synapse Detection Across Invertebrate Species
- AgriDoctor: A Multimodal Intelligent Assistant for Agriculture
- Learning Attribute-Aware Hash Codes for Fine-Grained Image Retrieval via Query Optimization
- Geodesic Prototype Matching via Diffusion Maps for Interpretable Fine-Grained Recognition
- CardiacCLIP: Video-based CLIP Adaptation for LVEF Prediction in a Few-shot Manner
- Informative Text-Image Alignment for Visual Affordance Learning with Foundation Models
- Enhanced Detection of Tiny Objects in Aerial Images
- A Dual-Modulation Framework for RGB-T Crowd Counting via Spatially Modulated Attention and Adaptive Fusion
- HyRF: Hybrid Radiance Fields for Memory-efficient and High-quality Novel View Synthesis
- VidCLearn: A Continual Learning Approach for Text-to-Video Generation
- MO R-CNN: Multispectral Oriented R-CNN for Object Detection in Remote Sensing Image
- Penalizing Boundary Activation for Object Completeness in Diffusion Models
- LLM-Assisted Semantic Guidance for Sparsely Annotated Remote Sensing Object Detection
- The 1st Solution for 7th LSVOS RVOS Track: SaSaSa2VA
- Optimal Transport for Handwritten Text Recognition in a Low-Resource Regime
- VCE: Safe Autoregressive Image Generation via Visual Contrast Exploitation
- A Cross-Hierarchical Multi-Feature Fusion Network Based on Multiscale Encoder-Decoder for Hyperspectral Change Detection
- DocIQ: A Benchmark Dataset and Feature Fusion Network for Document Image Quality Assessment
- When Color-Space Decoupling Meets Diffusion for Adverse-Weather Image Restoration
- $\mathtt{M^3VIR}$: A Large-Scale Multi-Modality Multi-View Synthesized Benchmark Dataset for Image Restoration and Content Creation
- SAM-DCE: Addressing Token Uniformity and Semantic Over-Smoothing in Medical Segmentation
- Rethinking Evaluation of Infrared Small Target Detection
- Learning from Gene Names, Expression Values and Images: Contrastive Masked Text-Image Pretraining for Spatial Transcriptomics Representation Learning
- PRISM: Precision-Recall Informed Data-Free Knowledge Distillation via Generative Diffusion
- ME-Mamba: Multi-Expert Mamba with Efficient Knowledge Capture and Fusion for Multimodal Survival Analysis
- SLAM-Former: Putting SLAM into One Transformer
- Parameter-efficient fine-tuning (PEFT) of Vision Foundation Models for Atypical Mitotic Figure Classification
- Prototype-Based Pseudo-Label Denoising for Source-Free Domain Adaptation in Remote Sensing Semantic Segmentation
- Catching the Details: Self-Distilled RoI Predictors for Fine-Grained MLLM Perception
- Leveraging RGB Images for Pre-Training of Event-Based Hand Pose Estimation
- Min: Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning
- CAMBench-QR : A Structure-Aware Benchmark for Post-Hoc Explanations with QR Understanding
- HyPlaneHead: Rethinking Tri-plane-like Representations in Full-Head Image Synthesis
- DiffEye: Diffusion-Based Continuous Eye-Tracking Data Generation Conditioned on Natural Images
- MMPart: Harnessing Multi-Modal Large Language Models for Part-Aware 3D Generation
- Artificial Satellite Trails Detection Using U-Net Deep Neural Network and Line Segment Detector Algorithm
- Benchmarking and Mitigating MCQA Selection Bias of Large Vision-Language Models
- MedGS: Gaussian Splatting for Multi-Modal 3D Medical Imaging
- Looking in the mirror: A faithful counterfactual explanation method for interpreting deep image classification models
- L2M-Reg: Building-level Uncertainty-aware Registration of Outdoor LiDAR Point Clouds and Semantic 3D City Models
- ISCS: Parameter-Guided Channel Ordering and Grouping for Learned Image Compression
- ConfidentSplat: Confidence-Weighted Depth Fusion for Accurate 3D Gaussian Splatting SLAM
- FitPro: A Zero-Shot Framework for Interactive Text-based Pedestrian Retrieval in Open World
- Segment-to-Act: Label-Noise-Robust Action-Prompted Video Segmentation Towards Embodied Intelligence
- IPF-RDA: An Information-Preserving Framework for Robust Data Augmentation
- Active View Selection for Scene-level Multi-view Crowd Counting and Localization with Limited Labels
- Towards a Transparent and Interpretable AI Model for Medical Image Classifications
- Spectral Compressive Imaging via Chromaticity-Intensity Decomposition
- InstanceAssemble: Layout-Aware Image Generation via Instance Assembling Attention
- Animalbooth: multimodal feature enhancement for animal subject personalization
- When Confidence Fails: Revisiting Pseudo-Label Selection in Semi-supervised Semantic Segmentation
- Text-Scene: A Scene-to-Language Parsing Framework for 3D Scene Understanding
- Pain in 3D: Generating Controllable Synthetic Faces for Automated Pain Assessment
- A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis
- Describe-to-Score: Text-Guided Efficient Image Complexity Assessment
- CGTGait: Collaborative Graph and Transformer for Gait Emotion Recognition
- Enhancing Scientific Visual Question Answering via Vision-Caption aware Supervised Fine-Tuning
- Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation
- DA-Font: Few-Shot Font Generation via Dual-Attention Hybrid Integration
- Towards Anytime Retrieval: A Benchmark for Anytime Person Re-Identification
- Unlocking Hidden Potential in Point Cloud Networks with Attention-Guided Grouping-Feature Coordination
- ADVEDM:Fine-grained Adversarial Attack against VLM-based Embodied Agents
- Are VLMs Ready for Lane Topology Awareness in Autonomous Driving?
- MedCutMix: A Data-Centric Approach to Improve Radiology Vision-Language Pre-training with Disease Awareness
- Eye Gaze Tells You Where to Compute: Gaze-Driven Efficient VLMs
- Octree Latent Diffusion for Semantic 3D Scene Generation and Completion
- RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation
- CommonForms: A Large, Diverse Dataset for Form Field Detection
- OS-DiffVSR: Towards One-step Latent Diffusion Model for High-detailed Real-world Video Super-Resolution
- SlowFast-SCI: Slow-Fast Deep Unfolding Learning for Spectral Compressive Imaging
- FG-Attn: Leveraging Fine-Grained Sparsity In Diffusion Transformers
- PM25Vision: A Large-Scale Benchmark Dataset for Visual Estimation of Air Quality
- Efficient Rectified Flow for Image Fusion
- ST-GS: Vision-Based 3D Semantic Occupancy Prediction with Spatial-Temporal Gaussian Splatting
- Person Identification from Egocentric Human-Object Interactions using 3D Hand Pose
- Captioning for Text-Video Retrieval via Dual-Group Direct Preference Optimization
- OpenGVL - Benchmarking Visual Temporal Progress for Data Curation
- Mano Report
- Introducing Resizable Region Packing Problem in Image Generation, with a Heuristic Solution
- Accurate Thyroid Cancer Classification using a Novel Binary Pattern Driven Local Discrete Cosine Transform Descriptor
- StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes
- 3D Gaussian Flats: Hybrid 2D/3D Photometric Scene Reconstruction
- TractoTransformer: Diffusion MRI Streamline Tractography using CNN and Transformer Networks
- Improved mmFormer for Liver Fibrosis Staging via Missing-Modality Compensation
- Explainable Gait Abnormality Detection Using Dual-Dataset CNN-LSTM Models
- Cross-Corpus and Cross-domain Handwriting Assessment of NeuroDegenerative Diseases via Time-Series-to-Image Conversion
- VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery
- AutoArabic: A Three-Stage Framework for Localizing Video-Text Retrieval Benchmarks
- Evaluating the Effectiveness and Scalability of LLM-Based Data Augmentation for Retrieval
- Purely Semantic Indexing for LLM-based Generative Recommendation and Retrieval
- Advancing Reference-free Evaluation of Video Captions with Factual Analysis
- Long document summarization using page specific target text alignment and distilling page importance
- The Role of Vocabularies in Learning Sparse Representations for Ranking
- Idiosyncratic Versus Normative Modeling of Atypical Speech Recognition: Dysarthric Case Studies
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?
- Localizing Malicious Outputs from CodeLLM
- SVeritas: Benchmark for Robust Speaker Verification under Diverse Conditions
- Bringing Pedagogy into Focus: Evaluating Virtual Teaching Assistants' Question-Answering in Asynchronous Learning Environments
- ReDepress: A Cognitive Framework for Detecting Depression Relapse from Social Media
- Variation in Verification: Understanding Verification Dynamics in Large Language Models
- WenetSpeech-Chuan: A Large-Scale Sichuanese Corpus with Rich Annotation for Dialectal Speech Processing
- Cross-Attention is Half Explanation in Speech-to-Text Models
- RadEval: A framework for radiology text evaluation
- The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies
- TMD-TTS: A Unified Tibetan Multi-Dialect Text-to-Speech Synthesis for \"U-Tsang, Amdo and Kham Speech Dataset Generation
- ARK-V1: An LLM-Agent for Knowledge Graph Question Answering Requiring Commonsense Reasoning
- SEQR: Secure and Efficient QR-based LoRA Routing
- Longitudinal and Multimodal Recording System to Capture Real-World Patient-Clinician Conversations for AI and Encounter Research: Protocol
- Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe
- Make Every Letter Count: Building Dialect Variation Dictionaries from Monolingual Corpora
- CorPipe at CRAC 2025: Evaluating Multilingual Encoders for Multilingual Coreference Resolution
- Unsupervised Learning and Representation of Mandarin Tonal Categories by a Generative CNN
- How Persuasive is Your Context?
- SiDiaC: Sinhala Diachronic Corpus
- Improving Zero-shot Sentence Decontextualisation with Content Selection and Planning
- Transformer-Encoder Trees for Efficient Multilingual Machine Translation and Speech Translation
- Training-free Truthfulness Detection via Value Vectors in LLMs
- D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models
- HICode: Hierarchical Inductive Coding with LLMs
- Dorabella Cipher as Musical Inspiration
- Breaking Token Into Concepts: Exploring Extreme Compression in Token Representation Via Compositional Shared Semantics
- Qwen3-Omni Technical Report
- A State-Update Prompting Strategy for Efficient and Robust Multi-turn Dialogue
- DIVERS-Bench: Evaluating Language Identification Across Domain Shifts and Code-Switching
- One Agent to Serve All: a Lite-Adaptive Stylized AI Assistant for Millions of Multi-Style Official Accounts
- Learning to vary: Teaching LMs to reproduce human linguistic variability in next-word prediction
- Findings of the Fourth Shared Task on Multilingual Coreference Resolution: Can LLMs Dethrone Traditional Approaches?
- Everyday Physics in Korean Contexts: A Culturally Grounded Physical Reasoning Benchmark
- Towards Adaptive Context Management for Intelligent Conversational Question Answering
- Fine-Grained Detection of AI-Generated Text Using Sentence-Level Segmentation
- Trust Me, I Can Convince You: The Contextualized Argument Appraisal Framework
- Specification-Aware Machine Translation and Evaluation for Purpose Alignment
- Asking a Language Model for Diverse Responses
- MSCoRe: A Benchmark for Multi-Stage Collaborative Reasoning in LLM Agents
- AuditoryBench++: Can Language Models Understand Auditory Knowledge without Hearing?
- Crosslingual Optimized Metric for Translation Assessment of Indian Languages
- PG-CE: A Progressive Generation Dataset with Constraint Enhancement for Controllable Text Generation
- Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications
- When TableQA Meets Noise: A Dual Denoising Framework for Complex Questions and Large-scale Tables
- TASO: Task-Aligned Sparse Optimization for Parameter-Efficient Model Adaptation
- Evaluating LLM-Generated Versus Human-Authored Responses in Role-Play Dialogues
- Investigating Bias: A Multilingual Pipeline for Generating, Solving, and Evaluating Math Problems with LLMs
- Filling in the Clinical Gaps in Benchmark: Case for HealthBench for the Japanese medical system
- Semantic Reformulation Entropy for Robust Hallucination Detection in QA Tasks
- SLAyiNG: Towards Queer Language Processing
- Codifying Natural Langauge Tasks
- PRINCIPLES: Synthetic Strategy Memory for Proactive Dialogue Agents
- Diagnosing Model Editing via Knowledge Spectrum
- AttnComp: Attention-Guided Adaptive Context Compression for Retrieval-Augmented Generation
- MapCoder-Lite: Squeezing Multi-Agent Coding into a Single Small LLM
- Enhancing Cross-Lingual Transfer through Reversible Transliteration: A Huffman-Based Approach for Low-Resource Languages
- CorefInst: Leveraging LLMs for Multilingual Coreference Resolution
- Leveraging Audio-Visual Data to Reduce the Multilingual Gap in Self-Supervised Speech Models
- Can LLMs Reason Over Non-Text Modalities in a Training-Free Manner? A Case Study with In-Context Representation Learning
- Better Late Than Never: Evaluation of Latency Metrics for Simultaneous Speech-to-Text Translation
- Scale-free Characteristics of Multilingual Legal Texts and the Limitations of LLMs
- Robustness of Neurosymbolic Reasoners on First-Order Logic Problems
- FinDebate: Multi-Agent Collaborative Intelligence for Financial Analysis
- EpiCache: Episodic KV Cache Management for Long Conversational Question Answering
- DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context
- Vision Language Models Are Not (Yet) Spelling Correctors
- RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios
- QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models
- MedFact: A Large-scale Chinese Dataset for Evidence-based Medical Fact-checking of LLM Responses
- GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning
- SFT-TA: Supervised Fine-Tuned Agents in Multi-Agent LLMs for Automated Inductive Thematic Analysis
- FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions
- Attention Consistency for LLMs Explanation
- LifeAlign: Lifelong Alignment for Large Language Models with Memory-Augmented Focalized Preference Optimization
- Evolution of Concepts in Language Model Pre-Training
- Prompt-Based Simplification for Plain Language using Spanish Language Models
- Extending Automatic Machine Translation Evaluation to Book-Length Documents
- Probabilistic Token Alignment for Large Language Model Fusion
- Automated Knowledge Graph Construction using Large Language Models and Sentence Complexity Modelling
- Multi-View Attention Multiple-Instance Learning Enhanced by LLM Reasoning for Cognitive Distortion Detection
- Scaling, Simplification, and Adaptation: Lessons from Pretraining on Machine-Translated Text
- AIMMerging: Adaptive Iterative Model Merging Using Training Trajectories for Language Model Continual Learning
- Can GRPO Boost Complex Multimodal Table Understanding?
- CLaC at DISRPT 2025: Hierarchical Adapters for Cross-Framework Multi-lingual Discourse Relation Classification
- CUTE: A Multilingual Dataset for Enhancing Cross-Lingual Knowledge Transfer in Low-Resource Languages
- K-DeCore: Facilitating Knowledge Transfer in Continual Structured Knowledge Reasoning via Knowledge Decoupling
- AirQA: A Comprehensive QA Dataset for AI Research with Instance-Level Evaluation
- Preference Distillation via Value based Reinforcement Learning
- Advancing Speech Understanding in Speech-Aware Language Models with GRPO
- The Transfer Neurons Hypothesis: An Underlying Mechanism for Language Latent Space Transitions in Multilingual LLMs
- Modeling Bottom-up Information Quality during Language Processing
- TactfulToM: Do LLMs Have the Theory of Mind Ability to Understand White Lies?
- OPEN-THEATRE: An Open-Source Toolkit for LLM-based Interactive Drama
- Semi-Supervised Synthetic Data Generation with Fine-Grained Relevance Control for Short Video Search Relevance Modeling
- Time to Revist Exact Match
- A Multi-Level Benchmark for Causal Language Understanding in Social Media Discourse
- Angular Dispersion Accelerates $k$-Nearest Neighbors Machine Translation
- The Sound of Syntax: Finetuning and Comprehensive Evaluation of Language Models for Speech Pathology
- MoRoVoc: A Large Dataset for Geographical Variation Identification of the Spoken Romanian Language
- Domain-Adaptive Pre-Training for Arabic Aspect-Based Sentiment Analysis: A Comparative Study of Domain Adaptation and Fine-Tuning Strategies
- KuBERT: Central Kurdish BERT Model and Its Application for Sentiment Analysis
- Cognitive Linguistic Identity Fusion Score (CLIFS): A Scalable Cognition-Informed Approach to Quantifying Identity Fusion from Text
- Semantic-Driven Topic Modeling for Analyzing Creativity in Virtual Brainstorming
- Multi-task Pretraining for Enhancing Interpretable L2 Pronunciation Assessment
- ChemOrch: Empowering LLMs with Chemical Intelligence via Synthetic Instructions
- MPCG: Multi-Round Persona-Conditioned Generation for Modeling the Evolution of Misinformation with LLMs
- From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature
- MCP: A Control-Theoretic Orchestration Framework for Synergistic Efficiency and Interpretability in Multimodal Large Language Models
- Computational-Assisted Systematic Review and Meta-Analysis (CASMA): Effect of a Subclass of GnRH-a on Endometriosis Recurrence
- Redefining Experts: Interpretable Decomposition of Language Models for Toxicity Mitigation
- Robust Native Language Identification through Agentic Decomposition
- Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle
- EG-MLA: Embedding-Gated Multi-head Latent Attention for Scalable and Efficient LLMs
- Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language Models
- HARE: an entity and relation centric evaluation framework for histopathology reports
- RephQA: Evaluating Readability of Large Language Models in Public Health Question Answering
- Whisper-UT: A Unified Translation Framework for Speech and Text
- 'Rich Dad, Poor Lad': How do Large Language Models Contextualize Socioeconomic Factors in College Admission ?
- Evaluating CxG Generalisation in LLMs via Construction-Based NLI Fine Tuning
- Intrinsic Meets Extrinsic Fairness: Assessing the Downstream Impact of Bias Mitigation in Large Language Models
- Computational Analysis of Conversation Dynamics through Participant Responsivity
- Leveraging Multilingual Training for Authorship Representation: Enhancing Generalization across Languages and Domains
- Challenging the Evaluator: LLM Sycophancy Under User Rebuttal
- Mental Multi-class Classification on Social Media: Benchmarking Transformer Architectures against LSTM Models
- Evaluation of Ensemble Learning Techniques for handwritten OCR Improvement
- Predicting First Year Dropout from Pre Enrolment Motivation Statements Using Text Mining
- Machine Learning for Quantum Noise Reduction
- How Can Quantum Deep Learning Improve Large Language Models?
- Motional representation; the ability to predict odor characters using molecular vibrations
- GraphMend: Code Transformations for Fixing Graph Breaks in PyTorch 2
- Vibrational Fingerprints of Strained Polymers: A Spectroscopic Pathway to Mechanical State Prediction
- Language Modeling with Learned Meta-Tokens
- Test-Time Learning and Inference-Time Deliberation for Efficiency-First Offline Reinforcement Learning in Care Coordination and Population Health Management
- TF-DWGNet: A Directed Weighted Graph Neural Network with Tensor Fusion for Multi-Omics Cancer Subtype Classification
- Neural Atlas Graphs for Dynamic Scene Decomposition and Editing
- Similarity-Guided Diffusion for Long-Gap Music Inpainting
- Equilibrium flow: From Snapshots to Dynamics
- Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs
- Unveiling m-Sharpness Through the Structure of Stochastic Gradient Noise
- Control Disturbance Rejection in Neural ODEs
- Reinforced Generation of Combinatorial Structures: Applications to Complexity Theory
- Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM
- Learning to Rank with Top-$K$ Fairness
- Learning functions, operators and dynamical systems with kernels
- Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding
- Deep Reinforcement Learning in Factor Investment
- On the Detection of Internal Defects in Structured Media
- Understanding Post-Training Structural Changes in Large Language Models
- Improving After-sales Service: Deep Reinforcement Learning for Dynamic Time Slot Assignment with Commitments and Customer Preferences
- Deep Hierarchical Learning with Nested Subspace Networks
- Confidence-gated training for efficient early-exit neural networks
- GaussianPSL: A novel framework based on Gaussian Splatting for exploring the Pareto frontier in multi-criteria optimization
- Optimizing Inference in Transformer-Based Models: A Multi-Method Benchmark
- SingLEM: Single-Channel Large EEG Model
- Medical priority fusion: achieving dual optimization of sensitivity and interpretability in nipt anomaly detection
- StefaLand: An Efficient Geoscience Foundation Model That Improves Dynamic Land-Surface Predictions
- Joint Optimization of Memory Frequency, Computing Frequency, Transmission Power and Task Offloading for Energy-efficient DNN Inference
- Intra-Cluster Mixup: An Effective Data Augmentation Technique for Complementary-Label Learning
- Budgeted Adversarial Attack against Graph-Based Anomaly Detection in Sensor Networks
- An AutoML Framework using AutoGluonTS for Forecasting Seasonal Extreme Temperatures
- Flatness is Necessary, Neural Collapse is Not: Rethinking Generalization via Grokking
- GEM-T: Generative Tabular Data via Fitting Moments
- Learning Neural Antiderivatives
- Revealing Multimodal Causality with Large Language Models
- Elucidating the Design Space of FP4 training
- Remote Sensing-Oriented World Model
- MTM: A Multi-Scale Token Mixing Transformer for Irregular Multivariate Time Series Classification
- MSGAT-GRU: A Multi-Scale Graph Attention and Recurrent Model for Spatiotemporal Road Accident Prediction
- Global Optimization via Softmin Energy Minimization
- Conv-like Scale-Fusion Time Series Transformer: A Multi-Scale Representation for Variable-Length Long Time Series
- BiLCNet : BiLSTM-Conformer Network for Encrypted Traffic Classification with 5G SA Physical Channel Records
- Achilles' Heel of Mamba: Essential difficulties of the Mamba architecture demonstrated by synthetic data
- An Unlearning Framework for Continual Learning
- SeqBattNet: A Discrete-State Physics-Informed Neural Network with Aging Adaptation for Battery Modeling
- Comparing Data Assimilation and Likelihood-Based Inference on Latent State Estimation in Agent-Based Models
- Mechanistic Interpretability with SAEs: Probing Religion, Violence, and Geography in Large Language Models
- Fast, Accurate and Interpretable Graph Classification with Topological Kernels
- Cluster Workload Allocation: A Predictive Approach Leveraging Machine Learning Efficiency
- A non-smooth regularization framework for learning over multitask graphs
- A Generative Conditional Distribution Equality Testing Framework and Its Minimax Analysis
- ConfClip: Confidence-Weighted and Clipped Reward for Reinforcement Learning in LLMs
- Training the next generation of physicians for artificial intelligence-assisted clinical neuroradiology: ASNR MICCAI Brain Tumor Segmentation (BraTS) 2025 Lighthouse Challenge education platform
- GraphWeave: Interpretable and Robust Graph Generation via Random Walk Trajectories
- Physics-Informed Operator Learning for Hemodynamic Modeling
- SPRINT: Stochastic Performative Prediction With Variance Reduction
- VQEzy: An Open-Source Dataset for Parameter Initialize in Variational Quantum Eigensolvers
- Generalizable End-to-End Tool-Use RL with Synthetic CodeGym
- Robust Anomaly Detection Under Normality Distribution Shift in Dynamic Graphs
- Efficient Sliced Wasserstein Distance Computation via Adaptive Bayesian Optimization
- Distributionally Robust Safety Verification of Neural Networks via Worst-Case CVaR
- MVCL-DAF++: Enhancing Multimodal Intent Recognition via Prototype-Aware Contrastive Alignment and Coarse-to-Fine Dynamic Attention Fusion
- Periodic Graph-Enhanced Multivariate Time Series Anomaly Detector
- Path-Weighted Integrated Gradients for Interpretable Dementia Classification
- A Comprehensive Performance Comparison of Traditional and Ensemble Machine Learning Models for Online Fraud Detection
- Regularizing Extrapolation in Causal Inference
- PMRT: A Training Recipe for Fast, 3D High-Resolution Aerodynamic Prediction
- Dendritic Resonate-and-Fire Neuron for Effective and Efficient Long Sequence Modeling
- SignalLLM: A General-Purpose LLM Agent Framework for Automated Signal Processing
- Conditional Policy Generator for Dynamic Constraint Satisfaction and Optimization
- Active Learning for Machine Learning Driven Molecular Dynamics
- Causal Representation Learning from Multimodal Clinical Records under Non-Random Modality Missingness
- Prospective Multi-Graph Cohesion for Multivariate Time Series Anomaly Detection
- TraceHiding: Scalable Machine Unlearning for Mobility Data
- Graph Signal Generative Diffusion Models
- Enhancing Performance and Calibration in Quantile Hyperparameter Optimization
- TSGym: Design Choices for Deep Multivariate Time-Series Forecasting
- On the Limits of Tabular Hardness Metrics for Deep RL: A Study with the Pharos Benchmark
- Ultra-short-term solar power forecasting by deep learning and data reconstruction
- GRPOformer: Advancing Hyperparameter Optimization via Group Relative Policy Optimization
- ScenGAN: Attention-Intensive Generative Model for Uncertainty-Aware Renewable Scenario Forecasting
- On the Simplification of Neural Network Architectures for Predictive Process Monitoring
- Flow-Induced Diagonal Gaussian Processes
- Unrolled Graph Neural Networks for Constrained Optimization
- Time Series Forecasting Using a Hybrid Deep Learning Method: A Bi-LSTM Embedding Denoising Auto Encoder Transformer
- Detecting Urban PM$_{2.5}$ Hotspots with Mobile Sensing and Gaussian Process Regression
- Dynamic Expert Specialization: Towards Catastrophic Forgetting-Free Multi-Domain MoE Adaptation
- DRES: Fake news detection by dynamic representation and ensemble selection
- The Complexity of Finding Local Optima in Contrastive Learning
- FedEL: Federated Elastic Learning for Heterogeneous Devices
- Auditability and the Landscape of Distance to Multicalibration
- Adaptive Graph Convolution and Semantic-Guided Attention for Multimodal Risk Detection in Social Networks
- Gradient Interference-Aware Graph Coloring for Multitask Learning
- PTQTP: Post-Training Quantization to Trit-Planes for Large Language Models
- Persistence Spheres: Bi-continuous Representations of Persistence Diagrams
- Adaptive Overclocking: Dynamic Control of Thinking Path Length via Real-Time Reasoning Signals
- Long-Tailed Out-of-Distribution Detection with Refined Separate Class Learning
- $\boldsymbol{\lambda}$-Orthogonality Regularization for Compatible Representation Learning
- HypeMARL: Multi-Agent Reinforcement Learning For High-Dimensional, Parametric, and Distributed Systems
- A Hybrid PCA-PR-Seq2Seq-Adam-LSTM Framework for Time-Series Power Outage Prediction
- Interpretable Clinical Classification with Kolgomorov-Arnold Networks
- Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees
- Geometric Mixture Classifier (GMC): A Discriminative Per-Class Mixture of Hyperplanes
- DISCO: Disentangled Communication Steering for Large Language Models
- KANO: Kolmogorov-Arnold Neural Operator
- SOLAR: Switchable Output Layer for Accuracy and Robustness in Once-for-All Training
- LVADNet3D: A Deep Autoencoder for Reconstructing 3D Intraventricular Flow from Sparse Hemodynamic Data
- Towards Interpretable and Efficient Attention: Compressing All by Contracting a Few
- A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective
- GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models
- Federated Learning with Ad-hoc Adapter Insertions: The Case of Soft-Embeddings for Training Classifier-as-Retriever
- LLM-Guided Co-Training for Text Classification
- mmExpert: Integrating Large Language Models for Comprehensive mmWave Data Synthesis and Understanding
- SCAN: Self-Denoising Monte Carlo Annotation for Robust Process Reward Learning
- ViTCAE: ViT-based Class-conditioned Autoencoder
- Learned Digital Codes for Over-the-Air Federated Learning
- Near-Optimal Sample Complexity Bounds for Constrained Average-Reward MDPs
- Self-Supervised Learning of Graph Representations for Network Intrusion Detection
- Causality-Induced Positional Encoding for Transformer-Based Representation Learning of Non-Sequential Features
- FairTune: A Bias-Aware Fine-Tuning Framework Towards Fair Heart Rate Prediction from PPG
- Comparison of Deterministic and Probabilistic Machine Learning Algorithms for Precise Dimensional Control and Uncertainty Quantification in Additive Manufacturing
- Architectural change in neural networks using fuzzy vertex pooling
- ROOT: Rethinking Offline Optimization as Distributional Translation via Probabilistic Bridge
- Auto-bidding under Return-on-Spend Constraints with Uncertainty Quantification
- Improving Deep Tabular Learning
- Guided Sequence-Structure Generative Modeling for Iterative Antibody Optimization
- EMPEROR: Efficient Moment-Preserving Representation of Distributions
- Federated Learning for Financial Forecasting
- Local Mechanisms of Compositional Generalization in Conditional Diffusion
- Towards Universal Debiasing for Language Models-based Tabular Data Generation
- Revisiting Broken Windows Theory
- Detection and Simulation of Urban Heat Islands Using a Fine-Tuned Geospatial Foundation Model
- Surgical-MambaLLM: Mamba2-enhanced Multimodal Large Language Model for VQLA in Robotic Surgery
- Audio-Conditioned Diffusion LLMs for ASR and Deliberation Processing
- When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs
- KungfuBot2: Learning Versatile Motion Skills for Humanoid Whole-Body Control
- AISTAT lab system for DCASE2025 Task6: Language-based audio retrieval
- On the de-duplication of the Lakh MIDI dataset
- Governed By Agents: A Survey On The Role Of Agentic AI In Future Computing Environments
- ProtoVQA: An Adaptable Prototypical Framework for Explainable Fine-Grained Visual Question Answering
- Design and Development of an Intelligent LLM-based LDAP Honeypot
- InteGround: On the Evaluation of Verification and Retrieval Planning in Integrative Grounding
- Train to Defend: First Defense Against Cryptanalytic Neural Network Parameter Extraction Attacks
- TranTac: Leveraging Transient Tactile Signals for Contact-Rich Robotic Manipulation
- Rethinking the Role of Text Complexity in Language Model Pretraining
- V-CECE: Visual Counterfactual Explanations via Conceptual Edits
- From Scores to Steps: Diagnosing and Improving LLM Performance in Evidence-Based Medical Calculations
- SQS: Enhancing Sparse Perception Models via Query-based Splatting in Autonomous Driving
- Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data
- Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels
- PruneCD: Contrasting Pruned Self Model to Improve Decoding Factuality
- FakeChain: Exposing Shallow Cues in Multi-Step Deepfake Detection
- Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations
- Entropic Causal Inference: Graph Identifiability
- Thermal Imaging-based Real-time Fall Detection using Motion Flow and Attention-enhanced Convolutional Recurrent Architecture
- The Oracle Has Spoken: A Multi-Aspect Evaluation of Dialogue in Pythia
- Can an Individual Manipulate the Collective Decisions of Multi-Agents?
- Synergies between Federated Foundation Models and Smart Power Grids
- Seeing Culture: A Benchmark for Visual Reasoning and Grounding
- Causal Fuzzing for Verifying Machine Unlearning
- Lattice Boltzmann Model for Learning Real-World Pixel Dynamicity
- AIPsychoBench: Understanding the Psychometric Differences between LLMs and Humans
- No Need for Real 3D: Fusing 2D Vision with Pseudo 3D Representations for Robotic Manipulation Learning
- Enhancing Financial RAG with Agentic AI and Multi-HyDE: A Novel Approach to Knowledge Retrieval and Hallucination Reduction
- CoUn: Empowering Machine Unlearning via Contrastive Learning
- Evaluating Behavioral Alignment in Conflict Dialogue: A Multi-Dimensional Comparison of LLM Agents and Humans
- GRID: Graph-based Reasoning for Intervention and Discovery in Built Environments
- Pico: A Modular Framework for Hypothesis-Driven Small Language Model Research
- LenslessMic: Audio Encryption and Authentication via Lensless Computational Imaging
- AHA -- Predicting What Matters Next: Online Highlight Detection Without Looking Ahead
- SENSE-7: Taxonomy and Dataset for Measuring User Perceptions of Empathy in Sustained Human-AI Conversations
- LightCode: Compiling LLM Inference for Photonic-Electronic Systems
- PersonaMatrix: A Recipe for Persona-Aware Evaluation of Legal Summarization
- KRAST: Knowledge-Augmented Robotic Action Recognition with Structured Text for Vision-Language Models
- A Generative AI System for Biomedical Data Discovery with Grammar-Based Visualizations
- Energy Equity, Infrastructure and Demographic Analysis with XAI Methods
- Robust LLM Training Infrastructure at ByteDance
- Patterns in the Transition From Founder-Leadership to Community Governance of Open Source
- How Large Language Models are Designed to Hallucinate
- Overhearing LLM Agents: A Survey, Taxonomy, and Roadmap
- Highly Imbalanced Regression with Tabular Data in SEP and Other Applications
- Agentic Reasoning for Robust Vision Systems via Increased Test-Time Compute
- Estimating Clinical Lab Test Result Trajectories from PPG using Physiological Foundation Model and Patient-Aware State Space Model -- a UNIPHY+ Approach
- From Canopy to Ground via ForestGen3D: Learning Cross-Domain Generation of 3D Forest Structure from Aerial-to-Terrestrial LiDAR
- QUINTA: Reflexive Sensibility For Responsible AI Research and Data-Driven Processes
- Secure Confidential Business Information When Sharing Machine Learning Models
- A study on Deep Convolutional Neural Networks, transfer learning, and Mnet model for Cervical Cancer Detection
- R-Net: A Reliable and Resource-Efficient CNN for Colorectal Cancer Detection with XAI Integration
- Imaging Modalities-Based Classification for Lung Cancer Detection
- HausaMovieReview: A Benchmark Dataset for Sentiment Analysis in Low-Resource African Language
- Socratic Mind: Impact of a Novel GenAI-Powered Assessment Tool on Student Learning and Higher-Order Thinking
- Gender and Political Bias in Large Language Models: A Demonstration Platform
- Digging Into the Internal: Causality-Based Analysis of LLM Function Calling
- SubDyve: Subgraph-Driven Dynamic Propagation for Virtual Screening Enhancement Controlling False Positive
- SecureFixAgent: A Hybrid LLM Agent for Automated Python Static Vulnerability Repair
- Comparative Analysis of STEM and non-STEM Teachers' Needs for Integrating AI into Educational Environments
- Stabilizing Information Flow Entropy: Regularization for Safe and Interpretable Autonomous Driving Perception
- "I think this is fair'': Uncovering the Complexities of Stakeholder Decision-Making in AI Fairness Assessment
- On the Variational Costs of Changing Our Minds
- The STAR-XAI Protocol: An Interactive Framework for Inducing Second-Order Agency in AI Agents
- Improving Large Language Models Function Calling and Interpretability via Guided-Structured Templates
- Reasoning Core: A Scalable RL Environment for LLM Symbolic Reasoning
- Breast Cancer Classification Using Gradient Boosting Algorithms Focusing on Reducing the False Negative and SHAP for Explainability
- EPIC: Generative AI Platform for Accelerating HPC Operational Data Analytics
- DarwinWafer: A Wafer-Scale Neuromorphic Chip
- Discovering Software Parallelization Points Using Deep Neural Networks
- On LLM-Based Scientific Inductive Reasoning Beyond Equations
- REAMS: Reasoning Enhanced Algorithm for Maths Solving
- Is It Certainly a Deepfake? Reliability Analysis in Detection & Generation Ecosystem
- MontePrep: Monte-Carlo-Driven Automatic Data Preparation without Target Data Instances
- LIMI: Less is More for Agency
- Table2LaTeX-RL: High-Fidelity LaTeX Code Generation from Table Images via Reinforced Multimodal Language Models
- EngiBench: A Benchmark for Evaluating Large Language Models on Engineering Problem Solving
- Virtual Arc Consistency for Linear Constraints inCost Function Networks
- DA-Mamba: Dialogue-aware selective state-space model for multimodal engagement estimation
- Efficient & Correct Predictive Equivalence for Decision Trees
- Mitigating Strategy-Selection Bias in Reasoning for More Effective Test-Time Scaling
- MEF: A Systematic Evaluation Framework for Text-to-Image Models
- Orcust: Stepwise-Feedback Reinforcement Learning for GUI Agent
- Can Agents Judge Systematic Reviews Like Humans? Evaluating SLRs with LLM-based Multi-Agent System
- Mind the Gap: Comparing Model- vs Agentic-Level Red Teaming with Action-Graph Observability on GPT-OSS-20B
- CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models
- LLaVul: A Multimodal LLM for Interpretable Vulnerability Reasoning about Source Code
- Medical AI Consensus: A Multi-Agent Framework for Radiology Report Generation and Evaluation
- Multi-Scenario Highway Lane-Change Intention Prediction: A Physics-Informed AI Framework for Three-Class Classification
- Correlation or Causation: Analyzing the Causal Structures of LLM and LRM Reasoning Process
- Program Synthesis via Test-Time Transduction
- Evaluating Multimodal Large Language Models with Daily Composite Tasks in Home Environments
- SPICED: A Synaptic Homeostasis-Inspired Framework for Unsupervised Continual EEG Decoding
- AI Pangaea: Unifying Intelligence Islands for Adapting Myriad Tasks
- A Multimodal Conversational Assistant for the Characterization of Agricultural Plots from Geospatial Open Data
- Audio-Guided Dynamic Modality Fusion with Stereo-Aware Attention for Audio-Visual Navigation
- Quantum Abduction: A New Paradigm for Reasoning under Uncertainty
- KAHAN: Knowledge-Augmented Hierarchical Analysis and Narration for Financial Data Narration
- From domain-landmark graph learning to problem-landmark graph generation
- RALLM-POI: Retrieval-Augmented LLM for Zero-shot Next POI Recommendation with Geographical Reranking
- Intention-aware Hierarchical Diffusion Model for Long-term Trajectory Anomaly Detection
- Governing Automated Strategic Intelligence
- MCTS-EP: Empowering Embodied Planning with Online Preference Optimization
- ARE: Scaling Up Agent Environments and Evaluations
- Shall We Play a Game? Language Models for Open-ended Wargames
- MoEs Are Stronger than You Think: Hyper-Parallel Inference Scaling with RoE
- Question Answering with LLMs and Learning from Answer Sets
- FESTA: Functionally Equivalent Sampling for Trust Assessment of Multimodal LLMs
- NUMINA: A Natural Understanding Benchmark for Multi-dimensional Intelligence and Numerical Reasoning Abilities
- Sycophancy Mitigation Through Reinforcement Learning with Uncertainty-Aware Adaptive Reasoning Trajectories
- Automated Procedural Analysis via Video-Language Models for AI-assisted Nursing Skills Assessment
- Prompt-Driven Agentic Video Editing System: Autonomous Comprehension of Long-Form, Story-Driven Media
- Roundtable Policy: Improving Scientific Reasoning and Narratives through Confidence-Weighted Consensus of LLMs
- The Principles of Human-like Conscious Machine
- Large Language Models as End-to-end Combinatorial Optimization Solvers
- seqBench: A Tunable Benchmark to Quantify Sequential Reasoning Limits of LLMs
- LLMs as Layout Designers: A Spatial Reasoning Perspective
- On the Non-Uniqueness of Representation of $(U,N)$-Implications
- Psychometric Personality Shaping Modulates Capabilities and Safety in Language Models
- A Unified AI Approach for Continuous Monitoring of Human Health and Diseases from Intensive Care Unit to Home with Physiological Foundation Models (UNIPHY+)
- Evaluation of Causal Reasoning for Large Language Models in Contextualized Clinical Scenarios of Laboratory Test Interpretation
- VORTEX: Aligning Task Utility and Human Preferences through LLM-Guided Reward Shaping
- Proactive Statistical Process Control Using AI: A Time Series Forecasting Approach for Semiconductor Manufacturing
- Domain-Specific Constitutional AI: Enhancing Safety in LLM-Powered Mental Health Chatbots
- GPO: Learning from Critical Steps to Improve LLM Reasoning
- Checking extracted rules in Neural Networks
- SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning
- Zero-Shot Human Mobility Forecasting via Large Language Model with Hierarchical Reasoning
- Identifying Critical Pathways in Coronary Heart Disease via Fuzzy Subgraph Connectivity
- A global view of diverse construction methods of fuzzy implication functions rooted on F-chains
Research Sources: 665 | Generated: 9/23/2025