Skip to content

isLinXu/paper-list

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[![Contributors][contributors-shield]][contributors-url] [![Forks][forks-shield]][forks-url] [![Stargazers][stars-shield]][stars-url] [![Issues][issues-shield]][issues-url]

Updated on 2024.05.23

Usage instructions: here

Table of Contents
  1. Classification
  2. Object Detection
  3. Semantic Segmentation
  4. Object Tracking
  5. Action Recognition
  6. Pose Estimation
  7. Image Generation
  8. LLM
  9. Scene Understanding
  10. Depth Estimation
  11. Audio Processing
  12. Multimodal
  13. Anomaly Detection
  14. Transfer Learning
  15. Optical Flow
  16. Reinforcement Learning
  17. Graph Neural Networks

Classification

Publish Date Title Authors PDF Code
2024-05-21 Decentralized Federated Learning Over Imperfect Communication Channels Weicai Li et.al. 2405.12894 null
2024-05-21 Multimodal Adaptive Inference for Document Image Classification with Anytime Early Exiting Omar Hamed et.al. 2405.12705 null
2024-05-21 Exploration of Masked and Causal Language Modelling for Text Generation Nicolo Micheletti et.al. 2405.12630 null
2024-05-21 3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification Yan He et.al. 2405.12487 null
2024-05-20 Alzheimer's Magnetic Resonance Imaging Classification Using Deep and Meta-Learning Models Nida Nasir et.al. 2405.12126 null
2024-05-20 Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification Weilian Zhou et.al. 2405.12003 link
2024-05-20 A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers Tom Roth et.al. 2405.11904 null
2024-05-21 A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus Eduard Poesina et.al. 2405.11877 link
2024-05-20 SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model Siavash Shams et.al. 2405.11831 link
2024-05-20 Exploring Ordinality in Text Classification: A Comparative Study of Explicit and Implicit Techniques Siva Rajesh Kasa et.al. 2405.11775 null
2024-05-19 SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization Jialong Guo et.al. 2405.11582 link
2024-05-19 Reproducibility Study of CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification Manan Shah et.al. 2405.11574 link
2024-05-19 An Invisible Backdoor Attack Based On Semantic Feature Yangming Chen et.al. 2405.11551 null
2024-05-19 Verification technology for finger vein biometric George Kumi Kyeremeh et.al. 2405.11540 null
2024-05-17 Reduced storage direct tensor ring decomposition for convolutional neural networks compression Mateusz Gabor et.al. 2405.10802 link
2024-05-17 Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset Jie Zhu et.al. 2405.10542 link
2024-05-17 Smart Expert System: Large Language Models as Text Classifiers Zhiqiang Wang et.al. 2405.10523 link
2024-05-16 Data-Efficient Low-Complexity Acoustic Scene Classification in the DCASE 2024 Challenge Florian Schmid et.al. 2405.10018 null
2024-05-16 ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset Johannes Rückert et.al. 2405.10004 link
2024-05-15 Improving Label Error Detection and Elimination with Uncertainty Quantification Johannes Jakubik et.al. 2405.09602 null
2024-05-15 Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck Hongru Li et.al. 2405.09514 null
2024-05-15 Feature-based Federated Transfer Learning: Communication Efficiency, Robustness and Privacy Feng Wang et.al. 2405.09014 link
2024-05-14 The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks Ziquan Liu et.al. 2405.08886 link
2024-05-14 Harnessing the power of longitudinal medical imaging for eye disease prognosis using Transformer-based sequence modeling Gregory Holste et.al. 2405.08780 null
2024-05-14 FolkTalent: Enhancing Classification and Tagging of Indian Folk Paintings Nancy Hada et.al. 2405.08776 null
2024-05-14 The impact of Compositionality in Zero-shot Multi-label action recognition for Object-based tasks Carmela Calabrese et.al. 2405.08695 null
2024-05-14 Achieving Fairness Through Channel Pruning for Dermatological Disease Diagnosis Qingpeng Kong et.al. 2405.08681 link
2024-05-14 Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning Alain Riou et.al. 2405.08679 null
2024-05-14 Dual-Branch Network for Portrait Image Quality Assessment Wei Sun et.al. 2405.08555 null
2024-05-13 Who's in and who's out? A case study of multimodal CLIP-filtering in DataComp Rachel Hong et.al. 2405.08209 link
2024-05-14 MambaOut: Do We Really Need Mamba for Vision? Weihao Yu et.al. 2405.07992 link
2024-05-13 Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics Haoyang Zheng et.al. 2405.07839 link
2024-05-13 Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent Michael Kohler et.al. 2405.07619 null
2024-05-13 On-device Online Learning and Semantic Management of TinyML Systems Haoyu Ren et.al. 2405.07601 null
2024-05-13 GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation Andrey V. Galichin et.al. 2405.07562 null
2024-05-13 Fine-tuning the SwissBERT Encoder Model for Embedding Sentences and Documents Juri Grosjean et.al. 2405.07513 null
2024-05-13 MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks Haijiang Tian et.al. 2405.07411 null
2024-05-12 Explainable Convolutional Neural Networks for Retinal Fundus Classification and Cutting-Edge Segmentation Models for Retinal Blood Vessels from Fundus Images Fatema Tuj Johora Faria et.al. 2405.07338 null
2024-05-12 Differentiable Model Scaling using Differentiable Topk Kai Liu et.al. 2405.07194 null
2024-05-11 A framework of text-dependent speaker verification for chinese numerical string corpus Litong Zheng et.al. 2405.07029 null
2024-05-10 Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification Yaoqin Ye et.al. 2405.06468 null
2024-05-10 Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data Rongyu Zhang et.al. 2405.06413 null
2024-05-10 SaudiBERT: A Large Language Model Pretrained on Saudi Dialect Corpora Faisal Qarah et.al. 2405.06239 null
2024-05-09 Deep Multi-Task Learning for Malware Image Classification Ahmed Bensaoud et.al. 2405.05906 null
2024-05-09 Enhancing Suicide Risk Detection on Social Media through Semi-Supervised Deep Label Smoothing Matthew Squires et.al. 2405.05795 null
2024-05-09 CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks Nick et.al. 2405.05755 null
2024-05-09 How Quality Affects Deep Neural Networks in Fine-Grained Image Classification Joseph Smith et.al. 2405.05742 null
2024-05-09 End-to-End Generative Semantic Communication Powered by Shared Semantic Knowledge Base Shuling Li et.al. 2405.05738 null
2024-05-09 Using Machine Translation to Augment Multilingual Classification Adam King et.al. 2405.05478 null
2024-05-08 AFEN: Respiratory Disease Classification using Ensemble Learning Rahul Nadkarni et.al. 2405.05467 null
2024-05-08 XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples Peiqin Lin et.al. 2405.05116 link
2024-05-08 Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution Shuo Shao et.al. 2405.04825 null
2024-05-07 Exploring Explainable AI Techniques for Improved Interpretability in Lung and Colon Cancer Classification Mukaffi Bin Moin et.al. 2405.04610 link
2024-05-07 Pragmatist Intelligence: Where the Principle of Usefulness Can Take ANNs Antonio Bikić et.al. 2405.04386 null
2024-05-07 Semi-Supervised Disease Classification based on Limited Medical Image Data Yan Zhang et.al. 2405.04295 null
2024-05-07 DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects Da Fu et.al. 2405.04093 null
2024-05-07 Feature Map Convergence Evaluation for Functional Module Ludan Zhang et.al. 2405.04041 null
2024-05-07 VMambaCC: A Visual State Space Model for Crowd Counting Hao-Yuan Ma et.al. 2405.03978 null
2024-05-06 On Adversarial Examples for Text Classification by Perturbing Latent Representations Korn Sooksatra et.al. 2405.03789 null
2024-05-06 CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification Sankalp Sinha et.al. 2405.03660 null
2024-05-06 Deep Space Separable Distillation for Lightweight Acoustic Scene Classification ShuQi Ye et.al. 2405.03567 null
2024-05-06 Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing Han Liu et.al. 2405.03565 null
2024-05-06 A Lightweight Neural Architecture Search Model for Medical Image Classification Lunchen Xie et.al. 2405.03462 null
2024-05-06 Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification Matteo Bianchi et.al. 2405.03301 null
2024-05-06 TED: Accelerate Model Training by Internal Generalization Jinying Xiao et.al. 2405.03228 null
2024-05-06 Advancing Multimodal Medical Capabilities of Gemini Lin Yang et.al. 2405.03162 null
2024-05-05 A scoping review of using Large Language Models (LLMs) to investigate Electronic Health Records (EHRs) Lingyao Li et.al. 2405.03066 null
2024-05-05 Parameter-Efficient Fine-Tuning with Discrete Fourier Transform Ziqi Gao et.al. 2405.03003 null
2024-05-04 MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning Vishal Nedungadi et.al. 2405.02771 null
2024-05-03 Multi-method Integration with Confidence-based Weighting for Zero-shot Image Classification Siqi Yin et.al. 2405.02155 null
2024-05-03 The Trade-off between Performance, Efficiency, and Fairness in Adapter Modules for Text Classification Minh Duc Bui et.al. 2405.02010 null
2024-05-03 Which Identities Are Mobilized: Towards an automated detection of social group appeals in political texts Felicia Riethmüller et.al. 2405.01904 null
2024-05-02 PVF (Parameter Vulnerability Factor): A Quantitative Metric Measuring AI Vulnerability and Resilience Against Parameter Corruptions Xun Jiao et.al. 2405.01741 null
2024-05-02 Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey Guoping Xu et.al. 2405.01725 link
2024-05-02 SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients Tushar Verma et.al. 2405.01699 null
2024-05-02 Explainable AI (XAI) in Image Segmentation in Medicine, Industry, and Beyond: A Survey Rokas Gipiškis et.al. 2405.01636 null
2024-05-02 Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models Nishad Singhi et.al. 2405.01531 null
2024-05-03 Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks Mikkel Jordahn et.al. 2405.01196 null
2024-05-02 Uncertainty-aware self-training with expectation maximization basis transformation Zijia Wang et.al. 2405.01175 null
2024-05-02 Transformers Fusion across Disjoint Samples for Hyperspectral Image Classification Muhammad Ahmad et.al. 2405.01095 null
2024-05-02 Efficient and Flexible Method for Reducing Moderate-size Deep Neural Networks with Condensation Tianyi Chen et.al. 2405.01041 null
2024-05-02 Benchmarking Representations for Speech, Music, and Acoustic Events Moreno La Quatra et.al. 2405.00934 link
2024-05-01 Digital-analog quantum convolutional neural networks for image classification Anton Simen et.al. 2405.00548 null
2024-05-03 BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine Mingchen Li et.al. 2405.00465 null
2024-05-01 Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol Konstantinos Apostolidis et.al. 2405.00384 null
2024-05-01 Data Augmentation Policy Search for Long-Term Forecasting Liran Nochumsohn et.al. 2405.00319 null
2024-04-30 Let's Focus: Focused Backdoor Attack against Federated Transfer Learning Marco Arazzi et.al. 2404.19420 null
2024-04-30 Large Language Model Informed Patent Image Retrieval Hao-Cheng Lo et.al. 2404.19360 null
2024-04-30 Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair Jeonghoon Park et.al. 2404.19250 null
2024-04-29 Spectral-Spatial Mamba for Hyperspectral Image Classification Lingbo Huang et.al. 2404.18401 null
2024-04-28 TextGram: Towards a better domain-adaptive pretraining Sharayu Hiwarkhedkar et.al. 2404.18228 null
2024-04-28 L3Cube-MahaNews: News-based Short Text and Long Document Classification Datasets in Marathi Saloni Mittal et.al. 2404.18216 link
2024-04-28 S $^2$ Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification Guanchun Wang et.al. 2404.18213 null
2024-04-27 Implicit Generative Prior for Bayesian Neural Networks Yijia Liu et.al. 2404.18008 link
2024-04-27 Towards Privacy-Preserving Audio Classification Systems Bhawana Chhaglani et.al. 2404.18002 null
2024-04-27 A Method of Moments Embedding Constraint and its Application to Semi-Supervised Learning Michael Majurski et.al. 2404.17978 null
2024-04-27 Spatial, Temporal, and Geometric Fusion for Remote Sensing Images Hessah Albanwan et.al. 2404.17851 null
2024-04-27 Leveraging Cross-Modal Neighbor Representation for Improved CLIP Classification Chao Yi et.al. 2404.17753 link
2024-04-26 SPLICE -- Streamlining Digital Pathology Image Processing Areej Alsaafin et.al. 2404.17704 null
2024-04-26 SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes Georgia Baltsou et.al. 2404.17255 null
2024-04-25 Incorporating Lexical and Syntactic Knowledge for Unsupervised Cross-Lingual Transfer Jianyu Zheng et.al. 2404.16627 link
2024-04-25 IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks Zitong Huang et.al. 2404.16331 null
2024-04-25 Lacunarity Pooling Layers for Plant Image Classification using Texture Analysis Akshatha Mohan et.al. 2404.16268 link
2024-04-24 MiMICRI: Towards Domain-centered Counterfactual Explanations of Cardiovascular Image Classification Models Grace Guo et.al. 2404.16174 null
2024-04-24 MoDE: CLIP Data Experts via Clustering Jiawei Ma et.al. 2404.16030 link
2024-04-26 A Survey on Visual Mamba Hanwei Zhang et.al. 2404.15956 null
2024-04-24 Vision Transformer-based Adversarial Domain Adaptation Yahan Li et.al. 2404.15817 link
2024-04-24 Rethinking Model Prototyping through the MedMNIST+ Dataset Collection Sebastian Doerrich et.al. 2404.15786 null
2024-04-24 Efficient Multi-Model Fusion with Adversarial Complementary Representation Learning Zuheng Kang et.al. 2404.15704 null
2024-04-24 Brain Storm Optimization Based Swarm Learning for Diabetic Retinopathy Image Classification Liang Qu et.al. 2404.15585 null
2024-04-23 An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models Yangchen Pan et.al. 2404.15518 null
2024-04-23 Deep multi-prototype capsule networks Saeid Abbassi et.al. 2404.15445 null
2024-04-23 A review of deep learning-based information fusion techniques for multimodal medical image classification Yihao Li et.al. 2404.15022 null
2024-04-23 Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case Muhammad Asif Auyb et.al. 2404.14977 null
2024-04-23 Traditional to Transformers: A Survey on Current Trends and Future Prospects for Hyperspectral Image Classification Muhammad Ahmad et.al. 2404.14955 link
2024-04-23 Pyramid Hierarchical Transformer for Hyperspectral Image Classification Muhammad Ahmad et.al. 2404.14945 link
2024-04-23 Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image Classification Muhammad Ahmad et.al. 2404.14944 link
2024-04-23 CoProNN: Concept-based Prototypical Nearest Neighbors for Explaining Vision Models Teodor Chiaburu et.al. 2404.14830 link
2024-04-22 WangLab at MEDIQA-M3G 2024: Multimodal Medical Answer Generation using Large Language Models Ronald Xie et.al. 2404.14567 null
2024-04-22 CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective Wencheng Zhu et.al. 2404.14109 null
2024-04-21 EncodeNet: A Framework for Boosting DNN Accuracy with Entropy-driven Generalized Converting Autoencoder Hasanul Mahmud et.al. 2404.13770 null
2024-04-21 PEACH: Pretrained-embedding Explanation Across Contextual and Hierarchical Structure Feiqi Cao et.al. 2404.13645 link
2024-04-21 I2CANSAY:Inter-Class Analogical Augmentation and Intra-Class Significance Analysis for Non-Exemplar Online Task-Free Continual Learning Songlin Dong et.al. 2404.13576 null
2024-04-21 IMO: Greedy Layer-Wise Sparse Representation Learning for Out-of-Distribution Text Classification with Pre-trained Models Tao Feng et.al. 2404.13504 null
2024-04-20 Nested-TNT: Hierarchical Vision Transformers with Multi-Scale Feature Processing Yuang Liu et.al. 2404.13434 null
2024-04-20 Evaluating Subword Tokenization: Alien Subword Composition and OOV Generalization Challenge Khuyagbaatar Batsuren et.al. 2404.13292 link
2024-04-20 3D-Convolution Guided Spectral-Spatial Transformer for Hyperspectral Image Classification Shyam Varahagiri et.al. 2404.13252 link
2024-04-19 On-board classification of underwater images using hybrid classical-quantum CNN based method Sreeraj Rajan Warrier et.al. 2404.13130 null
2024-04-19 Next Generation Loss Function for Image Classification Shakhnaz Akhmedova et.al. 2404.12948 null
2024-04-19 A Hybrid Generative and Discriminative PointNet on Unordered Point Sets Yang Ye et.al. 2404.12925 null
2024-04-19 Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment Danqing Ma et.al. 2404.12634 null
2024-04-18 When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes Asaf Yehudai et.al. 2404.12365 null
2024-04-18 Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training Jin Gao et.al. 2404.12210 link
2024-04-18 Concept Induction using LLMs: a user experiment for assessment Adrita Barua et.al. 2404.11875 null
2024-04-17 Pretraining Billion-scale Geospatial Foundational Models on Frontier Aristeidis Tsaris et.al. 2404.11706 null
2024-04-17 AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts Meng Jiang et.al. 2404.11449 null
2024-04-17 Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured Hanlin Mo et.al. 2404.11309 null
2024-04-17 A Progressive Framework of Vision-language Knowledge Distillation and Alignment for Multilingual Scene Wenbo Zhang et.al. 2404.11249 null
2024-04-17 A Novel ICD Coding Framework Based on Associated and Hierarchical Code Description Distillation Bin Zhang et.al. 2404.11132 null
2024-04-17 Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification Pierre Lepagnol et.al. 2404.11122 null
2024-04-18 Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification Mohammad Shiri et.al. 2404.11052 null
2024-04-17 InfoMatch: Entropy Neural Estimation for Semi-Supervised Image Classification Qi Han et.al. 2404.11003 link
2024-04-16 Incubating Text Classifiers Following User Instruction with Nothing but LLM Letian Peng et.al. 2404.10877 null
2024-04-16 Vocabulary-free Image Classification and Semantic Segmentation Alessandro Conti et.al. 2404.10864 link
2024-04-16 Assessing The Impact of CNN Auto Encoder-Based Image Denoising on Image Classification Tasks Mohsen Hami et.al. 2404.10664 null
2024-04-16 Tree Bandits for Generative Bayes Sean O'Hagan et.al. 2404.10436 null
2024-04-16 AudioProtoPNet: An interpretable deep learning model for bird sound classification René Heinrich et.al. 2404.10420 null
2024-04-16 Lighter, Better, Faster Multi-Source Domain Adaptation with Gaussian Mixture Models and Optimal Transport Eduardo Fernandes Montesuma et.al. 2404.10261 null
2024-04-15 Distributed Federated Learning-Based Deep Learning Model for Privacy MRI Brain Tumor Detection Lisang Zhou et.al. 2404.10026 null
2024-04-15 Interaction as Explanation: A User Interaction-based Method for Explaining Image Classification Models Hyeonggeun Yun et.al. 2404.09828 null
2024-04-15 Quantization of Large Language Models with an Overdetermined Basis Daniil Merkulov et.al. 2404.09737 null
2024-04-15 Pseudo-label Learning with Calibrated Confidence Using an Energy-based Model Masahito Toba et.al. 2404.09585 null
2024-04-14 Breast Cancer Image Classification Method Based on Deep Transfer Learning Weimin Wang et.al. 2404.09226 null
2024-04-14 Coreset Selection for Object Detection Hojun Lee et.al. 2404.09161 null
2024-04-13 Exploring Explainability in Video Action Recognition Avinab Saha et.al. 2404.09067 null
2024-04-13 Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification Denis Huseljic et.al. 2404.08981 link
2024-04-13 PM2: A New Prompting Multi-modal Model Paradigm for Few-shot Medical Image Classification Zhenwei Wang et.al. 2404.08915 null
2024-04-12 VertAttack: Taking advantage of Text Classifiers' horizontal vision Jonathan Rusert et.al. 2404.08538 null
2024-04-12 SpectralMamba: Efficient Mamba for Hyperspectral Image Classification Jing Yao et.al. 2404.08489 null
2024-04-12 OTTER: Improving Zero-Shot Classification via Optimal Transport Changho Shin et.al. 2404.08461 null
2024-04-12 A Survey of Neural Network Robustness Assessment in Image Recognition Jie Wang et.al. 2404.08285 null
2024-04-12 Convolutional neural network classification of cancer cytopathology images: taking breast cancer as an example MingXuan Xiao et.al. 2404.08279 null
2024-04-11 HGRN2: Gated Linear RNNs with State Expansion Zhen Qin et.al. 2404.07904 link
2024-04-11 Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification Ricardo Pereira et.al. 2404.07739 null
2024-04-11 Contrastive-Based Deep Embeddings for Label Noise-Resilient Histopathology Image Classification Lucas Dedieu et.al. 2404.07605 link
2024-04-11 Learning to Classify New Foods Incrementally Via Compressed Exemplars Justin Yang et.al. 2404.07507 null
2024-04-11 Interactive Prompt Debugging with Sequence Salience Ian Tenney et.al. 2404.07498 null
2024-04-11 Privacy preserving layer partitioning for Deep Neural Network models Kishore Rajasekar et.al. 2404.07437 null
2024-04-11 CopilotCAD: Empowering Radiologists with Report Completion Models and Quantitative Evidence from Medical Image Foundation Models Sheng Wang et.al. 2404.07424 null
2024-04-11 Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling Sourajit Saha et.al. 2404.07410 null
2024-04-10 Lost in Translation: Modern Neural Networks Still Struggle With Small Realistic Image Transformations Ofir Shifman et.al. 2404.07153 null
2024-04-10 Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization Michael Kohler et.al. 2404.07128 null
2024-04-10 Accelerating Cardiac MRI Reconstruction with CMRatt: An Attention-Driven Approach Anam Hashmi et.al. 2404.06941 null
2024-04-10 Multi-Label Continual Learning for the Medical Domain: A Novel Benchmark Marina Ceccon et.al. 2404.06859 null
2024-04-10 Neural Optimizer Equation, Decay Function, and Learning Rate Schedule Joint Evolution Brandon Morgan et.al. 2404.06679 null
2024-04-09 Variational Stochastic Gradient Descent for Deep Neural Networks Haotian Chen et.al. 2404.06549 link
2024-04-09 On adversarial training and the 1 Nearest Neighbor classifier Amir Hagai et.al. 2404.06313 link
2024-04-09 Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models David Kurzendörfer et.al. 2404.06309 link
2024-04-09 Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training Ming-Kun Xie et.al. 2404.06287 null
2024-04-09 Quantum Circuit $C^*$ -algebra Net Yuka Hashimoto et.al. 2404.06218 null
2024-04-09 VI-OOD: A Unified Representation Learning Framework for Textual Out-of-distribution Detection Li-Ming Zhan et.al. 2404.06217 link
2024-04-09 Symmetry-guided gradient descent for quantum neural networks Kaiming Bian et.al. 2404.06108 null
2024-04-10 Using Few-Shot Learning to Classify Primary Lung Cancer and Other Malignancy with Lung Metastasis in Cytological Imaging via Endobronchial Ultrasound Procedures Ching-Kai Lin et.al. 2404.06080 null
2024-04-08 Neural Cellular Automata for Lightweight, Robust and Explainable Classification of White Blood Cell Images Michael Deutges et.al. 2404.05584 null
2024-04-08 On the Convergence of Continual Learning with Adaptive Methods Seungyub Han et.al. 2404.05555 null
2024-04-08 Multi-Task Learning for Features Extraction in Financial Annual Reports Syrielle Montariol et.al. 2404.05281 link
2024-04-08 Allowing humans to interactively guide machines where to look does not always improve a human-AI team's classification accuracy Giang Nguyen et.al. 2404.05238 null
2024-04-08 iVPT: Improving Task-relevant Information Sharing in Visual Prompt Tuning by Cross-layer Dynamic Connection Nan Zhou et.al. 2404.05207 null
2024-04-08 Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods Roopkatha Dey et.al. 2404.05159 null
2024-04-07 PairAug: What Can Augmented Image-Text Pairs Do for Radiology? Yutong Xie et.al. 2404.04960 link
2024-04-07 GvT: A Graph-based Vision Transformer with Talking-Heads Utilizing Sparsity, Trained from Scratch on Small Datasets Dongjing Shan et.al. 2404.04924 null
2024-04-06 Focused Active Learning for Histopathological Image Classification Arne Schmidt et.al. 2404.04663 null
2024-04-06 Trustless Audits without Revealing Data or Models Suppakit Waiwitlikhit et.al. 2404.04500 null
2024-04-05 Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism Trilokesh Ranjan Sarkar et.al. 2404.04245 null
2024-04-05 Noisy Label Processing for Classification: A Survey Mengting Li et.al. 2404.04159 null
2024-04-05 Learning Correlation Structures for Vision Transformers Manjin Kim et.al. 2404.03924 null
2024-04-05 LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification Judy X Yang et.al. 2404.03883 null
2024-04-04 Dendrites endow artificial neural networks with accurate, robust and parameter-efficient learning Spyridon Chavlis et.al. 2404.03708 null
2024-04-05 A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data Iqra Bano et.al. 2404.03493 null
2024-04-04 Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks Lei Zhang et.al. 2404.03340 null
2024-04-04 Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning Andrei Semenov et.al. 2404.03323 link
2024-04-04 FACTUAL: A Novel Framework for Contrastive Learning Based Robust SAR Image Classification Xu Wang et.al. 2404.03225 null
2024-04-03 Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales Lucas E. Resck et.al. 2404.03098 link
2024-04-03 Guarantees of confidentiality via Hammersley-Chapman-Robbins bounds Kamalika Chaudhuri et.al. 2404.02866 link
2024-04-03 FPT: Feature Prompt Tuning for Few-shot Readability Assessment Ziyang Wang et.al. 2404.02772 link
2024-04-03 Adversarial Attacks and Dimensionality in Text Classifiers Nandish Chattopadhyay et.al. 2404.02660 null
2024-04-04 Non-negative Subspace Feature Representation for Few-shot Learning in Medical Imaging Keqiang Fan et.al. 2404.02656 null
2024-04-03 Adaptive Cross-lingual Text Classification through In-Context One-Shot Demonstrations Emilio Villa-Cueva et.al. 2404.02452 link
2024-04-03 A Novel Approach to Breast Cancer Histopathological Image Classification Using Cross-Colour Space Feature Fusion and Quantum-Classical Stack Ensemble Method Sambit Mallick et.al. 2404.02447 null
2024-04-03 Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data Parth Patwa et.al. 2404.02422 null
2024-04-02 Smooth Deep Saliency Rudolf Herdt et.al. 2404.02282 null
2024-04-02 Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models Matthew Kowal et.al. 2404.02233 null
2024-04-02 ImageNot: A contrast with ImageNet preserves model rankings Olawale Salaudeen et.al. 2404.02112 null
2024-04-02 Explainability in JupyterLab and Beyond: Interactive XAI Systems for Integrated and Collaborative Workflows Grace Guo et.al. 2404.02081 null
2024-04-02 Ukrainian Texts Classification: Exploration of Cross-lingual Knowledge Transfer Approaches Daryna Dementieva et.al. 2404.02043 null
2024-04-02 CAM-Based Methods Can See through Walls Magamed Taimeskhanov et.al. 2404.01964 link
2024-04-02 Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss Jaeha Kim et.al. 2404.01692 null
2024-04-02 A Universal Knowledge Embedded Contrastive Learning Framework for Hyperspectral Image Classification Quanwei Liu et.al. 2404.01673 null
2024-04-01 Can Biases in ImageNet Models Explain Generalization? Paul Gavrikov et.al. 2404.01509 link
2024-04-01 Parallel Proportional Fusion of Spiking Quantum Neural Network for Optimizing Image Classification Zuyu Xu et.al. 2404.01359 null
2024-04-01 Bridging Remote Sensors with Multisensor Geospatial Foundation Models Boran Han et.al. 2404.01260 link
2024-04-01 Diagnosis of Skin Cancer Using VGG16 and VGG19 Based Transfer Learning Models Amir Faghihi et.al. 2404.01160 null
2024-03-29 Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations Jaisidh Singh et.al. 2403.20312 link
2024-03-29 MCNet: A crowd denstity estimation network based on integrating multiscale attention module Qiang Guo et.al. 2403.20173 null
2024-03-29 Segmentation, Classification and Interpretation of Breast Cancer Medical Images using Human-in-the-Loop Machine Learning David Vázquez-Lema et.al. 2403.20112 null
2024-03-29 Adverb Is the Key: Simple Text Data Augmentation with Adverb Deletion Juhwan Choi et.al. 2403.20015 null
2024-03-29 Diverse Feature Learning by Self-distillation and Reset Sejik Park et.al. 2403.19941 null
2024-03-29 Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification Jianfeng Cai et.al. 2403.19902 link
2024-03-28 X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization Anna Kukleva et.al. 2403.19811 link
2024-03-28 RSMamba: Remote Sensing Image Classification with State Space Model Keyan Chen et.al. 2403.19654 link
2024-03-28 Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model Zhicai Wang et.al. 2403.19600 link
2024-03-28 The Bad Batches: Enhancing Self-Supervised Learning in Image Classification Through Representative Batch Curation Ozgu Goksu et.al. 2403.19579 null
2024-03-28 Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach Wei Dong et.al. 2403.19067 link
2024-03-27 Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data Yuting Guo et.al. 2403.19031 null
2024-03-27 Robustness and Visual Explanation for Black Box Image, Video, and ECG Signal Classification with Reinforcement Learning Soumyendu Sarkar et.al. 2403.18985 null
2024-03-27 The Impact of Uniform Inputs on Activation Sparsity and Energy-Latency Attacks in Computer Vision Andreas Müller et.al. 2403.18587 link
2024-03-27 Uncertainty-Aware SAR ATR: Defending Against Adversarial Attacks via Bayesian Neural Networks Tian Ye et.al. 2403.18318 null
2024-03-27 Multi-scale Unified Network for Image Classification Wenzhuo Liu et.al. 2403.18294 null
2024-03-26 The Need for Speed: Pruning Transformers with One Recipe Samir Khaki et.al. 2403.17921 link
2024-03-26 Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation Carlos Gomes et.al. 2403.17886 null
2024-03-26 PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition Chenhongyi Yang et.al. 2403.17695 link
2024-03-26 Language Models for Text Classification: Is In-Context Learning Enough? Aleksandra Edwards et.al. 2403.17661 null
2024-03-26 Boosting Few-Shot Learning with Disentangled Self-Supervised Learning and Meta-Learning for Medical Image Classification Eva Pachetti et.al. 2403.17530 null
2024-03-26 HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification He Zhu et.al. 2403.17307 link
2024-03-25 Histogram Layers for Neural Engineered Features Joshua Peeples et.al. 2403.17176 link
2024-03-25 Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships Rangel Daroya et.al. 2403.17173 link
2024-03-25 CipherFormer: Efficient Transformer Private Inference with Low Round Complexity Weize Wang et.al. 2403.16860 null
2024-03-25 Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer Dominik Müller et.al. 2403.16695 null
2024-03-25 DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural Networks Dominik Müller et.al. 2403.16678 link
2024-03-25 LARA: Linguistic-Adaptive Retrieval-Augmented LLMs for Multi-Turn Intent Classification Liu Junhua et.al. 2403.16504 null
2024-03-24 On machine learning analysis of atomic force microscopy images for image classification, sample surface recognition Igor Sokolov et.al. 2403.16230 null
2024-03-24 Leveraging Deep Learning and Xception Architecture for High-Accuracy MRI Classification in Alzheimer Diagnosis Shaojie Li et.al. 2403.16212 null
2024-03-24 Multi-Task Learning with Multi-Task Optimization Lu Bai et.al. 2403.16162 null
2024-03-24 CBGT-Net: A Neuromimetic Architecture for Robust Classification of Streaming Data Shreya Sharma et.al. 2403.15974 link
2024-03-23 A Deep Learning Architectures for Kidney Disease Classification Muhammad Shoaib Farooq et.al. 2403.15895 null
2024-03-23 VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding Phong Nguyen-Thuan Do et.al. 2403.15882 null
2024-03-23 VLM-CPL: Consensus Pseudo Labels from Vision-Language Models for Human Annotation-Free Pathological Image Classification Lanfeng Zhong et.al. 2403.15836 null
2024-03-22 Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion Sofia Casarin et.al. 2403.15194 null
2024-03-22 Image Classification with Rotation-Invariant Variational Quantum Circuits Paul San Sebastian et.al. 2403.15031 null
2024-03-22 Extracting Human Attention through Crowdsourced Patch Labeling Minsuk Chang et.al. 2403.15013 null
2024-03-22 Clean-image Backdoor Attacks Dazhong Rong et.al. 2403.15010 null
2024-03-22 ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding Novendra Setyawan et.al. 2403.15004 null
2024-03-22 MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection Sadiya Sayara Chowdhury Puspo et.al. 2403.14989 null
2024-03-21 Learning with SASQuaTCh: a Novel Variational Quantum Transformer Architecture with Kernel-Based Self-Attention Ethan N. Evans et.al. 2403.14753 null
2024-03-21 Estimating Physical Information Consistency of Channel Data Augmentation for Remote Sensing Images Tom Burgert et.al. 2403.14547 null
2024-03-21 Multi-Level Explanations for Generative Language Models Lucas Monteiro Paes et.al. 2403.14459 null
2024-03-21 Tensor network compressibility of convolutional models Sukhbinder Singh et.al. 2403.14379 null
2024-03-21 LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding Masato Fujitake et.al. 2403.14252 null
2024-03-21 Safeguarding Medical Image Segmentation Datasets against Unauthorized Training via Contour- and Texture-Aware Perturbations Xun Lin et.al. 2403.14250 null
2024-03-21 Improving Image Classification Accuracy through Complementary Intra-Class and Inter-Class Mixup Ye Xu et.al. 2403.14137 link
2024-03-20 Bridge the Modality and Capacity Gaps in Vision-Language Model Selection Chao Yi et.al. 2403.13797 null
2024-03-20 Leveraging feature communication in federated learning for remote sensing image classification Anh-Kiet Duong et.al. 2403.13575 null
2024-03-20 MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining Di Wang et.al. 2403.13430 link
2024-03-20 Building Optimal Neural Architectures using Interpretable Knowledge Keith G. Mills et.al. 2403.13293 link
2024-03-19 LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images Jing Zhang et.al. 2403.13171 null
2024-03-19 Improved EATFormer: A Vision Transformer for Medical Image Classification Yulong Shisu et.al. 2403.13167 null
2024-03-19 SIFT-DBT: Self-supervised Initialization and Fine-Tuning for Imbalanced Digital Breast Tomosynthesis Image Classification Yuexi Du et.al. 2403.13148 link
2024-03-19 Using evolutionary computation to optimize task performance of unclocked, recurrent Boolean circuits in FPGAs Raphael Norman-Tenazas et.al. 2403.13105 null
2024-03-19 Investigating Text Shortening Strategy in BERT: Truncation vs Summarization Mirza Alim Mutasodirin et.al. 2403.12799 link
2024-03-18 Posterior Uncertainty Quantification in Neural Networks using Data Augmentation Luhuan Wu et.al. 2403.12729 null
2024-03-19 SEVEN: Pruning Transformer Model by Reserving Sentinels Jinying Xiao et.al. 2403.12688 link
2024-03-19 Simple Hack for Transformers against Heavy Long-Text Classification on a Time- and Memory-Limited GPU Service Mirza Alim Mutasodirin et.al. 2403.12563 null
2024-03-19 Prompt-Guided Adaptive Model Transformation for Whole Slide Image Classification Yi Lin et.al. 2403.12537 null
2024-03-19 CrossTune: Black-Box Few-Shot Classification with Label Enhancement Danqing Luo et.al. 2403.12468 null
2024-03-18 Generalizing deep learning models for medical image classification Matta Sarah et.al. 2403.12167 null
2024-03-19 Leveraging Spatial and Semantic Feature Extraction for Skin Cancer Diagnosis with Capsule Networks and Graph Neural Networks K. P. Santoso et.al. 2403.12009 null
2024-03-18 High-energy physics image classification: A Survey of Jet Applications Hamza Kheddar et.al. 2403.11934 null
2024-03-18 Better (pseudo-)labels for semi-supervised instance segmentation François Porcher et.al. 2403.11675 null
2024-03-18 Continual Forgetting for Pre-trained Vision Models Hongbo Zhao et.al. 2403.11530 link
2024-03-18 Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting Mingkui Tan et.al. 2403.11491 null
2024-03-17 Potential of Domain Adaptation in Machine Learning in Ecology and Hydrology to Improve Model Extrapolability Haiyang Shi et.al. 2403.11331 null
2024-03-17 A Modified Word Saliency-Based Adversarial Attack on Text Classification Models Hetvi Waghela et.al. 2403.11297 null
2024-03-17 Forging the Forger: An Attempt to Improve Authorship Verification via Data Augmentation Silvia Corbara et.al. 2403.11265 null
2024-03-17 Multiple Teachers-Meticulous Student: A Domain Adaptive Meta-Knowledge Distillation Model for Medical Image Classification Shahabedin Nabavi et.al. 2403.11226 null
2024-03-16 Forward Learning of Graph Neural Networks Namyong Park et.al. 2403.11004 null
2024-03-16 Understanding Robustness of Visual State Space Models for Image Classification Chengbin Du et.al. 2403.10935 null
2024-03-16 Automatic location detection based on deep learning Anjali Karangiya et.al. 2403.10912 null
2024-03-14 Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models Akhil Kedia et.al. 2403.09635 link
2024-03-14 XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization Yequan Bie et.al. 2403.09410 null
2024-03-14 ConDiSR: Contrastive Disentanglement and Style Regularization for Single Domain Generalization Aleksandr Matsun et.al. 2403.09400 null
2024-03-14 A Hierarchical Fused Quantum Fuzzy Neural Network for Image Classification Sheng-Yao Wu et.al. 2403.09318 null
2024-03-14 CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise Classification Yiming Ma et.al. 2403.09281 null
2024-03-14 Are Vision Language Models Texture or Shape Biased and Can We Steer Them? Paul Gavrikov et.al. 2403.09193 null
2024-03-14 Randomized Principal Component Analysis for Hyperspectral Image Classification Mustafa Ustuner et.al. 2403.09117 null
2024-03-14 CardioCaps: Attention-based Capsule Network for Class-Imbalanced Echocardiogram Classification Hyunkyung Han et.al. 2403.09108 link
2024-03-14 The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models? Qinyu Zhao et.al. 2403.09037 link
2024-03-13 PathM3: A Multimodal Multi-Task Multiple Instance Learning Framework for Whole Slide Image Classification and Captioning Qifeng Zhou et.al. 2403.08967 null
2024-03-13 DAM: Dynamic Adapter Merging for Continual Video QA Learning Feng Cheng et.al. 2403.08755 link
2024-03-13 Leveraging Compressed Frame Sizes For Ultra-Fast Video Classification Yuxing Han et.al. 2403.08580 null
2024-03-13 HOLMES: HOLonym-MEronym based Semantic inspection for Convolutional Image Classifiers Francesco Dibitonto et.al. 2403.08536 link
2024-03-13 Pig aggression classification using CNN, Transformers and Recurrent Networks Junior Silva Souza et.al. 2403.08528 null
2024-03-13 Reduced Jeffries-Matusita distance: A Novel Loss Function to Improve Generalization Performance of Deep Classification Models Mohammad Lashkari et.al. 2403.08408 null
2024-03-13 Iterative Online Image Synthesis via Diffusion Model for Imbalanced Classification Shuhan Li et.al. 2403.08407 null
2024-03-13 Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks Khondoker Murad Hossain et.al. 2403.08208 null
2024-03-13 Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks Fuzhi Wu et.al. 2403.08157 link
2024-03-12 Harnessing Artificial Intelligence to Combat Online Hate: Exploring the Challenges and Opportunities of Large Language Models in Hate Speech Detection Tharindu Kumarage et.al. 2403.08035 null
2024-03-13 Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion Dongyang Li et.al. 2403.07721 link
2024-03-12 FPT: Fine-grained Prompt Tuning for Parameter and Memory Efficient Fine Tuning in High-resolution Medical Image Classification Yijin Huang et.al. 2403.07576 null
2024-03-12 Backdoor Attack with Mode Mixture Latent Modification Hongwei Zhang et.al. 2403.07463 null
2024-03-12 In-context learning enables multimodal large language models to classify cancer pathology images Dyke Ferber et.al. 2403.07407 null
2024-03-12 Premonition: Using Generative Models to Preempt Future Data Changes in Continual Learning Mark D. McDonnell et.al. 2403.07356 null
2024-03-12 How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance Hongkang Li et.al. 2403.07310 null
2024-03-12 A Bayesian Approach to OOD Robustness in Image Classification Prakhar Kaushik et.al. 2403.07277 null
2024-03-11 LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations Mohammad Alkhalefi et.al. 2403.06813 null
2024-03-11 Dynamic Perturbation-Adaptive Adversarial Training on Medical Image Classification Shuai Li et.al. 2403.06798 null
2024-03-11 Leveraging Internal Representations of Model for Magnetic Image Classification Adarsh N L et.al. 2403.06797 null
2024-03-11 Shortcut Learning in Medical Image Segmentation Manxi Lin et.al. 2403.06748 null
2024-03-11 Active Generation for Image Classification Tao Huang et.al. 2403.06517 null
2024-03-11 Evolving Knowledge Distillation with Large Language Models and Active Learning Chengyuan Liu et.al. 2403.06414 null
2024-03-11 'One size doesn't fit all': Learning how many Examples to use for In-Context Learning for Improved Text Classification Manish Chandra et.al. 2403.06402 null
2024-03-10 Probing Image Compression For Class-Incremental Learning Justin Yang et.al. 2403.06288 null
2024-03-10 Bayesian Random Semantic Data Augmentation for Medical Image Classification Yaoyao Zhu et.al. 2403.06138 link
2024-03-10 Universal Debiased Editing for Fair Medical Image Classification Ruinan Jin et.al. 2403.06104 null
2024-03-08 Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets Lorenzo Brigato et.al. 2403.05532 null
2024-03-08 Generalized Correspondence Matching via Flexible Hierarchical Refinement and Patch Descriptor Distillation Yu Han et.al. 2403.05388 null
2024-03-08 The Impact of Quantization on the Robustness of Transformer-based Text Classifiers Seyed Parsa Neshaei et.al. 2403.05365 null
2024-03-08 Multiple Instance Learning with random sampling for Whole Slide Image Classification H. Keshvarikhojasteh et.al. 2403.05351 null
2024-03-08 Learning Expressive And Generalizable Motion Features For Face Forgery Detection Jingyi Zhang et.al. 2403.05172 null
2024-03-08 Defending Against Unforeseen Failure Modes with Latent Adversarial Training Stephen Casper et.al. 2403.05030 link
2024-03-07 Fooling Neural Networks for Motion Forecasting via Adversarial Attacks Edgar Medina et.al. 2403.04954 null
2024-03-07 T-TAME: Trainable Attention Mechanism for Explaining Convolutional Networks and Vision Transformers Mariano V. Ntrougkas et.al. 2403.04523 null
2024-03-07 Source Matters: Source Dataset Impact on Model Robustness in Medical Imaging Dovile Juodelyte et.al. 2403.04484 link
2024-03-07 Advancing Biomedical Text Mining with Community Challenges Hui Zong et.al. 2403.04261 null
2024-03-07 Scalable On-Chip Optical Linear Processing Unit Using a Single Thin-Film Lithium Niobate Ring Modulator Zhaoang Deng et.al. 2403.04216 null
2024-03-07 Scalable and Robust Transformer Decoders for Interpretable Image Classification with Foundation Models Evelyn Mannix et.al. 2403.04125 null
2024-03-07 Privacy-preserving Fine-tuning of Large Language Models through Flatness Tiejin Chen et.al. 2403.04124 null
2024-03-06 MedMamba: Vision Mamba for Medical Image Classification Yubiao Yue et.al. 2403.03849 link
2024-03-06 On the Effectiveness of Distillation in Mitigating Backdoors in Pre-trained Encoder Tingxu Han et.al. 2403.03846 link
2024-03-06 RADIA -- Radio Advertisement Detection with Intelligent Analytics Jorge Álvarez et.al. 2403.03538 null
2024-03-06 Inverse-Free Fast Natural Gradient Descent Method for Deep Learning Xinwei Ou et.al. 2403.03473 null
2024-03-06 Sparse Spiking Neural Network: Exploiting Heterogeneity in Timescales for Pruning Recurrent SNN Biswadeep Chakraborty et.al. 2403.03409 null
2024-03-05 RulePrompt: Weakly Supervised Text Classification with Prompting PLMs and Self-Iterative Logical Rules Miaomiao Li et.al. 2403.02932 link
2024-03-05 Demonstrating Mutual Reinforcement Effect through Information Flow Chengguang Gan et.al. 2403.02902 null
2024-03-05 Quantum Mixed-State Self-Attention Network Fu Chen et.al. 2403.02871 null
2024-03-05 SOFIM: Stochastic Optimization Using Regularized Fisher Information Matrix Gayathri C et.al. 2403.02833 null
2024-03-05 SGD with Partial Hessian for Deep Neural Networks Optimization Ying Sun et.al. 2403.02681 link
2024-03-05 G-EvoNAS: Evolutionary Neural Architecture Search Based on Network Growth Juan Zou et.al. 2403.02667 null
2024-03-05 Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad Sayantan Choudhury et.al. 2403.02648 link
2024-03-05 Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use Imad Eddine Toubal et.al. 2403.02626 null
2024-03-04 When do Convolutional Neural Networks Stop Learning? Sahan Ahmad et.al. 2403.02473 link
2024-03-04 NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function Abdullah Nazhat Abdullah et.al. 2403.02411 link
2024-03-02 Can a Confident Prior Replace a Cold Posterior? Martin Marek et.al. 2403.01272 link
2024-03-02 Leveraging Self-Supervised Learning for Scene Recognition in Child Sexual Abuse Imagery Pedro H. V. Valois et.al. 2403.01183 null
2024-03-02 Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation Lian Xu et.al. 2403.01156 null
2024-03-02 ELA: Efficient Local Attention for Deep Convolutional Neural Networks Wei Xu et.al. 2403.01123 null
2024-03-01 Margin Discrepancy-based Adversarial Training for Multi-Domain Text Classification Yuan Wu et.al. 2403.00888 null
2024-03-01 Text classification of column headers with a controlled vocabulary: leveraging LLMs for metadata enrichment Margherita Martorana et.al. 2403.00884 null
2024-03-01 SURE: SUrvey REcipes for building reliable and robust deep networks Yuting Li et.al. 2403.00543 link
2024-03-01 Invariant Test-Time Adaptation for Vision-Language Model Generalization Huan Ma et.al. 2403.00376 null
2024-02-29 TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision Yunyi Zhang et.al. 2403.00165 null
2024-02-29 Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance Huakun Shen et.al. 2402.19401 null
2024-02-29 Stitching Gaps: Fusing Situated Perceptual Knowledge with Vision Transformers for High-Level Image Classification Delfina Sol Martinez Pandiani et.al. 2402.19339 null
2024-02-29 Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction Hao Li et.al. 2402.19326 null
2024-02-29 Decompose-and-Compose: A Compositional Approach to Mitigating Spurious Correlation Fahimeh Hosseini Noohdani et.al. 2402.18919 null
2024-02-29 Utilizing Local Hierarchy with Adversarial Training for Hierarchical Text Classification Zihan Wang et.al. 2402.18825 link
2024-02-28 Comparing Importance Sampling Based Methods for Mitigating the Effect of Class Imbalance Indu Panigrahi et.al. 2402.18742 link
2024-02-28 Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains Hafiz Tiomoko Ali et.al. 2402.18614 null
2024-02-28 Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling Mahdi Karami et.al. 2402.18508 null
2024-02-28 Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization Deng Li et.al. 2402.18447 null
2024-02-29 A Modular System for Enhanced Robustness of Multimedia Understanding Networks via Deep Parametric Estimation Francesco Barbato et.al. 2402.18402 null
2024-02-28 A Multimodal Handover Failure Detection Dataset and Baselines Santosh Thoduka et.al. 2402.18319 null
2024-02-28 Classes Are Not Equal: An Empirical Study on Image Recognition Fairness Jiequan Cui et.al. 2402.18133 null
2024-02-27 Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers Yiwei Lu et.al. 2402.17710 null
2024-02-27 SDF2Net: Shallow to Deep Feature Fusion Network for PolSAR Image Classification Mohammed Q. Alkhatib et.al. 2402.17672 link
2024-02-27 Predict the Next Word: Evgenia Ilia et.al. 2402.17527 null
2024-02-27 Scaling Supervised Local Learning with Augmented Auxiliary Networks Chenxiang Ma et.al. 2402.17318 link
2024-02-26 Offline Writer Identification Using Convolutional Neural Network Activation Features Vincent Christlein et.al. 2402.17029 null

(back to top)

Object Detection

Publish Date Title Authors PDF Code
2024-05-21 BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once Theodore Zhao et.al. 2405.12971 null
2024-05-21 AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection Zizhao Chen et.al. 2405.12944 link
2024-05-21 Predicting the Influence of Adverse Weather on Pedestrian Detection with Automotive Radar and Lidar Sensors Daniel Weihmayr et.al. 2405.12736 null
2024-05-21 Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text Yafu Li et.al. 2405.12689 null
2024-05-21 Automating Attendance Management in Human Resources: A Design Science Approach Using Computer Vision and Facial Recognition Bao-Thien Nguyen-Tat et.al. 2405.12633 null
2024-05-21 FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors Shuai Liu et.al. 2405.12601 link
2024-05-21 Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering Hiba Maryam et.al. 2405.12533 null
2024-05-21 Active Object Detection with Knowledge Aggregation and Distillation from Large Models Dejie Yang et.al. 2405.12509 null
2024-05-21 Mutual Information Analysis in Multimodal Learning Systems Hadi Hadizadeh et.al. 2405.12456 null
2024-05-20 Multi-View Attentive Contextualization for Multi-View 3D Object Detection Xianpeng Liu et.al. 2405.12200 null
2024-05-20 Bangladeshi Native Vehicle Detection in Wild Bipin Saha et.al. 2405.12150 link
2024-05-20 Salience-guided Ground Factor for Robust Localization of Delivery Robots in Complex Urban Environments Jooyong Park et.al. 2405.11855 null
2024-05-20 DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment Jianhong Han et.al. 2405.11765 link
2024-05-20 Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation Runou Yang et.al. 2405.11754 link
2024-05-19 FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention Ziang Guo et.al. 2405.11682 link
2024-05-19 SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization Jialong Guo et.al. 2405.11582 link
2024-05-19 The First Swahili Language Scene Text Detection and Recognition Dataset Fadila Wendigoundi Douamba et.al. 2405.11437 link
2024-05-18 InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images Wuzhou Li et.al. 2405.11293 null
2024-05-18 Visible and Clear: Finding Tiny Objects in Difference Map Bing Cao et.al. 2405.11276 null
2024-05-17 A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model Mingxiang Fu et.al. 2405.10890 null
2024-05-17 DeepPavlov at SemEval-2024 Task 8: Leveraging Transfer Learning for Detecting Boundaries of Machine-Generated Texts Anastasia Voznyuk et.al. 2405.10629 link
2024-05-17 DuoSpaceNet: Leveraging Both Bird's-Eye-View and Perspective View Representations for 3D Object Detection Zhe Huang et.al. 2405.10577 null
2024-05-16 Drone-type-Set: Drone types detection benchmark for drone detection and tracking Kholoud AlDosari et.al. 2405.10398 null
2024-05-16 Grounded 3D-LLM with Referent Tokens Yilun Chen et.al. 2405.10370 null
2024-05-16 Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection Tianhe Ren et.al. 2405.10300 link
2024-05-16 Towards Task-Compatible Compressible Representations Anderson de Andrade et.al. 2405.10244 link
2024-05-16 SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network Zhaoxu Li et.al. 2405.10148 null
2024-05-16 SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection Mingxuan Liu et.al. 2405.10053 null
2024-05-16 FPDIoU Loss: A Loss Function for Efficient Bounding Box Regression of Rotated Object Detection Siliang Ma et.al. 2405.09942 null
2024-05-16 Infrared Adversarial Car Stickers Xiaopei Zhu et.al. 2405.09924 null
2024-05-16 PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features Xusheng Li et.al. 2405.09828 null
2024-05-16 Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection Feiran Li et.al. 2405.09782 link
2024-05-15 Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation Guo Yachan et.al. 2405.09682 null
2024-05-15 Dynamic Loss Decay based Robust Oriented Object Detection on Remote Sensing Images with Noisy Labels Guozhang Liu et.al. 2405.09024 null
2024-05-14 CLIP with Quality Captions: A Strong Pretraining for Vision Tasks Pavan Kumar Anasosalu Vasu et.al. 2405.08911 null
2024-05-14 Open-Vocabulary Object Detection via Neighboring Region Attention Alignment Sunyuan Qiang et.al. 2405.08593 null
2024-05-14 Semantic Contextualization of Face Forgery: A New Definition, Dataset, and Detection Method Mian Zou et.al. 2405.08487 null
2024-05-14 RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images Zong-Wei Hong et.al. 2405.08483 link
2024-05-14 Multimodal Collaboration Networks for Geospatial Vehicle Detection in Dense, Occluded, and Large-Scale Events Xin Wu et.al. 2405.08251 link
2024-05-13 RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors Liam Dugan et.al. 2405.07940 null
2024-05-13 oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving Abdul Hannan Khan et.al. 2405.07698 null
2024-05-13 MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders Xueying Jiang et.al. 2405.07696 null
2024-05-13 Quality-aware Selective Fusion Network for V-D-T Salient Object Detection Liuxin Bao et.al. 2405.07655 link
2024-05-13 Fast Training Data Acquisition for Object Detection and Segmentation using Black Screen Luminance Keying Thomas Pöllabauer et.al. 2405.07653 null
2024-05-13 Integrity Monitoring of 3D Object Detection in Automated Driving Systems using Raw Activation Patterns and Spatial Filtering Hakan Yekta Yatbaz et.al. 2405.07600 null
2024-05-13 Environmental Matching Attack Against Unmanned Aerial Vehicles Object Detection Dehong Kong et.al. 2405.07595 null
2024-05-13 Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis Tianci Bi et.al. 2405.07481 null
2024-05-13 Enhancing 3D Object Detection by Using Neural Network with Self-adaptive Thresholding Houze Liu et.al. 2405.07479 null
2024-05-12 MAML MOT: Multiple Object Tracking based on Meta-Learning Jiayi Chen et.al. 2405.07272 null
2024-05-10 How to Augment for Atmospheric Turbulence Effects on Thermal Adapted Object Detection Models? Engin Uzun et.al. 2405.06383 null
2024-05-10 Precise Apple Detection and Localization in Orchards using YOLOv5 for Robotic Harvesting Systems Jiang Ziyue et.al. 2405.06260 null
2024-05-09 CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks Nick et.al. 2405.05755 null
2024-05-09 Depth Awakens: A Depth-perceptual Attention Fusion Network for RGB-D Camouflaged Object Detection Xinran Liua et.al. 2405.05614 null
2024-05-09 The object detection model uses combined extraction with KNN and RF classification Florentina Tatrin Kurniati et.al. 2405.05551 null
2024-05-08 Reviewing Intelligent Cinematography: AI research for camera-based video production Adrian Azzarelli et.al. 2405.05039 null
2024-05-07 A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching Xianlei Long et.al. 2405.04589 null
2024-05-07 DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving Chen Min et.al. 2405.04390 null
2024-05-07 A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields Raiyan Rahman et.al. 2405.04305 null
2024-05-07 ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers Jinke Li et.al. 2405.04299 null
2024-05-07 Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore Junchao Wu et.al. 2405.04286 null
2024-05-07 Deep Event-based Object Detection in Autonomous Driving: A Survey Bingquan Zhou et.al. 2405.03995 null
2024-05-06 BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection Saket S. Chaturvedi et.al. 2405.03884 null
2024-05-06 RepVGG-GELAN: Enhanced GELAN with VGG-STYLE ConvNets for Brain Tumour Detection Thennarasi Balakrishnan et.al. 2405.03541 link
2024-05-06 Low-light Object Detection Pengpeng Li et.al. 2405.03519 null
2024-05-06 Salient Object Detection From Arbitrary Modalities Nianchang Huang et.al. 2405.03352 null
2024-05-06 Modality Prompts for Arbitrary Modality Salient Object Detection Nianchang Huang et.al. 2405.03351 null
2024-05-06 Vietnamese AI Generated Text Detection Quang-Dan Tran et.al. 2405.03206 null
2024-05-06 PTQ4SAM: Post-Training Quantization for Segment Anything Chengtao Lv et.al. 2405.03144 link
2024-05-05 Performance Evaluation of Real-Time Object Detection for Electric Scooters Dong Chen et.al. 2405.03039 link
2024-05-05 SalFAU-Net: Saliency Fusion Attention U-Net for Salient Object Detection Kassaw Abraham Mulat et.al. 2405.02906 null
2024-05-07 Adaptive Guidance Learning for Camouflaged Object Detection Zhennan Chen et.al. 2405.02824 null
2024-05-05 PVTransformer: Point-to-Voxel Transformer for Scalable 3D Object Detection Zhaoqi Leng et.al. 2405.02811 null
2024-05-02 Segmentation-Free Outcome Prediction in Head and Neck Cancer: Deep Learning-based Feature Extraction from Multi-Angle Maximum Intensity Projections (MA-MIPs) of PET Images Amirhosein Toosi et.al. 2405.01756 null
2024-05-02 PointCompress3D -- A Point Cloud Compression Framework for Roadside LiDARs in Intelligent Transportation Systems Walter Zimmer et.al. 2405.01750 null
2024-05-02 Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey Guoping Xu et.al. 2405.01725 link
2024-05-02 SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients Tushar Verma et.al. 2405.01699 null
2024-05-02 Imagine the Unseen: Occluded Pedestrian Detection via Adversarial Feature Completion Shanshan Zhang et.al. 2405.01311 null
2024-05-02 Overcoming LLM Challenges using RAG-Driven Precision in Coffee Leaf Disease Remediation Dr. Selva Kumar S et.al. 2405.01310 null
2024-05-02 Towards Consistent Object Detection via LiDAR-Camera Synergy Kai Luo et.al. 2405.01258 link
2024-05-02 Federated Learning with Heterogeneous Data Handling for Robust Vehicular Object Detection Ahmad Khalil et.al. 2405.01108 null
2024-05-01 Grains of Saliency: Optimizing Saliency-based Training of Biometric Attack Detection Models Colton R. Crum et.al. 2405.00650 null
2024-05-01 Object detection under the linear subspace model with application to cryo-EM images Amitay Eldar et.al. 2405.00364 null
2024-04-30 Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Yunhao Ge et.al. 2404.19752 null
2024-04-30 Quantifying Nematodes through Images: Datasets, Models, and Baselines of Deep Learning Zhipeng Yuan et.al. 2404.19748 null
2024-04-30 Masked Multi-Query Slot Attention for Unsupervised Object Discovery Rishav Pramanik et.al. 2404.19654 link
2024-04-30 Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World Wen Yin et.al. 2404.19417 null
2024-04-30 UniFS: Universal Few-shot Instance Perception with Point Representations Sheng Jin et.al. 2404.19401 null
2024-04-30 Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection Zhanwei Zhang et.al. 2404.19384 null
2024-04-30 Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank Sungjune Park et.al. 2404.19299 null
2024-04-29 MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection Heitor R. Medeiros et.al. 2404.18849 null
2024-04-29 Leveraging PointNet and PointNet++ for Lyft Point Cloud Classification Challenge Rajat K. Doshi et.al. 2404.18665 null
2024-04-29 CoSense3D: an Agent-based Efficient Learning Framework for Collective Perception Yunshuang Yuan et.al. 2404.18617 null
2024-04-29 Assessing Quality Metrics for Neural Reality Gap Input Mitigation in Autonomous Driving Testing Stefano Carlo Lambertenghi et.al. 2404.18577 null
2024-04-29 Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images Wenbin Guan et.al. 2404.18426 null
2024-04-29 Multi-modal Perception Dataset of In-water Objects for Autonomous Surface Vehicles Mingi Jeong et.al. 2404.18411 null
2024-04-28 FAD-SAR: A Novel Fishing Activity Detection System via Synthetic Aperture Radar Images Based on Deep Learning Method Yanbing Bai et.al. 2404.18245 null
2024-04-28 RadSimReal: Bridging the Gap Between Synthetic and Real Data in Radar Object Detection With Simulation Oded Bialer et.al. 2404.18150 null
2024-04-27 Reliable Student: Addressing Noise in Semi-Supervised 3D Object Detection Farzad Nozarian et.al. 2404.17910 link
2024-04-27 A Hybrid Approach for Document Layout Analysis in Document images Tahira Shehzadi et.al. 2404.17888 null
2024-04-26 Inhomogeneous illuminated image enhancement under extremely low visibility condition Libang Chen et.al. 2404.17503 null
2024-04-26 Cost-Sensitive Uncertainty-Based Failure Recognition for Object Detection Moussa Kassem Sbeyti et.al. 2404.17427 null
2024-04-26 Enhancing mmWave Radar Point Cloud via Visual-inertial Supervision Cong Fan et.al. 2404.17229 null
2024-04-26 MorphText: Deep Morphology Regularized Arbitrary-shape Scene Text Detection Chengpei Xu et.al. 2404.17151 null
2024-04-25 Generating Minimalist Adversarial Perturbations to Test Object-Detection Models: An Adaptive Multi-Metric Evolutionary Search Approach Cristopher McIntyre-Garcia et.al. 2404.17020 link
2024-04-25 Constellation Dataset: Benchmarking High-Altitude Object Detection for an Urban Intersection Mehmet Kerem Turkcan et.al. 2404.16944 link
2024-04-25 Self-Balanced R-CNN for Instance Segmentation Leonardo Rossi et.al. 2404.16633 link
2024-04-25 Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System Daniel Dworak et.al. 2404.16548 null
2024-04-25 Commonsense Prototype for Outdoor Unsupervised 3D Object Detection Hai Wu et.al. 2404.16493 link
2024-04-25 IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks Zitong Huang et.al. 2404.16331 null
2024-04-25 CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions Haoyuan Li et.al. 2404.16302 link
2024-04-24 AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models Zhiqiang Tang et.al. 2404.16233 null
2024-04-24 Observational parameters of Blue Large-Amplitude Pulsators P. Pietrukowicz et.al. 2404.16089 null
2024-04-24 A Survey on Visual Mamba Hanwei Zhang et.al. 2404.15956 null
2024-04-24 Steal Now and Attack Later: Evaluating Robustness of Object Detection against Black-box Adversarial Attacks Erh-Chung Chen et.al. 2404.15881 null
2024-04-24 Revisiting Out-of-Distribution Detection in LiDAR-based 3D Object Detection Michael Kösel et.al. 2404.15879 link
2024-04-23 CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection Hongyi Cai et.al. 2404.15451 null
2024-04-23 ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning Weifeng Chen et.al. 2404.15449 null
2024-04-23 Source-free Domain Adaptation for Video Object Detection Under Adverse Image Conditions Xingguang Zhang et.al. 2404.15252 null
2024-04-23 Efficient Transformer Encoders for Mask2Former-style models Manyi Yao et.al. 2404.15244 null
2024-04-23 Gallbladder Cancer Detection in Ultrasound Images based on YOLO and Faster R-CNN Sara Dadjouy et.al. 2404.15129 null
2024-04-23 External Prompt Features Enhanced Parameter-efficient Fine-tuning for Salient Object Detection Wen Liang et.al. 2404.15008 null
2024-04-23 ContextualFusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions Shounak Sural et.al. 2404.14780 null
2024-04-23 Unified Unsupervised Salient Object Detection via Knowledge Transfer Yao Yuan et.al. 2404.14759 link
2024-04-22 SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection Yuxia Wang et.al. 2404.14183 null
2024-04-22 Text in the Dark: Extremely Low-Light Text Image Enhancement Che-Tsung Lin et.al. 2404.14135 null
2024-04-22 CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective Wencheng Zhu et.al. 2404.14109 null
2024-04-22 Benchmarking Multi-Modal LLMs for Testing Visual Deep Learning Systems Through the Lens of Image Mutation Liwen Wang et.al. 2404.13945 null
2024-04-22 NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation Chi Huang et.al. 2404.13921 null
2024-04-22 TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos Atom Scott et.al. 2404.13868 null
2024-04-22 Toward Robust LiDAR based 3D Object Detection via Density-Aware Adaptive Thresholding Eunho Lee et.al. 2404.13852 null
2024-04-21 A Nasal Cytology Dataset for Object Detection and Deep Learning Mauro Camporeale et.al. 2404.13745 null
2024-04-23 Clio: Real-time Task-Driven Open-Set 3D Scene Graphs Dominic Maggio et.al. 2404.13696 null
2024-04-20 FisheyeDetNet: Object Detection on Fisheye Surround View Camera Systems for Automated Driving Ganesh Sistu et.al. 2404.13443 null
2024-04-19 A comparison between single-stage and two-stage 3D tracking algorithms for greenhouse robotics David Rapado-Rincon et.al. 2404.12963 null
2024-04-19 Language-Driven Active Learning for Diverse Open-Set 3D Object Detection Ross Greer et.al. 2404.12856 null
2024-04-19 ECOR: Explainable CLIP for Object Recognition Ali Rasekh et.al. 2404.12839 null
2024-04-19 A Point-Based Approach to Efficient LiDAR Multi-Task Perception Christopher Lang et.al. 2404.12798 null
2024-04-19 ELEV-VISION-SAM: Integrated Vision Language and Foundation Model for Automated Estimation of Building Lowest Floor Elevation Yu-Hsuan Ho et.al. 2404.12606 null
2024-04-18 The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models Cheng Shi et.al. 2404.11957 link
2024-04-18 Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition Xunsong Li et.al. 2404.11903 null
2024-04-17 TempBEV: Improving Learned BEV Encoders with Combined Image and BEV Space Temporal Aggregation Thomas Monninger et.al. 2404.11803 null
2024-04-17 Multimodal 3D Object Detection on Unseen Domains Deepti Hegde et.al. 2404.11764 null
2024-04-17 Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection Deepti Hegde et.al. 2404.11737 null
2024-04-17 Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems Luca Bompani et.al. 2404.11488 link
2024-04-17 EcoMLS: A Self-Adaptation Approach for Architecting Green ML-Enabled Systems Meghana Tedla et.al. 2404.11411 null
2024-04-17 Detector Collapse: Backdooring Object Detection to Catastrophic Overload or Blindness Hangtao Zhang et.al. 2404.11357 null
2024-04-17 Simple In-place Data Augmentation for Surveillance Object Detection Munkh-Erdene Otgonbold et.al. 2404.11226 null
2024-04-17 Feature Corrective Transfer Learning: End-to-End Solutions to Object Detection in Non-Ideal Visual Conditions Chuheng Wei et.al. 2404.11214 null
2024-04-17 GhostNetV3: Exploring the Training Strategies for Compact Models Zhenhua Liu et.al. 2404.11202 null
2024-04-17 How to deal with glare for improved perception of Autonomous Vehicles Muhammad Z. Alam et.al. 2404.10992 null
2024-04-17 Leveraging 3D LiDAR Sensors to Enable Enhanced Urban Safety and Public Health: Pedestrian Monitoring and Abnormal Activity Detection Nawfal Guefrachi et.al. 2404.10978 null
2024-04-16 OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery Matthew Inkawhich et.al. 2404.10865 null
2024-04-16 Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark Jiangning Zhang et.al. 2404.10760 null
2024-04-16 Watch Your Step: Optimal Retrieval for Continual Learning at Scale Truman Hickok et.al. 2404.10758 null
2024-04-16 Efficient optimal dispersed Haar-like filters for face detection Zeinab Sedaghatjoo et.al. 2404.10476 null
2024-04-16 Camera clustering for scalable stream-based active distillation Dani Manjah et.al. 2404.10411 null
2024-04-15 Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets Dai Quoc Tran et.al. 2404.10078 link
2024-04-15 Explainable Light-Weight Deep Learning Pipeline for Improved Drought Stres Aswini Kumar Patra et.al. 2404.10073 null
2024-04-15 VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection Bonan Ding et.al. 2404.09431 null
2024-04-14 TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model Wiktor Mucha et.al. 2404.09254 null
2024-04-14 DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection Lewei Yao et.al. 2404.09216 null
2024-04-14 Coreset Selection for Object Detection Hojun Lee et.al. 2404.09161 null
2024-04-14 Fusion-Mamba for Cross-modality Object Detection Wenhao Dong et.al. 2404.09146 null
2024-04-13 The Snake's Beating Heart? A Millisecond Pulsar Binary in the Galactic Center Radio Filament G359.1 $-$ 0.2 Marcus E. Lower et.al. 2404.09098 null
2024-04-13 BG-YOLO: A Bidirectional-Guided Method for Underwater Object Detection Jian Zhang et.al. 2404.08979 null
2024-04-13 Shifting Spotlight for Co-supervision: A Simple yet Efficient Single-branch Network to See Through Camouflage Yang Hu et.al. 2404.08936 null
2024-04-12 Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation Yanhao Zheng et.al. 2404.08603 link
2024-04-12 FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation Riza Velioglu et.al. 2404.08582 null
2024-04-12 Analyzing Decades-Long Environmental Changes in Namibia Using Archival Aerial Photography and Deep Learning Girmaw Abebe Tadesse et.al. 2404.08544 null
2024-04-12 MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion Zhe Li et.al. 2404.08406 null
2024-04-12 Overcoming Scene Context Constraints for Object Detection in wild using Defilters Vamshi Krishna Kancharla et.al. 2404.08293 null
2024-04-11 ConsistencyDet: Robust Object Detector with Denoising Paradigm of Consistency Model Lifan Jiang et.al. 2404.07773 null
2024-04-11 Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification Ricardo Pereira et.al. 2404.07739 null
2024-04-11 Run-time Monitoring of 3D Object Detection in Automated Driving Systems Using Early Layer Neural Activation Patterns Hakan Yekta Yatbaz et.al. 2404.07685 null
2024-04-11 Finding Dino: A plug-and-play framework for unsupervised detection of out-of-distribution objects using prototypes Poulami Sinhamahapatra et.al. 2404.07664 null
2024-04-11 Separated Attention: An Improved Cycle GAN Based Under Water Image Enhancement Method Tashmoy Ghosh et.al. 2404.07649 null
2024-04-11 GLID: Pre-training a Generalist Encoder-Decoder Vision Model Jihao Liu et.al. 2404.07603 null
2024-04-11 SFSORT: Scene Features-based Simple Online Real-Time Tracker M. M. Morsali et.al. 2404.07553 link
2024-04-11 The Sydney Radio Star Catalogue: properties of radio stars at megahertz to gigahertz frequencies Laura N. Driessen et.al. 2404.07418 null
2024-04-11 Simplifying Two-Stage Detectors for On-Device Inference in Remote Sensing Jaemin Kang et.al. 2404.07405 null
2024-04-11 A fine-tuning workflow for automatic first-break picking with deep learning Amir Mardan et.al. 2404.07400 link
2024-04-10 Identification of Fine-grained Systematic Errors via Controlled Scene Generation Valentyn Boreiko et.al. 2404.07045 null
2024-04-10 Accurate Tennis Court Line Detection on Amateur Recorded Matches Sameer Agrawal et.al. 2404.06977 null
2024-04-10 SARA: Smart AI Reading Assistant for Reading Comprehension Enkeleda Thaqi et.al. 2404.06906 null
2024-04-10 Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data Aakash Kumar et.al. 2404.06715 null
2024-04-10 Scaling Multi-Camera 3D Object Detection through Weak-to-Strong Eliciting Hao Lu et.al. 2404.06700 link
2024-04-09 Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping Anas Gouda et.al. 2404.06277 null
2024-04-09 Label-Efficient 3D Object Detection For Road-Side Units Minh-Quan Dao et.al. 2404.06256 null
2024-04-09 Automatic Defect Detection in Sewer Network Using Deep Learning Based Object Detector Bach Ha et.al. 2404.06219 null
2024-04-09 YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images Chenguang Liu et.al. 2404.06180 null
2024-04-09 Enhanced Radar Perception via Multi-Task Learning: Towards Refined Data for Sensor Fusion Applications Huawei Sun et.al. 2404.06165 null
2024-04-09 Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation Zong-Wei Hong et.al. 2404.06029 null
2024-04-08 Retrieval-Augmented Open-Vocabulary Object Detection Jooyeon Kim et.al. 2404.05687 link
2024-04-08 3D-COCO: extension of MS-COCO dataset for image detection and 3D reconstruction modules Maxence Bideaux et.al. 2404.05641 null
2024-04-08 PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text? Kseniia Petukhova et.al. 2404.05483 null
2024-04-08 Detecting Every Object from Events Haitian Zhang et.al. 2404.05285 link
2024-04-08 MOSE: Boosting Vision-based Roadside 3D Object Detection with Scene Cues Xiahan Chen et.al. 2404.05280 null
2024-04-08 Rendering-Enhanced Automatic Image-to-Point Cloud Registration for Roadside Scenes Yu Sheng et.al. 2404.05164 null
2024-04-08 Better Monocular 3D Detectors with LiDAR from the Past Yurong You et.al. 2404.05139 link
2024-04-07 AirShot: Efficient Few-Shot Detection for Autonomous Exploration Zihan Wang et.al. 2404.05069 link
2024-04-07 PlateSegFL: A Privacy-Preserving License Plate Detection Using Federated Segmentation Learning Md. Shahriar Rahman Anuvab et.al. 2404.05049 null
2024-04-07 PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a Mobile Robot Shenbagaraj Kannapiran et.al. 2404.05024 null
2024-04-05 SCAResNet: A ResNet Variant Optimized for Tiny Object Detection in Transmission and Distribution Towers Weile Li et.al. 2404.04179 link
2024-04-05 Designing Robots to Help Women Martin Cooney et.al. 2404.04123 null
2024-04-04 Is CLIP the main roadblock for fine-grained open-world perception? Lorenzo Bianchi et.al. 2404.03539 link
2024-04-04 DQ-DETR: DETR with Dynamic Query for Tiny Object Detection Yi-Xin Huang et.al. 2404.03507 null
2024-04-05 A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data Iqra Bano et.al. 2404.03493 null
2024-04-04 MonoCD: Monocular 3D Object Detection with Complementary Depths Longfei Yan et.al. 2404.03181 link
2024-04-03 DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection Felix Fent et.al. 2404.03015 null
2024-04-03 ALOHa: A New Measure for Hallucination in Captioning Models Suzanne Petryk et.al. 2404.02904 null
2024-04-03 FlightScope: A Deep Comprehensive Assessment of Aircraft Detection Algorithms in Satellite Imagery Safouane El Ghazouali et.al. 2404.02877 link
2024-04-03 HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras Zhongyu Xia et.al. 2404.02517 link
2024-04-04 TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression Ho-Joong Kim et.al. 2404.02405 null
2024-04-04 EGTR: Extracting Graph from Transformer for Scene Graph Generation Jinbae Im et.al. 2404.02072 link
2024-04-03 Cooperative Students: Navigating Unsupervised Domain Adaptation in Nighttime Object Detection Jicheng Yuan et.al. 2404.01988 link
2024-04-02 Towards Enhanced Analysis of Lung Cancer Lesions in EBUS-TBNA -- A Semi-Supervised Video Object Detection Method Jyun-An Lin et.al. 2404.01929 null
2024-04-02 Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack Ying Zhou et.al. 2404.01907 link
2024-04-02 Scene Adaptive Sparse Transformer for Event-based Object Detection Yansong Peng et.al. 2404.01882 link
2024-04-02 Semi-Supervised Domain Adaptation for Wildfire Detection JooYoung Jang et.al. 2404.01842 null
2024-04-02 Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection Tahira Shehzadi et.al. 2404.01819 null
2024-04-02 Analyzing the Single Event Upset Vulnerability of Binarized Neural Networks on SRAM FPGAs Ioanna Souvatzoglou et.al. 2404.01757 null
2024-04-02 Disentangled Pre-training for Human-Object Interaction Detection Zhuolong Li et.al. 2404.01725 null
2024-04-02 Task Integration Distillation for Object Detectors Hai Su et.al. 2404.01699 null
2024-03-29 PLoc: A New Evaluation Criterion Based on Physical Location for Autonomous Driving Datasets Ruining Yang et.al. 2403.19893 null
2024-03-29 MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection Ali Behrouz et.al. 2403.19888 null
2024-03-28 DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs Donghyun Kim et.al. 2403.19588 link
2024-03-28 OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation Zhenyu Wang et.al. 2403.19580 null
2024-03-28 AIpom at SemEval-2024 Task 8: Detecting AI-produced Outputs in M4 Alexander Shirnin et.al. 2403.19354 null
2024-03-28 Sparse Generation: Making Pseudo Labels Sparse for weakly supervision with points Tian Ma et.al. 2403.19306 null
2024-03-28 CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object Detection Mikhail Kennerley et.al. 2403.19278 link
2024-03-28 Algorithmic Ways of Seeing: Using Object Detection to Facilitate Art Exploration Louie Søs Meyer et.al. 2403.19174 null
2024-03-28 CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation Lingjun Zhao et.al. 2403.19104 null
2024-03-28 A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement Junjie Wen et.al. 2403.19079 null
2024-03-27 Illicit object detection in X-ray images using Vision Transformers Jorgen Cani et.al. 2403.19043 null
2024-03-27 Benchmarking Object Detectors with COCO: A New Path Forward Shweta Singh et.al. 2403.18819 link
2024-03-27 PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations Ehsan Latif et.al. 2403.18721 null
2024-03-27 CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection Jiayi Zhu et.al. 2403.18554 null
2024-03-27 BAM: Box Abstraction Monitors for Real-time OoD Detection in Object Detection Changshun Wu et.al. 2403.18373 null
2024-03-27 Ship in Sight: Diffusion Models for Ship-Image Super Resolution Luigi Sigillo et.al. 2403.18370 link
2024-03-27 DODA: Diffusion for Object-detection Domain Adaptation in Agriculture Shuai Xiang et.al. 2403.18334 null
2024-03-27 Tracking-Assisted Object Detection with Event Cameras Ting-Kang Yen et.al. 2403.18330 null
2024-03-27 SGDM: Static-Guided Dynamic Module Make Stronger Visual Models Wenjie Xing et.al. 2403.18282 null
2024-03-27 Road Obstacle Detection based on Unknown Objectness Scores Chihiro Noguchi et.al. 2403.18207 null
2024-03-26 State of the art applications of deep learning within tracking and detecting marine debris: A survey Zoe Moorton et.al. 2403.18067 null
2024-03-26 The Solution for the CVPR 2023 1st foundation model challenge-Track2 Haonan Xu et.al. 2403.17702 null
2024-03-26 PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition Chenhongyi Yang et.al. 2403.17695 link
2024-03-26 UADA3D: Unsupervised Adversarial Domain Adaptation for 3D Object Detection with Sparse LiDAR and Large Domain Gaps Maciej K Wozniak et.al. 2403.17633 null
2024-03-26 SSF3D: Strict Semi-Supervised 3D Object Detection with Switching Filter Songbur Wong et.al. 2403.17390 null
2024-03-26 Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection Jiacheng Zhang et.al. 2403.17387 null
2024-03-26 AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving Mingfu Liang et.al. 2403.17373 null
2024-03-26 Staircase Localization for Autonomous Exploration in Urban Environments Jinrae Kim et.al. 2403.17330 null
2024-03-25 Co-Occurring of Object Detection and Identification towards unlabeled object discovery Binay Kumar Singh et.al. 2403.17223 null
2024-03-25 Optimizing LiDAR Placements for Robust Driving Perception in Adverse Conditions Ye Li et.al. 2403.17009 link
2024-03-25 Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance Jingyuan Zhu et.al. 2403.16954 null
2024-03-25 TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques Ashok Urlana et.al. 2403.16592 null
2024-03-25 RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection Zhiwei Lin et.al. 2403.16440 link
2024-03-25 ASDF: Assembly State Detection Utilizing Late Fusion by Integrating 6D Pose Estimation Hannah Schieber et.al. 2403.16400 null
2024-03-25 Impact of Video Compression Artifacts on Fisheye Camera Visual Perception Tasks Madhumitha Sakthi et.al. 2403.16338 null
2024-03-24 Cross-domain Multi-modal Few-shot Object Detection via Rich Text Zeyu Shangguan et.al. 2403.16188 null
2024-03-24 Semantic Is Enough: Only Semantic Information For NeRF Reconstruction Ruibo Wang et.al. 2403.16043 null
2024-03-23 Adversarial Defense Teacher for Cross-Domain Object Detection under Poor Visibility Conditions Kaiwen Wang et.al. 2403.15786 null
2024-03-23 EAGLE: A Domain Generalization Framework for AI-generated Text Detection Amrita Bhattacharjee et.al. 2403.15690 null
2024-03-25 Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection Hongzhi Gao et.al. 2403.15317 null
2024-03-22 CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking Nicolas Baumann et.al. 2403.15313 null
2024-03-22 IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection Junbo Yin et.al. 2403.15241 null
2024-03-22 MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection Taeheon Kim et.al. 2403.15209 null
2024-03-22 SFOD: Spiking Fusion Object Detector Yimeng Fan et.al. 2403.15192 link
2024-03-22 CRPlace: Camera-Radar Fusion with BEV Representation for Place Recognition Shaowei Fu et.al. 2403.15183 null
2024-03-22 An In-Depth Analysis of Data Reduction Methods for Sustainable Deep Learning Víctor Toscano-Durán et.al. 2403.15150 null
2024-03-22 Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection Jiaming Li et.al. 2403.15127 link
2024-03-22 VRSO: Visual-Centric Reconstruction for Static Object Annotation Chenyao Yu et.al. 2403.15026 null
2024-03-22 Vehicle Detection Performance in Nordic Region Hamam Mokayed et.al. 2403.15017 null
2024-03-21 T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy Qing Jiang et.al. 2403.14610 link
2024-03-21 UAV-Assisted Maritime Search and Rescue: A Holistic Approach Martin Messmer et.al. 2403.14281 null
2024-03-21 Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection Tim Salzmann et.al. 2403.14270 null
2024-03-21 3D Object Detection from Point Cloud via Voting Step Diffusion Haoran Hou et.al. 2403.14133 null
2024-03-20 EcoSense: Energy-Efficient Intelligent Sensing for In-Shore Ship Detection through Edge-Cloud Collaboration Wenjun Huang et.al. 2403.14027 null
2024-03-20 RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition Ziyu Liu et.al. 2403.13805 link
2024-03-20 Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments Yang Yang et.al. 2403.13803 link
2024-03-20 Fostc3net:A Lightweight YOLOv5 Based On the Network Structure Optimization Danqing Ma et.al. 2403.13703 null
2024-03-20 Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments Djamahl Etchegaray et.al. 2403.13556 null
2024-03-20 MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining Di Wang et.al. 2403.13430 link
2024-03-20 Few-shot Oriented Object Detection with Memorable Contrastive Learning in Remote Sensing Images Jiawei Zhou et.al. 2403.13375 null
2024-03-20 Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection Zhixin Lai et.al. 2403.13335 null
2024-03-20 DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception Yibo Wang et.al. 2403.13304 null
2024-03-20 Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models Huachuan Qiu et.al. 2403.13250 null
2024-03-19 SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model Armen Avetisyan et.al. 2403.13064 null
2024-03-19 Wildfire danger prediction optimization with transfer learning Spiros Maggioros et.al. 2403.12871 link
2024-03-19 As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks? Anjun Hu et.al. 2403.12693 null
2024-03-19 EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks Ziming Wang et.al. 2403.12574 null
2024-03-19 DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM Yixuan Wu et.al. 2403.12488 null
2024-03-19 TransformMix: Learning Transformation and Mixing Strategies from Data Tsz-Him Cheung et.al. 2403.12429 null
2024-03-19 VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation Hao Wang et.al. 2403.12415 null
2024-03-19 Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition Jielin Qiu et.al. 2403.12339 null
2024-03-18 EffiPerception: an Efficient Framework for Various Perception Tasks Xinhao Xiang et.al. 2403.12317 null
2024-03-18 Prototipo de un Contador Bidireccional Automático de Personas basado en sensores de visión 3D Benjamín Ojeda-Magaña et.al. 2403.12310 null
2024-03-18 Align and Distill: Unifying and Improving Domain Adaptive Object Detection Justin Kay et.al. 2403.12029 link
2024-03-18 TrajectoryNAS: A Neural Architecture Search for Trajectory Prediction Ali Asghar Sharifi et.al. 2403.11695 null
2024-03-18 Just Add $100 More: Augmenting NeRF-based Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem Mincheol Chang et.al. 2403.11573 null
2024-03-18 R2SNet: Scalable Domain Adaptation for Object Detection in Cloud-Based Robots Ecosystems via Proposal Refinement Michele Antonazzi et.al. 2403.11567 null
2024-03-18 Continual Forgetting for Pre-trained Vision Models Hongbo Zhao et.al. 2403.11530 link
2024-03-17 V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather Conditions Baolu Li et.al. 2403.11371 null
2024-03-17 Advanced Knowledge Extraction of Physical Design Drawings, Translation and conversion to CAD formats using Deep Learning Jesher Joshua M et.al. 2403.11291 null
2024-03-17 ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models Siyuan Huang et.al. 2403.11289 null
2024-03-17 CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations Yuwei Zhang et.al. 2403.11220 link
2024-03-17 GRA: Detecting Oriented Objects through Group-wise Rotating and Attention Jiangshan Wang et.al. 2403.11127 null
2024-03-17 Self-supervised co-salient object detection via feature correspondence at multiple scales Souradeep Chakraborty et.al. 2403.11107 link
2024-03-14 Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization Zhao Wang et.al. 2403.09433 null
2024-03-14 D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection Dinh Phat Do et.al. 2403.09359 link
2024-03-14 Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring Yufei Zhan et.al. 2403.09333 link
2024-03-14 EfficientMFD: Towards More Efficient Multimodal Synchronous Fusion Detection Jiaqing Zhang et.al. 2403.09323 link
2024-03-14 Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection Martin Aubard et.al. 2403.09313 link
2024-03-14 MOTPose: Multi-object 6D Pose Estimation for Dynamic Video Sequences using Attention-based Temporal Fusion Arul Selvam Periyasamy et.al. 2403.09309 null
2024-03-14 CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise Classification Yiming Ma et.al. 2403.09281 null
2024-03-14 D-YOLO a robust framework for object detection in adverse weather conditions Zihan Chu et.al. 2403.09233 null
2024-03-14 Improving Distant 3D Object Detection Using 2D Box Supervision Zetong Yang et.al. 2403.09230 null
2024-03-14 PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest Jiajun Deng et.al. 2403.09212 null
2024-03-13 VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis Enric Corona et.al. 2403.08764 null
2024-03-13 MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning Jialv Zou et.al. 2403.08760 link
2024-03-13 Data Augmentation in Human-Centric Vision Wentao Jiang et.al. 2403.08650 null
2024-03-13 PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections Matteo Taiana et.al. 2403.08586 null
2024-03-13 A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product Ao Xiang et.al. 2403.08511 null
2024-03-13 Improved YOLOv5 Based on Attention Mechanism and FasterNet for Foreign Object Detection on Railway and Airway tracks Zongqing Qi et.al. 2403.08499 null
2024-03-13 IAMCV Multi-Scenario Vehicle Interaction Dataset Novel Certad et.al. 2403.08455 null
2024-03-13 Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks Khondoker Murad Hossain et.al. 2403.08208 null
2024-03-12 TaskCLIP: Extend Large Vision-Language Model for Task Oriented Object Detection Hanning Chen et.al. 2403.08108 null
2024-03-12 Aedes aegypti Egg Counting with Neural Networks for Object Detection Micheli Nayara de Oliveira Vicente et.al. 2403.08016 null
2024-03-12 Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference Changmin Jeon et.al. 2403.07598 null
2024-03-12 PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution Honghao Chen et.al. 2403.07589 null
2024-03-12 A Survey of Vision Transformers in Autonomous Driving: Current Trends and Future Directions Quoc-Vinh Lai-Dang et.al. 2403.07542 null
2024-03-12 JSTR: Joint Spatio-Temporal Reasoning for Event-based Moving Object Detection Hanyu Zhou et.al. 2403.07436 null
2024-03-12 Eliminating Cross-modal Conflicts in BEV Space for LiDAR-Camera 3D Object Detection Jiahui Fu et.al. 2403.07372 null
2024-03-12 GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method Zubair Qazi et.al. 2403.07321 link
2024-03-12 MENTOR: Multilingual tExt detectioN TOward leaRning by analogy Hsin-Ju Lin et.al. 2403.07286 null
2024-03-12 SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection Hongcheng Zhang et.al. 2403.07284 null
2024-03-12 Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction Alexander Timans et.al. 2403.07263 null
2024-03-11 Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation Strategies Nieves Crasto et.al. 2403.07113 link
2024-03-11 Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head Tiancheng Zhao et.al. 2403.06892 null
2024-03-11 LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations Mohammad Alkhalefi et.al. 2403.06813 null
2024-03-11 Genetic Learning for Designing Sim-to-Real Data Augmentations Bram Vanherle et.al. 2403.06786 null
2024-03-11 Evaluating the Energy Efficiency of Few-Shot Learning for Object Detection in Industrial Settings Georgios Tsoumplekas et.al. 2403.06631 null
2024-03-11 Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers Alexander H. Berger et.al. 2403.06601 null
2024-03-11 SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection Yuxuan Li et.al. 2403.06534 link
2024-03-11 3D Semantic Segmentation-Driven Representations for 3D Object Detection Hayeon O et.al. 2403.06501 null
2024-03-11 Fine-Grained Pillar Feature Encoding Via Spatio-Temporal Virtual Grid for 3D Object Detection Konyul Park et.al. 2403.06433 null
2024-03-10 Transformer based Multitask Learning for Image Captioning and Object Detection Debolena Basak et.al. 2403.06292 null
2024-03-10 Poly Kernel Inception Network for Remote Sensing Detection Xinhao Cai et.al. 2403.06258 link
2024-03-08 EVD4UAV: An Altitude-Sensitive Benchmark to Evade Vehicle Detection in UAV Huiming Sun et.al. 2403.05422 null
2024-03-08 SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection Yahao Lu et.al. 2403.05416 link
2024-03-08 Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery Xavier Bou et.al. 2403.05381 null
2024-03-08 Frequency-Adaptive Dilated Convolution for Semantic Segmentation Linwei Chen et.al. 2403.05369 link
2024-03-08 VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model Junsu Kim et.al. 2403.05346 null
2024-03-08 Improving the Successful Robotic Grasp Detection Using Convolutional Neural Networks Hamed Hosseini et.al. 2403.05211 null
2024-03-08 LanePtrNet: Revisiting Lane Detection as Point Voting and Grouping on Curves Jiayan Cao et.al. 2403.05155 null
2024-03-08 RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features Geonho Bang et.al. 2403.05061 null
2024-03-08 ActFormer: Scalable Collaborative Perception via Active Queries Suozhi Huang et.al. 2403.04968 null
2024-03-07 FriendNet: Detection-Friendly Dehazing Network Yihua Fan et.al. 2403.04443 null
2024-03-07 Effectiveness Assessment of Recent Large Vision-Language Models Yao Jiang et.al. 2403.04306 null
2024-03-07 ACC-ViT : Atrous Convolution's Comeback in Vision Transformers Nabil Ibtehaz et.al. 2403.04200 null
2024-03-07 CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images Guanlin Shen et.al. 2403.04198 null
2024-03-07 Scalable and Robust Transformer Decoders for Interpretable Image Classification with Foundation Models Evelyn Mannix et.al. 2403.04125 null
2024-03-07 CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection Gyusam Chang et.al. 2403.03721 null
2024-03-06 Adversarial Infrared Geometry: Using Geometry to Perform Adversarial Attack against Infrared Pedestrian Detectors Kalibinuer Tiliwalidi et.al. 2403.03674 null
2024-03-06 Towards Detecting AI-Generated Text within Human-AI Collaborative Hybrid Texts Zijie Zeng et.al. 2403.03506 null
2024-03-06 Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator Wonhyeok Choi et.al. 2403.03468 null
2024-03-06 FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided Diffusion Hao Wang et.al. 2403.03463 null
2024-03-06 Performance Evaluation of Semi-supervised Learning Frameworks for Multi-Class Weed Detection Jiajia Li et.al. 2403.03390 link
2024-03-05 Detecting Concrete Visual Tokens for Multimodal Machine Translation Braeden Bowen et.al. 2403.03075 null
2024-03-05 Loss Design for Single-carrier Joint Communication and Neural Network-based Sensing Charlotte Muth et.al. 2403.02929 null
2024-03-05 Are Dense Labels Always Necessary for 3D Object Detection from Point Cloud? Chenqiang Gao et.al. 2403.02818 null
2024-03-05 Bootstrapping Rare Object Detection in High-Resolution Satellite Imagery Akram Zaytar et.al. 2403.02736 null
2024-03-05 FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View Jiawei Hou et.al. 2403.02710 null
2024-03-05 False Positive Sampling-based Data Augmentation for Enhanced 3D Object Detection Accuracy Jiyong Oh et.al. 2403.02639 null
2024-03-05 BSDP: Brain-inspired Streaming Dual-level Perturbations for Online Open World Object Detection Yu Chen et.al. 2403.02637 null
2024-03-04 NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function Abdullah Nazhat Abdullah et.al. 2403.02411 link
2024-03-04 COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic Attacks Zijian Huang et.al. 2403.02329 null
2024-03-04 Scalable Vision-Based 3D Object Detection and Monocular Depth Estimation for Autonomous Driving Yuxuan Liu et.al. 2403.02037 link
2024-03-02 TUMTraf V2X Cooperative Perception Dataset Walter Zimmer et.al. 2403.01316 null
2024-03-02 Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection Taeheon Kim et.al. 2403.01300 null
2024-03-02 Run-time Introspection of 2D Object Detection in Automated Driving Systems Using Learning Representations Hakan Yekta Yatbaz et.al. 2403.01172 null
2024-03-02 ELA: Efficient Local Attention for Deep Convolutional Neural Networks Wei Xu et.al. 2403.01123 null
2024-03-02 Face Swap via Diffusion Model Feifei Wang et.al. 2403.01108 null
2024-03-02 Beyond Night Visibility: Adaptive Multi-Scale Fusion of Infrared and Visible Images Shufan Pei et.al. 2403.01083 null
2024-03-01 Learning Causal Features for Incremental Object Detection Zhenwei He et.al. 2403.00591 null
2024-03-01 Abductive Ego-View Accident Video Understanding for Safe Driving Perception Jianwu Fang et.al. 2403.00436 null
2024-03-04 DAMS-DETR: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion Junjie Guo et.al. 2403.00326 null
2024-03-01 ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting Chen Duan et.al. 2403.00303 null
2024-02-29 SeMoLi: What Moves Together Belongs Together Jenny Seidenschwarz et.al. 2402.19463 null
2024-02-29 Genie: Smart ROS-based Caching for Connected Autonomous Robots Zexin Li et.al. 2402.19410 null
2024-02-29 ProtoP-OD: Explainable Object Detection with Prototypical Parts Pavlos Rath-Manakidis et.al. 2402.19142 null
2024-02-29 Theoretically Achieving Continuous Representation of Oriented Bounding Boxes Zikai Xiao et.al. 2402.18975 link
2024-02-29 Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching Boxuan Zhang et.al. 2402.18958 null
2024-02-29 Edge Computing Enabled Real-Time Video Analysis via Adaptive Spatial-Temporal Semantic Filtering Xiang Chen et.al. 2402.18927 null
2024-02-29 A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection Chao Hao et.al. 2402.18922 null
2024-02-29 Privacy-Preserving Autoencoder for Collaborative Object Detection Bardia Azizian et.al. 2402.18864 null
2024-02-29 Debiased Novel Category Discovering and Localization Juexiao Feng et.al. 2402.18821 null
2024-02-28 Spatial Coherence Loss for Salient and Camouflaged Object Detection and Beyond Ziyun Yang et.al. 2402.18698 null
2024-02-28 UniMODE: Unified Monocular 3D Object Detection Zhuoling Li et.al. 2402.18573 null
2024-02-28 Detection of Micromobility Vehicles in Urban Traffic Videos Khalil Sabri et.al. 2402.18503 link
2024-02-28 Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection Xun Huang et.al. 2402.18493 null
2024-02-28 Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization Deng Li et.al. 2402.18447 null
2024-02-28 Unveiling novel insights into Kirchhoff migration for effective object detection using experimental Fresnel dataset Won-Kwang Park et.al. 2402.18322 null
2024-02-28 Zero-Shot Aerial Object Detection with Visual Description Regularization Zhengqing Zang et.al. 2402.18233 null
2024-02-28 VulMCI : Code Splicing-based Pixel-row Oversampling for More Continuous Vulnerability Image Generation Tao Peng et.al. 2402.18189 null
2024-02-27 SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection Junsu Kim et.al. 2402.17323 null
2024-02-27 A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge -- Multi-Task Robustness Track Zehui Chen et.al. 2402.17319 null
2024-02-27 Probing Multimodal Large Language Models for Global and Local Semantic Representation Mingxu Tao et.al. 2402.17304 null

(back to top)

Semantic Segmentation

Publish Date Title Authors PDF Code
2024-05-21 Transparency Distortion Robustness for SOTA Image Segmentation Tasks Volker Knauthe et.al. 2405.12864 null
2024-05-20 A comprehensive overview of deep learning techniques for 3D point cloud classification and semantic segmentation Sushmita Sarker et.al. 2405.11903 null
2024-05-20 Salience-guided Ground Factor for Robust Localization of Delivery Robots in Complex Urban Environments Jooyong Park et.al. 2405.11855 null
2024-05-20 Improving the Explain-Any-Concept by Introducing Nonlinearity to the Trainable Surrogate Model Mounes Zaval et.al. 2405.11837 null
2024-05-20 Universal Organizer of SAM for Unsupervised Semantic Segmentation Tingting Li et.al. 2405.11742 null
2024-05-19 Interpreting a Semantic Segmentation Model for Coastline Detection Conor O'Sullivan et.al. 2405.11500 null
2024-05-19 Unifying 3D Vision-Language Understanding via Promptable Queries Ziyu Zhu et.al. 2405.11442 null
2024-05-18 PS6D: Point Cloud Based Symmetry-Aware 6D Object Pose Estimation in Robot Bin-Picking Yifan Yang et.al. 2405.11257 null
2024-05-17 CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation Mushui Liu et.al. 2405.10530 link
2024-05-16 4D Panoptic Scene Graph Generation Jingkang Yang et.al. 2405.10305 link
2024-05-16 Towards Task-Compatible Compressible Representations Anderson de Andrade et.al. 2405.10244 link
2024-05-16 DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data Chengxiang Fan et.al. 2405.10185 link
2024-05-16 An Integrated Framework for Multi-Granular Explanation of Video Summarization Konstantinos Tsigos et.al. 2405.10082 null
2024-05-16 A Preprocessing and Postprocessing Voxel-based Method for LiDAR Semantic Segmentation Improvement in Long Distance Andrea Matteazzi et.al. 2405.10046 null
2024-05-16 Towards Realistic Incremental Scenario in Class Incremental Semantic Segmentation Jihwan Kwak et.al. 2405.09858 null
2024-05-15 Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation Guo Yachan et.al. 2405.09682 null
2024-05-14 CLIP with Quality Captions: A Strong Pretraining for Vision Tasks Pavan Kumar Anasosalu Vasu et.al. 2405.08911 null
2024-05-14 Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study Qinfeng Zhu et.al. 2405.08493 null
2024-05-14 TEDNet: Twin Encoder Decoder Neural Network for 2D Camera and LiDAR Road Detection Martín Bayón-Gutiérrez et.al. 2405.08429 link
2024-05-13 IMAFD: An Interpretable Multi-stage Approach to Flood Detection from time series Multispectral Data Ziyang Zhang et.al. 2405.07916 null
2024-05-13 PLUTO: Pathology-Universal Transformer Dinkar Juyal et.al. 2405.07905 null
2024-05-12 PotatoGANs: Utilizing Generative Adversarial Networks, Instance Segmentation, and Explainable AI for Enhanced Potato Disease Identification and Classification Mohammad Shafiul Alam et.al. 2405.07332 link
2024-05-12 Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception Haoming Chen et.al. 2405.07201 null
2024-05-11 Global Motion Understanding in Large-Scale Video Object Segmentation Volodymyr Fedynyak et.al. 2405.07031 null
2024-05-10 GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs Mustafa Munir et.al. 2405.06849 link
2024-05-10 Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach Elham Ravanbakhsh et.al. 2405.06586 null
2024-05-10 Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation Xiaowen Ma et.al. 2405.06525 link
2024-05-10 Multi-Target Unsupervised Domain Adaptation for Semantic Segmentation without External Data Yonghao Xu et.al. 2405.06502 null
2024-05-10 Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data Rongyu Zhang et.al. 2405.06413 null
2024-05-10 Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation Zhenliang Ni et.al. 2405.06228 link
2024-05-10 Zero-shot Degree of Ill-posedness Estimation for Active Small Object Change Detection Koji Takeda et.al. 2405.06185 null
2024-05-10 Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase Imaging Zhuchen Shao et.al. 2405.06175 null
2024-05-09 Mask-TS Net: Mask Temperature Scaling Uncertainty Calibration for Polyp Segmentation Yudian Zhang et.al. 2405.05830 null
2024-05-09 CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks Nick et.al. 2405.05755 null
2024-05-08 OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies Lingdong Kong et.al. 2405.05259 link
2024-05-08 Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving Lingdong Kong et.al. 2405.05258 link
2024-05-08 Weakly-supervised Semantic Segmentation via Dual-stream Contrastive Learning of Cross-image Contextual Information Qi Lai et.al. 2405.04913 null
2024-05-08 DeepDamageNet: A two-step deep-learning model for multi-disaster building damage segmentation and classification using satellite imagery Irene Alisjahbana et.al. 2405.04800 null
2024-05-07 A Self-Supervised Method for Body Part Segmentation and Keypoint Detection of Rat Images László Kopácsi et.al. 2405.04650 null
2024-05-07 FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes Charles Gaydon et.al. 2405.04634 link
2024-05-07 AugmenTory: A Fast and Flexible Polygon Augmentation Library Tanaz Ghahremani et.al. 2405.04442 null
2024-05-07 A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields Raiyan Rahman et.al. 2405.04305 null
2024-05-07 ELiTe: Efficient Image-to-LiDAR Knowledge Transfer for Semantic Segmentation Zhibo Zhang et.al. 2405.04121 null
2024-05-07 Structured Click Control in Transformer-based Interactive Segmentation Long Xu et.al. 2405.04009 link
2024-05-06 PTQ4SAM: Post-Training Quantization for Segment Anything Chengtao Lv et.al. 2405.03144 link
2024-05-04 MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning Vishal Nedungadi et.al. 2405.02771 null
2024-05-04 Few-Shot Fruit Segmentation via Transfer Learning Jordan A. James et.al. 2405.02556 null
2024-05-03 Panoptic-SLAM: Visual SLAM in Dynamic Environments using Panoptic Segmentation Gabriel Fischer Abati et.al. 2405.02177 null
2024-05-03 Towards general deep-learning-based tree instance segmentation models Jonathan Henrich et.al. 2405.02061 null
2024-05-03 DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model Peijin Jia et.al. 2405.02008 null
2024-05-02 Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey Guoping Xu et.al. 2405.01725 link
2024-05-02 Explainable AI (XAI) in Image Segmentation in Medicine, Industry, and Beyond: A Survey Rokas Gipiškis et.al. 2405.01636 null
2024-05-02 CromSS: Cross-modal pre-training with noisy labels for remote sensing image segmentation Chenying Liu et.al. 2405.01217 null
2024-05-02 Uncertainty-aware self-training with expectation maximization basis transformation Zijia Wang et.al. 2405.01175 null
2024-05-01 GraCo: Granularity-Controllable Interactive Segmentation Yian Zhao et.al. 2405.00587 null
2024-05-01 Exploring Self-Supervised Vision Transformers for Deepfake Detection: A Comparative Analysis Huy H. Nguyen et.al. 2405.00355 null
2024-04-30 Masked Multi-Query Slot Attention for Unsupervised Object Discovery Rishav Pramanik et.al. 2404.19654 link
2024-04-30 UniFS: Universal Few-shot Instance Perception with Point Representations Sheng Jin et.al. 2404.19401 null
2024-04-30 DELINE8K: A Synthetic Data Pipeline for the Semantic Segmentation of Historical Documents Taylor Archibald et.al. 2404.19259 null
2024-04-29 Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing Leonardo Rossi et.al. 2404.18924 null
2024-04-29 IPixMatch: Boost Semi-supervised Semantic Segmentation with Inter-Pixel Relation Kebin Wu et.al. 2404.18891 null
2024-04-29 From Density to Geometry: YOLOv8 Instance Segmentation for Reverse Engineering of Optimized Structures Thomas Rochefort-Beaudoin et.al. 2404.18763 null
2024-04-29 Towards Long-term Robotics in the Wild Stephen Hausler et.al. 2404.18477 null
2024-04-29 Clicks2Line: Using Lines for Interactive Image Segmentation Chaewon Lee et.al. 2404.18461 null
2024-04-29 MFP: Making Full Use of Probability Maps for Interactive Image Segmentation Chaewon Lee et.al. 2404.18448 null
2024-04-28 Panoptic Segmentation and Labelling of Lumbar Spine Vertebrae using Modified Attention Unet Rikathi Pal et.al. 2404.18291 null
2024-04-28 Garbage Segmentation and Attribute Analysis by Robotic Dogs Nuo Xu et.al. 2404.18112 null
2024-04-27 Multi-Stream Cellular Test-Time Adaptation of Real-Time Models Evolving in Dynamic Environments Benoît Gérin et.al. 2404.17930 link
2024-04-27 GLIMS: Attention-Guided Lightweight Multi-Scale Hybrid Network for Volumetric Semantic Segmentation Ziya Ata Yazıcı et.al. 2404.17854 link
2024-04-26 Optimizing Universal Lesion Segmentation: State Space Model-Guided Hierarchical Networks with Feature Importance Adjustment Kazi Shahriar Sanjid et.al. 2404.17235 null
2024-04-25 Calculation of Femur Caput Collum Diaphyseal angle for X-Rays images using Semantic Segmentation Deepak Bhatia et.al. 2404.17083 null
2024-04-25 Boosting Unsupervised Semantic Segmentation with Principal Mask Proposals Oliver Hahn et.al. 2404.16818 link
2024-04-25 Self-Balanced R-CNN for Instance Segmentation Leonardo Rossi et.al. 2404.16633 link
2024-04-26 Multi-Scale Representations by Varying Window Attention for Semantic Segmentation Haotian Yan et.al. 2404.16573 link
2024-04-25 360SFUDA++: Towards Source-free UDA for Panoramic Segmentation by Learning Reliable Category Prototypes Xu Zheng et.al. 2404.16501 null
2024-04-25 Semantic Segmentation Refiner for Ultrasound Applications with Zero-Shot Foundation Models Hedda Cohen Indelman et.al. 2404.16325 null
2024-04-25 Style Adaptation for Domain-adaptive Semantic Segmentation Ting Li et.al. 2404.16301 null
2024-04-25 A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic Segmentation Yifan Zhao et.al. 2404.16266 link
2024-04-24 Does SAM dream of EIG? Characterizing Interactive Segmenter Performance using Expected Information Gain Kuan-I Chung et.al. 2404.16155 null
2024-04-24 3D Freehand Ultrasound using Visual Inertial and Deep Inertial Odometry for Measuring Patellar Tracking Russell Buchanan et.al. 2404.15847 null
2024-04-24 Vision Transformer-based Adversarial Domain Adaptation Yahan Li et.al. 2404.15817 link
2024-04-23 PRISM: A Promptable and Robust Interactive Segmentation Model with Visual Prompts Hao Li et.al. 2404.15028 link
2024-04-23 Unknown Object Grasping for Assistive Robotics Elle Miller et.al. 2404.15001 null
2024-04-22 Surgical-DeSAM: Decoupling SAM for Instrument Segmentation in Robotic Surgery Yuyang Sheng et.al. 2404.14040 link
2024-04-22 OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks Sophia Sirko-Galouchenko et.al. 2404.14027 null
2024-04-22 PM-VIS: High-Performance Box-Supervised Video Instance Segmentation Zhangjing Yang et.al. 2404.13863 null
2024-04-21 Semantic-Rearrangement-Based Multi-Level Alignment for Domain Generalized Segmentation Guanlong Jiao et.al. 2404.13701 null
2024-04-21 PV-S3: Advancing Automatic Photovoltaic Defect Detection using Semi-Supervised Semantic Segmentation of Electroluminescence Images Abhishek Jha et.al. 2404.13693 null
2024-04-21 A Complete System for Automated 3D Semantic-Geometric Mapping of Corrosion in Industrial Environments Rui Pimentel de Figueiredo et.al. 2404.13691 null
2024-04-21 LMFNet: An Efficient Multimodal Fusion Approach for Semantic Segmentation in High-Resolution Remote Sensing Tong Wang et.al. 2404.13659 null
2024-04-21 Towards Unified Representation of Multi-Modal Pre-training for 3D Understanding via Differentiable Rendering Ben Fei et.al. 2404.13619 null
2024-04-20 FisheyeDetNet: Object Detection on Fisheye Surround View Camera Systems for Automated Driving Ganesh Sistu et.al. 2404.13443 null
2024-04-20 AMMUNet: Multi-Scale Attention Map Merging for Remote Sensing Image Segmentation Yang Yang et.al. 2404.13408 null
2024-04-19 Nuclei Instance Segmentation of Cryosectioned H&E Stained Histological Images using Triple U-Net Architecture Zarif Ahmed et.al. 2404.12986 null
2024-04-19 FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving Xingtai Gui et.al. 2404.12867 null
2024-04-19 Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation Yilong Chen et.al. 2404.12861 null
2024-04-19 COIN: Counterfactual inpainting for weakly supervised semantic segmentation for medical images Dmytro Shvetsov et.al. 2404.12832 link
2024-04-19 A Point-Based Approach to Efficient LiDAR Multi-Task Perception Christopher Lang et.al. 2404.12798 null
2024-04-19 Generalized Few-Shot Meets Remote Sensing: Discovering Novel Classes in Land Cover Mapping via Hybrid Semantic Segmentation Framework Zhuohong Li et.al. 2404.12721 link
2024-04-19 Improving Prediction Accuracy of Semantic Segmentation Methods Using Convolutional Autoencoder Based Pre-processing Layers Hisashi Shimodaira et.al. 2404.12718 null
2024-04-19 Show and Grasp: Few-shot Semantic Segmentation for Robot Grasping through Zero-shot Foundation Models Leonardo Barcellona et.al. 2404.12717 null
2024-04-18 Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds Oliver Lemke et.al. 2404.12440 null
2024-04-18 A Perspective on Deep Vision Performance with Standard Image and Video Codecs Christoph Reich et.al. 2404.12330 null
2024-04-18 Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery Yona Falinie A. Gaus et.al. 2404.12285 null
2024-04-18 Deep Gaussian mixture model for unsupervised image segmentation Matthias Schwab et.al. 2404.12252 null
2024-04-18 Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training Jin Gao et.al. 2404.12210 link
2024-04-18 How to Benchmark Vision Foundation Models for Semantic Segmentation? Tommie Kerssies et.al. 2404.12172 null
2024-04-17 Mushroom Segmentation and 3D Pose Estimation from Point Clouds using Fully Convolutional Geometric Features and Implicit Pose Encoding George Retsinas et.al. 2404.12144 link
2024-04-18 Tendency-driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation Chongjie Si et.al. 2404.11981 null
2024-04-18 The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models Cheng Shi et.al. 2404.11957 link
2024-04-18 Group-On: Boosting One-Shot Segmentation with Supportive Query Hanjing Zhou et.al. 2404.11871 null
2024-04-17 Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach Mir Rayat Imtiaz Hossain et.al. 2404.11732 null
2024-04-17 A Semantic Segmentation-guided Approach for Ground-to-Aerial Image Matching Francesco Pro et.al. 2404.11302 link
2024-04-17 Learning from Unlabelled Data with Transformers: Domain Adaptation for Semantic Segmentation of High Resolution Aerial Images Nikolaos Dionelis et.al. 2404.11299 link
2024-04-17 Criteria for Uncertainty-based Corner Cases Detection in Instance Segmentation Florian Heidecker et.al. 2404.11266 null
2024-04-16 A Concise Tiling Strategy for Preserving Spatial Context in Earth Observation Imagery Ellianna Abrahams et.al. 2404.10927 link
2024-04-16 Vocabulary-free Image Classification and Semantic Segmentation Alessandro Conti et.al. 2404.10864 link
2024-04-16 Gasformer: A Transformer-based Architecture for Segmenting Methane Emissions from Livestock in Optical Gas Imaging Toqi Tahamid Sarker et.al. 2404.10841 link
2024-04-16 Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark Jiangning Zhang et.al. 2404.10760 null
2024-04-16 ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic Segmentation Iaroslav Melekhov et.al. 2404.10699 null
2024-04-16 Contextrast: Contextual Contrastive Learning for Semantic Segmentation Changki Sung et.al. 2404.10633 null
2024-04-16 Label merge-and-split: A graph-colouring approach for memory-efficient brain parcellation Aaron Kujawa et.al. 2404.10572 null
2024-04-16 LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Perception System Shijing Hu et.al. 2404.10498 null
2024-04-16 Adversarial Identity Injection for Semantic Face Image Synthesis Giuseppe Tarollo et.al. 2404.10408 null
2024-04-16 Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation Jiapeng Su et.al. 2404.10322 null
2024-04-16 Learnable Prompt for Few-Shot Semantic Segmentation in Remote Sensing Domain Steve Andreas Immanuel et.al. 2404.10307 link
2024-04-15 NOISe: Nuclei-Aware Osteoclast Instance Segmentation for Mouse-to-Human Domain Transfer Sai Kumar Reddy Manne et.al. 2404.10130 link
2024-04-15 Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL Fangwei Zhong et.al. 2404.09857 null
2024-04-15 In-Context Translation: Towards Unifying Image Recognition, Processing, and Generation Han Xue et.al. 2404.09633 null
2024-04-15 The revenge of BiSeNet: Efficient Multi-Task Image Segmentation Gabriele Rosi et.al. 2404.09570 null
2024-04-15 kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies Zhongrui Gui et.al. 2404.09447 null
2024-04-15 Human-in-the-Loop Segmentation of Multi-species Coral Imagery Scarlett Raine et.al. 2404.09406 null
2024-04-14 Bridging Data Islands: Geographic Heterogeneity-Aware Federated Learning for Collaborative Remote Sensing Semantic Segmentation Jieyi Tan et.al. 2404.09292 null
2024-04-12 Structured Model Pruning for Efficient Inference in Computational Pathology Mohammed Adnan et.al. 2404.08831 null
2024-04-12 COCONut: Modernizing COCO Segmentation Xueqing Deng et.al. 2404.08639 null
2024-04-12 Benchmarking the Cell Image Segmentation Models Robustness under the Microscope Optical Aberrations Boyuan Peng et.al. 2404.08549 null
2024-04-12 Analyzing Decades-Long Environmental Changes in Namibia Using Archival Aerial Photography and Deep Learning Girmaw Abebe Tadesse et.al. 2404.08544 null
2024-04-12 LaSagnA: Language-based Segmentation Assistant for Complex Queries Cong Wei et.al. 2404.08506 link
2024-04-12 Adapting the Segment Anything Model During Usage in Novel Situations Robin Schön et.al. 2404.08421 null
2024-04-12 Let It Flow: Simultaneous Optimization of 3D Flow and Object Clustering Patrik Vacek et.al. 2404.08363 null
2024-04-12 AdaContour: Adaptive Contour Descriptor with Hierarchical Representation Tianyu Ding et.al. 2404.08292 null
2024-04-12 Tackling Ambiguity from Perspective of Uncertainty Inference and Affinity Diversification for Weakly Supervised Semantic Segmentation Zhiwei Yang et.al. 2404.08195 link
2024-04-12 Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation Sina Hajimiri et.al. 2404.08181 link
2024-04-11 Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification Ricardo Pereira et.al. 2404.07739 null
2024-04-11 OpenTrench3D: A Photogrammetric 3D Point Cloud Dataset for Semantic Segmentation of Underground Utilities Lasse H. Hansen et.al. 2404.07711 link
2024-04-11 ViM-UNet: Vision Mamba for Biomedical Segmentation Anwai Archit et.al. 2404.07705 link
2024-04-11 Implicit and Explicit Language Guidance for Diffusion-based Visual Perception Hefeng Wang et.al. 2404.07600 null
2024-04-11 Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling Sourajit Saha et.al. 2404.07410 null
2024-04-10 AI-Guided Defect Detection Techniques to Model Single Crystal Diamond Growth Rohan Reddy Mekala et.al. 2404.07306 null
2024-04-10 RESSCAL3D: Resolution Scalable 3D Semantic Segmentation of Point Clouds Remco Royen et.al. 2404.06863 null
2024-04-10 O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation Muer Tie et.al. 2404.06836 null
2024-04-10 Convolution-based Probability Gradient Loss for Semantic Segmentation Guohang Shan et.al. 2404.06704 null
2024-04-09 Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation Luca Barsellotti et.al. 2404.06542 null
2024-04-09 QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding Yash Mehan et.al. 2404.06442 null
2024-04-09 DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning Senthil Yogamani et.al. 2404.06352 null
2024-04-09 Automated National Urban Map Extraction Hasan Nasrallah et.al. 2404.06202 null
2024-04-09 Hierarchical Insights: Exploiting Structural Similarities for Reliable 3D Semantic Segmentation Mariella Dreissig et.al. 2404.06124 null
2024-04-09 Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation Zong-Wei Hong et.al. 2404.06029 null
2024-04-08 Evaluating the Efficacy of Cut-and-Paste Data Augmentation in Semantic Segmentation for Satellite Imagery Ionut M. Motoi et.al. 2404.05693 null
2024-04-08 AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation Jiannan Ge et.al. 2404.05667 null
2024-04-08 Impact of LiDAR visualisations on semantic segmentation of archaeological objects Raveerat Jaturapitpornchai et.al. 2404.05512 null
2024-04-08 Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance Dazhong Shen et.al. 2404.05384 link
2024-04-08 GPS-free Autonomous Navigation in Cluttered Tree Rows with Deep Semantic Segmentation Alessandro Navone et.al. 2404.05338 null
2024-04-08 Human Detection from 4D Radar Data in Low-Visibility Field Conditions Mikael Skog et.al. 2404.05307 null
2024-04-08 iVPT: Improving Task-relevant Information Sharing in Visual Prompt Tuning by Cross-layer Dynamic Connection Nan Zhou et.al. 2404.05207 null
2024-04-08 UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather Haimei Zhao et.al. 2404.05145 null
2024-04-07 D2SL: Decouple Defogging and Semantic Learning for Foggy Domain-Adaptive Segmentation Xuan Sun et.al. 2404.04807 null
2024-04-06 HawkDrive: A Transformer-driven Visual Perception System for Autonomous Driving in Night Scene Ziang Guo et.al. 2404.04653 link
2024-04-05 Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation Zifu Wan et.al. 2404.04256 null
2024-04-05 Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation Ji-Jia Wu et.al. 2404.04231 null
2024-04-05 MarsSeg: Mars Surface Semantic Segmentation with Multi-level Extractor and Connector Junbo Li et.al. 2404.04155 null
2024-04-04 Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation Elham Amin Mansour et.al. 2404.03799 null
2024-04-04 Flattening the Parent Bias: Hierarchical Semantic Segmentation in the Poincaré Ball Simon Weber et.al. 2404.03778 null
2024-04-04 OW-VISCap: Open-World Video Instance Segmentation and Captioning Anwesa Choudhuri et.al. 2404.03657 null
2024-04-04 Background Noise Reduction of Attention Map for Weakly Supervised Semantic Segmentation Izumi Fujimori et.al. 2404.03394 null
2024-04-04 iSeg: Interactive 3D Segmentation via Interactive Attention Itai Lang et.al. 2404.03219 null
2024-04-04 CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks Beibei Wang et.al. 2404.03191 null
2024-04-03 GPU-Accelerated RSF Level Set Evolution for Large-Scale Microvascular Segmentation Meher Niger et.al. 2404.02813 null
2024-04-03 RS-Mamba for Large Remote Sensing Image Dense Prediction Sijie Zhao et.al. 2404.02668 link
2024-04-03 A Satellite Band Selection Framework for Amazon Forest Deforestation Detection Task Eduardo Neto et.al. 2404.02659 null
2024-04-03 SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation Junyan Ye et.al. 2404.02638 link
2024-04-03 Active learning for efficient annotation in precision agriculture: a use-case on crop-weed semantic segmentation Bart M. van Marrewijk et.al. 2404.02580 null
2024-04-03 HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras Zhongyu Xia et.al. 2404.02517 link
2024-04-03 Optimizing traffic signs and lights visibility for the teleoperation of autonomous vehicles through ROI compression I. Dror et.al. 2404.02481 null
2024-04-03 RS3Mamba: Visual State Space Model for Remote Sensing Images Semantic Segmentation Xianping Ma et.al. 2404.02457 link
2024-04-02 Constrained Robotic Navigation on Preferred Terrains Using LLMs and Speech Instruction: Exploiting the Power of Adverbs Faraz Lotfi et.al. 2404.02294 null
2024-04-02 Segment Any 3D Object with Language Seungjun Lee et.al. 2404.02157 null
2024-04-02 Multi-Level Label Correction by Distilling Proximate Patterns for Semi-supervised Semantic Segmentation Hui Xiao et.al. 2404.02065 null
2024-04-01 What is Point Supervision Worth in Video Instance Segmentation? Shuaiyi Huang et.al. 2404.01990 null
2024-04-02 Synthetic Data for Robust Stroke Segmentation Liam Chalcroft et.al. 2404.01946 link
2024-04-02 Improving Bird's Eye View Semantic Segmentation by Task Decomposition Tianhao Zhao et.al. 2404.01925 null
2024-04-02 Rethinking Annotator Simulation: Realistic Evaluation of Whole-Body PET Lesion Interactive Segmentation Methods Zdravko Marinov et.al. 2404.01816 null
2024-04-02 Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model Qinfeng Zhu et.al. 2404.01705 null
2024-04-02 Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss Jaeha Kim et.al. 2404.01692 null
2024-04-02 JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments Duy-Tho Le et.al. 2404.01686 null
2024-04-01 SUGAR: Pre-training 3D Visual Representations for Robotics Shizhe Chen et.al. 2404.01491 null
2024-03-29 ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning Beomyoung Kim et.al. 2403.20126 link
2024-03-29 Modeling Weather Uncertainty for Multi-weather Co-Presence Estimation Qi Bi et.al. 2403.20092 null
2024-03-29 Using Images as Covariates: Measuring Curb Appeal with Deep Learning Ardyn Nordstrom et.al. 2403.19915 null
2024-03-29 MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection Ali Behrouz et.al. 2403.19888 null
2024-03-28 Segmentation Re-thinking Uncertainty Estimation Metrics for Semantic Segmentation Qitian Ma et.al. 2403.19826 null
2024-04-01 Efficient 3D Instance Mapping and Localization with Neural Fields George Tang et.al. 2403.19797 null
2024-03-28 ENet-21: An Optimized light CNN Structure for Lane Detection Seyed Rasoul Hosseini et.al. 2403.19782 null
2024-03-29 Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers Pingcheng Dong et.al. 2403.19591 link
2024-03-28 DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs Donghyun Kim et.al. 2403.19588 link
2024-03-28 Learning Multiple Representations with Inconsistency-Guided Detail Regularization for Mask-Guided Matting Weihao Jiang et.al. 2403.19213 null
2024-03-27 Lift3D: Zero-Shot Lifting of Any 2D Vision Model to 3D Mukund Varma T et.al. 2403.18922 null
2024-03-27 Annolid: Annotate, Segment, and Track Anything You Need Chen Yang et.al. 2403.18690 null
2024-03-27 I2CKD : Intra- and Inter-Class Knowledge Distillation for Semantic Segmentation Ayoub Karine et.al. 2403.18490 null
2024-03-28 ViTAR: Vision Transformer with Any Resolution Qihang Fan et.al. 2403.18361 null
2024-03-27 Generating Diverse Agricultural Data for Vision-Based Farming Applications Mikolaj Cieslak et.al. 2403.18351 null
2024-03-27 Road Obstacle Detection based on Unknown Objectness Scores Chihiro Noguchi et.al. 2403.18207 null
2024-03-26 Spectral Convolutional Transformer: Harmonizing Real vs. Complex Multi-View Spectral Operators for Vision Transformer Badri N. Patro et.al. 2403.18063 link
2024-03-26 The Need for Speed: Pruning Transformers with One Recipe Samir Khaki et.al. 2403.17921 link
2024-03-26 Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation Carlos Gomes et.al. 2403.17886 null
2024-03-26 PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition Chenhongyi Yang et.al. 2403.17695 link
2024-03-26 Integrating Mamba Sequence Model and Hierarchical Upsampling Network for Accurate Semantic Segmentation of Multiple Sclerosis Legion Kazi Shahriar Sanjid et.al. 2403.17432 null
2024-03-25 Optimizing LiDAR Placements for Robust Driving Perception in Adverse Conditions Ye Li et.al. 2403.17009 link
2024-03-25 DreamLIP: Language-Image Pre-training with Long Captions Kecheng Zheng et.al. 2403.17007 null
2024-03-25 TwinLiteNetPlus: A Stronger Model for Real-time Drivable Area and Lane Segmentation Quang-Huy Che et.al. 2403.16958 null
2024-03-25 HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation Linglin Jing et.al. 2403.16788 null
2024-03-25 Clustering Propagation for Universal Medical Image Segmentation Yuhang Ding et.al. 2403.16646 null
2024-03-25 SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation Aysim Toker et.al. 2403.16605 null
2024-03-25 Self-Supervised Learning for Medical Image Data with Anatomy-Oriented Imaging Planes Tianwei Zhang et.al. 2403.16499 null
2024-03-25 GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation Weiming Zhang et.al. 2403.16370 null
2024-03-24 AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans Cedric Perauer et.al. 2403.16318 null
2024-03-24 Dual-modal Prior Semantic Guided Infrared and Visible Image Fusion for Intelligent Transportation System Jing Li et.al. 2403.16227 null
2024-03-24 Segment Anything Model for Road Network Graph Extraction Congrui Hetang et.al. 2403.16051 link
2024-03-24 SM2C: Boost the Semi-supervised Segmentation for Medical Image by using Meta Pseudo Labels and Mixed Images Yifei Wang et.al. 2403.16009 null
2024-03-22 Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting Jun Guo et.al. 2403.15624 null
2024-03-22 A2DMN: Anatomy-Aware Dilated Multiscale Network for Breast Ultrasound Semantic Segmentation Kyle Lucke et.al. 2403.15560 null
2024-03-22 InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Yi Wang et.al. 2403.15377 null
2024-03-22 Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations Pranav Kulkarni et.al. 2403.15218 null
2024-03-22 Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion Sofia Casarin et.al. 2403.15194 null
2024-03-22 IFSENet : Harnessing Sparse Iterations for Interactive Few-shot Segmentation Excellence Shreyas Chandgothia et.al. 2403.15089 null
2024-03-22 Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans Heng Guo et.al. 2403.15063 null
2024-03-22 BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation Jiahao Lu et.al. 2403.15019 null
2024-03-22 Improve Cross-domain Mixed Sampling with Guidance Training for Adaptive Segmentation Wenlve Zhou et.al. 2403.14995 null
2024-03-21 WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather Blake Gella et.al. 2403.14874 null
2024-03-21 PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model Zheng Zhang et.al. 2403.14598 link
2024-03-21 Learning to Project for Cross-Task Knowledge Distillation Dylan Auty et.al. 2403.14494 null
2024-03-21 OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation Bohao Peng et.al. 2403.14418 link
2024-03-21 Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models Pablo Marcos-Manchón et.al. 2403.14291 link
2024-03-21 OTSeg: Multi-prompt Sinkhorn Attention for Zero-Shot Semantic Segmentation Kwanyoung Kim et.al. 2403.14183 null
2024-03-21 Evidential Semantic Mapping in Off-road Environments with Uncertainty-aware Bayesian Kernel Inference Junyoung Kim et.al. 2403.14138 null
2024-03-21 Soft Masked Transformer for Point Cloud Processing with Skip Attention-Based Upsampling Yong He et.al. 2403.14124 null
2024-03-21 Semantics from Space: Satellite-Guided Thermal Semantic Segmentation Annotation for Aerial Field Robots Connor Lee et.al. 2403.14056 null
2024-03-20 When Cars meet Drones: Hyperbolic Federated Learning for Source-Free Domain Adaptation in Adverse Weather Giulia Rizzoli et.al. 2403.13762 null
2024-03-20 Next day fire prediction via semantic segmentation Konstantinos Alexis et.al. 2403.13545 null
2024-03-20 MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining Di Wang et.al. 2403.13430 link
2024-03-20 AMCO: Adaptive Multimodal Coupling of Vision and Proprioception for Quadruped Robot Navigation in Outdoor Environments Mohamed Elnoor et.al. 2403.13235 null
2024-03-20 Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation Linshan Wu et.al. 2403.13225 null
2024-03-19 Reflectivity Is All You Need!: Advancing LiDAR Semantic Segmentation Kasi Viswanath et.al. 2403.13188 null
2024-03-19 As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks? Anjun Hu et.al. 2403.12693 null
2024-03-19 PCT: Perspective Cue Training Framework for Multi-Camera BEV Segmentation Haruya Ishikawa et.al. 2403.12530 null
2024-03-19 Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation Xu Zheng et.al. 2403.12505 null
2024-03-19 CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation Wenqi Zhu et.al. 2403.12455 link
2024-03-19 Multi-Object RANSAC: Efficient Plane Clustering Method in a Clutter Seunghyeon Lim et.al. 2403.12449 null
2024-03-18 EffiPerception: an Efficient Framework for Various Perception Tasks Xinhao Xiang et.al. 2403.12317 null
2024-03-18 Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery Yuqi Zhang et.al. 2403.11812 null
2024-03-18 Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation Wangbo Zhao et.al. 2403.11808 null
2024-03-18 LSKNet: A Foundation Lightweight Backbone for Remote Sensing Yuxuan Li et.al. 2403.11735 null
2024-03-18 TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models Lisa Weijler et.al. 2403.11691 null
2024-03-18 Better (pseudo-)labels for semi-supervised instance segmentation François Porcher et.al. 2403.11675 null
2024-03-18 Synthesizing multi-log grasp poses Arvid Fälldin et.al. 2403.11623 null
2024-03-18 OurDB: Ouroboric Domain Bridging for Multi-Target Domain Adaptive Semantic Segmentation Seungbeom Woo et.al. 2403.11582 null
2024-03-18 MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation Chih-Chung Hsu et.al. 2403.11576 null
2024-03-18 Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes Chih-Chung Hsu et.al. 2403.11572 null
2024-03-18 Circle Representation for Medical Instance Object Segmentation Juming Xiong et.al. 2403.11507 link
2024-03-18 MCD: Diverse Large-Scale Multi-Campus Dataset for Robot Perception Thien-Minh Nguyen et.al. 2403.11496 null
2024-03-18 Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting Mingkui Tan et.al. 2403.11491 null
2024-03-18 ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance Segmentation Minh Tran et.al. 2403.11376 null
2024-03-14 PosSAM: Panoptic Open-vocabulary Segment Anything Vibashan VS et.al. 2403.09620 null
2024-03-14 WeakSurg: Weakly supervised surgical instrument segmentation using temporal equivariance and semantic continuity Qiyuan Wang et.al. 2403.09551 null
2024-03-14 Annotation Free Semantic Segmentation with Vision Foundation Models Soroush Seifi et.al. 2403.09307 null
2024-03-14 StainFuser: Controlling Diffusion for Faster Neural Style Transfer in Multi-Gigapixel Histology Images Robert Jewsbury et.al. 2403.09302 link
2024-03-14 Customizing Segmentation Foundation Model via Prompt Learning for Instance Segmentation Hyung-Il Kim et.al. 2403.09199 null
2024-03-14 When Semantic Segmentation Meets Frequency Aliasing Linwei Chen et.al. 2403.09065 link
2024-03-13 CART: Caltech Aerial RGB-Thermal Dataset in the Wild Connor Lee et.al. 2403.08997 link
2024-03-13 SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion using a 3D Recurrent U-Net Helin Cao et.al. 2403.08885 null
2024-03-13 Segmentation of Knee Bones for Osteoarthritis Assessment: A Comparative Analysis of Supervised, Few-Shot, and Zero-Shot Learning Approaches Yun Xin Teoh et.al. 2403.08761 null
2024-03-13 Real-time 3D semantic occupancy prediction for autonomous vehicles using memory-efficient sparse convolution Samuel Sze et.al. 2403.08748 null
2024-03-13 Semantic Segmentation of Solar Radio Spikes at Low Frequencies Pearse C. Murphy et.al. 2403.08546 null
2024-03-13 Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation Zicheng Zhang et.al. 2403.08426 null
2024-03-13 LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving Sicen Guo et.al. 2403.08215 null
2024-03-13 Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks Fuzhi Wu et.al. 2403.08157 link
2024-03-12 Mitigating the Impact of Attribute Editing on Face Recognition Sudipta Banerjee et.al. 2403.08092 null
2024-03-12 Hunting Attributes: Context Prototype-Aware Learning for Weakly Supervised Semantic Segmentation Feilong Tang et.al. 2403.07630 link
2024-03-12 PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution Honghao Chen et.al. 2403.07589 null
2024-03-12 Open-World Semantic Segmentation Including Class Similarity Matteo Sodano et.al. 2403.07532 null
2024-03-11 Average Calibration Error: A Differentiable Loss for Improved Reliability in Image Segmentation Theodore Barfoot et.al. 2403.06759 link
2024-03-11 Forest Inspection Dataset for Aerial Semantic Segmentation and Depth Estimation Bianca-Cerasela-Zelia Blaga et.al. 2403.06621 link
2024-03-11 OMH: Structured Sparsity via Optimally Matched Hierarchy for Unsupervised Semantic Segmentation Baran Ozaydin et.al. 2403.06546 null
2024-03-11 3D Semantic Segmentation-Driven Representations for 3D Object Detection Hayeon O et.al. 2403.06501 link
2024-03-11 Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy Jiuming Liu et.al. 2403.06467 link
2024-03-11 Towards the Uncharted: Density-Descending Feature Perturbation for Semi-supervised Semantic Segmentation Xiaoyang Wang et.al. 2403.06462 null
2024-03-11 Refining Segmentation On-the-Fly: An Interactive Framework for Point Cloud Semantic Segmentation Peng Zhang et.al. 2403.06401 null
2024-03-10 Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning Woo-Jin Ahn et.al. 2403.06122 link
2024-03-09 Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation Hairong Shi et.al. 2403.05912 null
2024-03-09 Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration Jingyun Xue et.al. 2403.05906 null
2024-03-08 Attention-guided Feature Distillation for Semantic Segmentation Amir M. Mansourian et.al. 2403.05451 link
2024-03-08 Generalized Correspondence Matching via Flexible Hierarchical Refinement and Patch Descriptor Distillation Yu Han et.al. 2403.05388 null
2024-03-08 Frequency-Adaptive Dilated Convolution for Semantic Segmentation Linwei Chen et.al. 2403.05369 link
2024-03-08 Embedded Deployment of Semantic Segmentation in Medicine through Low-Resolution Inputs Erik Ostrowski et.al. 2403.05340 null
2024-03-08 LVIC: Multi-modality segmentation by Lifting Visual Info as Cue Zichao Dong et.al. 2403.05159 null
2024-03-07 SAM-PD: How Far Can SAM Take Us in Tracking and Segmenting Anything in Videos by Prompt Denoising Tao Zhou et.al. 2403.04194 link
2024-03-06 ECAP: Extensive Cut-and-Paste Augmentation for Unsupervised Domain Adaptive Semantic Segmentation Erik Brorsson et.al. 2403.03854 link
2024-03-06 Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision Yajie Liu et.al. 2403.03707 null
2024-03-06 Causal Prototype-inspired Contrast Adaptation for Unsupervised Domain Adaptive Semantic Segmentation of High-resolution Remote Sensing Imagery Jingru Zhu et.al. 2403.03704 null
2024-03-06 GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding Zi-Ting Chou et.al. 2403.03608 null
2024-03-06 Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator Wonhyeok Choi et.al. 2403.03468 null
2024-03-05 CenterDisks: Real-time instance segmentation with disk covering Katia Jodogne-Del Litto et.al. 2403.03296 link
2024-03-05 Improved LiDAR Odometry and Mapping using Deep Semantic Segmentation and Novel Outliers Detection Mohamed Afifi et.al. 2403.03111 null
2024-03-05 ActiveAD: Planning-Oriented Active Learning for End-to-End Autonomous Driving Han Lu et.al. 2403.02877 null
2024-03-05 DDF: A Novel Dual-Domain Image Fusion Strategy for Remote Sensing Image Semantic Segmentation with Unsupervised Domain Adaptation Lingyan Ran et.al. 2403.02784 null
2024-03-05 Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels Zhuohong Li et.al. 2403.02746 null
2024-03-05 FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View Jiawei Hou et.al. 2403.02710 null
2024-03-05 Deep Common Feature Mining for Efficient Video Semantic Segmentation Yaoyan Zheng et.al. 2403.02689 null
2024-03-04 Self-Supervised Facial Representation Learning with Facial Region Awareness Zheng Gao et.al. 2403.02138 null
2024-03-04 Semi-Supervised Semantic Segmentation Based on Pseudo-Labels: A Survey Lingyan Ran et.al. 2403.01909 null
2024-03-04 Map-aided annotation for pole base detection Benjamin Missaoui et.al. 2403.01868 null
2024-03-04 AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation Haonan Wang et.al. 2403.01818 link
2024-03-02 Benchmarking Segmentation Models with Mask-Preserved Attribute Editing Zijin Yin et.al. 2403.01231 link
2024-03-02 Boosting Box-supervised Instance Segmentation with Pseudo Depth Xinyi Yu et.al. 2403.01214 null
2024-03-02 Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation Lian Xu et.al. 2403.01156 null
2024-03-01 Rethinking Few-shot 3D Point Cloud Semantic Segmentation Zhaochong An et.al. 2403.00592 link
2024-03-01 Small, Versatile and Mighty: A Range-View Perception Framework Qiang Meng et.al. 2403.00325 null
2024-03-01 YOLO-MED : Multi-Task Interaction Network for Biomedical Images Suizhi Huang et.al. 2403.00245 null
2024-02-29 FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything Safouane El Ghazouali et.al. 2403.00175 link
2024-02-29 Leveraging AI Predicted and Expert Revised Annotations in Interactive Segmentation: Continual Tuning or Full Training? Tiezheng Zhang et.al. 2402.19423 null
2024-03-01 PEM: Prototype-based Efficient MaskFormer for Image Segmentation Niccolò Cavagnero et.al. 2402.19422 link
2024-02-29 RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation Jie Zhang et.al. 2402.19004 null
2024-02-28 Spatial Coherence Loss for Salient and Camouflaged Object Detection and Beyond Ziyun Yang et.al. 2402.18698 null
2024-02-29 Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation Zhiwei Yang et.al. 2402.18467 link
2024-02-29 A Modular System for Enhanced Robustness of Multimedia Understanding Networks via Deep Parametric Estimation Francesco Barbato et.al. 2402.18402 null
2024-02-28 Enhancing Roadway Safety: LiDAR-based Tree Clearance Analysis Miriam Louise Carnot et.al. 2402.18309 null
2024-02-28 Feature Denoising For Low-Light Instance Segmentation Using Weighted Non-Local Blocks Joanne Lin et.al. 2402.18307 null
2024-02-28 Self-Supervised Learning in Electron Microscopy: Towards a Foundation Model for Advanced Image Analysis Bashir Kazimi et.al. 2402.18286 null
2024-02-28 PRCL: Probabilistic Representation Contrastive Learning for Semi-Supervised Semantic Segmentation Haoyu Xie et.al. 2402.18117 null
2024-02-28 Spannotation: Enhancing Semantic Segmentation for Autonomous Navigation with Efficient Image Annotation Samuel O. Folorunsho et.al. 2402.18084 link
2024-02-27 Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation Xinyu Yang et.al. 2402.17891 link
2024-02-27 Mitigating Distributional Shift in Semantic Segmentation via Uncertainty Estimation from Unlabelled Data David S. W. Williams et.al. 2402.17653 null
2024-02-27 Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling David S. W. Williams et.al. 2402.17622 null

(back to top)

Object Tracking

Publish Date Title Authors PDF Code
2024-05-20 DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM Xuchen Li et.al. 2405.12139 null
2024-05-19 Track Anything Rapter(TAR) Tharun V. Puthanveettil et.al. 2405.11655 link
2024-05-19 RobMOT: Robust 3D Multi-Object Tracking by Observational Noise and State Estimation Drift Mitigation on LiDAR PointCloud Mohamed Nagy et.al. 2405.11536 null
2024-05-18 City-Scale Multi-Camera Vehicle Tracking System with Improved Self-Supervised Camera Link Model Yuqiang Lin et.al. 2405.11345 null
2024-05-17 Air Signing and Privacy-Preserving Signature Verification for Digital Documents P. Sarveswarasarma et.al. 2405.10868 null
2024-05-16 A Novel Bounding Box Regression Method for Single Object Tracking Omar Abdelaziz et.al. 2405.10444 null
2024-05-16 Beyond Traditional Single Object Tracking: A Survey Omar Abdelaziz et.al. 2405.10439 null
2024-05-16 Spatial Cognition: a Wave Hypothesis Robert Worden et.al. 2405.10112 null
2024-05-14 Learning Correspondence for Deformable Objects Priya Sundaresan et.al. 2405.08996 null
2024-05-14 ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association Shuxiao Ding et.al. 2405.08909 link
2024-05-12 MAML MOT: Multiple Object Tracking based on Meta-Learning Jiayi Chen et.al. 2405.07272 null
2024-05-16 Common Corruptions for Enhancing and Evaluating Robustness in Air-to-Air Visual Object Detection Anastasios Arsenos et.al. 2405.06765 null
2024-05-16 Ensuring UAV Safety: A Vision-only and Real-time Framework for Collision Avoidance Through Object Detection, Tracking, and Distance Estimation Vasileios Karampinis et.al. 2405.06749 null
2024-05-10 Multi-Object Tracking in the Dark Xinzhe Wang et.al. 2405.06600 link
2024-05-09 Outlier-robust Kalman Filtering through Generalised Bayes Gerardo Duran-Martin et.al. 2405.05646 link
2024-05-08 MOTLEE: Collaborative Multi-Object Tracking Using Temporal Consistency for Neighboring Robot Frame Alignment Mason B. Peterson et.al. 2405.05210 link
2024-05-08 TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking Pengcheng Shao et.al. 2405.05004 link
2024-05-07 DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving Chen Min et.al. 2405.04390 null
2024-05-07 Bayesian Simultaneous Localization and Multi-Lane Tracking Using Onboard Sensors and a SD Map Yuxuan Xia et.al. 2405.04290 null
2024-05-06 Collecting Consistently High Quality Object Tracks with Minimal Human Involvement by Using Self-Supervised Learning to Detect Tracker Errors Samreen Anjum et.al. 2405.03643 null
2024-05-03 Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning Dhruva Tirumala et.al. 2405.02425 null
2024-05-03 DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos Wen-Hsuan Chu et.al. 2405.02280 link
2024-05-02 Tracking and classifying objects with DAS data along railway Simon L. B. Fredriksen et.al. 2405.01140 null
2024-04-29 Innovative Integration of Visual Foundation Model with a Robotic Arm on a Mobile Platform Shimian Zhang et.al. 2404.18720 null
2024-04-27 3D Extended Object Tracking by Fusing Roadside Sparse Radar Point Clouds and Pixel Keypoints Jiayin Deng et.al. 2404.17903 link
2024-04-22 360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos Yinzhe Xu et.al. 2404.13953 null
2024-04-22 TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos Atom Scott et.al. 2404.13868 null
2024-04-19 A comparison between single-stage and two-stage 3D tracking algorithms for greenhouse robotics David Rapado-Rincon et.al. 2404.12963 null
2024-04-18 Inverse Neural Rendering for Explainable Multi-Object Tracking Julian Ost et.al. 2404.12359 null
2024-04-24 On Target Detection in the Presence of Clutter in Joint Communication and Sensing Cellular Networks Julia Vinogradova et.al. 2404.12133 null
2024-04-18 MLS-Track: Multilevel Semantic Interaction in RMOT Zeliang Ma et.al. 2404.12031 null
2024-04-18 KnotResolver: Tracking self-intersecting filaments in microscopy using directed graphs Dhruv Khatri et.al. 2404.12029 link
2024-04-17 How to deal with glare for improved perception of Autonomous Vehicles Muhammad Z. Alam et.al. 2404.10992 null
2024-04-12 Into the Fog: Evaluating Multiple Object Tracking Robustness Nadezda Kirillova et.al. 2404.10534 link
2024-04-15 3D Face Tracking from 2D Video through Iterative Dense UV to Image Flow Felix Taubner et.al. 2404.09819 null
2024-04-12 IDD-X: A Multi-View Dataset for Ego-relative Important Object Localization and Explanation in Dense and Unstructured Traffic Chirag Parikh et.al. 2404.08561 null
2024-04-11 Gaga: Group Any Gaussians via 3D-aware Memory Bank Weijie Lyu et.al. 2404.07977 null
2024-04-11 SFSORT: Scene Features-based Simple Online Real-Time Tracker M. M. Morsali et.al. 2404.07553 link
2024-04-11 PillarTrack: Redesigning Pillar-based Transformer Network for Single Object Tracking on Point Clouds Weisheng Xu et.al. 2404.07495 link
2024-04-11 Trashbusters: Deep Learning Approach for Litter Detection and Tracking Kashish Jain et.al. 2404.07467 null
2024-04-09 LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks Jianlang Chen et.al. 2404.06247 link
2024-04-08 DepthMOT: Depth Cues Lead to a Strong Multi-Object Tracker Jiapeng Wu et.al. 2404.05518 link
2024-04-08 Self-Supervised Multi-Object Tracking with Path Consistency Zijia Lu et.al. 2404.05136 link
2024-04-07 Spatial Cognition from Egocentric Video: Out of Sight, Not Out of Mind Chiara Plizzari et.al. 2404.05072 null
2024-04-03 Ego-Motion Aware Target Prediction Module for Robust Multi-Object Tracking Navid Mahdian et.al. 2404.03110 link
2024-04-03 Representation Alignment Contrastive Regularization for Multi-Object Tracking Shujie Chen et.al. 2404.02562 link
2024-03-29 Bayesian Nonparametrics: An Alternative to Deep Learning Bahman Moraffah et.al. 2404.00085 null
2024-03-29 MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark Sanghyun Woo et.al. 2403.20225 null
2024-03-29 SceneTracker: Long-term Scene Flow Estimation Network Bo Wang et.al. 2403.19924 null
2024-03-27 Enhancing Multiple Object Tracking Accuracy via Quantum Annealing Yasuyuki Ihara et.al. 2403.18908 null
2024-03-27 TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes Liangyu Xu et.al. 2403.18238 null
2024-03-27 Middle Fusion and Multi-Stage, Multi-Form Prompts for Robust RGB-T Tracking Qiming Wang et.al. 2403.18193 null
2024-03-26 OmniVid: A Generative Framework for Universal Video Understanding Junke Wang et.al. 2403.17935 link
2024-03-26 Exploring Dynamic Transformer for Efficient Object Tracking Jiawen Zhu et.al. 2403.17651 null
2024-03-25 Multiple Object Tracking as ID Prediction Ruopeng Gao et.al. 2403.16848 link
2024-03-25 From Two Stream to One Stream: Efficient RGB-T Tracking via Mutual Prompt Learning and Knowledge Distillation Yang Luo et.al. 2403.16834 null
2024-03-29 Elysium: Exploring Object-level Perception in Videos via MLLM Han Wang et.al. 2403.16558 link
2024-03-25 Spike-NeRF: Neural Radiance Field Based On Spike Camera Yijia Guo et.al. 2403.16410 null
2024-03-28 SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking Xiaojun Hou et.al. 2403.16002 link
2024-03-23 Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking Shaoyu Sun et.al. 2403.15831 null
2024-03-23 PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search Chensheng Peng et.al. 2403.15712 link
2024-03-22 CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking Nicolas Baumann et.al. 2403.15313 null
2024-03-22 Reasoning-Enhanced Object-Centric Learning for Videos Jian Li et.al. 2403.15245 null
2024-03-20 Fast-Poly: A Fast Polyhedral Framework For 3D Multi-Object Tracking Xiaoyu Li et.al. 2403.13443 link
2024-03-19 Lifting Multi-View Detection and Tracking to the Bird's Eye View Torben Teepe et.al. 2403.12573 link
2024-03-18 Pedestrian Tracking with Monocular Camera using Unconstrained 3D Motion Model Jan Krejčí et.al. 2403.11978 null
2024-03-17 NetTrack: Tracking Highly Dynamic Objects with a Net Guangze Zheng et.al. 2403.11186 null
2024-03-16 View-Centric Multi-Object Tracking with Homographic Matching in Moving UAV Deyi Ji et.al. 2403.10830 null
2024-03-16 Exploring Learning-based Motion Models in Multi-Object Tracking Hsiang-Wei Huang et.al. 2403.10826 null
2024-03-15 NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices Zhiyong Zhang et.al. 2403.10425 link
2024-03-14 OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning Lingyi Hong et.al. 2403.09634 null
2024-03-13 Object Permanence Filter for Robust Tracking with Interactive Robots Shaoting Peng et.al. 2403.08231 null
2024-03-12 Learning Data Association for Multi-Object Tracking using Only Coordinates Mehdi Miah et.al. 2403.08018 null
2024-03-12 A Study on Centralised and Decentralised Swarm Robotics Architecture for Part Delivery System Angelos Dimakos et.al. 2403.07635 null
2024-03-12 LiDAR Point Cloud-based Multiple Vehicle Tracking with Probabilistic Measurement-Region Association Guanhua Ding et.al. 2403.06423 null
2024-03-09 SSF-Net: Spatial-Spectral Fusion Network with Spectral Angle Awareness for Hyperspectral Object Tracking Hanzheng Wang et.al. 2403.05852 null
2024-03-09 Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline Xiao Wang et.al. 2403.05839 link
2024-03-11 Beyond MOT: Semantic Multi-Object Tracking Yunhao Li et.al. 2403.05021 null
2024-03-07 Delving into the Trajectory Long-tail Distribution for Muti-object Tracking Sijia Chen et.al. 2403.04700 link
2024-03-07 Towards learning-based planning:The nuPlan benchmark for real-world autonomous driving Napat Karnchanachari et.al. 2403.04133 null
2024-03-06 Multi-Object Tracking with Camera-LiDAR Fusion for Autonomous Driving Riccardo Pieroni et.al. 2403.04112 null
2024-03-06 VastTrack: Vast Category Visual Object Tracking Liang Peng et.al. 2403.03493 link
2024-03-05 DeconfuseTrack:Dealing with Confusion for Multi-Object Tracking Cheng Huang et.al. 2403.02767 null
2024-03-04 DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction Weiyi Lv et.al. 2403.02075 null
2024-03-04 Integrating Efficient Optimal Transport and Functional Maps For Unsupervised Shape Correspondence Learning Tung Le et.al. 2403.01781 null
2024-03-01 Joint Spatial-Temporal Calibration for Camera and Global Pose Sensor Junlin Song et.al. 2403.00976 null
2024-02-28 Estimation of railway vehicle response for track geometry evaluation using branch Fourier neural operator Qingjing Wang et.al. 2402.18366 null
2024-02-28 EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving Jiacheng Lin et.al. 2402.18302 link
2024-02-28 Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks Zhewei Wu et.al. 2402.17976 null
2024-02-27 SWTrack: Multiple Hypothesis Sliding Window 3D Multi-Object Tracking Sandro Papais et.al. 2402.17892 null
2024-02-27 In Defense and Revival of Bayesian Filtering for Thermal Infrared Object Tracking Peng Gao et.al. 2402.17098 null
2024-02-26 Searching a Lightweight Network Architecture for Thermal Infrared Pedestrian Tracking Peng Gao et.al. 2402.16570 null
2024-02-26 SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking Yu Lin et.al. 2402.16249 null
2024-02-26 Real-Time Vehicle Detection and Urban Traffic Behavior Analysis Based on UAV Traffic Videos on Mobile Devices Yuan Zhu et.al. 2402.16246 null
2024-02-24 Multi-Object Tracking by Hierarchical Visual Representations Jinkun Cao et.al. 2402.15895 null
2024-02-24 Detection Is Tracking: Point Cloud Multi-Sweep Deep Learning Models Revisited Lingji Chen et.al. 2402.15756 null

(back to top)

Action Recognition

Publish Date Title Authors PDF Code
2024-05-20 Building Temporal Kernels with Orthogonal Polynomials Yan Ru Pei et.al. 2405.12179 link
2024-05-18 GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition Mallika Garg et.al. 2405.11180 link
2024-05-17 Air Signing and Privacy-Preserving Signature Verification for Digital Documents P. Sarveswarasarma et.al. 2405.10868 null
2024-05-17 MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains Zhaohuan Zhan et.al. 2405.10620 null
2024-05-06 MEET: Mixture of Experts Extra Tree-Based sEMG Hand Gesture Identification Naveen Gehlot et.al. 2405.09562 null
2024-05-14 Wearable Sensor-Based Few-Shot Continual Learning on Hand Gestures for Motor-Impaired Individuals via Latent Embedding Exploitation Riyad Bin Rafiq et.al. 2405.08969 link
2024-05-14 The impact of Compositionality in Zero-shot Multi-label action recognition for Object-based tasks Carmela Calabrese et.al. 2405.08695 null
2024-05-15 POWQMIX: Weighted Value Factorization with Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning Chang Huang et.al. 2405.08036 null
2024-05-13 Coarse or Fine? Recognising Action End States without Labels Davide Moltisanti et.al. 2405.07723 link
2024-05-11 PRENet: A Plane-Fit Redundancy Encoding Point Cloud Sequence Network for Real-Time 3D Action Recognition Shenglin He et.al. 2405.06929 null
2024-05-10 CasCalib: Cascaded Calibration for Motion Capture from Sparse Unsynchronized Cameras James Tang et.al. 2405.06845 link
2024-05-09 A Survey on Backbones for Deep Video Action Recognition Zixuan Tang et.al. 2405.05584 null
2024-05-06 OmniActions: Predicting Digital Actions in Response to Real-World Multimodal Sensory Inputs with LLMs Jiahao Nick Li et.al. 2405.03901 null
2024-05-05 JOSENet: A Joint Stream Embedding Network for Violence Detection in Surveillance Videos Pietro Nardelli et.al. 2405.02961 null
2024-05-03 On the Utility of External Agent Intention Predictor for Human-AI Coordination Chenxu Wang et.al. 2405.02229 null
2024-05-11 MVP-Shot: Multi-Velocity Progressive-Alignment Framework for Few-Shot Action Recognition Hongyu Qu et.al. 2405.02077 null
2024-05-03 Enhancing Micro Gesture Recognition for Emotion Understanding via Context-aware Visual-Text Contrastive Learning Deng Li et.al. 2405.01885 link
2024-05-02 Multi-view Action Recognition via Directed Gromov-Wasserstein Discrepancy Hoang-Quan Nguyen et.al. 2405.01337 null
2024-05-07 Towards Inclusive Face Recognition Through Synthetic Ethnicity Alteration Praveen Kumar Chandaliya et.al. 2405.01273 null
2024-04-30 One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features Trung Thanh Nguyen et.al. 2404.19542 link
2024-04-30 Cross-Block Fine-Grained Semantic Cascade for Skeleton-Based Sports Action Recognition Zhendong Liu et.al. 2404.19383 null
2024-04-28 Enhancing Action Recognition from Low-Quality Skeleton Data via Part-Level Knowledge Distillation Cuiwei Liu et.al. 2404.18206 null
2024-04-26 SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes Georgia Baltsou et.al. 2404.17255 null
2024-04-25 Learning Discriminative Spatio-temporal Representations for Semi-supervised Action Recognition Yu Wang et.al. 2404.16416 null
2024-04-25 An Improved Graph Pooling Network for Skeleton-Based Action Recognition Cong Wu et.al. 2404.16359 null
2024-04-24 Unimodal and Multimodal Sensor Fusion for Wearable Activity Recognition Hymalai Bello et.al. 2404.16005 null
2024-04-24 3D Face Morphing Attack Generation using Non-Rigid Registration Jag Mohan Singh et.al. 2404.15765 null
2024-04-25 HDBN: A Novel Hybrid Dual-branch Network for Robust Skeleton-based Action Recognition Jinfu Liu et.al. 2404.15719 link
2024-04-23 Combating Missing Modalities in Egocentric Videos at Test Time Merey Ramazanova et.al. 2404.15161 null
2024-04-23 G3R: Generating Rich and Fine-grained mmWave Radar Data from 2D Videos for Generalized Gesture Recognition Kaikai Deng et.al. 2404.14934 null
2024-04-23 Driver Activity Classification Using Generalizable Representations from Vision-Language Models Ross Greer et.al. 2404.14906 null
2024-04-23 DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition Haozhe Cheng et.al. 2404.14890 null
2024-04-22 1st Place Solution to the 1st SkatingVerse Challenge Tao Sun et.al. 2404.14032 null
2024-04-22 CoFInAl: Enhancing Action Quality Assessment with Coarse-to-Fine Instruction Alignment Kanglei Zhou et.al. 2404.13999 link
2024-04-21 Attack on Scene Flow using Point Clouds Haniyeh Ehsani Oskouie et.al. 2404.13621 null
2024-04-20 STAT: Towards Generalizable Temporal Action Localization Yangcen Liu et.al. 2404.13311 null
2024-04-19 Ring-a-Pose: A Ring for Continuous Hand Pose Tracking Tianhong Catherine Yu et.al. 2404.12980 null
2024-04-19 VoxAtnNet: A 3D Point Clouds Convolutional Neural Network for Generalizable Face Presentation Attack Detection Raghavendra Ramachandra et.al. 2404.12680 null
2024-04-18 DeepLocalization: Using change point detection for Temporal Action Localization Mohammed Shaiqur Rahman et.al. 2404.12258 null
2024-04-18 Aligning Actions and Walking to LLM-Generated Textual Descriptions Radu Chivereanu et.al. 2404.12192 link
2024-04-18 Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition Xunsong Li et.al. 2404.11903 null
2024-04-18 sEMG-based Fine-grained Gesture Recognition via Improved LightGBM Model Xiupeng Qiao et.al. 2404.11861 null
2024-04-17 VG4D: Vision-Language Model Goes 4D Video Recognition Zhichao Deng et.al. 2404.11605 link
2024-04-17 A Data-Driven Representation for Sign Language Production Harry Walsh et.al. 2404.11499 link
2024-04-17 Lower Limb Movements Recognition Based on Feature Recursive Elimination and Backpropagation Neural Network Yongkai Ma et.al. 2404.11383 null
2024-04-17 Revisiting Noise Resilience Strategies in Gesture Recognition: Short-Term Enhancement in Surface Electromyographic Signal Analysis Weiyu Guo et.al. 2404.11213 null
2024-04-17 Kathakali Hand Gesture Recognition With Minimal Data Kavitha Raju et.al. 2404.11205 null
2024-04-16 HumMUSS: Human Motion Understanding using State Space Models Arnab Kumar Mondal et.al. 2404.10880 null
2024-04-17 Learning to Score Sign Language with Two-stage Method Hongli Wen et.al. 2404.10383 null
2024-04-16 MK-SGN: A Spiking Graph Convolutional Network with Multimodal Fusion and Knowledge Distillation for Skeleton-based Action Recognition Naichuan Zheng et.al. 2404.10210 null
2024-04-15 Design and Analysis of Efficient Attention in Transformers for Social Group Activity Recognition Masato Tamura et.al. 2404.09964 null
2024-04-15 A Diffusion-based Data Generator for Training Object Recognition Models in Ultra-Range Distance Eran Bamani et.al. 2404.09846 null
2024-04-15 Leveraging Temporal Contextualization for Video Action Recognition Minji Kim et.al. 2404.09490 null
2024-04-14 In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition Wiktor Mucha et.al. 2404.09308 null
2024-04-13 Exploring Explainability in Video Action Recognition Avinab Saha et.al. 2404.09067 null
2024-04-12 MSSTNet: A Multi-Scale Spatio-Temporal CNN-Transformer Network for Dynamic Facial Expression Recognition Linhuang Wang et.al. 2404.08433 null
2024-04-11 Graph Integrated Language Transformers for Next Action Prediction in Complex Phone Calls Amin Hosseiny Marani et.al. 2404.08155 null
2024-04-11 Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos Soumyabrata Chaudhuri et.al. 2404.07645 null
2024-04-15 Fine-Grained Side Information Guided Dual-Prompts for Zero-Shot Skeleton Action Recognition Yang Chen et.al. 2404.07487 null
2024-04-10 O-TALC: Steps Towards Combating Oversegmentation within Online Action Segmentation Matthew Kent Myers et.al. 2404.06894 null
2024-04-10 An Animation-based Augmentation Approach for Action Recognition from Discontinuous Video Xingyu Song et.al. 2404.06741 null
2024-04-07 X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model Jan Held et.al. 2404.06332 null
2024-04-10 Algorithms for Caching and MTS with reduced number of predictions Karim Abdel Sadek et.al. 2404.06280 null
2024-04-09 ActNetFormer: Transformer-ResNet Hybrid Method for Semi-Supervised Action Recognition in Videos Sharana Dharshikgan Suresh Dass et.al. 2404.06243 link
2024-04-08 Localizing Moments of Actions in Untrimmed Videos of Infants with Autism Spectrum Disorder Halil Ismail Helvaci et.al. 2404.05849 null
2024-04-09 TIM: A Time Interval Machine for Audio-Visual Action Recognition Jacob Chalk et.al. 2404.05559 link
2024-04-11 Test-Time Zero-Shot Temporal Action Localization Benedetta Liberatori et.al. 2404.05426 link
2024-04-09 SDFR: Synthetic Data for Face Recognition Competition Hatef Otroshi Shahreza et.al. 2404.04580 null
2024-04-05 PhysPT: Physics-aware Pretrained Transformer for Estimating Human Dynamics from Monocular Videos Yufei Zhang et.al. 2404.04430 null
2024-04-05 Koala: Key frame-conditioned long video-LLM Reuben Tan et.al. 2404.04346 null
2024-04-04 UniAV: Unified Audio-Visual Perception for Multi-Task Video Localization Tiantian Geng et.al. 2404.03179 null
2024-04-03 Optimizing the Deployment of Tiny Transformers on Low-Power MCUs Victor J. B. Jung et.al. 2404.02945 link
2024-04-03 Multi-Scale Spatial-Temporal Self-Attention Graph Convolutional Networks for Skeleton-based Action Recognition Ikuo Nakamura et.al. 2404.02624 null
2024-04-02 PREGO: online mistake detection in PRocedural EGOcentric videos Alessandro Flaborea et.al. 2404.01933 link
2024-04-02 Disentangled Pre-training for Human-Object Interaction Detection Zhuolong Li et.al. 2404.01725 link
2024-04-02 Language Model Guided Interpretable Video Action Reasoning Ning Wang et.al. 2404.01591 null
2024-04-02 Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery Christian Limberg et.al. 2404.01571 null
2024-04-01 LoSA: Long-Short-range Adapter for Scaling End-to-End Temporal Action Localization Akshita Gupta et.al. 2404.01282 null
2024-03-31 LLMs are Good Action Recognizers Haoxuan Qu et.al. 2404.00532 null
2024-03-29 Latent Embedding Clustering for Occlusion Robust Head Pose Estimation José Celestino et.al. 2403.20251 null
2024-03-29 A Unified Framework for Human-centric Point Cloud Video Understanding Yiteng Xu et.al. 2403.20031 null
2024-03-28 Zero-shot Prompt-based Video Encoder for Surgical Gesture Recognition Mingxing Rao et.al. 2403.19786 link
2024-03-28 Hypergraph-based Multi-View Action Recognition using Event Cameras Yue Gao et.al. 2403.19316 null
2024-03-27 PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal Action Localization Edward Fish et.al. 2403.18915 null
2024-03-27 iFace: Hand-Over-Face Gesture Recognition Leveraging Impedance Sensing Mengxi Liu et.al. 2403.18433 null
2024-03-27 An Evolutionary Network Architecture Search Framework with Adaptive Multimodal Fusion for Hand Gesture Recognition Yizhang Xia et.al. 2403.18208 null
2024-03-26 OmniVid: A Generative Framework for Universal Video Understanding Junke Wang et.al. 2403.17935 link
2024-03-25 Understanding Long Videos in One Multimodal Language Model Pass Kanchana Ranasinghe et.al. 2403.16998 link
2024-03-25 Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects Zicong Fan et.al. 2403.16428 null
2024-03-24 Emotion Recognition from the perspective of Activity Recognition Savinay Nagendra et.al. 2403.16263 null
2024-03-22 InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Yi Wang et.al. 2403.15377 link
2024-03-22 Gesture-Controlled Aerial Robot Formation for Human-Swarm Interaction in Safety Monitoring Applications Vít Krátký et.al. 2403.15333 null
2024-03-22 GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition Lei Jiang et.al. 2403.15212 link
2024-03-21 Transfer Learning for Cross-dataset Isolated Sign Language Recognition in Under-Resourced Datasets Ahmet Alp Kindiroglu et.al. 2403.14534 link
2024-03-20 Hierarchical NeuroSymbolic Approach for Action Quality Assessment Lauren Okamoto et.al. 2403.13798 null
2024-03-19 Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition Filip Ilic et.al. 2403.12710 null
2024-03-19 ExACT: Language-guided Conceptual Reasoning and Uncertainty Estimation for Event-based Action Recognition and More Jiazhou Zhou et.al. 2403.12534 null
2024-03-19 VideoBadminton: A Video Dataset for Badminton Action Recognition Qi Li et.al. 2403.12385 null
2024-03-19 Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level Perception Vijay John et.al. 2403.11616 null
2024-03-19 VIHE: Virtual In-Hand Eye Transformer for 3D Robotic Manipulation Weiyao Wang et.al. 2403.11461 null
2024-03-17 A Lie Group Approach to Riemannian Batch Normalization Ziheng Chen et.al. 2403.11261 link
2024-03-17 Boosting Semi-Supervised Temporal Action Localization by Learning from Non-Target Classes Kun Xia et.al. 2403.11189 null
2024-03-16 CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing Yin Li et.al. 2403.10796 null
2024-03-15 CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner Tingbing Yan et.al. 2403.10082 null
2024-03-15 Skeleton-Based Human Action Recognition with Noisy Labels Yi Xu et.al. 2403.09975 null
2024-03-14 On the Utility of 3D Hand Poses for Action Recognition Md Salman Shamil et.al. 2403.09805 null
2024-03-14 3D-VLA: A 3D Vision-Language-Action Generative World Model Haoyu Zhen et.al. 2403.09631 null
2024-03-14 SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition Jeonghyeok Do et.al. 2403.09508 link
2024-03-14 EventRPG: Event Data Augmentation with Relevance Propagation Guidance Mingyuan Sun et.al. 2403.09274 link
2024-03-14 Leveraging Foundation Model Automatic Data Augmentation Strategies and Skeletal Points for Hands Action Recognition in Industrial Assembly Lines Liang Wu et.al. 2403.09056 null
2024-03-13 Low-Cost and Real-Time Industrial Human Action Recognitions Based on Large-Scale Foundation Models Wensheng Liang et.al. 2403.08420 null
2024-03-13 NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation Ran Xu et.al. 2403.08355 null
2024-03-13 ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation Guanxing Lu et.al. 2403.08321 null
2024-03-12 NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning Bingqian Lin et.al. 2403.07376 link
2024-03-12 BID: Boundary-Interior Decoding for Unsupervised Temporal Action Localization Pre-Trainin Qihang Fang et.al. 2403.07354 null
2024-03-11 Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling Wele Gedara Chaminda Bandara et.al. 2403.06978 link
2024-03-11 Deep Learning Approaches for Human Action Recognition in Video Data Yufei Xie et.al. 2403.06810 null
2024-03-11 Real-Time Multimodal Cognitive Assistant for Emergency Medical Services Keshara Weerasinghe et.al. 2403.06734 null
2024-03-11 Multimodal Transformers for Real-Time Surgical Activity Prediction Keshara Weerasinghe et.al. 2403.06705 link
2024-03-11 epsilon-Mesh Attack: A Surface-based Adversarial Point Cloud Attack for Facial Expression Recognition Batuhan Cengiz et.al. 2403.06661 null
2024-03-11 Density-Guided Label Smoothing for Temporal Localization of Driving Actions Tunc Alkanat et.al. 2403.06616 null
2024-03-11 Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition Erkut Akdag et.al. 2403.06577 null
2024-03-10 Coherent Temporal Synthesis for Incremental Action Segmentation Guodong Ding et.al. 2403.06102 null
2024-03-09 Dissecting Deep RL with High Update Ratios: Combatting Value Overestimation and Divergence Marcel Hussing et.al. 2403.05996 null
2024-03-08 Benchmarking Micro-action Recognition: Dataset, Methods, and Applications Dan Guo et.al. 2403.05234 link
2024-03-06 Video Relationship Detection Using Mixture of Experts Ala Shaabana et.al. 2403.03994 null
2024-03-05 Behavior Generation with Latent Actions Seungjae Lee et.al. 2403.03181 link
2024-03-05 Learning to Use Tools via Cooperative and Interactive Agents Zhengliang Shi et.al. 2403.03031 null
2024-03-04 Gesture recognition with Brownian reservoir computing using geometrically confined skyrmion dynamics Grischa Beneke et.al. 2403.01877 null
2024-03-04 A Simple Baseline for Efficient Hand Mesh Reconstruction Zhishan Zhou et.al. 2403.01813 null
2024-03-03 A Unified Model Selection Technique for Spectral Clustering Based Motion Segmentation Yuxiang Huang et.al. 2403.01606 null
2024-03-03 Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition Kun-Yu Lin et.al. 2403.01560 link
2024-03-02 Dynamic 3D Point Cloud Sequences as 2D Videos Yiming Zeng et.al. 2403.01129 null
2024-02-29 On the Design of Human-Robot Collaboration Gestures Anas Shrinah et.al. 2402.19058 null
2024-02-23 Multimodal Transformer With a Low-Computational-Cost Guarantee Sungjin Park et.al. 2402.15096 null
2024-02-17 Implementation of a Model of the Cortex Basal Ganglia Loop Naoya Arakawa et.al. 2402.13275 null
2024-02-20 Radar-Based Recognition of Static Hand Gestures in American Sign Language Christian Schuessler et.al. 2402.12800 null
2024-02-20 Learning Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition Yuke Li et.al. 2402.12706 null
2024-02-19 Comprehensive Cognitive LLM Agent for Smartphone GUI Automation Xinbei Ma et.al. 2402.11941 null
2024-02-15 Hand Shape and Gesture Recognition using Multiscale Template Matching, Background Subtraction and Binary Image Analysis Ketan Suhaas Saichandran et.al. 2402.09663 null
2024-02-14 TikTokActions: A TikTok-Derived Video Dataset for Human Action Recognition Yang Qian et.al. 2402.08875 null
2024-02-13 BdSLW60: A Word-Level Bangla Sign Language Dataset Husne Ara Rubaiyeat et.al. 2402.08635 link
2024-02-13 Vision-Based Hand Gesture Customization from a Single Demonstration Soroush Shahi et.al. 2402.08420 null
2024-02-12 PBADet: A One-Stage Anchor-Free Approach for Part-Body Association Zhongpai Gao et.al. 2402.07814 null

(back to top)

Pose Estimation

Publish Date Title Authors PDF Code
2024-05-21 Leveraging Neural Radiance Fields for Pose Estimation of an Unknown Space Object during Proximity Operations Antoine Legrand et.al. 2405.12728 null
2024-05-21 PoseGravity: Pose Estimation from Points and Lines with Axis Prior Akshay Chandrasekhar et.al. 2405.12646 null
2024-05-19 Focus on Low-Resolution Information: Multi-Granular Information-Lossless Model for Low-Resolution Human Pose Estimation Zejun Gu et.al. 2405.12247 null
2024-05-20 AutoSoccerPose: Automated 3D posture Analysis of Soccer Shot Movements Calvin Yeung et.al. 2405.12070 link
2024-05-19 Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries Christiaan G. A. Viviers et.al. 2405.11677 link
2024-05-19 Cross-Domain Knowledge Distillation for Low-Resolution Human Pose Estimation Zejun Gu et.al. 2405.11448 null
2024-05-18 PS6D: Point Cloud Based Symmetry-Aware 6D Object Pose Estimation in Robot Bin-Picking Yifan Yang et.al. 2405.11257 null
2024-05-18 MotionGS : Compact Gaussian Splatting SLAM by Motion Filter Xinli Guo et.al. 2405.11129 link
2024-05-17 Resolving Symmetry Ambiguity in Correspondence-based Methods for Instance-level Object Pose Estimation Yongliang Lin et.al. 2405.10557 null
2024-05-16 Diversity-Aware Sign Language Production through a Pose Encoding Variational Autoencoder Mohamed Ilyes Lakhal et.al. 2405.10423 null
2024-05-17 Toon3D: Seeing Cartoons from a New Perspective Ethan Weber et.al. 2405.10320 null
2024-05-15 Task-adaptive Q-Face Haomiao Sun et.al. 2405.09059 null
2024-05-14 RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images Zong-Wei Hong et.al. 2405.08483 link
2024-05-14 TP3M: Transformer-based Pseudo 3D Image Matching with Reference Liming Han et.al. 2405.08434 null
2024-05-13 Deep Learning-Based Object Pose Estimation: A Comprehensive Survey Jian Liu et.al. 2405.07801 link
2024-05-13 JointLoc: A Real-time Visual Localization Framework for Planetary UAVs Based on Joint Relative and Absolute Pose Estimation Xubo Luo et.al. 2405.07429 link
2024-05-11 TD-NeRF: Novel Truncated Depth Prior for Joint Camera Pose and Neural Radiance Field Optimization Zhen Tan et.al. 2405.07027 null
2024-05-11 AHPPEBot: Autonomous Robot for Tomato Harvesting based on Phenotyping and Pose Estimation Xingxu Li et.al. 2405.06959 null
2024-05-10 CasCalib: Cascaded Calibration for Motion Capture from Sparse Unsynchronized Cameras James Tang et.al. 2405.06845 link
2024-05-10 MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization Pengcheng Zhu et.al. 2405.06241 null
2024-05-10 Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera Haixin Shi et.al. 2405.05858 null
2024-05-09 Semi-Autonomous Laparoscopic Robot Docking with Learned Hand-Eye Information Fusion Huanyu Tian et.al. 2405.05817 null
2024-05-09 NeuRSS: Enhancing AUV Localization and Bathymetric Mapping with Neural Rendering for Sidescan SLAM Yiping Xie et.al. 2405.05807 null
2024-05-09 Benchmarking Neural Radiance Fields for Autonomous Robots: An Overview Yuhang Ming et.al. 2405.05526 null
2024-05-08 Adversary-Guided Motion Retargeting for Skeleton Anonymization Thomas Carr et.al. 2405.05428 null
2024-05-08 FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models Jinglin Xu et.al. 2405.05216 link
2024-05-08 ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion Bing Zhu et.al. 2405.05164 null
2024-05-08 GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration Estimation Ivan Bilić et.al. 2405.04890 null
2024-05-07 Learning Distributional Demonstration Spaces for Task-Specific Cross-Pose Estimation Jenny Wang et.al. 2405.04609 null
2024-05-07 Speak the Same Language: Global LiDAR Registration on BIM Using Pose Hough Transform Zhijian Qiao et.al. 2405.03969 null
2024-05-07 Joint Estimation of Identity Verification and Relative Pose for Partial Fingerprints Xiongjun Guan et.al. 2405.03959 null
2024-05-06 Pose Priors from Language Models Sanjay Subramanian et.al. 2405.03689 null
2024-05-06 Optimizing Hand Region Detection in MediaPipe Holistic Full-Body Pose Estimation to Improve Accuracy and Avoid Downstream Errors Amit Moryossef et.al. 2405.03545 link
2024-05-05 Multi-hop graph transformer network for 3D human pose estimation Zaedul Islam et.al. 2405.03055 null
2024-05-05 Blending Distributed NeRFs with Tri-stage Robust Pose Optimization Baijun Ye et.al. 2405.02880 null
2024-05-03 WeightedPose: Generalizable Cross-Pose Estimation via Weighted SVD Xuxin Cheng et.al. 2405.02241 null
2024-05-03 Probablistic Restoration with Adaptive Noise Sampling for 3D Human Pose Estimation Xianzhou Zeng et.al. 2405.02114 link
2024-05-03 An Onboard Framework for Staircases Modeling Based on Point Clouds Chun Qing et.al. 2405.01918 null
2024-05-06 ShadowNav: Autonomous Global Localization for Lunar Navigation in Darkness Deegan Atha et.al. 2405.01673 null
2024-05-02 IntervenGen: Interventional Data Generation for Robust and Data-Efficient Robot Imitation Learning Ryan Hoque et.al. 2405.01472 null
2024-05-02 Behavior Imitation for Manipulator Control and Grasping with Deep Reinforcement Learning Liu Qiyuan et.al. 2405.01284 null
2024-05-02 Sports Analysis and VR Viewing System Based on Player Tracking and Pose Estimation with Multimodal and Multiview Sensors Wenxuan Guo et.al. 2405.01112 null
2024-05-02 CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Applications Jan Blumenkamp et.al. 2405.01107 null
2024-05-04 HandSSCA: 3D Hand Mesh Reconstruction with State Space Channel Attention from RGB images Zixun Jiao et.al. 2405.01066 null
2024-05-01 Radar-Based Localization For Autonomous Ground Vehicles In Suburban Neighborhoods Andrew J. Kramer et.al. 2405.00600 null
2024-04-30 Ultra Inertial Poser: Scalable Motion Capture and Tracking from Sparse Inertial Sensors and Ultra-Wideband Ranging Rayan Armani et.al. 2404.19541 link
2024-04-30 UniFS: Universal Few-shot Instance Perception with Point Representations Sheng Jin et.al. 2404.19401 null
2024-04-30 Quater-GCN: Enhancing 3D Human Pose Estimation with Orientation and Semi-supervised Training Xingyu Song et.al. 2404.19279 null
2024-04-30 XFeat: Accelerated Features for Lightweight Image Matching Guilherme Potje et.al. 2404.19174 null
2024-04-29 Self-Avatar Animation in Virtual Reality: Impact of Motion Signals Artifacts on the Full-Body Pose Reconstruction Antoine Maiorca et.al. 2404.18628 null
2024-04-29 Mesh-based Photorealistic and Real-time 3D Mapping for Robust Visual Perception of Autonomous Underwater Vehicle Jungwoo Lee et.al. 2404.18395 null
2024-04-29 Reconstructing Satellites in 3D from Amateur Telescope Images Zhiming Chang et.al. 2404.18394 null
2024-04-27 Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs Yiming Bao et.al. 2404.17837 null
2024-04-26 Localization Through Particle Filter Powered Neural Network Estimated Monocular Camera Poses Yi Shen et.al. 2404.17685 null
2024-04-26 SLAM for Indoor Mapping of Wide Area Construction Environments Vincent Ress et.al. 2404.17215 null
2024-04-25 WheelPose: Data Synthesis Techniques to Improve Pose Estimation Performance on Wheelchair Users William Huang et.al. 2404.17063 link
2024-04-25 Transformer-Based Local Feature Matching for Multimodal Image Registration Remi Delaunay et.al. 2404.16802 null
2024-04-25 DeepKalPose: An Enhanced Deep-Learning Kalman Filter for Temporally Consistent Monocular Vehicle Pose Estimation Leandro Di Bella et.al. 2404.16558 null
2024-04-25 Efficient Solution of Point-Line Absolute Pose Petr Hruby et.al. 2404.16552 link
2024-04-25 COBRA -- COnfidence score Based on shape Regression Analysis for method-independent quality assessment of object pose estimation from single images Panagiotis Sapoutzoglou et.al. 2404.16471 link
2024-04-25 MegaParticles: Range-based 6-DoF Monte Carlo Localization with GPU-Accelerated Stein Particle Filter Kenji Koide et.al. 2404.16370 null
2024-04-24 3D Human Pose Estimation with Occlusions: Introducing BlendMimic3D Dataset and GCN Refinement Filipa Lino et.al. 2404.16136 null
2024-04-23 SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation Xiangyu Xu et.al. 2404.15276 link
2024-04-25 Domain adaptive pose estimation via multi-level alignment Yugan Chen et.al. 2404.14885 link
2024-04-23 Semi-supervised 2D Human Pose Estimation via Adaptive Keypoint Masking Kexin Meng et.al. 2404.14835 null
2024-04-23 UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues Vandad Davoodnia et.al. 2404.14634 null
2024-04-22 DHRNet: A Dual-Path Hierarchical Relation Network for Multi-Person Pose Estimation Yonghao Dang et.al. 2404.14025 null
2024-04-23 CT-NeRF: Incremental Optimizing Neural Radiance Field and Poses with Complex Trajectory Yunlong Ran et.al. 2404.13896 null
2024-04-21 Resampling-free Particle Filters in High-dimensions Akhilan Boopathy et.al. 2404.13698 null
2024-04-20 EC-SLAM: Real-time Dense Neural RGB-D SLAM System with Effectively Constrained Global Bundle Adjustment Guanghao Li et.al. 2404.13346 link
2024-04-18 Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds Oliver Lemke et.al. 2404.12440 null
2024-04-18 Gait Recognition from Highly Compressed Videos Andrei Niculae et.al. 2404.12183 null
2024-04-17 Mushroom Segmentation and 3D Pose Estimation from Point Clouds using Fully Convolutional Geometric Features and Implicit Pose Encoding George Retsinas et.al. 2404.12144 link
2024-04-17 Kathakali Hand Gesture Recognition With Minimal Data Kavitha Raju et.al. 2404.11205 null
2024-04-17 GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement Linfang Zheng et.al. 2404.11139 null
2024-04-17 CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation Lianyu Hu et.al. 2404.11111 link
2024-04-16 HumMUSS: Human Motion Understanding using State Space Models Arnab Kumar Mondal et.al. 2404.10880 null
2024-04-16 Invariant Kalman Filtering with Noise-Free Pseudo-Measurements Sven Goffin et.al. 2404.10687 null
2024-04-16 The Unreasonable Effectiveness of Pre-Trained Features for Camera Pose Refinement Gabriele Trivigno et.al. 2404.10438 null
2024-04-16 GaitPoint+: A Gait Recognition Network Incorporating Point Cloud Analysis and Recycling Huantao Ren et.al. 2404.10213 null
2024-04-16 LWIRPOSE: A novel LWIR Thermal Image Dataset and Benchmark Avinash Upadhyay et.al. 2404.10212 link
2024-04-15 LetsGo: Large-Scale Garage Modeling and Rendering via LiDAR-Assisted Gaussian Primitives Jiadi Cui et.al. 2404.09748 null
2024-04-14 In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition Wiktor Mucha et.al. 2404.09308 null
2024-04-13 DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detector Johan Edstedt et.al. 2404.08928 link
2024-04-16 3D Human Scan With A Moving Event Camera Kai Kohyama et.al. 2404.08504 null
2024-04-11 Separated Attention: An Improved Cycle GAN Based Under Water Image Enhancement Method Tashmoy Ghosh et.al. 2404.07649 null
2024-04-11 GLID: Pre-training a Generalist Encoder-Decoder Vision Model Jihao Liu et.al. 2404.07603 null
2024-04-10 Measuring proximity to standard planes during fetal brain ultrasound scanning Chiara Di Vece et.al. 2404.07124 null
2024-04-10 MoCap-to-Visual Domain Adaptation for Efficient Human Mesh Estimation from 2D Keypoints Bedirhan Uguz et.al. 2404.07094 null
2024-04-10 Gaussian-LIC: Photo-realistic LiDAR-Inertial-Camera SLAM with 3D Gaussian Splatting Xiaolei Lang et.al. 2404.06926 null
2024-04-09 Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences Axel Barroso-Laguna et.al. 2404.06337 link
2024-04-09 Incremental Joint Learning of Depth, Pose and Implicit Scene Representation on Monocular Camera in Large-scale Scenes Tianchen Deng et.al. 2404.06050 null
2024-04-09 Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation Zong-Wei Hong et.al. 2404.06029 null
2024-04-08 Learning 3D-Aware GANs from Unposed Images with Template Feature Field Xinya Chen et.al. 2404.05705 null
2024-04-08 Learning a Category-level Object Pose Estimator without Pose Annotations Fengrui Tian et.al. 2404.05626 null
2024-04-08 DepthMOT: Depth Cues Lead to a Strong Multi-Object Tracker Jiapeng Wu et.al. 2404.05518 link
2024-04-08 Two Hands Are Better Than One: Resolving Hand to Hand Intersections via Occupancy Networks Maksym Ivashechkin et.al. 2404.05414 null
2024-04-08 STITCH: Augmented Dexterity for Suture Throws Including Thread Coordination and Handoffs Kush Hari et.al. 2404.05151 null
2024-04-05 ToolEENet: Tool Affordance 6D Pose Estimation Yunlong Wang et.al. 2404.04193 null
2024-04-04 SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation Sichen Chen et.al. 2404.03518 link
2024-04-04 Multi Positive Contrastive Learning with Pose-Consistent Generated Images Sho Inayoshi et.al. 2404.03256 null
2024-04-04 HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud Wencan Cheng et.al. 2404.03159 link
2024-04-03 Fusing Multi-sensor Input with State Information on TinyML Brains for Autonomous Nano-drones Luca Crupi et.al. 2404.02567 null
2024-04-03 Semi-Supervised Unconstrained Head Pose Estimation in the Wild Huayi Zhou et.al. 2404.02544 link
2024-04-02 3D Congealing: 3D-Aware Image Alignment in the Wild Yunzhi Zhang et.al. 2404.02125 null
2024-04-02 SelfPose3d: Self-Supervised Multi-Person Multi-View 3d Pose Estimation Vinkle Srivastav et.al. 2404.02041 null
2024-04-01 Marrying NeRF with Feature Matching for One-step Pose Estimation Ronghan Chen et.al. 2404.00891 null
2024-03-31 Graph-Based vs. Error State Kalman Filter-Based Fusion Of 5G And Inertial Data For MAV Indoor Pose Estimation Meisam Kabiri et.al. 2404.00691 null
2024-03-31 OmniLocalRF: Omnidirectional Local Radiance Fields from Dynamic Videos Dongyoung Choi et.al. 2404.00676 null
2024-04-02 KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation Jihua Peng et.al. 2404.00658 link
2024-03-29 FetalDiffusion: Pose-Controllable 3D Fetal MRI Synthesis with Conditional Diffusion Model Molin Zhang et.al. 2404.00132 null
2024-03-29 Latent Embedding Clustering for Occlusion Robust Head Pose Estimation José Celestino et.al. 2403.20251 null
2024-03-29 A Unified Framework for Human-centric Point Cloud Video Understanding Yiteng Xu et.al. 2403.20031 null
2024-04-01 Video-Based Human Pose Regression via Decoupled Space-Time Aggregation Jijie He et.al. 2403.19926 link
2024-03-28 Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation Xiao Lin et.al. 2403.19527 link
2024-03-27 Object Pose Estimation via the Aggregation of Diffusion Features Tianfu Wang et.al. 2403.18791 link
2024-03-27 RoboKeyGen: Robot Pose and Joint Angles Estimation via Diffusion-based 3D Keypoint Generation Yang Tian et.al. 2403.18259 null
2024-03-26 Mathematical Foundation and Corrections for Full Range Head Pose Estimation Huei-Chung Hu et.al. 2403.18104 null
2024-03-26 EgoPoseFormer: A Simple Baseline for Egocentric 3D Human Pose Estimation Chenhongyi Yang et.al. 2403.18080 null
2024-03-26 A Survey on 3D Egocentric Human Pose Estimation Md Mushfiqur Azam et.al. 2403.17893 null
2024-03-26 GTA-HDR: A Large-Scale Synthetic Dataset for HDR Image Reconstruction Hrishav Bakul Barua et.al. 2403.17837 link
2024-03-26 DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions Sammy Christen et.al. 2403.17827 null
2024-03-26 System Calibration of a Field Phenotyping Robot with Multiple High-Precision Profile Laser Scanners Felix Esser et.al. 2403.17788 null
2024-03-25 Animal Avatars: Reconstructing Animatable 3D Animals from Casual Videos Remy Sabathier et.al. 2403.17103 null
2024-03-25 Characterisation of the Intel RealSense D415 Stereo Depth Camera for Motion-Corrected CT Perfusion Imaging Mahdieh Dashtbani Moghari et.al. 2403.16490 null
2024-03-25 Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects Zicong Fan et.al. 2403.16428 null
2024-03-25 A Geometric Perspective on Fusing Gaussian Distributions on Lie Groups Yixiao Ge et.al. 2403.16411 null
2024-03-25 ASDF: Assembly State Detection Utilizing Late Fusion by Integrating 6D Pose Estimation Hannah Schieber et.al. 2403.16400 null
2024-03-24 KITchen: A Real-World Benchmark and Dataset for 6D Object Pose Estimation in Kitchen Environments Abdelrahman Younes et.al. 2403.16238 null
2024-03-24 Diffusion Model is a Good Pose Estimator from 3D RF-Vision Junqiao Fan et.al. 2403.16198 null
2024-03-23 UPNeRF: A Unified Framework for Monocular 3D Object Reconstruction and Pose Estimation Yuliang Guo et.al. 2403.15705 null
2024-03-22 InterFusion: Text-Driven Generation of 3D Human-Object Interaction Sisi Dai et.al. 2403.15612 null
2024-03-22 Augmented Reality Warnings in Roadway Work Zones: Evaluating the Effect of Modality on Worker Reaction Times Sepehr Sabeti et.al. 2403.15571 null
2024-03-22 Gesture-Controlled Aerial Robot Formation for Human-Swarm Interaction in Safety Monitoring Applications Vít Krátký et.al. 2403.15333 null
2024-03-22 WSCLoc: Weakly-Supervised Sparse-View Camera Relocalization Jialu Wang et.al. 2403.15272 null
2024-03-22 DITTO: Demonstration Imitation by Trajectory Transformation Nick Heppert et.al. 2403.15203 null
2024-03-22 Cartoon Hallucinations Detection: Pose-aware In Context Visual Learning Bumsoo Kim et.al. 2403.15048 null
2024-03-22 Trajectory Regularization Enhances Self-Supervised Geometric Representation Jiayun Wang et.al. 2403.14973 null
2024-03-21 VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding Ahmad Mahmood et.al. 2403.14743 null
2024-03-21 Visibility-Aware Keypoint Localization for 6DoF Object Pose Estimation Ruyi Lian et.al. 2403.14559 null
2024-03-21 Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset Andrea Avogaro. Andrea Toaiari et.al. 2403.14447 null
2024-03-21 Evaluation and Deployment of LiDAR-based Place Recognition in Dense Forests Haedam Oh et.al. 2403.14326 null
2024-03-21 Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation Francesco Di Felice et.al. 2403.14279 null
2024-03-20 DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses Chen Zhao et.al. 2403.13683 link
2024-03-20 Meta-Point Learning and Refining for Category-Agnostic Pose Estimation Junjie Chen et.al. 2403.13647 link
2024-03-20 Advancing 6D Pose Estimation in Augmented Reality -- Overcoming Projection Ambiguity with Uncontrolled Imagery Mayura Manawadu et.al. 2403.13434 null
2024-03-20 DOR3D-Net: Dense Ordinal Regression Network for 3D Hand Pose Estimation Yamin Mao et.al. 2403.13405 null
2024-03-20 ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics Qiaojun Yu et.al. 2403.13365 null
2024-03-20 MULAN-WC: Multi-Robot Localization Uncertainty-aware Active NeRF with Wireless Coordination Weiying Wang et.al. 2403.13348 null
2024-03-19 FaceXFormer: A Unified Transformer for Facial Analysis Kartik Narayan et.al. 2403.12960 null
2024-03-19 WHAC: World-grounded Humans and Cameras Wanqi Yin et.al. 2403.12959 null
2024-03-19 Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation Jingtao Sun et.al. 2403.12728 link
2024-03-19 IFFNeRF: Initialisation Free and Fast 6DoF pose estimation from a single image and a NeRF model Matteo Bortolon et.al. 2403.12682 null
2024-03-19 In-Hand Following of Deformable Linear Objects Using Dexterous Fingers with Tactile Sensing Mingrui Yu et.al. 2403.12676 null
2024-03-19 Self-learning Canonical Space for Multi-view 3D Human Pose Estimation Xiaoben Li et.al. 2403.12440 null
2024-03-19 Human Mesh Recovery from Arbitrary Multi-view Images Xiaoben Li et.al. 2403.12434 null
2024-03-19 XPose: eXplainable Human Pose Estimation Luyu Qiu et.al. 2403.12370 null
2024-03-18 HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data Mengqi Zhang et.al. 2403.12011 null
2024-03-18 Normalized Validity Scores for DNNs in Regression based Eye Feature Extraction Wolfgang Fuhl et.al. 2403.11665 null
2024-03-18 An Accurate and Real-time Relative Pose Estimation from Triple Point-line Images by Decoupling Rotation and Translation Zewen Xu et.al. 2403.11639 null
2024-03-18 LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models Yang Yang et.al. 2403.11627 link
2024-03-18 GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects Sungphill Moon et.al. 2403.11510 null
2024-03-17 A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation Qucheng Peng et.al. 2403.11310 null
2024-03-17 Compact 3D Gaussian Splatting For Dense Visual SLAM Tianchen Deng et.al. 2403.11247 null
2024-03-16 Robotic Task Success Evaluation Under Multi-modal Non-Parametric Object Pose Uncertainty Lakshadeep Naik et.al. 2403.10874 null
2024-03-16 DPPE: Dense Pose Estimation in a Plenoxels Environment using Gradient Approximation Christopher Kolios et.al. 2403.10773 null
2024-03-15 GS-Pose: Cascaded Framework for Generalizable Segmentation-based 6D Object Pose Estimation Dingding Cai et.al. 2403.10683 null
2024-03-15 CLOSURE: Fast Quantification of Pose Uncertainty Sets Yihuai Gao et.al. 2403.09990 null
2024-03-14 Scalable Autonomous Drone Flight in the Forest with Visual-Inertial SLAM and Dense Submaps Built without LiDAR Sebastián Barbas Laina et.al. 2403.09596 null
2024-03-14 Improving Real-Time Omnidirectional 3D Multi-Person Human Pose Estimation with People Matching and Unsupervised 2D-3D Lifting Pawel Knap et.al. 2403.09437 null
2024-03-14 LM2D: Lyrics- and Music-Driven Dance Synthesis Wenjie Yin et.al. 2403.09407 null
2024-03-14 SD-Net: Symmetric-Aware Keypoint Prediction and Domain Adaptation for 6D Pose Estimation In Bin-picking Scenarios Ding-Tao Huang et.al. 2403.09317 link
2024-03-14 MOTPose: Multi-object 6D Pose Estimation for Dynamic Video Sequences using Attention-based Temporal Fusion Arul Selvam Periyasamy et.al. 2403.09309 null
2024-03-13 Data Augmentation in Human-Centric Vision Wentao Jiang et.al. 2403.08650 null
2024-03-13 PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections Matteo Taiana et.al. 2403.08586 null
2024-03-13 NeRF-Supervised Feature Point Detection and Description Ali Youssef et.al. 2403.08156 null
2024-03-12 Q-SLAM: Quadric Representations for Monocular SLAM Chensheng Peng et.al. 2403.08125 null
2024-03-12 MRC-Net: 6-DoF Pose Estimation with MultiScale Residual Correlation Yuelong Li et.al. 2403.08019 null
2024-03-12 Uncertainty Quantification with Deep Ensembles for 6D Object Pose Estimation Kira Wursthorn et.al. 2403.07741 null
2024-03-12 Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving JunDa Cheng et.al. 2403.07535 null
2024-03-12 Category-Agnostic Pose Estimation for Point Clouds Bowen Liu et.al. 2403.07437 null
2024-03-12 Monocular Microscope to CT Registration using Pose Estimation of the Incus for Augmented Reality Cochlear Implant Surgery Yike Zhang et.al. 2403.07219 null
2024-03-11 Real-Time Simulated Avatar from Head-Mounted Sensors Zhengyi Luo et.al. 2403.06862 null
2024-03-11 Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition Erkut Akdag et.al. 2403.06577 null
2024-03-10 Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation Paweł A. Pierzchlewicz et.al. 2403.06164 link
2024-03-10 Diffusion Models Trained with Large Data Are Transferable Visual Models Guangkai Xu et.al. 2403.06090 null
2024-03-08 Prepared for the Worst: A Learning-Based Adversarial Attack for Resilience Analysis of the ICP Algorithm Ziyu Zhang et.al. 2403.05666 null
2024-03-11 Exploiting polar symmetry in designing equivariant observers for vision-based motion estimation Tarek Bouazza et.al. 2403.05450 null
2024-03-07 Real-Time Planning Under Uncertainty for AUVs Using Virtual Maps Ivana Collado-Gonzalez et.al. 2403.04936 null
2024-03-07 That's My Point: Compact Object-centric LiDAR Pose Estimation for Large-scale Outdoor Localisation Georgi Pramatarov et.al. 2403.04755 null
2024-03-07 Disentangled Diffusion-Based 3D Human Pose Estimation with Hierarchical Spatial and Temporal Denoiser Qingyuan Cai et.al. 2403.04444 null
2024-03-09 Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation Ruicong Liu et.al. 2403.04381 null
2024-03-05 FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation Chris Rockwell et.al. 2403.03221 null
2024-03-05 NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors Yannan He et.al. 2403.03122 null
2024-03-05 Improved LiDAR Odometry and Mapping using Deep Semantic Segmentation and Novel Outliers Detection Mohamed Afifi et.al. 2403.03111 null
2024-03-05 Splat-Nav: Safe Real-Time Robot Navigation in Gaussian Splatting Maps Timothy Chen et.al. 2403.02751 null
2024-03-04 PowerSkel: A Device-Free Framework Using CSI Signal for Human Skeleton Estimation in Power Station Cunyi Yin et.al. 2403.01913 link
2024-03-04 A Simple Baseline for Efficient Hand Mesh Reconstruction Zhishan Zhou et.al. 2403.01813 null
2024-03-03 MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images Junwen Huang et.al. 2403.01517 null
2024-03-02 Single-image camera calibration with model-free distortion correction Katia Genovese et.al. 2403.01263 null
2024-03-02 Grid-based Fast and Structural Visual Odometry Zhang Zhihe et.al. 2403.01110 null
2024-03-01 Optimal Robot Formations: Balancing Range-Based Observability and User-Defined Configurations Syed Shabbir Ahmed et.al. 2403.00988 null
2024-03-04 TEXterity -- Tactile Extrinsic deXterity: Simultaneous Tactile Estimation and Control for Extrinsic Dexterity Sangwoon Kim et.al. 2403.00049 null
2024-03-01 Graph Convolutional Neural Networks for Automated Echocardiography View Recognition: A Holistic Approach Sarina Thomas et.al. 2402.19062 null
2024-02-29 Deep Learning for 3D Human Pose Estimation and Mesh Recovery: A Survey Yang Liu et.al. 2402.18844 link
2024-02-28 Attention-Propagation Network for Egocentric Heatmap to 3D Pose Lifting Taeho Kang et.al. 2402.18330 link
2024-02-28 Location-guided Head Pose Estimation for Fisheye Image Bing Li et.al. 2402.18320 null
2024-02-28 NToP: NeRF-Powered Large-scale Dataset Generation for 2D and 3D Human Pose Estimation in Top-View Fisheye Images Jingrui Yu et.al. 2402.18196 null
2024-02-28 Six-Point Method for Multi-Camera Systems with Reduced Solution Space Banglei Guan et.al. 2402.18066 null
2024-02-27 Real-Time Estimation of Relative Pose for UAVs Using a Dual-Channel Feature Association Zhaoying Wang et.al. 2402.17504 null
2024-02-26 HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields Haozhe Qi et.al. 2402.17062 link
2024-02-26 DRSI-Net: Dual-Residual Spatial Interaction Network for Multi-Person Pose Estimation Shang Wu et.al. 2402.16640 null
2024-02-26 GEA: Reconstructing Expressive 3D Gaussian Avatar from Monocular Video Xinqi Liu et.al. 2402.16607 null
2024-02-26 DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer Yizhe Wu et.al. 2402.16308 null
2024-02-25 XAI-based gait analysis of patients walking with Knee-Ankle-Foot orthosis using video cameras Arnav Mishra et.al. 2402.16175 null

(back to top)

Image Generation

Publish Date Title Authors PDF Code
2024-05-21 Personalized Residuals for Concept-Driven Text-to-Image Generation Cusuh Ham et.al. 2405.12978 null
2024-05-21 An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation Zhiyu Tan et.al. 2405.12914 null
2024-05-21 Spatial-aware Attention Generative Adversarial Network for Semi-supervised Anomaly Detection in Medical Image Zerui Zhang et.al. 2405.12872 null
2024-05-21 A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability Li-Yang Tseng et.al. 2405.12847 null
2024-05-21 Leveraging Neural Radiance Fields for Pose Estimation of an Unknown Space Object during Proximity Operations Antoine Legrand et.al. 2405.12728 null
2024-05-21 CustomText: Customized Textual Image Generation using Diffusion Models Shubham Paliwal et.al. 2405.12531 null
2024-05-20 Diffusion for World Modeling: Visual Details Matter in Atari Eloi Alonso et.al. 2405.12399 link
2024-05-20 Paired Conditional Generative Adversarial Network for Highly Accelerated Liver 4D MRI Di Xu et.al. 2405.12357 null
2024-05-20 EGAN: Evolutional GAN for Ransomware Evasion Daniel Commey et.al. 2405.12266 null
2024-05-20 Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices Nathaniel Cohen et.al. 2405.12211 null
2024-05-20 Diffusion Models for Generating Ballistic Spacecraft Trajectories Tyler Presser et.al. 2405.11738 null
2024-05-19 URDFormer: A Pipeline for Constructing Articulated Simulation Environments from Real-World Images Zoey Chen et.al. 2405.11656 null
2024-05-19 Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation Sangyeop Yeo et.al. 2405.11614 null
2024-05-19 A GAN-Based Data Poisoning Attack Against Federated Learning Systems and Its Countermeasure Wei Sun et.al. 2405.11440 null
2024-05-18 UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers Duo Peng et.al. 2405.11336 null
2024-05-18 On the Trajectory Regularity of ODE-based Diffusion Sampling Defang Chen et.al. 2405.11326 null
2024-05-18 Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning Udi Aharon et.al. 2405.11258 null
2024-05-18 TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation Chengcheng Feng et.al. 2405.11236 null
2024-05-17 Improving face generation quality and prompt following with synthetic captions Michail Tarasiou et.al. 2405.10864 null
2024-05-17 Multi-scale Semantic Prior Features Guided Deep Neural Network for Urban Street-view Image Jianshun Zeng et.al. 2405.10504 null
2024-05-17 Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers Rya Sanovar et.al. 2405.10480 null
2024-05-16 Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model Zheng Gu et.al. 2405.10316 null
2024-05-16 UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models Sahel Sharifymoghaddam et.al. 2405.10311 null
2024-05-16 VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing Binghui Chen et.al. 2405.09985 null
2024-05-16 KPNDepth: Depth Estimation of Lane Images under Complex Rainy Environment Zhengxu Shi et.al. 2405.09964 null
2024-05-16 Chameleon: Mixed-Modal Early-Fusion Foundation Models Chameleon Team et.al. 2405.09818 null
2024-05-16 MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis Joseph Cho et.al. 2405.09806 null
2024-05-16 An Autoencoder and Generative Adversarial Networks Approach for Multi-Omics Data Imbalanced Class Handling and Classification Ibrahim Al-Hurani et.al. 2405.09756 null
2024-05-15 Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer Weifei Jin et.al. 2405.09470 null
2024-05-16 Global-Local Image Perceptual Score (GLIPS): Evaluating Photorealistic Quality of AI-Generated Images Memoona Aziz et.al. 2405.09426 null
2024-05-15 DeCoDEx: Confounder Detector Guidance for Improved Diffusion-based Counterfactual Explanations Nima Fathi et.al. 2405.09288 link
2024-05-15 SOEDiff: Efficient Distillation for Small Object Editing Qihe Pan et.al. 2405.09114 null
2024-05-15 Deep Learning in Earthquake Engineering: A Comprehensive Review Yazhou Xie et.al. 2405.09021 null
2024-05-14 Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Zhimin Li et.al. 2405.08748 link
2024-05-15 Similarity Metrics for MR Image-To-Image Translation Melanie Dohmen et.al. 2405.08431 null
2024-05-14 Compositional Text-to-Image Generation with Dense Blob Representations Weili Nie et.al. 2405.08246 null
2024-05-13 RATLIP: Generative Adversarial CLIP Text-to-Image Synthesis Based on Recurrent Affine Transformations Chengde Lin et.al. 2405.08114 link
2024-05-13 CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models Nick Stracke et.al. 2405.07913 null
2024-05-13 SAR Image Synthesis with Diffusion Models Denisa Qosja et.al. 2405.07776 null
2024-05-12 Semantic Loss Functions for Neuro-Symbolic Structured Prediction Kareem Ahmed et.al. 2405.07387 null
2024-05-12 Understanding and Evaluating Human Preferences for AI Generated Images with Instruction Tuning Jiarui Wang et.al. 2405.07346 link
2024-05-12 PotatoGANs: Utilizing Generative Adversarial Networks, Instance Segmentation, and Explainable AI for Enhanced Potato Disease Identification and Classification Mohammad Shafiul Alam et.al. 2405.07332 link
2024-05-12 Stable Signature is Unstable: Removing Image Watermark from Diffusion Models Yuepeng Hu et.al. 2405.07145 null
2024-05-12 MAxPrototyper: A Multi-Agent Generation System for Interactive User Interface Prototyping Mingyue Yuan et.al. 2405.07131 null
2024-05-11 Unsupervised Density Neural Representation for CT Metal Artifact Reduction Qing Wu et.al. 2405.07047 null
2024-05-11 Semantic Guided Large Scale Factor Remote Sensing Image Super-resolution with Generative Diffusion Prior Ce Wang et.al. 2405.07044 link
2024-05-11 Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation Shengyuan Liu et.al. 2405.06948 null
2024-05-10 Controllable Image Generation With Composed Parallel Token Prediction Jamie Stirling et.al. 2405.06535 null
2024-05-10 SketchDream: Sketch-based Text-to-3D Generation and Editing Feng-Lin Liu et.al. 2405.06461 null
2024-05-09 Photonic quantum generative adversarial networks for classical data Tigran Sedrakyan et.al. 2405.06023 null
2024-05-09 Frame Interpolation with Consecutive Brownian Bridge Diffusion Zonglin Lyu et.al. 2405.05953 null
2024-05-09 Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion Models Zhe Ma et.al. 2405.05846 null
2024-05-10 MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation Yuxiang Wei et.al. 2405.05806 link
2024-05-09 Exploring Text-Guided Single Image Editing for Remote Sensing Images Fangzhou Han et.al. 2405.05769 null
2024-05-09 End-to-End Generative Semantic Communication Powered by Shared Semantic Knowledge Base Shuling Li et.al. 2405.05738 null
2024-05-09 VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis Zhihan Ju et.al. 2405.05667 null
2024-05-09 A Survey on Personalized Content Synthesis with Diffusion Models Xulu Zhang et.al. 2405.05538 null
2024-05-09 Characteristic Learning for Provable One Step Generation Zhao Ding et.al. 2405.05512 link
2024-05-08 Cross-Modality Translation with Generative Adversarial Networks to Unveil Alzheimer's Disease Biomarkers Reihaneh Hassanzadeh et.al. 2405.05462 null
2024-05-08 DrawL: Understanding the Effects of Non-Mainstream Dialects in Prompted Image Generation Joshua N. Williams et.al. 2405.05382 null
2024-05-08 Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo Nayantara Mudur et.al. 2405.05255 link
2024-05-08 StyleMamba : State Space Model for Efficient Text-driven Image Style Transfer Zijia Wang et.al. 2405.05027 null
2024-05-08 Discrepancy-based Diffusion Models for Lesion Detection in Brain MRI Keqiang Fan et.al. 2405.04974 null
2024-05-08 Improving Long Text Understanding with Knowledge Distilled from Summarization Model Yan Liu et.al. 2405.04955 null
2024-05-08 HAGAN: Hybrid Augmented Generative Adversarial Network for Medical Image Synthesis Zhihan Ju et.al. 2405.04902 null
2024-05-08 FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation Xuehai He et.al. 2405.04834 null
2024-05-07 TexControl: Sketch-Based Two-Stage Fashion Image Generation Using Diffusion Model Yongming Zhang et.al. 2405.04675 null
2024-05-07 ResNCT: A Deep Learning Model for the Synthesis of Nephrographic Phase Images in CT Urography Syed Jamal Safdar Gardezi et.al. 2405.04629 null
2024-05-07 SingIt! Singer Voice Transformation Amit Eliav et.al. 2405.04627 null
2024-05-07 Towards Geographic Inclusion in the Evaluation of Text-to-Image Models Melissa Hall et.al. 2405.04457 null
2024-05-07 Data augmentation experiments with style-based quantum generative adversarial networks on trapped-ion and superconducting-qubit technologies Julien Baglio et.al. 2405.04401 null
2024-05-07 Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation Jihyun Kim et.al. 2405.04356 null
2024-05-07 Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer Zhuoyi Yang et.al. 2405.04312 link
2024-05-07 Improving Offline Reinforcement Learning with Inaccurate Simulators Yiwen Hou et.al. 2405.04307 null
2024-05-07 Bayesian Simultaneous Localization and Multi-Lane Tracking Using Onboard Sensors and a SD Map Yuxuan Xia et.al. 2405.04290 null
2024-05-07 Bidirectional Adversarial Autoencoders for the design of Plasmonic Metasurfaces Yuansan Liu et.al. 2405.04056 link
2024-05-07 Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model Joo Young Choi et.al. 2405.03958 null
2024-05-06 Generated Contents Enrichment Mahdi Naseri et.al. 2405.03650 null
2024-05-06 CCDM: Continuous Conditional Diffusion Models for Image Generation Xin Ding et.al. 2405.03546 link
2024-05-06 GLIP: Electromagnetic Field Exposure Map Completion by Deep Generative Networks Mohammed Mallik et.al. 2405.03384 null
2024-05-05 AnoGAN for Tabular Data: A Novel Approach to Anomaly Detection Aditya Singh et.al. 2405.03075 null
2024-05-05 Boundary-aware Decoupled Flow Networks for Realistic Extreme Rescaling Jinmin Li et.al. 2405.02941 null
2024-05-05 Data-Efficient Molecular Generation with Hierarchical Textual Inversion Seojin Kim et.al. 2405.02845 null
2024-05-05 SMCD: High Realism Motion Style Transfer via Mamba-based Diffusion Ziyun Qian et.al. 2405.02844 null
2024-05-05 ImageInWords: Unlocking Hyper-Detailed Image Descriptions Roopal Garg et.al. 2405.02793 link
2024-05-04 U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers Yuchuan Tian et.al. 2405.02730 null
2024-05-03 Functional Imaging Constrained Diffusion for Brain PET Synthesis from Structural MRI Minhui Yu et.al. 2405.02504 null
2024-05-03 Multi-method Integration with Confidence-based Weighting for Zero-shot Image Classification Siqi Yin et.al. 2405.02155 null
2024-05-03 Reconstructing the mid-infrared spectra of galaxies using ultraviolet to submillimeter photometry and Deep Generative Networks Agapi Rissaki et.al. 2405.02153 null
2024-05-03 Three-Dimensional Amyloid-Beta PET Synthesis from Structural MRI with Conditional Generative Adversarial Networks Fernando Vega et.al. 2405.02109 null
2024-05-03 AI-generated art perceptions with GenFrame -- an image-generating picture frame Peter Kun et.al. 2405.01901 null
2024-05-03 Defect Image Sample Generation With Diffusion Prior for Steel Surface Defect Recognition Yichun Tai et.al. 2405.01872 null
2024-05-03 Report on the AAPM Grand Challenge on deep generative modeling for learning medical image statistics Rucha Deshpande et.al. 2405.01822 null
2024-05-02 Long Tail Image Generation Through Feature Space Augmentation and Iterated Learning Rafael Elberg et.al. 2405.01705 link
2024-05-02 Investigation on optimal microstructure of dual-phase steel with high strength and ductility by machine learning Misato Suzuki et.al. 2405.01689 null
2024-05-02 Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance Kelvin C. K. Chan et.al. 2405.01356 null
2024-05-02 Towards Inclusive Face Recognition Through Synthetic Ethnicity Alteration Praveen Kumar Chandaliya et.al. 2405.01273 null
2024-05-02 DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines Ye Tian et.al. 2405.01248 null
2024-05-02 On Mechanistic Knowledge Localization in Text-to-Image Generative Models Samyadeep Basu et.al. 2405.01008 null
2024-05-01 SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models Burak Can Biner et.al. 2405.00878 null
2024-05-01 Guided Conditional Diffusion Classifier (ConDiff) for Enhanced Prediction of Infection in Diabetic Foot Ulcers Palawat Busaranuvong et.al. 2405.00858 null
2024-05-01 RGB $\leftrightarrow$ X: Image decomposition and synthesis using material- and lighting-aware diffusion models Zheng Zeng et.al. 2405.00666 null
2024-05-01 UWAFA-GAN: Ultra-Wide-Angle Fluorescein Angiography Transformation via Multi-scale Generation and Registration Enhancement Ruiquan Ge et.al. 2405.00542 link
2024-05-01 Compressive Sensing Imaging Using Caustic Lens Mask Generated by Periodic Perturbation in a Ripple Tank Doğan Tunca Arık et.al. 2405.00407 null
2024-05-01 Beamforming Inferring by Conditional WGAN-GP for Holographic Antenna Arrays Fenghao Zhu et.al. 2405.00391 null
2024-05-01 Streamlining Image Editing with Layered Diffusion Brushes Peyman Gholami et.al. 2405.00313 null
2024-04-30 IgCONDA-PET: Implicitly-Guided Counterfactual Diffusion for Detecting Anomalies in PET Images Shadab Ahamed et.al. 2405.00239 link
2024-04-30 DOCCI: Descriptions of Connected and Contrasting Images Yasumasa Onoe et.al. 2404.19753 null
2024-04-30 Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Yunhao Ge et.al. 2404.19752 null
2024-04-30 SwipeGANSpace: Swipe-to-Compare Image Generation via Efficient Latent Space Exploration Yuto Nakashima et.al. 2404.19693 null
2024-04-30 Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model Denys Godwin et.al. 2404.19609 null
2024-04-30 TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image Generation with Diffusion Models Teng Zhou et.al. 2404.19475 null
2024-04-30 InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Chanran Kim et.al. 2404.19427 null
2024-05-01 Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation Zhenglin Li et.al. 2404.19265 null
2024-05-01 FOTS: A Fast Optical Tactile Simulator for Sim2Real Learning of Tactile-motor Robot Manipulation Skills Yongqiang Zhao et.al. 2404.19217 null
2024-04-30 NeRF-Insert: 3D Local Editing with Multimodal Control Signals Benet Oriol Sabat et.al. 2404.19204 null
2024-04-29 DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing Minghao Chen et.al. 2404.18929 null
2024-04-29 TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation Junhao Cheng et.al. 2404.18919 null
2024-04-29 Hide and Seek: How Does Watermarking Impact Face Recognition? Yuguang Yao et.al. 2404.18890 null
2024-04-29 Learning Mixtures of Gaussians Using Diffusion Models Khashayar Gatmiry et.al. 2404.18869 null
2024-04-29 Socially Adaptive Path Planning Based on Generative Adversarial Network Yao Wang et.al. 2404.18687 null
2024-04-29 FlexiFilm: Long Video Generation with Flexible Conditions Yichen Ouyang et.al. 2404.18620 link
2024-04-29 Anywhere: A Multi-Agent Framework for Reliable and Diverse Foreground-Conditioned Image Inpainting Tianyidan Xie et.al. 2404.18598 null
2024-04-29 SIDBench: A Python Framework for Reliably Assessing Synthetic Image Detection Methods Manos Schinas et.al. 2404.18552 link
2024-04-29 Towards Image Synthesis with Photon Counting Stellar Intensity Interferometry Alessia Spolon et.al. 2404.18507 null
2024-04-29 Autonomous Quality and Hallucination Assessment for Virtual Tissue Staining and Digital Pathology Luzhe Huang et.al. 2404.18458 null
2024-04-26 Federated Transfer Component Analysis Towards Effective VNF Profiling Xunzheng ZhangB et.al. 2404.17553 null
2024-04-26 Spatial-frequency Dual-Domain Feature Fusion Network for Low-Light Remote Sensing Image Enhancement Zishu Yao et.al. 2404.17400 null
2024-04-26 Trinity Detector:text-assisted and attention mechanisms based spectral fusion for diffusion generation image detection Jiawei Song et.al. 2404.17254 null
2024-04-26 ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion Ziyue Zhang et.al. 2404.17230 link
2024-04-26 DPGAN: A Dual-Path Generative Adversarial Network for Missing Data Imputation in Graphs Xindi Zheng et.al. 2404.17164 null
2024-04-26 An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder Yicheng Gu et.al. 2404.17161 null
2024-04-26 Synthesizing Iris Images using Generative Adversarial Networks: Survey and Comparative Analysis Shivangi Yadav et.al. 2404.17105 null
2024-04-25 Channel Modeling for FR3 Upper Mid-band via Generative Adversarial Networks Yaqi Hu et.al. 2404.17069 null
2024-04-25 DE-CGAN: Boosting rTMS Treatment Prediction with Diversity Enhancing Conditional Generative Adversarial Networks Matthew Squires et.al. 2404.16913 null
2024-04-25 REBEL: Reinforcement Learning via Regressing Relative Rewards Zhaolin Gao et.al. 2404.16767 null
2024-04-25 Denoising: from classical methods to deep CNNs Jean-Eric Campagne et.al. 2404.16617 link
2024-04-25 MuseumMaker: Continual Style Customization without Catastrophic Forgetting Chenxi Liu et.al. 2404.16612 null
2024-04-25 Conditional Distribution Modelling for Few-Shot Image Synthesis with Diffusion Models Parul Gupta et.al. 2404.16556 null
2024-04-25 OpenDlign: Enhancing Open-World 3D Learning with Depth-Aligned Images Ye Mao et.al. 2404.16538 null
2024-04-25 Cross-sensor super-resolution of irregularly sampled Sentinel-2 time series Aimi Okabayashi et.al. 2404.16409 link
2024-04-24 Guardians of the Quantum GAN Archisman Ghosh et.al. 2404.16156 null
2024-04-24 Quantitative Characterization of Retinal Features in Translated OCTA Rashadul Hasan Badhon et.al. 2404.16133 null
2024-04-24 Spinning solar jets explained through the interplay between plasma sheets and vortex columns Sahel Dey et.al. 2404.16096 null
2024-04-24 PuLID: Pure and Lightning ID Customization via Contrastive Alignment Zinan Guo et.al. 2404.16022 null
2024-04-24 Security Analysis of WiFi-based Sensing Systems: Threats from Perturbation Attacks Hangcheng Cao et.al. 2404.15587 null
2024-04-23 Multi-scale Intervention Planning based on Generative Design Ioannis Kavouras et.al. 2404.15492 null
2024-04-23 ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning Weifeng Chen et.al. 2404.15449 null
2024-04-23 GLoD: Composing Global Contexts and Local Details in Image Generation Moyuru Yamada et.al. 2404.15447 null
2024-04-23 From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation Zehuan Huang et.al. 2404.15267 null
2024-04-23 Adaptive Mixed-Scale Feature Fusion Network for Blind AI-Generated Image Quality Assessment Tianwei Zhou et.al. 2404.15163 null
2024-04-23 Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation Xun Wu et.al. 2404.15100 null
2024-04-23 CoARF: Controllable 3D Artistic Style Transfer for Radiance Fields Deheng Zhang et.al. 2404.14967 null
2024-04-23 Music Style Transfer With Diffusion Model Hong Huang et.al. 2404.14771 null
2024-04-23 SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models Bo Lin et.al. 2404.14755 null
2024-04-23 Skip the Benchmark: Generating System-Level High-Level Synthesis Data using Generative Machine Learning Yuchao Liao et.al. 2404.14754 null
2024-04-23 FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction Hang Hua et.al. 2404.14715 null
2024-04-22 The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking Yuying Li et.al. 2404.14581 null
2024-04-22 GeoDiffuser: Geometry-Based Image Editing with Diffusion Models Rahul Sajnani et.al. 2404.14403 null
2024-04-22 SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation Yuying Ge et.al. 2404.14396 link
2024-04-22 MultiBooth: Towards Generating All Your Concepts in an Image from Text Chenyang Zhu et.al. 2404.14239 link
2024-04-22 RHanDS: Refining Malformed Hands for Generated Images with Decoupled Structure and Style Guidance Chengrui Wang et.al. 2404.13984 null
2024-04-23 Accelerating Image Generation with Sub-path Linear Approximation Model Chen Xu et.al. 2404.13903 null
2024-04-22 Towards Better Text-to-Image Generation Alignment via Attention Modulation Yihang Wu et.al. 2404.13899 null
2024-04-22 Regional Style and Color Transfer Zhicheng Ding et.al. 2404.13880 null
2024-04-22 Distributional Black-Box Model Inversion Attack with Multi-Agent Reinforcement Learning Huan Bao et.al. 2404.13860 null
2024-04-22 A Comparative Study on Enhancing Prediction in Social Network Advertisement through Data Augmentation Qikai Yang et.al. 2404.13812 null
2024-04-21 Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation Jensen Hwa et.al. 2404.13798 null
2024-04-19 RadRotator: 3D Rotation of Radiographs with Diffusion Models Pouria Rouzrokh et.al. 2404.13000 null
2024-04-19 Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images Santosh et.al. 2404.12908 link
2024-04-19 Explainable Deepfake Video Detection using Convolutional Neural Network and CapsuleNet Gazi Hasin Ishrak et.al. 2404.12841 null
2024-04-19 Generative Modelling with High-Order Langevin Dynamics Ziqiang Shi et.al. 2404.12814 null
2024-04-19 PATE-TripleGAN: Privacy-Preserving Image Synthesis with Gaussian Differential Privacy Zepeng Jiang et.al. 2404.12730 null
2024-04-19 MLSD-GAN -- Generating Strong High Quality Face Morphing Attacks using Latent Semantic Disentanglement Aravinda Reddy PN et.al. 2404.12679 null
2024-04-19 How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples Dren Fazlija et.al. 2404.12653 null
2024-04-19 F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation Man M. Ho et.al. 2404.12650 null
2024-04-18 Alleviating Catastrophic Forgetting in Facial Expression Recognition with Emotion-Centered Models Israel A. Laurensi et.al. 2404.12260 null
2024-04-18 First 2D electron density measurements using Coherence Imaging Spectroscopy in the MAST-U Super-X divertor N. Lonigro et.al. 2404.12021 null
2024-04-18 ©Plug-in Authorization for Human Content Copyright Protection in Text-to-Image Model Chao Zhou et.al. 2404.11962 null
2024-04-18 Sketch-guided Image Inpainting with Partial Discrete Diffusion Process Nakul Sharma et.al. 2404.11949 link
2024-04-18 LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights Thibault Castells et.al. 2404.11936 null
2024-04-18 EdgeFusion: On-Device Text-to-Image Generation Thibault Castells et.al. 2404.11925 null
2024-04-18 Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans Lixing Tan et.al. 2404.11889 null
2024-04-18 Generating synthetic electroretinogram waveforms using Artificial Intelligence to improve classification of retinal conditions in under-represented populations Mikhail Kulyabin et.al. 2404.11842 null
2024-04-18 TextCenGen: Attention-Guided Text-Centric Background Adaptation for Text-to-Image Generation Tianyi Liang et.al. 2404.11824 null
2024-04-18 Tailoring Generative Adversarial Networks for Smooth Airfoil Design Joyjit Chattoraj et.al. 2404.11816 null
2024-04-17 On the Scalability of GNNs for Molecular Graphs Maciej Sypetkowski et.al. 2404.11568 null
2024-04-17 MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation Kuan-Chieh et.al. 2404.11565 null
2024-04-17 SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening Yu Zhong et.al. 2404.11537 null
2024-04-17 Towards Highly Realistic Artistic Style Transfer via Stable Diffusion with Step-aware and Layer-aware Prompt Zhanjie Zhang et.al. 2404.11474 link
2024-04-17 What-if Analysis Framework for Digital Twins in 6G Wireless Network Management Elif Ak et.al. 2404.11394 null
2024-04-17 Image Generative Semantic Communication with Multi-Modal Similarity Estimation for Resource-Limited Networks Eri Hosonuma et.al. 2404.11280 null
2024-04-17 Optical Image-to-Image Translation Using Denoising Diffusion Models: Heterogeneous Change Detection as a Use Case João Gabriel Vinholi et.al. 2404.11243 null
2024-04-17 KI-GAN: Knowledge-Informed Generative Adversarial Networks for Enhanced Multi-Vehicle Trajectory Forecasting at Signalized Intersections Chuheng Wei et.al. 2404.11181 link
2024-04-17 TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing Sherry X. Chen et.al. 2404.11120 link
2024-04-17 Object Remover Performance Evaluation Methods using Class-wise Object Removal Images Changsuk Oh et.al. 2404.11104 null
2024-04-16 RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting Ashkan Mirzaei et.al. 2404.10765 null
2024-04-16 LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation? Yuchi Wang et.al. 2404.10763 link
2024-04-16 AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation Zexin Li et.al. 2404.10714 null
2024-04-16 Gaussian Splatting Decoder for 3D-aware Generative Adversarial Networks Florian Barthel et.al. 2404.10625 null
2024-04-16 Adversarial Identity Injection for Semantic Face Image Synthesis Giuseppe Tarollo et.al. 2404.10408 null
2024-04-16 Generating Counterfactual Trajectories with Latent Diffusion Models for Concept Discovery Payal Varshney et.al. 2404.10356 null
2024-04-16 CanvasPic: An Interactive Tool for Freely Generating Facial Images Based on Spatial Layout Jiafu Wei et.al. 2404.10352 null
2024-04-16 OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model Runyi Li et.al. 2404.10312 null
2024-04-16 Learnable Prompt for Few-Shot Semantic Segmentation in Remote Sensing Domain Steve Andreas Immanuel et.al. 2404.10307 link
2024-04-16 OneActor: Consistent Character Generation via Cluster-Conditioned Guidance Jiahao Wang et.al. 2404.10267 null
2024-04-15 Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models Ziwei Luo et.al. 2404.09732 link
2024-04-15 VFLGAN: Vertical Federated Learning-based Generative Adversarial Network for Vertically Partitioned Data Publication Xun Yuan et.al. 2404.09722 null
2024-04-15 In-Context Translation: Towards Unifying Image Recognition, Processing, and Generation Han Xue et.al. 2404.09633 null
2024-04-15 Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement Chi Wang et.al. 2404.09540 null
2024-04-15 Magic Clothing: Controllable Garment-Driven Image Synthesis Weifeng Chen et.al. 2404.09512 link
2024-04-15 Improved Object-Based Style Transfer with Single Deep Network Harshmohan Kulkarni et.al. 2404.09461 null
2024-04-15 Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models Peifei Zhu et.al. 2404.09401 null
2024-04-14 Counteracting Concept Drift by Learning with Future Malware Predictions Branislav Bosansky et.al. 2404.09352 null
2024-04-14 DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation Modeling Xuening Yuan et.al. 2404.09227 null
2024-04-13 InverseVis: Revealing the Hidden with Curved Sphere Tracing Kai Lawonn et.al. 2404.09092 null
2024-04-12 An improved tabular data generator with VAE-GMM integration Patricia A. Apellániz et.al. 2404.08434 null
2024-04-12 Counterfactual Explanations for Face Forgery Detection via Adversarial Removal of Artifacts Yang Li et.al. 2404.08341 link
2024-04-11 Latent Guard: a Safety Framework for Text-to-image Generation Runtao Liu et.al. 2404.08031 link
2024-04-11 Rethinking Artistic Copyright Infringements in the Era of Text-to-Image Generative Models Mazda Moayeri et.al. 2404.08030 null
2024-04-11 OpenBias: Open-set Bias Detection in Text-to-Image Generative Models Moreno D'Incà et.al. 2404.07990 null
2024-04-11 Taming Stable Diffusion for Text to 360° Panorama Image Generation Cheng Zhang et.al. 2404.07949 link
2024-04-11 Generating Synthetic Satellite Imagery With Deep-Learning Text-to-Image Models -- Technical Challenges and Implications for Monitoring and Verification Tuong Vy Nguyen et.al. 2404.07754 null
2024-04-11 Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models Tuomas Kynkäänniemi et.al. 2404.07724 null
2024-04-11 Model-based Cleaning of the QUILT-1M Pathology Dataset for Text-Conditional Image Synthesis Marc Aubreville et.al. 2404.07676 null
2024-04-11 Implicit and Explicit Language Guidance for Diffusion-based Visual Perception Hefeng Wang et.al. 2404.07600 null
2024-04-11 GAN-based iterative motion estimation in HASTE MRI Mathias S. Feinler et.al. 2404.07576 null
2024-04-11 ObjBlur: A Curriculum Learning Approach With Progressive Object-Level Blurring for Improved Layout-to-Image Generation Stanislav Frolov et.al. 2404.07564 null
2024-04-11 CAT: Contrastive Adapter Training for Personalized Image Generation Jae Wan Park et.al. 2404.07554 link
2024-04-11 Enhancing Network Intrusion Detection Performance using Generative Adversarial Networks Xinxing Zhao et.al. 2404.07464 null
2024-04-10 RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion Jaidev Shriram et.al. 2404.07199 null
2024-04-10 A Gauss-Newton Approach for Min-Max Optimization in Generative Adversarial Networks Neel Mishra et.al. 2404.07172 link
2024-04-10 Implicit Multi-Spectral Transformer: An Lightweight and Effective Visible to Infrared Image Translation Model Yijia Chen et.al. 2404.07072 link
2024-04-10 Fine color guidance in diffusion models and its application to image compression at extremely low bitrates Tom Bordin et.al. 2404.06865 null
2024-04-10 UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion Junsheng Zhou et.al. 2404.06851 null
2024-04-10 Tuning-Free Adaptive Style Incorporation for Structure-Consistent Text-Driven Style Transfer Yanqi Ge et.al. 2404.06835 null
2024-04-10 MedRG: Medical Report Grounding with Multi-modal Large Language Model Ke Zou et.al. 2404.06798 null
2024-04-10 CryinGAN: Design and evaluation of point-cloud-based generative adversarial networks using disordered materials $-$ application to Li$_3$ScCl$_6$-LiCoO$_2$ battery interfaces Adrian Xiao Bin Yong et.al. 2404.06734 null
2024-04-10 Deep Generative Data Assimilation in Multimodal Setting Yongquan Qu et.al. 2404.06665 link
2024-04-09 GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis Srikumar Sastry et.al. 2404.06637 link
2024-04-09 High Noise Scheduling is a Must Mahmut S. Gokmen et.al. 2404.06353 null
2024-04-09 Fortifying Fully Convolutional Generative Adversarial Networks for Image Super-Resolution Using Divergence Measures Arkaprabha Basu et.al. 2404.06294 null
2024-04-09 Hyperparameter-Free Medical Image Synthesis for Sharing Data and Improving Site-Specific Segmentation Alexander Chebykin et.al. 2404.06240 link
2024-04-09 DiffHarmony: Latent Diffusion Model Meets Image Harmonization Pengfei Zhou et.al. 2404.06139 null
2024-04-09 Greedy-DiM: Greedy Algorithms for Unreasonably Effective Face Morphs Zander W. Blasingame et.al. 2404.06025 null
2024-04-09 Boosting Digital Safeguards: Blending Cryptography and Steganography Anamitra Maiti et.al. 2404.05985 null
2024-04-09 Tackling Structural Hallucination in Image Translation with Local Diffusion Seunghoi Kim et.al. 2404.05980 null
2024-04-09 StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion Ming Tao et.al. 2404.05979 link
2024-04-09 Quantum Generative Adversarial Networks in a Silicon Photonic Chip with Maximum Expressibility Haoran Ma et.al. 2404.05921 null
2024-04-08 SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing Jing Gu et.al. 2404.05717 null
2024-04-08 Learning 3D-Aware GANs from Unposed Images with Template Feature Field Xinya Chen et.al. 2404.05705 null
2024-04-08 SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation Heyuan Li et.al. 2404.05680 null
2024-04-08 MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation Kunpeng Song et.al. 2404.05674 null
2024-04-08 Automatic Controllable Colorization via Imagination Xiaoyan Cong et.al. 2404.05661 null
2024-04-08 UniFL: Improve Stable Diffusion via Unified Feedback Learning Jiacheng Zhang et.al. 2404.05595 null
2024-04-08 Mind-to-Image: Projecting Visual Mental Imagination of the Brain from fMRI Hugo Caselles-Dupré et.al. 2404.05468 null
2024-04-08 CDAD-Net: Bridging Domain Gaps in Generalized Category Discovery Sai Bhargav Rongali et.al. 2404.05366 null
2024-04-08 Mask-ControlNet: Higher-Quality Image Generation with An Additional Mask Prompt Zhiqi Huang et.al. 2404.05331 null
2024-04-08 MC $^2$ : Multi-concept Guidance for Customized Multi-concept Generation Jiaxiu Jiang et.al. 2404.05268 null
2024-04-04 No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance Vishaal Udandarao et.al. 2404.04125 link
2024-04-05 3D Facial Expressions through Analysis-by-Neural-Synthesis George Retsinas et.al. 2404.04104 null
2024-04-05 Dynamic Prompt Optimizing for Text-to-Image Generation Wenyi Mo et.al. 2404.04095 link
2024-04-05 Physics-Inspired Synthesized Underwater Image Dataset Reina Kaneko et.al. 2404.03998 null
2024-04-05 Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models Gihyun Kwon et.al. 2404.03913 null
2024-04-04 RaFE: Generative Radiance Fields Restoration Zhongkai Wu et.al. 2404.03654 null
2024-04-04 CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching Dongzhi Jiang et.al. 2404.03653 link
2024-04-04 Reference-Based 3D-Aware Image Editing with Triplane Bahri Batuhan Bilecen et.al. 2404.03632 null
2024-04-04 Robust Concept Erasure Using Task Vectors Minh Pham et.al. 2404.03631 null
2024-04-04 Terrain Point Cloud Inpainting via Signal Decomposition Yizhou Xie et.al. 2404.03572 null
2024-04-04 Integrating Generative AI into Financial Market Prediction for Improved Decision Making Chang Che et.al. 2404.03523 null
2024-04-04 Knowledge Distillation-Based Model Extraction Attack using Private Counterfactual Explanations Fatima Ezzeddine et.al. 2404.03348 null
2024-04-04 Multi Positive Contrastive Learning with Pose-Consistent Generated Images Sho Inayoshi et.al. 2404.03256 null
2024-04-04 Would Deep Generative Models Amplify Bias in Future Models? Tianwei Chen et.al. 2404.03242 null
2024-04-04 Diverse and Tailored Image Generation for Zero-shot Multi-label Classification Kaixin Zhang et.al. 2404.03144 null
2024-04-03 Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Keyu Tian et.al. 2404.02905 link
2024-04-03 MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment Duygu Ceylan et.al. 2404.02899 null
2024-04-03 On the Scalability of Diffusion-based Text-to-Image Generation Hao Li et.al. 2404.02883 null
2024-04-03 MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation Petru-Daniel Tudosiu et.al. 2404.02790 null
2024-04-03 InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation Haofan Wang et.al. 2404.02733 link
2024-04-03 Model-agnostic Origin Attribution of Generated Images with Few-shot Examples Fengyuan Liu et.al. 2404.02697 null
2024-04-03 Deep Privacy Funnel Model: From a Discriminative to a Generative Approach with an Application to Face Recognition Behrooz Razeghi et.al. 2404.02696 null
2024-04-03 Severity Controlled Text-to-Image Generative Model Bias Manipulation Jordan Vice et.al. 2404.02530 null
2024-04-03 Designing a Photonic Physically Unclonable Function Having Resilience to Machine Learning Attacks Elena R. Henderson et.al. 2404.02440 null
2024-04-02 Diffusion $^2$ : Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models Zeyu Yang et.al. 2404.02148 link
2024-04-02 3D Congealing: 3D-Aware Image Alignment in the Wild Yunzhi Zhang et.al. 2404.02125 null
2024-04-02 Red-Teaming Segment Anything Model Krzysztof Jankowski et.al. 2404.02067 link
2024-04-02 MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages Daryna Dementieva et.al. 2404.02037 null
2024-04-02 Enhancing Portfolio Optimization with Transformer-GAN Integration: A Novel Approach in the Black-Litterman Framework Enmin Zhu et.al. 2404.02029 null
2024-04-02 Bi-LORA: A Vision-Language Approach for Synthetic Image Detection Mamadou Keita et.al. 2404.01959 null
2024-04-02 Real, fake and synthetic faces -- does the coin have three sides? Shahzeb Naeem et.al. 2404.01878 null
2024-04-02 Disentangled Pre-training for Human-Object Interaction Detection Zhuolong Li et.al. 2404.01725 null
2024-04-01 PlayFutures: Imagining Civic Futures with AI and Puppets Supratim Pait et.al. 2404.01527 null
2024-04-01 Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data Matthias Gerstgrasser et.al. 2404.01413 null
2024-03-29 Benchmarking Counterfactual Image Generation Thomas Melistas et.al. 2403.20287 link
2024-03-29 FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models Barbara Toniella Corradini et.al. 2403.20105 null
2024-03-29 SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image Yunhao Li et.al. 2403.20018 link
2024-03-29 FairRAG: Fair Human Generation via Fair Retrieval Augmentation Robik Shrestha et.al. 2403.19964 null
2024-04-01 Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting Haipeng Liu et.al. 2403.19898 link
2024-03-28 Vision-Language Synthetic Data Enhances Echocardiography Downstream Tasks Pooria Ashrafian et.al. 2403.19880 link
2024-03-28 Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization Yuhang Li et.al. 2403.19866 null
2024-03-28 CLoRA: A Contrastive Approach to Compose Multiple LoRA Models Tuna Han Salih Meral et.al. 2403.19776 null
2024-03-28 Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond Katherine Xu et.al. 2403.19653 link
2024-03-28 GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models Yusuf Dalva et.al. 2403.19645 null
2024-03-28 Lane-Change in Dense Traffic with Model Predictive Control and Neural Networks Sangjae Bae et.al. 2403.19633 link
2024-03-28 Collaborative Interactive Evolution of Art in the Latent Space of Deep Generative Models Ole Hall et.al. 2403.19620 null
2024-03-28 Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model Zhicai Wang et.al. 2403.19600 link
2024-03-28 Frame by Familiar Frame: Understanding Replication in Video Diffusion Models Aimon Rahman et.al. 2403.19593 null
2024-03-28 Locate, Assign, Refine: Taming Customized Image Inpainting with Text-Subject Guidance Yulin Pan et.al. 2403.19534 null
2024-03-28 Imperceptible Protection against Style Imitation from Diffusion Models Namhyuk Ahn et.al. 2403.19254 null
2024-03-28 QNCD: Quantization Noise Correction for Diffusion Models Huanpeng Chu et.al. 2403.19140 link
2024-03-28 Synthetic Medical Imaging Generation with Generative Adversarial Networks For Plain Radiographs John R. McNulty et.al. 2403.19107 null
2024-03-27 Conditional Wasserstein Distances with Applications in Bayesian OT Flow Matching Jannis Chemseddine et.al. 2403.18705 null
2024-03-27 Attention Calibration for Disentangled Text-to-Image Personalization Yanbing Zhang et.al. 2403.18551 link
2024-03-27 DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery Analysis Zhongxi Chen et.al. 2403.18471 link
2024-03-27 DiffStyler: Diffusion-based Localized Image Style Transfer Shaoxu Li et.al. 2403.18461 null
2024-03-27 U-Sketch: An Efficient Approach for Sketch to Image Diffusion Models Ilias Mitsouras et.al. 2403.18425 null
2024-03-27 ECNet: Effective Controllable Text-to-Image Diffusion Models Sicheng Li et.al. 2403.18417 null
2024-03-27 Colour and Brush Stroke Pattern Recognition in Abstract Art using Modified Deep Convolutional Generative Adversarial Networks Srinitish Srinivasan et.al. 2403.18397 link
2024-03-27 Ship in Sight: Diffusion Models for Ship-Image Super Resolution Luigi Sigillo et.al. 2403.18370 link
2024-03-27 DSF-GAN: DownStream Feedback Generative Adversarial Network Oriel Perets et.al. 2403.18267 link
2024-03-27 Don't Look into the Dark: Latent Codes for Pluralistic Image Inpainting Haiwei Chen et.al. 2403.18186 null
2024-03-26 Boosting Diffusion Models with Moving Average Sampling in Frequency Domain Yurui Qian et.al. 2403.17870 null
2024-03-26 CT Synthesis with Conditional Diffusion Models for Abdominal Lymph Node Segmentation Yongrui Yu et.al. 2403.17770 null
2024-03-26 FaultGuard: A Generative Approach to Resilient Fault Prediction in Smart Electrical Grids Emad Efatinasab et.al. 2403.17494 null
2024-03-26 LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection Yunpeng Luo et.al. 2403.17465 null
2024-03-26 An inexact proximal MM method for a class of nonconvex composite image reconstruction models Bujin Li et.al. 2403.17450 null
2024-03-25 DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment Stella Bounareli et.al. 2403.17217 null
2024-03-25 FlashFace: Human Image Personalization with High-fidelity Identity Preservation Shilong Zhang et.al. 2403.17008 null
2024-03-25 SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer Rui Zhu et.al. 2403.17004 null
2024-03-25 Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation Omer Dahary et.al. 2403.16990 null
2024-03-25 Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance Jingyuan Zhu et.al. 2403.16954 null
2024-03-25 Iso-Diffusion: Improving Diffusion Probabilistic Models Using the Isotropy of the Additive Gaussian Noise Dilum Fernando et.al. 2403.16790 null
2024-03-25 Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases Sophie Starck et.al. 2403.16776 null
2024-03-25 Multi-Scale Texture Loss for CT denoising with GANs Francesco Di Feola et.al. 2403.16640 link
2024-03-25 SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions Yuda Song et.al. 2403.16627 null
2024-03-25 Enhancing Cross-Dataset EEG Emotion Recognition: A Novel Approach with Emotional EEG Style Transfer Network Yijin Zhou et.al. 2403.16540 null
2024-03-25 An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models Zizhao Hu et.al. 2403.16530 null
2024-03-25 Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator Takuhiro Kaneko et.al. 2403.16464 null
2024-03-25 Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation Sanyam Lakhanpal et.al. 2403.16422 null
2024-03-25 Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation Yingshan Chang et.al. 2403.16394 null
2024-03-25 Illuminating Systematic Trends in Nuclear Data with Generative Machine Learning Models Jordan M. R. Fox et.al. 2403.16389 null
2024-03-25 FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models Lin Zhao et.al. 2403.16379 null
2024-03-24 Fill in the ____ (a Diffusion-based Image Inpainting Pipeline) Eyoel Gebre et.al. 2403.16016 null
2024-03-22 DragAPart: Learning a Part-Level Motion Prior for Articulated Objects Ruining Li et.al. 2403.15382 null
2024-03-22 Long-CLIP: Unlocking the Long-Text Capability of CLIP Beichen Zhang et.al. 2403.15378 null
2024-03-22 A Wasserstein perspective of Vanilla GANs Lea Kunkel et.al. 2403.15312 null
2024-03-22 Controlled Training Data Generation with Diffusion Models Teresa Yeo et.al. 2403.15309 null
2024-03-22 Robust Utility Optimization via a GAN Approach Florian Krach et.al. 2403.15243 null
2024-03-22 A Multimodal Approach for Cross-Domain Image Retrieval Lucas Iijima et.al. 2403.15152 null
2024-03-22 MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration Zhichao Wei et.al. 2403.15059 null
2024-03-22 Cartoon Hallucinations Detection: Pose-aware In Context Visual Learning Bumsoo Kim et.al. 2403.15048 null
2024-03-22 Generative Active Learning for Image Synthesis Personalization Xulu Zhang et.al. 2403.14987 null
2024-03-22 CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusion model Seungdae Han et.al. 2403.14944 null
2024-03-21 Implicit Style-Content Separation using B-LoRA Yarden Frenkel et.al. 2403.14572 null
2024-03-21 DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing Yueru Jia et.al. 2403.14487 null
2024-03-21 AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks Max Ku et.al. 2403.14468 null
2024-03-21 Analysing Diffusion Segmentation for Medical Images Mathias Öttl et.al. 2403.14440 null
2024-03-21 Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation Mathias Öttl et.al. 2403.14429 null
2024-03-21 HySim: An Efficient Hybrid Similarity Measure for Patch Matching in Image Inpainting Saad Noufel et.al. 2403.14292 null
2024-03-21 Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models Pablo Marcos-Manchón et.al. 2403.14291 link
2024-03-21 Safeguarding Medical Image Segmentation Datasets against Unauthorized Training via Contour- and Texture-Aware Perturbations Xun Lin et.al. 2403.14250 null
2024-03-21 StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN Jongwoo Choi et.al. 2403.14186 null
2024-03-21 QSMDiff: Unsupervised 3D Diffusion Models for Quantitative Susceptibility Mapping Zhuang Xiong et.al. 2403.14070 null
2024-03-20 Learning from Models and Data for Visual Grounding Ruozhen He et.al. 2403.13804 null
2024-03-20 Step-Calibrated Diffusion for Biomedical Optical Image Restoration Yiwei Lyu et.al. 2403.13680 null
2024-03-20 ReGround: Improving Textual and Spatial Grounding at No Cost Yuseung Lee et.al. 2403.13589 null
2024-03-20 Diversity-aware Channel Pruning for StyleGAN Compression Jiwoo Chung et.al. 2403.13548 link
2024-03-20 IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models Siying Cui et.al. 2403.13535 null
2024-03-20 Deepfake Detection without Deepfakes: Generalization via Synthetic Frequency Patterns Injection Davide Alessandro Coccomini et.al. 2403.13479 null
2024-03-20 S2DM: Sector-Shaped Diffusion Models for Video Generation Haoran Lang et.al. 2403.13408 null
2024-03-20 IIDM: Image-to-Image Diffusion Model for Semantic Image Synthesis Feng Liu et.al. 2403.13378 null
2024-03-20 AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation Jingkun An et.al. 2403.13352 null
2024-03-20 TiBiX: Leveraging Temporal Information for Bidirectional X-ray and Report Generation Santosh Sanjeev et.al. 2403.13343 null
2024-03-19 FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis Linjiang Huang et.al. 2403.12963 link
2024-03-19 Segment Anything for comprehensive analysis of grapevine cluster architecture and berry properties Efrain Torres-Lomas et.al. 2403.12935 null
2024-03-19 You Only Sample Once: Taming One-Step Text-To-Image Synthesis by Self-Cooperative Diffusion GANs Yihong Luo et.al. 2403.12931 link
2024-03-19 Ultra-High-Resolution Image Synthesis with Pyramid Diffusion Model Jiajie Yang et.al. 2403.12915 link
2024-03-19 Generative Enhancement for 3D Medical Images Lingting Zhu et.al. 2403.12852 link
2024-03-19 How Spammers and Scammers Leverage AI-Generated Images on Facebook for Audience Growth Renee DiResta et.al. 2403.12838 null
2024-03-19 Total Disentanglement of Font Images into Style and Character Class Features Daichi Haraguchi et.al. 2403.12784 null
2024-03-19 Towards Controllable Face Generation with Semantic Latent Diffusion Models Alex Ergasti et.al. 2403.12743 link
2024-03-19 Tuning-Free Image Customization with Image and Text Guidance Pengzhi Li et.al. 2403.12658 null
2024-03-19 NSGAN: A Non-Dominant Sorting Optimisation-Based Generative Adversarial Design Framework for Alloy Discovery Zhipeng Li et.al. 2403.12495 null
2024-03-18 Urban Scene Diffusion through Semantic Occupancy Map Junge Zhang et.al. 2403.11697 null
2024-03-18 Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection Julia Wolleb et.al. 2403.11667 null
2024-03-18 LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything Model Yuxin Cao et.al. 2403.11656 null
2024-03-18 QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation Zhizhen Zhou et.al. 2403.11626 null
2024-03-18 CRS-Diff: Controllable Generative Remote Sensing Foundation Model Datao Tang et.al. 2403.11614 null
2024-03-18 VmambaIR: Visual State Space Model for Image Restoration Yuan Shi et.al. 2403.11423 link
2024-03-17 StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining Tushar Kataria et.al. 2403.11340 null
2024-03-17 Fast Personalized Text-to-Image Syntheses With Attention Injection Yuxuan Zhang et.al. 2403.11284 null
2024-03-17 Forging the Forger: An Attempt to Improve Authorship Verification via Data Augmentation Silvia Corbara et.al. 2403.11265 null
2024-03-17 Understanding Diffusion Models by Feynman's Path Integral Yuji Hirono et.al. 2403.11262 null
2024-03-14 SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior Huan-ang Gao et.al. 2403.09638 null
2024-03-14 Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering Zeyu Liu et.al. 2403.09622 null
2024-03-14 PrompTHis: Visualizing the Process and Influence of Prompt Editing during Text-to-Image Creation Yuhan Guo et.al. 2403.09615 null
2024-03-14 Counterfactual contrastive learning: robust representations via causal image synthesis Melanie Roschewitz et.al. 2403.09605 link
2024-03-14 Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing Wonjun Kang et.al. 2403.09468 link
2024-03-14 Mitigating attribute amplification in counterfactual image generation Tian Xia et.al. 2403.09422 null
2024-03-14 Machine Learning Processes as Sources of Ambiguity: Insights from AI Art Christian Sivertsen et.al. 2403.09374 null
2024-03-14 Mitigating Data Consistency Induced Discrepancy in Cascaded Diffusion Models for Sparse-view CT Reconstruction Hanyu Chen et.al. 2403.09355 null
2024-03-14 StainFuser: Controlling Diffusion for Faster Neural Style Transfer in Multi-Gigapixel Histology Images Robert Jewsbury et.al. 2403.09302 link
2024-03-14 Noise Dimension of GAN: An Image Compression Perspective Ziran Zhu et.al. 2403.09196 null
2024-03-13 Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models trained on Corrupted Data Asad Aali et.al. 2403.08728 link
2024-03-13 HAIFIT: Human-Centered AI for Fashion Image Translation Jianan Jiang et.al. 2403.08651 link
2024-03-13 Gaussian Splatting in Style Abhishek Saroha et.al. 2403.08498 null
2024-03-13 An Analysis of Human Alignment of Latent Diffusion Models Lorenz Linhardt et.al. 2403.08469 null
2024-03-13 Generating Synthetic Computed Tomography for Radiotherapy: SynthRAD2023 Challenge Report Evi M. C. Huijben et.al. 2403.08447 null
2024-03-13 Iterative Online Image Synthesis via Diffusion Model for Imbalanced Classification Shuhan Li et.al. 2403.08407 null
2024-03-13 StyleDyRF: Zero-shot 4D Style Transfer for Dynamic Neural Radiance Fields Hongbin Xu et.al. 2403.08310 null
2024-03-13 Attack Deterministic Conditional Image Generative Models for Diverse and Controllable Generation Tianyi Chu et.al. 2403.08294 null
2024-03-13 VIGFace: Virtual Identity Generation Model for Face Image Synthesis Minsoo Kim et.al. 2403.08277 null
2024-03-13 CoroNetGAN: Controlled Pruning of GANs via Hypernetworks Aman Kumar et.al. 2403.08261 null
2024-03-12 Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation Shihao Zhao et.al. 2403.07860 link
2024-03-12 Quantifying and Mitigating Privacy Risks for Tabular Generative Models Chaoyi Zhu et.al. 2403.07842 null
2024-03-12 StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting Kunhao Liu et.al. 2403.07807 null
2024-03-12 BraSyn 2023 challenge: Missing MRI synthesis and the effect of different learning objectives Ivo M. Baltruschat et.al. 2403.07800 null
2024-03-12 Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model Yuxuan Zhang et.al. 2403.07764 null
2024-03-12 Synth $^2$ : Boosting Visual-Language Models with Synthetic Captions and Image Embeddings Sahand Sharifzadeh et.al. 2403.07750 null
2024-03-12 Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion Dongyang Li et.al. 2403.07721 link
2024-03-12 SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces Yuta Oshima et.al. 2403.07711 link
2024-03-12 Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation Di Mi et.al. 2403.07673 null
2024-03-12 Gender-ambiguous voice generation through feminine speaking style transfer in male voices Maria Koutsogiannaki et.al. 2403.07661 null
2024-03-11 BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion Xuan Ju et.al. 2403.06976 null
2024-03-11 Surface-aware Mesh Texture Synthesis with Pre-trained 2D CNNs Áron Samuel Kovács et.al. 2403.06855 null
2024-03-11 Medical Image Synthesis via Fine-Grained Image-Text Alignment and Anatomy-Pathology Prompting Wenting Chen et.al. 2403.06835 null
2024-03-11 Data-Independent Operator: A Training-Free Artifact Representation Extractor for Generalizable Deepfake Detection Chuangchuang Tan et.al. 2403.06803 link
2024-03-11 FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation Pengchong Qiao et.al. 2403.06775 link
2024-03-11 Distribution-Aware Data Expansion with Diffusion Models Haowei Zhu et.al. 2403.06741 link
2024-03-11 Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback Adarsh N L et.al. 2403.06735 null
2024-03-11 Galaxy Morphologies Revealed with Subaru HSC and Super-Resolution Techniques II: Environmental Dependence of Galaxy Mergers at z~2-5 Takatoshi Shibuya et.al. 2403.06729 null
2024-03-11 FFAD: A Novel Metric for Assessing Generated Time Series Data Utilizing Fourier Transform and Auto-encoder Yang Chen et.al. 2403.06576 null
2024-03-11 Active Generation for Image Classification Tao Huang et.al. 2403.06517 null
2024-03-08 Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola Yijiang Li et.al. 2403.05523 null
2024-03-08 A Data Augmentation Pipeline to Generate Synthetic Labeled Datasets of 3D Echocardiography Images using a GAN Cristiana Tiago et.al. 2403.05384 null
2024-03-08 Federated Learning Method for Preserving Privacy in Face Recognition System Enoch Solomon et.al. 2403.05344 null
2024-03-08 Fine-tuning a Multiple Instance Learning Feature Extractor with Masked Context Modelling and Knowledge Distillation Juan I. Pisula et.al. 2403.05325 null
2024-03-08 GAN-based Massive MIMO Channel Model Trained on Measured Data Florian Euchner et.al. 2403.05321 null
2024-03-08 An Efficient Quasi-Random Sampling for Copulas Sumin Wang et.al. 2403.05281 null
2024-03-08 Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation Junyan Wang et.al. 2403.05239 null
2024-03-08 Synthetic Privileged Information Enhances Medical Image Representation Learning Lucas Farndale et.al. 2403.05220 null
2024-03-08 Denoising Autoregressive Representation Learning Yazhe Li et.al. 2403.05196 null
2024-03-08 Robust Semantic Communications for Speech-to-Text Translation Zhenzi Weng et.al. 2403.05187 null
2024-03-07 Photonic probabilistic machine learning using quantum vacuum noise Seou Choi et.al. 2403.04731 null
2024-03-07 PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation Junsong Chen et.al. 2403.04692 null
2024-03-07 A Domain Translation Framework with an Adversarial Denoising Diffusion Model to Generate Synthetic Datasets of Echocardiography Images Cristiana Tiago et.al. 2403.04612 null
2024-03-07 Discriminative Probing and Tuning for Text-to-Image Generation Leigang Qu et.al. 2403.04321 null
2024-03-06 PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement Zhijie Wang et.al. 2403.04014 link
2024-03-06 Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer Naifu Xue et.al. 2403.03736 null
2024-03-06 Seamless Virtual Reality with Integrated Synchronizer and Synthesizer for Autonomous Driving He Li et.al. 2403.03541 null
2024-03-06 NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging Takahiro Shirakawa et.al. 2403.03485 null
2024-03-06 FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided Diffusion Hao Wang et.al. 2403.03463 null
2024-03-07 DLP-GAN: learning to draw modern Chinese landscape photos with generative adversarial network Xiangquan Gui et.al. 2403.03456 null
2024-03-06 Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing Bingyan Liu et.al. 2403.03431 null
2024-03-05 Scaling Rectified Flow Transformers for High-Resolution Image Synthesis Patrick Esser et.al. 2403.03206 null
2024-03-05 Behavior Generation with Latent Actions Seungjae Lee et.al. 2403.03181 link
2024-03-05 Doubly Abductive Counterfactual Inference for Text-based Image Editing Xue Song et.al. 2403.02981 null
2024-03-05 Bias in Generative AI Mi Zhou et.al. 2403.02726 null
2024-03-05 Time Weaver: A Conditional Time Series Generation Model Sai Shankar Narasimhan et.al. 2403.02682 null
2024-03-04 Transformer for Times Series: an Application to the S&P500 Pierre Brugiere et.al. 2403.02523 null
2024-03-04 NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function Abdullah Nazhat Abdullah et.al. 2403.02411 link
2024-03-04 ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models Jiaxiang Cheng et.al. 2403.02084 null
2024-03-05 Matrix Completion with Convex Optimization and Column Subset Selection Antonina Krajewska et.al. 2403.01919 link
2024-03-04 PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis Zhengyao Lv et.al. 2403.01852 link
2024-03-02 Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models Neta Shaul et.al. 2403.01329 null
2024-03-02 TCIG: Two-Stage Controlled Image Generation with Quality Enhancement through Diffusion Salaheldin Mohamed et.al. 2403.01212 null
2024-03-02 A Hybrid Model for Traffic Incident Detection based on Generative Adversarial Networks and Transformer Model Xinying Lu et.al. 2403.01147 null
2024-03-02 Distilling Text Style Transfer With Self-Explanation From LLMs Chiyu Zhang et.al. 2403.01106 null
2024-03-01 BasedAI: A decentralized P2P network for Zero Knowledge Large Language Models (ZK-LLMs) Sean Wellington et.al. 2403.01008 null
2024-03-01 Improving Android Malware Detection Through Data Augmentation Using Wasserstein Generative Adversarial Networks Kawana Stalin et.al. 2403.00890 null
2024-03-01 Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks Yuhao Liu et.al. 2403.00644 null
2024-03-01 Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset Ander Salaberria et.al. 2403.00587 link
2024-03-01 Rethinking cluster-conditioned diffusion models Nikolas Adaloglou et.al. 2403.00570 null
2024-03-01 VisionLLaMA: A Unified LLaMA Interface for Vision Tasks Xiangxiang Chu et.al. 2403.00522 link
2024-02-29 SeD: Semantic-Aware Discriminator for Image Super-Resolution Bingchen Li et.al. 2402.19387 null
2024-02-29 A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation Hanxi Li et.al. 2402.19330 null
2024-02-29 Memory-Augmented Generative Adversarial Transformers Stephan Raaijmakers et.al. 2402.19218 null
2024-02-29 Generative models struggle with kirigami metamaterials Gerrit Felsch et.al. 2402.19196 null
2024-02-29 Disentangling representations of retinal images with generative models Sarah Müller et.al. 2402.19186 null
2024-02-29 Trajectory Consistency Distillation Jianbin Zheng et.al. 2402.19159 link
2024-02-29 Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection Christos Koutlis et.al. 2402.19091 null
2024-02-29 WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image Synthesis Paul Friedrich et.al. 2402.19043 link
2024-02-29 Lotka-Volterra Model with Mutations and Generative Adversarial Networks S. V. Kozyrev et.al. 2402.19035 null
2024-02-29 Generating, Reconstructing, and Representing Discrete and Continuous Data: Generalized Diffusion with Learnable Encoding-Decoding Guangyi Liu et.al. 2402.19009 null
2024-02-28 MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation Jiahao Huang et.al. 2402.18451 null
2024-02-28 FineDiffusion: Scaling up Diffusion Models for Fine-grained Image Generation with 10,000 Classes Ziying Pan et.al. 2402.18331 null
2024-02-28 Balancing Act: Distribution-Guided Debiasing in Diffusion Models Rishubh Parihar et.al. 2402.18206 null
2024-02-28 Misalignment-Robust Frequency Distribution Loss for Image Transformation Zhangkai Ni et.al. 2402.18192 null
2024-02-28 VulMCI : Code Splicing-based Pixel-row Oversampling for More Continuous Vulnerability Image Generation Tao Peng et.al. 2402.18189 null
2024-02-28 Block and Detail: Scaffolding Sketch-to-Image Generation Vishnu Sarukkai et.al. 2402.18116 null
2024-02-28 Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis Yanzuo Lu et.al. 2402.18078 link
2024-02-28 SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model Bin Cao et.al. 2402.18068 null
2024-02-28 Breaking the Black-Box: Confidence-Guided Model Inversion Attack for Distribution Shift Xinhao Liu et.al. 2402.18027 null
2024-02-27 CustomSketching: Sketch Concept Extraction for Sketch-based Image Synthesis and Editing Chufeng Xiao et.al. 2402.17624 null

(back to top)

LLM

Publish Date Title Authors PDF Code
2024-05-21 Reducing Transformer Key-Value Cache Size with Cross-Layer Attention William Brandon et.al. 2405.12981 null
2024-05-21 Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale Shriram Chennakesavalu et.al. 2405.12961 null
2024-05-21 Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models Zhangyue Yin et.al. 2405.12939 null
2024-05-21 Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs Bilgehan Sel et.al. 2405.12933 null
2024-05-21 Code-mixed Sentiment and Hate-speech Prediction Anjali Yadav et.al. 2405.12929 null
2024-05-21 Streamlining Software Reviews: Efficient Predictive Modeling with Minimal Examples Tim Menzies et.al. 2405.12920 null
2024-05-21 G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation Xingyuan Pan et.al. 2405.12915 null
2024-05-21 An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation Zhiyu Tan et.al. 2405.12914 null
2024-05-21 Topic Modelling Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment Holli Sargeant et.al. 2405.12910 link
2024-05-21 Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents San Kim et.al. 2405.12900 null
2024-05-20 Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning Guanglin Zhou et.al. 2405.12217 link
2024-05-20 MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark Hongwei Liu et.al. 2405.12209 link
2024-05-20 Developers' Perceptions on the Impact of ChatGPT in Software Development: A Survey Thiago S. Vaillant et.al. 2405.12195 null
2024-05-20 CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models Haoxiang Shi et.al. 2405.12174 null
2024-05-20 Fennec: Fine-grained Language Model Evaluation and Correction Extended through Branching and Bridging Xiaobo Liang et.al. 2405.12163 link
2024-05-20 Eliciting Problem Specifications via Large Language Models Robert E. Wray et.al. 2405.12147 null
2024-05-20 DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM Xuchen Li et.al. 2405.12139 null
2024-05-20 MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Ting Jiang et.al. 2405.12130 link
2024-05-20 Reindex-Then-Adapt: Improving Large Language Models for Conversational Recommendation Zhankui He et.al. 2405.12119 null
2024-05-20 Imp: Highly Capable Large Multimodal Models for Mobile Devices Zhenwei Shao et.al. 2405.12107 link
2024-05-17 A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers Kaiyu Huang et.al. 2405.10936 link
2024-05-17 The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks Lucius Bushnaq et.al. 2405.10928 null
2024-05-17 COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain Dimitrios P. Panagoulias et.al. 2405.10893 null
2024-05-17 Application of Artificial Intelligence in Schizophrenia Rehabilitation Management: Systematic Literature Review Hongyi Yang et.al. 2405.10883 null
2024-05-17 The Future of Large Language Model Pre-training is Federated Lorenzo Sani et.al. 2405.10853 null
2024-05-17 Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities Hao Zhou et.al. 2405.10825 null
2024-05-17 Modeling Supply Chain Interaction and Disruption: Insights from Real-world Data and Complex Adaptive System Jiawei Feng et.al. 2405.10818 null
2024-05-17 ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios Markus Bayer et.al. 2405.10808 null
2024-05-17 Empowering Small-Scale Knowledge Graphs: A Strategy of Leveraging General-Purpose Knowledge Graphs for Enriched Embeddings Albert Sawczyn et.al. 2405.10745 null
2024-05-17 Efficient Multimodal Large Language Models: A Survey Yizhang Jin et.al. 2405.10739 link
2024-05-16 UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models Sahel Sharifymoghaddam et.al. 2405.10311 null
2024-05-16 4D Panoptic Scene Graph Generation Jingkang Yang et.al. 2405.10305 link
2024-05-16 HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models Rhea Sanjay Sukthanker et.al. 2405.10299 link
2024-05-16 Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction Jianhao Chen et.al. 2405.10288 null
2024-05-16 FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models Adrian Bulat et.al. 2405.10286 null
2024-05-16 Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers Tuo Zhang et.al. 2405.10276 null
2024-05-16 Keep It Private: Unsupervised Privatization of Online Text Calvin Bao et.al. 2405.10260 link
2024-05-16 When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models Xianzheng Ma et.al. 2405.10255 null
2024-05-16 A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks Xuanfan Ni et.al. 2405.10251 null
2024-05-16 IntelliExplain: Enhancing Interactive Code Generation through Natural Language Explanations for Non-Professional Programmers Hao Yan et.al. 2405.10250 null
2024-05-15 Modeling Bilingual Sentence Processing: Evaluating RNN and Transformer Architectures for Cross-Language Structural Priming Bushi Xiao et.al. 2405.09508 null
2024-05-15 ParaNames 1.0: Creating an Entity Name Corpus for 400+ Languages using Wikidata Jonne Sälevä et.al. 2405.09496 null
2024-05-15 Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts Donya Rooein et.al. 2405.09482 null
2024-05-15 Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models Majid Zarharan et.al. 2405.09454 link
2024-05-15 Facilitating Opinion Diversity through Hybrid NLP Approaches Michiel van der Meer et.al. 2405.09439 null
2024-05-15 MicroPython Testbed for Federated Learning Algorithms Miroslav Popovic et.al. 2405.09423 null
2024-05-15 Matching domain experts by training from scratch on domain knowledge Xiaoliang Luo et.al. 2405.09395 null
2024-05-15 PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models Devansh Jain et.al. 2405.09373 null
2024-05-15 Large Language Model Bias Mitigation from the Perspective of Knowledge Editing Ruizhe Chen et.al. 2405.09341 null
2024-05-15 Prompting-based Synthetic Data Generation for Few-Shot Question Answering Maximilian Schmidt et.al. 2405.09335 null
2024-05-14 Towards Enhanced RAC Accessibility: Leveraging Datasets and LLMs Edison Jair Bejarano Sepulveda et.al. 2405.08792 null
2024-05-14 Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring Tiantian Zhang et.al. 2405.08786 null
2024-05-14 Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs Akhila Yerukola et.al. 2405.08760 link
2024-05-14 Distributed Threat Intelligence at the Edge Devices: A Large Language Model-Driven Approach Syed Mhamudul Hasan et.al. 2405.08755 null
2024-05-14 Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Zhimin Li et.al. 2405.08748 link
2024-05-14 ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation Dimitris Gkoumas et.al. 2405.08619 null
2024-05-14 A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine Hanguang Xiao et.al. 2405.08603 null
2024-05-14 EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark Xiaohui Zhang et.al. 2405.08596 null
2024-05-14 Falcon 7b for Software Mention Detection in Scholarly Documents AmeerAli Khan et.al. 2405.08514 null
2024-05-14 Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure Odysseas S. Chlapanis et.al. 2405.08502 null
2024-05-13 Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Chengyue Wu et.al. 2405.07990 null
2024-05-13 A Generalist Learner for Multifaceted Medical Image Interpretation Hong-Yu Zhou et.al. 2405.07988 null
2024-05-13 PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation Suad Alshammari et.al. 2405.07963 null
2024-05-13 AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments Samuel Schmidgall et.al. 2405.07960 null
2024-05-13 EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning Yinzhu Quan et.al. 2405.07938 null
2024-05-13 PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition Ziyang Zhang et.al. 2405.07932 link
2024-05-13 Can Better Text Semantics in Prompt Tuning Improve VLM Generalization? Hari Chandana Kuchibhotla et.al. 2405.07921 null
2024-05-13 A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking Ferdinand Schlatt et.al. 2405.07920 null
2024-05-13 Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers Alena Tsanda et.al. 2405.07886 null
2024-05-13 Reproducing the Metric-Based Evaluation of a Set of Controllable Text Generation Techniques Michela Lorandi et.al. 2405.07875 null
2024-05-10 Linearizing Large Language Models Jean Mercat et.al. 2405.06640 link
2024-05-10 Value Augmented Sampling for Language Model Alignment and Personalization Seungwook Han et.al. 2405.06639 link
2024-05-10 Federated Document Visual Question Answering: A Pilot Study Khanh Nguyen et.al. 2405.06636 null
2024-05-10 Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models Chakshu Moar et.al. 2405.06626 null
2024-05-10 What Can Natural Language Processing Do for Peer Review? Ilia Kuznetsov et.al. 2405.06563 null
2024-05-10 Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval Mengjia Niu et.al. 2405.06545 null
2024-05-10 Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts Wenyu Huang et.al. 2405.06524 null
2024-05-10 UniDM: A Unified Framework for Data Manipulation with Large Language Models Yichen Qian et.al. 2405.06510 null
2024-05-10 Aspect-based Sentiment Evaluation of Chess Moves (ASSESS): an NLP-based Method for Evaluating Chess Strategies from Textbooks Haifa Alrdahi et.al. 2405.06499 null
2024-05-10 Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling Lyumanshan Ye et.al. 2405.06495 null
2024-05-09 Natural Language Processing RELIES on Linguistics Juri Opitz et.al. 2405.05966 null
2024-05-09 OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning Dan Qiao et.al. 2405.05957 link
2024-05-09 Probing Multimodal LLMs as World Models for Driving Shiva Sreeram et.al. 2405.05956 link
2024-05-09 Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning Junzhi Chen et.al. 2405.05955 null
2024-05-09 CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts Jiachen Li et.al. 2405.05949 link
2024-05-09 Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness Siyuan Li et.al. 2405.05930 null
2024-05-09 Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? Zorik Gekhman et.al. 2405.05904 null
2024-05-09 Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes Ziang Guo et.al. 2405.05885 null
2024-05-09 FlockGPT: Guiding UAV Flocking with Linguistic Orchestration Artem Lykov et.al. 2405.05872 null
2024-05-09 Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning Artem Lykov et.al. 2405.05824 link
2024-05-08 You Only Cache Once: Decoder-Decoder Architectures for Language Models Yutao Sun et.al. 2405.05254 null
2024-05-08 Open Source Language Models Can Provide Feedback: Evaluating LLMs' Ability to Help Students Using GPT-4-As-A-Judge Charles Koutcheme et.al. 2405.05253 link
2024-05-09 LLMs with Personalities in Multi-issue Negotiation Games Sean Noh et.al. 2405.05248 null
2024-05-08 SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants Masoud Moghani et.al. 2405.05226 null
2024-05-08 Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers Jiuxiang Gu et.al. 2405.05219 null
2024-05-08 MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning Inderjeet Nair et.al. 2405.05189 null
2024-05-08 Air Gap: Protecting Privacy-Conscious Conversational Agents Eugene Bagdasaryan et.al. 2405.05175 null
2024-05-08 XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples Peiqin Lin et.al. 2405.05116 null
2024-05-08 QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs Weijia Zhang et.al. 2405.05109 null
2024-05-08 Concerns on Bias in Large Language Models when Creating Synthetic Personae Helena A. Haxvig et.al. 2405.05080 null
2024-05-07 ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning Jing Lin et.al. 2405.04533 null
2024-05-07 QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving Yujun Lin et.al. 2405.04532 link
2024-05-07 NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts Shudan Zhang et.al. 2405.04520 null
2024-05-07 xLSTM: Extended Long Short-Term Memory Maximilian Beck et.al. 2405.04517 null
2024-05-07 A Transformer with Stack Attention Jiaoda Li et.al. 2405.04515 link
2024-05-08 Unveiling Disparities in Web Task Handling Between Human and Web Agent Kihoon Son et.al. 2405.04497 null
2024-05-07 Toward In-Context Teaching: Adapting Examples to Students' Misconceptions Alexis Ross et.al. 2405.04495 null
2024-05-07 The Silicone Ceiling: Auditing GPT's Race and Gender Biases in Hiring Lena Armstrong et.al. 2405.04412 null
2024-05-07 Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks Georgios Pantazopoulos et.al. 2405.04403 link
2024-05-07 Large Language Models Cannot Explain Themselves Advait Sarkar et.al. 2405.04382 null
2024-05-06 Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs Muhammad Uzair Khattak et.al. 2405.03690 null
2024-05-06 Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames Keith Burghardt et.al. 2405.03688 null
2024-05-06 Language-Image Models with 3D Understanding Jang Hyun Cho et.al. 2405.03685 null
2024-05-06 AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design Kamal Choudhary et.al. 2405.03680 null
2024-05-06 A New Robust Partial $p$ -Wasserstein-Based Metric for Comparing Distributions Sharath Raghvendra et.al. 2405.03664 null
2024-05-06 When LLMs Meet Cybersecurity: A Systematic Literature Review Jie Zhang et.al. 2405.03644 null
2024-05-06 A Controlled Experiment on the Energy Efficiency of the Source Code Generated by Code Llama Vlad-Andrei Cursaru et.al. 2405.03616 null
2024-05-06 Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment Abhinav Agarwalla et.al. 2405.03594 null
2024-05-06 AlphaMath Almost Zero: process Supervision without process Guoxin Chen et.al. 2405.03553 null
2024-05-06 MAmmoTH2: Scaling Instructions from the Web Xiang Yue et.al. 2405.03548 null
2024-05-03 Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows Jasmine Y. Shih et.al. 2405.02260 null
2024-05-03 What matters when building vision-language models? Hugo Laurençon et.al. 2405.02246 null
2024-05-03 REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs Deepa Tilwani et.al. 2405.02228 null
2024-05-03 Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks Lujing Zhang et.al. 2405.02225 null
2024-05-03 FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems Yashar Deldjoo et.al. 2405.02219 null
2024-05-03 Automatic Programming: Large Language Models and Beyond Michael R. Lyu et.al. 2405.02213 null
2024-05-03 Assessing and Verifying Task Utility in LLM-Powered Applications Negar Arabzadeh et.al. 2405.02178 null
2024-05-03 The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates Giuseppe Russo Latona et.al. 2405.02150 null
2024-05-03 MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain Chao Jiang et.al. 2405.02144 null
2024-05-03 Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection Guillem Ramírez et.al. 2405.02134 null
2024-05-02 Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks Murtaza Dalal et.al. 2405.01534 null
2024-05-02 OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning Shihao Wang et.al. 2405.01533 null
2024-05-02 FLAME: Factuality-Aware Alignment for Large Language Models Sheng-Chieh Lin et.al. 2405.01525 null
2024-05-02 Transformer-Aided Semantic Communications Matin Mortaheb et.al. 2405.01521 null
2024-05-02 Analyzing the Role of Semantic Representations in the Era of Large Language Models Zhijing Jin et.al. 2405.01502 link
2024-05-02 Supporting Business Document Workflows via Collection-Centric Information Foraging with Large Language Models Raymond Fok et.al. 2405.01501 null
2024-05-02 Controllable Text Generation in the Instruction-Tuning Era Dhananjay Ashok et.al. 2405.01490 null
2024-05-02 NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Gerald Shen et.al. 2405.01481 link
2024-05-02 V-FLUTE: Visual Figurative Language Understanding with Textual Explanations Arkadiy Saakyan et.al. 2405.01474 null
2024-05-02 Advancing human-centric AI for robust X-ray analysis through holistic self-supervised learning Théo Moutakanni et.al. 2405.01469 null
2024-05-01 Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3 Junsang Yoon et.al. 2405.00664 null
2024-05-01 HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models Ningke Li et.al. 2405.00648 null
2024-05-01 When Quantization Affects Confidence of Large Language Models? Irina Proskurina et.al. 2405.00632 null
2024-05-01 "I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust Sunnie S. Y. Kim et.al. 2405.00623 null
2024-05-01 Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling Yida Mu et.al. 2405.00611 null
2024-05-01 Investigating Automatic Scoring and Feedback using Large Language Models Gloria Ashiya Katuka et.al. 2405.00602 null
2024-05-01 Are Models Biased on Text without Gender-related Language? Catarina G Belém et.al. 2405.00588 link
2024-05-01 The Real, the Better: Aligning Large Language Models with Online Human Behaviors Guanying Jiang et.al. 2405.00578 null
2024-05-01 EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model Deng Li et.al. 2405.00574 null
2024-05-01 Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval Young Kyun Jang et.al. 2405.00571 null
2024-04-30 DOCCI: Descriptions of Connected and Contrasting Images Yasumasa Onoe et.al. 2404.19753 null
2024-04-30 Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Yunhao Ge et.al. 2404.19752 null
2024-04-30 PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification Leon Garza et.al. 2404.19744 null
2024-04-30 Better & Faster Large Language Models via Multi-token Prediction Fabian Gloeckle et.al. 2404.19737 null
2024-04-30 A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications Steph Buongiorno et.al. 2404.19729 null
2024-04-30 PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games Steph Buongiorno et.al. 2404.19721 null
2024-04-30 Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns Constantinos Patsakis et.al. 2404.19715 null
2024-04-30 Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models Scott Sumpter et.al. 2404.19713 null
2024-04-30 When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively Tiziano Labruna et.al. 2404.19705 null
2024-04-30 Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners Chun Feng et.al. 2404.19696 null
2024-04-29 Hallucination of Multimodal Large Language Models: A Survey Zechen Bai et.al. 2404.18930 link
2024-04-29 DPO Meets PPO: Reinforced Token Optimization for RLHF Han Zhong et.al. 2404.18922 null
2024-04-29 TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation Junhao Cheng et.al. 2404.18919 null
2024-04-29 Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting Fangcheng Liu et.al. 2404.18911 null
2024-04-29 Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking Hong Jin Kang et.al. 2404.18881 link
2024-04-29 More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness Aaron J. Li et.al. 2404.18870 link
2024-04-29 Truth-value judgment in language models: belief directions are context sensitive Stefan F. Schouten et.al. 2404.18865 null
2024-04-29 Performance-Aligned LLMs for Generating Fast Code Daniel Nichols et.al. 2404.18864 null
2024-04-29 VERT: Verified Equivalent Rust Transpilation with Few-Shot Learning Aidan Z. H. Yang et.al. 2404.18852 null
2024-04-29 It's Difficult to be Neutral -- Human and LLM-based Sentiment Annotation of Patient Comments Petter Mæhlum et.al. 2404.18832 null
2024-04-26 Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo Stephen Zhao et.al. 2404.17546 null
2024-04-26 Large Language Model Agent as a Mechanical Designer Yayati Jadhav et.al. 2404.17525 null
2024-04-26 On the Use of Large Language Models to Generate Capability Ontologies Luis Miguel Vieira da Silva et.al. 2404.17524 null
2024-04-26 Enhancing Legal Compliance and Regulation Analysis with Large Language Models Shabnam Hassani et.al. 2404.17522 null
2024-04-26 A Comprehensive Evaluation on Event Reasoning of Large Language Models Zhengwei Tao et.al. 2404.17513 link
2024-04-26 Learning text-to-video retrieval from image captioning Lucas Ventura et.al. 2404.17498 null
2024-04-26 CEval: A Benchmark for Evaluating Counterfactual Text Generation Van Bach Nguyen et.al. 2404.17475 null
2024-04-26 Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System Robin Schmucker et.al. 2404.17460 null
2024-04-26 "ChatGPT Is Here to Help, Not to Replace Anybody" -- An Evaluation of Students' Opinions On Integrating ChatGPT In CS Courses Bruno Pereira Cipriano et.al. 2404.17443 null
2024-04-26 InspectorRAGet: An Introspection Platform for RAG Evaluation Kshitij Fadnis et.al. 2404.17347 null
2024-04-25 Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials Ye Fang et.al. 2404.16829 null
2024-04-25 How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Zhe Chen et.al. 2404.16821 link
2024-04-25 IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages Harman Singh et.al. 2404.16816 null
2024-04-25 Make Your LLM Fully Utilize the Context Shengnan An et.al. 2404.16811 link
2024-04-25 Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning Tianhui Zhang et.al. 2404.16807 null
2024-04-25 Weak-to-Strong Extrapolation Expedites Alignment Chujie Zheng et.al. 2404.16792 link
2024-04-25 SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension Bohao Li et.al. 2404.16790 link
2024-04-25 Continual Learning of Large Language Models: A Comprehensive Survey Haizhou Shi et.al. 2404.16789 link
2024-04-25 Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model Runzhe Zhan et.al. 2404.16766 null
2024-04-25 RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis Xiaoman Zhang et.al. 2404.16754 null
2024-04-24 Hybrid LLM/Rule-based Approaches to Business Insights Generation from Structured Data Aliaksei Vertsel et.al. 2404.15604 null
2024-04-24 ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction Henry Peng Zou et.al. 2404.15592 link
2024-04-24 Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? Hossein Salami et.al. 2404.15578 null
2024-04-23 PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models Shashi Kant Gupta et.al. 2404.15549 null
2024-04-23 Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models Mihir Parmar et.al. 2404.15522 link
2024-04-23 Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval Young Kyun Jang et.al. 2404.15516 null
2024-04-23 ToM-LM: Delegating Theory Of Mind Reasoning to External Symbolic Executors in Large Language Models Weizhi Tang et.al. 2404.15515 null
2024-04-23 GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots Simranjit Singh et.al. 2404.15500 null
2024-04-23 IryoNLP at MEDIQA-CORR 2024: Tackling the Medical Error Detection & Correction Task On the Shoulders of Medical Agents Jean-Philippe Corbeil et.al. 2404.15488 link
2024-04-23 Large Language Models Spot Phishing Emails with Surprising Accuracy: A Comparative Analysis of Performance Het Patel et.al. 2404.15485 null
2024-04-23 Aligning LLM Agents by Learning Latent Preference from User Edits Ge Gao et.al. 2404.15269 null
2024-04-23 XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts Yifeng Ding et.al. 2404.15247 link
2024-04-23 Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models Aidan Z. H. Yang et.al. 2404.15236 null
2024-04-23 Re-Thinking Inverse Graphics With Large Language Models Peter Kulits et.al. 2404.15228 null
2024-04-23 Setting up the Data Printer with Improved English to Ukrainian Machine Translation Yurii Paniv et.al. 2404.15196 null
2024-04-23 Regressive Side Effects of Training Language Models to Mimic Student Misconceptions Shashank Sonkar et.al. 2404.15156 null
2024-04-23 Bias patterns in the application of LLMs for clinical decision support: A comprehensive study Raphael Poulain et.al. 2404.15149 null
2024-04-23 Rethinking LLM Memorization through the Lens of Adversarial Compression Avi Schwarzschild et.al. 2404.15146 null
2024-04-23 MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning Sunan He et.al. 2404.15127 null
2024-04-23 Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation Xun Wu et.al. 2404.15100 null
2024-04-22 AutoAD III: The Prequel -- Back to the Pixels Tengda Han et.al. 2404.14412 null
2024-04-22 SpaceByte: Towards Deleting Tokenization from Large Language Modeling Kevin Slagle et.al. 2404.14408 link
2024-04-22 RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios? Adrian de Wynter et.al. 2404.14397 null
2024-04-22 A Survey on Self-Evolution of Large Language Models Zhengwei Tao et.al. 2404.14387 null
2024-04-22 Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph Xiaochen Kev Gao et.al. 2404.14372 link
2024-04-22 Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data Fahim Tajwar et.al. 2404.14367 link
2024-04-22 Better Synthetic Data by Retrieving and Transforming Existing Datasets Saumya Gandhi et.al. 2404.14361 link
2024-04-22 Rethinking Legal Compliance Automation: Opportunities with Large Language Models Shabnam Hassani et.al. 2404.14356 null
2024-04-22 Automated Long Answer Grading with RiceChem Dataset Shashank Sonkar et.al. 2404.14316 null
2024-04-22 Explaining Arguments' Strength: Unveiling the Role of Attacks and Supports (Technical Report) Xiang Yin et.al. 2404.14304 null
2024-04-19 MoVA: Adapting Mixture of Vision Experts to Multimodal Context Zhuofan Zong et.al. 2404.13046 link
2024-04-19 Unified Scene Representation and Reconstruction for 3D Large Language Models Tao Chu et.al. 2404.13044 null
2024-04-19 Data Alignment for Zero-Shot Concept Generation in Dermatology AI Soham Gadgil et.al. 2404.13043 null
2024-04-19 LaPA: Latent Prompt Assist Model For Medical Visual Question Answering Tiancheng Gu et.al. 2404.13039 link
2024-04-19 Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs Biyang Guo et.al. 2404.13033 link
2024-04-19 When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering Stephen Choi et.al. 2404.13028 null
2024-04-19 Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Chuofan Ma et.al. 2404.13013 null
2024-04-19 Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs Clemencia Siro et.al. 2404.12994 link
2024-04-19 RedactBuster: Entity Type Recognition from Redacted Documents Mirco Beltrame et.al. 2404.12991 null
2024-04-19 FineRec:Exploring Fine-grained Sequential Recommendation Xiaokun Zhang et.al. 2404.12975 null
2024-04-18 BLINK: Multimodal Large Language Models Can See but Not Perceive Xingyu Fu et.al. 2404.12390 null
2024-04-18 MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale Xiaotang Gai et.al. 2404.12372 null
2024-04-18 When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes Asaf Yehudai et.al. 2404.12365 null
2024-04-18 Towards a Foundation Model for Partial Differential Equation: Multi-Operator Learning and Extrapolation Jingmin Sun et.al. 2404.12355 link
2024-04-18 V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning Hang Hua et.al. 2404.12353 null
2024-04-18 Large Language Models in Targeted Sentiment Analysis Nicolay Rusnachenko et.al. 2404.12342 link
2024-04-18 Normative Requirements Operationalization with Large Language Models Nick Feng et.al. 2404.12335 null
2024-04-18 Large Language Models for Synthetic Participatory Planning of Shared Automated Electric Mobility Systems Jiangbo Yu et.al. 2404.12317 null
2024-04-18 Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair Yusuke Sakai et.al. 2404.12299 null
2024-04-18 Augmenting emotion features in irony detection with Large language modeling Yucheng Lin et.al. 2404.12291 null
2024-04-17 A Deep Dive into Large Language Models for Automated Bug Localization and Repair Soneya Binta Hossain et.al. 2404.11595 null
2024-04-17 Related Work and Citation Text Generation: A Survey Xiangci Li et.al. 2404.11588 null
2024-04-17 LLMTune: Accelerate Database Knob Tuning with Large Language Models Xinmei Huang et.al. 2404.11581 null
2024-04-17 MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation Kuan-Chieh et.al. 2404.11565 null
2024-04-17 Quantifying Multilingual Performance of Large Language Models Across Languages Zihao Li et.al. 2404.11553 null
2024-04-17 Evaluating Span Extraction in Generative Paradigm: A Reflection on Aspect-Based Sentiment Analysis Soyoung Yang et.al. 2404.11539 null
2024-04-17 Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization Costas Mavromatis et.al. 2404.11531 null
2024-04-17 Embedding Privacy in Computational Social Science and Artificial Intelligence Research Keenan Jones et.al. 2404.11515 null
2024-04-17 Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models Yushuo Chen et.al. 2404.11502 link
2024-04-17 Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models Yue Zhou et.al. 2404.11500 link
2024-04-16 Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback Qiwei Di et.al. 2404.10776 null
2024-04-16 LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation? Yuchi Wang et.al. 2404.10763 link
2024-04-16 Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification Yu-Yang Li et.al. 2404.10757 null
2024-04-16 Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study Shusheng Xu et.al. 2404.10719 null
2024-04-16 An empirical study on code review activity prediction in practice Doriane Olewicki et.al. 2404.10703 null
2024-04-16 Automating REST API Postman Test Cases Using LLM S Deepika Sri et.al. 2404.10678 null
2024-04-16 ViTextVQA: A Large-Scale Visual Question Answering Dataset for Evaluating Vietnamese Text Comprehension in Images Quan Van Nguyen et.al. 2404.10652 link
2024-04-16 Self-playing Adversarial Language Game Enhances LLM Reasoning Pengyu Cheng et.al. 2404.10642 link
2024-04-16 HLAT: High-quality Large Language Model Pre-trained on AWS Trainium Haozheng Fan et.al. 2404.10630 null
2024-04-16 Private Attribute Inference from Images with Vision-Language Models Batuhan Tömekçe et.al. 2404.10618 null
2024-04-15 Personalized Collaborative Fine-Tuning for On-Device Large Language Models Nicolas Wagner et.al. 2404.09753 null
2024-04-15 Quantization of Large Language Models with an Overdetermined Basis Daniil Merkulov et.al. 2404.09737 null
2024-04-15 Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model Hyunsoo Cho et.al. 2404.09717 null
2024-04-15 Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction David Sobrín-Hidalgo et.al. 2404.09705 null
2024-04-15 Generative AI for Game Theory-based Mobile Networking Long He et.al. 2404.09699 null
2024-04-15 Are Large Language Models Reliable Argument Quality Annotators? Nailia Mirzakhmedova et.al. 2404.09696 null
2024-04-15 LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models Guangyan Li et.al. 2404.09695 null
2024-04-15 Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation Juhwan Choi et.al. 2404.09682 null
2024-04-15 Do LLMs Understand Visual Anomalies? Uncovering LLM Capabilities in Zero-shot Anomaly Detection Jiaqi Zhu et.al. 2404.09654 null
2024-04-15 Bridging Vision and Language Spaces with Assignment Prediction Jungin Park et.al. 2404.09632 link
2024-04-12 Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts Övgü Özdemir et.al. 2404.08589 link
2024-04-12 Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation Hanlin Tian et.al. 2404.08570 null
2024-04-12 RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs Shreyas Chaudhari et.al. 2404.08555 null
2024-04-12 Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward Xuan Xie et.al. 2404.08517 null
2024-04-12 Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction Haoran Qiu et.al. 2404.08509 link
2024-04-12 LaSagnA: Language-based Segmentation Assistant for Complex Queries Cong Wei et.al. 2404.08506 link
2024-04-12 Strategic Interactions between Large Language Models-based Agents in Beauty Contests Siting Lu et.al. 2404.08492 null
2024-04-12 Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian Stefano De Paoli et.al. 2404.08488 null
2024-04-12 Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task Hassan Ali et.al. 2404.08424 null
2024-04-12 AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees William Fleshman et.al. 2404.08417 null
2024-04-11 OpenBias: Open-set Bias Detection in Text-to-Image Generative Models Moreno D'Incà et.al. 2404.07990 null
2024-04-11 View Selection for 3D Captioning via Diffusion Ranking Tiange Luo et.al. 2404.07984 null
2024-04-11 Manipulating Large Language Models to Increase Product Visibility Aounon Kumar et.al. 2404.07981 link
2024-04-11 LLoCO: Learning Long Contexts Offline Sijun Tan et.al. 2404.07979 link
2024-04-11 Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models Haotian Zhang et.al. 2404.07973 null
2024-04-11 Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation Jinkyung Park et.al. 2404.07926 null
2024-04-11 LaVy: Vietnamese Multimodal Large Language Model Chi Tran et.al. 2404.07922 null
2024-04-11 AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs Zeyi Liao et.al. 2404.07921 link
2024-04-11 DesignQA: A Multimodal Benchmark for Evaluating Large Language Models' Understanding of Engineering Documentation Anna C. Doris et.al. 2404.07917 link
2024-04-11 High-Dimension Human Value Representation in Large Language Models Samuel Cahyawijaya et.al. 2404.07900 null
2024-04-10 UMBRAE: Unified Multimodal Decoding of Brain Signals Weihao Xia et.al. 2404.07202 null
2024-04-10 Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Tsendsuren Munkhdalai et.al. 2404.07143 null
2024-04-11 Semantically-correlated memories in a dense associative model Thomas F Burns et.al. 2404.07123 null
2024-04-10 Continuous Language Model Interpolation for Dynamic and Controllable Text Generation Sara Kangaslahti et.al. 2404.07117 null
2024-04-11 From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications Yongqiang Ma et.al. 2404.07108 null
2024-04-10 Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs Bowen Jin et.al. 2404.07103 null
2024-04-10 Dynamic Generation of Personalities with Large Language Models Jianzhi Liu et.al. 2404.07084 null
2024-04-10 VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning Alexandros Xenos et.al. 2404.07078 link
2024-04-10 Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers? Mingyu Jin et.al. 2404.07066 link
2024-04-10 Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study Alessandro Stolfo et.al. 2404.07060 null
2024-04-09 Pitfalls of Conversational LLMs on News Debiasing Ipek Baris Schlicht et.al. 2404.06488 null
2024-04-09 Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks Chonghua Wang et.al. 2404.06480 link
2024-04-09 Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models Zihan Fang et.al. 2404.06448 null
2024-04-09 Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems Kunal Garg et.al. 2404.06413 null
2024-04-09 AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents Luca Gioacchini et.al. 2404.06411 link
2024-04-09 Take a Look at it! Rethinking How to Evaluate Language Model Jailbreak Hongyu Cai et.al. 2404.06407 link
2024-04-09 Apprentices to Research Assistants: Advancing Research with Large Language Models M. Namvarpour et.al. 2404.06404 null
2024-04-09 MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies Shengding Hu et.al. 2404.06395 link
2024-04-09 MuPT: A Generative Symbolic Music Pretrained Transformer Xingwei Qu et.al. 2404.06393 null
2024-04-09 Latent Distance Guided Alignment Training for Large Language Models Haotian Luo et.al. 2404.06390 null
2024-04-08 MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding Bo He et.al. 2404.05726 null
2024-04-08 Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Keen You et.al. 2404.05719 null
2024-04-08 Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding Ahmad Idrissi-Yaghir et.al. 2404.05694 null
2024-04-08 Evaluating Mathematical Reasoning Beyond Accuracy Shijie Xia et.al. 2404.05692 link
2024-04-08 Retrieval-Augmented Open-Vocabulary Object Detection Jooyeon Kim et.al. 2404.05687 link
2024-04-08 MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation Kunpeng Song et.al. 2404.05674 null
2024-04-08 CoReS: Orchestrating the Dance of Reasoning and Segmentation Xiaoyi Bao et.al. 2404.05673 null
2024-04-08 Fighting crime with Transformers: Empirical analysis of address parsing methods in payment data Haitham Hammami et.al. 2404.05632 link
2024-04-08 LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking Faren Yan et.al. 2404.05624 null
2024-04-08 MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering Iñigo Alonso et.al. 2404.05590 null
2024-04-05 Physical Property Understanding from Language-Embedded Feature Fields Albert J. Zhai et.al. 2404.04242 null
2024-04-05 Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents Harsh Kohli et.al. 2404.04237 null
2024-04-05 Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation Tianqi Zhong et.al. 2404.04232 link
2024-04-05 Social Skill Training with Large Language Models Diyi Yang et.al. 2404.04204 null
2024-04-05 Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model Xinrun Du et.al. 2404.04167 null
2024-04-05 Large language models as oracles for instantiating ontologies with domain-specific knowledge Giovanni Ciatto et.al. 2404.04108 link
2024-04-05 Improving Factual Accuracy of Neural Table-to-Text Output by Addressing Input Problems in ToTTo Barkavi Sundararajan et.al. 2404.04103 link
2024-04-05 Robust Preference Optimization with Provable Noise Tolerance for LLMs Xize Liang et.al. 2404.04102 null
2024-04-05 Assessing the quality of information extraction Filip Seitl et.al. 2404.04068 null
2024-04-05 CLUE: A Clinical Language Understanding Evaluation for LLMs Amin Dada et.al. 2404.04067 null
2024-04-04 CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching Dongzhi Jiang et.al. 2404.03653 link
2024-04-04 AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent Hanyu Lai et.al. 2404.03648 link
2024-04-04 Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra Darioush Kevian et.al. 2404.03647 null
2024-04-04 Training LLMs over Neurally Compressed Text Brian Lester et.al. 2404.03626 null
2024-04-04 Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph Marco Bronzini et.al. 2404.03623 null
2024-04-04 Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models Wenshan Wu et.al. 2404.03622 null
2024-04-04 DeViDe: Faceted medical knowledge for improved medical vision-language pre-training Haozhe Luo et.al. 2404.03618 null
2024-04-04 Sailor: Open Language Models for South-East Asia Longxu Dou et.al. 2404.03608 link
2024-04-04 Evaluating LLMs at Detecting Errors in LLM Responses Ryo Kamoi et.al. 2404.03602 link
2024-04-04 Intent Detection and Entity Extraction from BioMedical Literature Ankan Mullick et.al. 2404.03598 link
2024-04-03 ALOHa: A New Measure for Hallucination in Captioning Models Suzanne Petryk et.al. 2404.02904 null
2024-04-03 MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment Duygu Ceylan et.al. 2404.02899 null
2024-04-03 ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline Yifan Xu et.al. 2404.02893 null
2024-04-03 Integrating Explanations in Learning LTL Specifications from Demonstrations Ashutosh Gupta et.al. 2404.02872 null
2024-04-03 Toward Inference-optimal Mixture-of-Expert Large Language Models Longfei Yun et.al. 2404.02852 null
2024-04-03 I-Design: Personalized LLM Interior Designer Ata Çelen et.al. 2404.02838 null
2024-04-03 Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models Wanyun Cui et.al. 2404.02837 null
2024-04-03 Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison Maxime Bouthors et.al. 2404.02835 null
2024-04-03 Empowering Biomedical Discovery with AI Agents Shanghua Gao et.al. 2404.02831 null
2024-04-03 BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models Qijun Luo et.al. 2404.02827 link
2024-04-02 Topic-based Watermarks for LLM-Generated Text Alexander Nemecek et.al. 2404.02138 null
2024-04-02 Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models Wanyong Feng et.al. 2404.02124 null
2024-04-02 GINopic: Topic Modeling with Graph Isomorphism Network Suman Adhya et.al. 2404.02115 link
2024-04-02 CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems Sara Rosenthal et.al. 2404.02103 link
2024-04-02 Advancing LLM Reasoning Generalists with Preference Trees Lifan Yuan et.al. 2404.02078 link
2024-04-02 Digital Forgetting in Large Language Models: A Survey of Unlearning Methods Alberto Blanco-Justicia et.al. 2404.02062 null
2024-04-02 Long-context LLMs Struggle with Long In-context Learning Tianle Li et.al. 2404.02060 link
2024-04-02 Deconstructing In-Context Learning: Understanding Prompts via Corruption Namrata Shivagunde et.al. 2404.02054 link
2024-04-02 BERTopic-Driven Stock Market Predictions: Unraveling Sentiment Insights Enmin Zhu et.al. 2404.02053 null
2024-04-02 A Survey on Large Language Model-Based Game Agents Sihao Hu et.al. 2404.02039 link
2024-03-29 Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models Atsuyuki Miyai et.al. 2403.20331 link
2024-03-29 Gecko: Versatile Text Embeddings Distilled from Large Language Models Jinhyuk Lee et.al. 2403.20327 null
2024-03-29 Convolutional Prompting meets Language Models for Continual Learning Anurag Roy et.al. 2403.20317 null
2024-03-29 Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference Jovan Stojkovic et.al. 2403.20306 null
2024-03-29 Can LLMs Correct Physicians, Yet? Investigating Effective Interaction Methods in the Medical Domain Burcu Sayin et.al. 2403.20288 null
2024-03-29 LUQ: Long-text Uncertainty Quantification for LLMs Caiqi Zhang et.al. 2403.20279 null
2024-04-01 Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want Weifeng Lin et.al. 2403.20271 link
2024-03-29 Latxa: An Open Language Model and Evaluation Suite for Basque Julen Etxaniz et.al. 2403.20266 link
2024-03-29 ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models Thibaut Thonet et.al. 2403.20262 null
2024-03-29 Using LLMs to Model the Beliefs and Preferences of Targeted Populations Keiichi Namikoshi et.al. 2403.20252 null
2024-03-28 InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction Sirui Xu et.al. 2403.19652 null
2024-03-28 MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions Kai Zhang et.al. 2403.19651 null
2024-03-28 Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning Chenyang Liu et.al. 2403.19646 link
2024-03-28 Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models Yucheng Shi et.al. 2403.19631 null
2024-03-28 Semantic Map-based Generation of Navigation Instructions Chengzu Li et.al. 2403.19603 link
2024-03-28 LocCa: Visual Pretraining with Location-aware Captioners Bo Wan et.al. 2403.19596 null
2024-03-28 Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation Zhongliang Zhou et.al. 2403.19584 null
2024-03-28 WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models Piotr Molenda et.al. 2403.19548 null
2024-03-28 LLMs as Academic Reading Companions: Extending HCI Through Synthetic Personae Celia Chen et.al. 2403.19506 null
2024-03-28 Evolving Assembly Code in an Adversarial Environment Irina Maliukov et.al. 2403.19489 null
2024-03-27 Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Yanwei Li et.al. 2403.18814 link
2024-03-27 ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation Suraj Patni et.al. 2403.18807 link
2024-03-27 Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation Mateusz Klimaszewski et.al. 2403.18804 null
2024-03-27 Long-form factuality in large language models Jerry Wei et.al. 2403.18802 link
2024-03-27 3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation Ehsan Latif et.al. 2403.18778 null
2024-03-27 CheckEval: Robust Evaluation Framework using Large Language Model via Checklist Yukyung Lee et.al. 2403.18771 null
2024-03-27 MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model Yike Wu et.al. 2403.18760 null
2024-03-27 Understanding the Learning Dynamics of Alignment with Human Feedback Shawn Im et.al. 2403.18742 null
2024-03-27 PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations Ehsan Latif et.al. 2403.18721 null
2024-03-27 NL-ITI: Optimizing Probing and Intervention for Improvement of ITI Method Jakub Hoscilowicz et.al. 2403.18680 link
2024-03-26 MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution Wei Tao et.al. 2403.17927 null
2024-03-26 LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning Rui Pan et.al. 2403.17919 null
2024-03-26 Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach Andrea Ferrario et.al. 2403.17873 null
2024-03-26 Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications Philip Lippmann et.al. 2403.17860 null
2024-03-26 ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages Bhawna Piryani et.al. 2403.17859 link
2024-03-26 Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs David R. Mortensen et.al. 2403.17856 null
2024-03-26 ArabicaQA: A Comprehensive Dataset for Arabic Question Answering Abdelrahman Abdallah et.al. 2403.17848 link
2024-03-26 Assessment of Multimodal Large Language Models in Alignment with Human Values Zhelun Shi et.al. 2403.17830 null
2024-03-26 Accelerating Radio Spectrum Regulation Workflows with Large Language Models (LLMs) Amir Ghasemi et.al. 2403.17819 null
2024-03-26 Are Compressed Language Models Less Subgroup Robust? Leonidas Gee et.al. 2403.17811 link
2024-03-25 Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making Shuai Ma et.al. 2403.16812 null
2024-03-25 An LLM-Based Digital Twin for Optimizing Human-in-the Loop Systems Hanqing Yang et.al. 2403.16809 null
2024-03-25 Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback Zhangqian Bi et.al. 2403.16792 null
2024-03-25 All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification Deepak Narayan Gadde et.al. 2403.16750 null
2024-03-25 Synapse: Learning Preferential Concepts from Visual Demonstrations Sadanand Modak et.al. 2403.16689 null
2024-03-25 Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography Jiayue Zhang et.al. 2403.16687 null
2024-03-25 ToXCL: A Unified Framework for Toxic Speech Detection and Explanation Nhat M. Hoang et.al. 2403.16685 link
2024-03-25 RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict Yirong Zeng et.al. 2403.16662 link
2024-03-25 Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT Rohit Raju et.al. 2403.16655 null
2024-03-25 CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment Feiteng Fang et.al. 2403.16649 null
2024-03-25 Virtual Co-Pilot: Multimodal Large Language Model-enabled Quick-access Procedures for Single Pilot Operations Fan Li et.al. 2403.16645 null
2024-03-25 Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units Biswesh Mohapatra et.al. 2403.16609 null
2024-03-25 TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques Ashok Urlana et.al. 2403.16592 null
2024-03-25 Can Large Language Models (or Humans) Distill Text? Nicolas Audinet de Pieuchon et.al. 2403.16584 null
2024-03-22 LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models Yuzhang Shang et.al. 2403.15388 null
2024-03-22 Long-CLIP: Unlocking the Long-Text Capability of CLIP Beichen Zhang et.al. 2403.15378 null
2024-03-22 Can large language models explore in-context? Akshay Krishnamurthy et.al. 2403.15371 null
2024-03-22 CoLLEGe: Concept Embedding Generation for Large Language Models Ryan Teehan et.al. 2403.15362 null
2024-03-22 Multi-Review Fusion-in-Context Aviv Slobodkin et.al. 2403.15351 null
2024-03-22 CO-Fun: A German Dataset on Company Outsourcing in Fund Prospectuses for Named Entity Recognition and Relation Extraction Neda Foroutan et.al. 2403.15322 null
2024-03-22 Sphere Neural-Networks for Rational Reasoning Tiansi Dong et.al. 2403.15297 null
2024-03-22 Measuring Gender and Racial Biases in Large Language Models Jiafu An et.al. 2403.15281 null
2024-03-22 Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review Jinge Wang et.al. 2403.15274 null
2024-03-22 Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs Xiaobin Zhang et.al. 2403.15273 null
2024-03-21 MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Renrui Zhang et.al. 2403.14624 null
2024-03-21 Language Repository for Long Video Understanding Kumara Kahatapitiya et.al. 2403.14622 link
2024-03-21 Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey Zeyu Han et.al. 2403.14608 null
2024-03-21 MyVLM: Personalizing VLMs for User-Specific Queries Yuval Alaluf et.al. 2403.14599 null
2024-03-21 Large Language Models for Multi-Choice Question Classification of Medical Subjects Víctor Ponce-López et.al. 2403.14582 null
2024-03-21 RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants in the Biomedical Domain William James Bolton et.al. 2403.14578 link
2024-03-21 A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science Clayton Cohn et.al. 2403.14565 null
2024-03-21 EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling Shimao Zhang et.al. 2403.14541 null
2024-03-21 Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference Han Zhao et.al. 2403.14520 null
2024-03-21 The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs) Joschka Haltaufderheide et.al. 2403.14473 null
2024-03-20 RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition Ziyu Liu et.al. 2403.13805 null
2024-03-20 Learning from Models and Data for Visual Grounding Ruozhen He et.al. 2403.13804 null
2024-03-20 Reverse Training to Nurse the Reversal Curse Olga Golovneva et.al. 2403.13799 null
2024-03-20 Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts Guangzeng Han et.al. 2403.13786 null
2024-03-20 Leveraging High-Resolution Features for Improved Deep Hashing-based Image Retrieval Aymene Berriche et.al. 2403.13747 null
2024-03-20 EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation Atnafu Lambebo Tonja et.al. 2403.13737 null
2024-03-20 Large Language Models meet Network Slicing Management and Orchestration Abdulhalim Dandoush et.al. 2403.13721 null
2024-03-20 RoleInteract: Evaluating the Social Interaction of Role-Playing Agents Hongzhan Chen et.al. 2403.13679 null
2024-03-20 Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese Meet Doshi et.al. 2403.13638 null
2024-03-20 VL-Mamba: Exploring State Space Models for Multimodal Learning Yanyuan Qiao et.al. 2403.13600 null
2024-03-19 Dated Data: Tracing Knowledge Cutoffs in Large Language Models Jeffrey Cheng et.al. 2403.12958 null
2024-03-19 Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models Joana Ribeiro de Faria et.al. 2403.12936 null
2024-03-19 Rapid AIdeation: Generating Ideas With the Self and in Collaboration With Large Language Models Gionnieve Lim et.al. 2403.12928 null
2024-03-19 Supporting Energy Policy Research with Large Language Models Grant Buster et.al. 2403.12924 null
2024-03-19 Semantic Layering in Room Segmentation via LLMs Taehyeon Kim et.al. 2403.12920 null
2024-03-19 Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference Baolin Li et.al. 2403.12900 null
2024-03-19 mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding Anwen Hu et.al. 2403.12895 link
2024-03-19 MEDBind: Unifying Language and Multimodal Medical Data Embeddings Yuan Gao et.al. 2403.12894 null
2024-03-19 HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning Fucai Ke et.al. 2403.12884 null
2024-03-19 Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models Zehui Chen et.al. 2403.12881 link
2024-03-18 HDLdebugger: Streamlining HDL debugging with Large Language Models Xufeng Yao et.al. 2403.11671 null
2024-03-18 Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model Haoyun Xu et.al. 2403.11621 null
2024-03-18 Linguacodus: A Synergistic Framework for Transformative Code Generation in Machine Learning Pipelines Ekaterina Trofimova et.al. 2403.11585 null
2024-03-18 Reinforcement Learning with Token-level Feedback for Controllable Text Generation Wendi Li et.al. 2403.11558 null
2024-03-18 LLM^3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning Shu Wang et.al. 2403.11552 link
2024-03-18 TARN-VIST: Topic Aware Reinforcement Network for Visual Storytelling Weiran Chen et.al. 2403.11550 null
2024-03-18 DEE: Dual-stage Explainable Evaluation Method for Text Generation Shenyu Zhang et.al. 2403.11509 null
2024-03-18 Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis Vishnu Sashank Dorbala et.al. 2403.11487 null
2024-03-18 VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding Yue Fan et.al. 2403.11481 null
2024-03-18 HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models Huy Nghiem et.al. 2403.11456 link
2024-03-14 Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference Piotr Nawrot et.al. 2403.09636 null
2024-03-14 3D-VLA: A 3D Vision-Language-Action Generative World Model Haoyu Zhen et.al. 2403.09631 null
2024-03-14 MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Brandon McKinzie et.al. 2403.09611 null
2024-03-14 Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey Xiaoyu Liu et.al. 2403.09606 null
2024-03-14 Logical Discrete Graphical Models Must Supplement Large Language Models for Information Synthesis Gregory Coppola et.al. 2403.09599 null
2024-03-14 ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models Runyu Ma et.al. 2403.09583 null
2024-03-14 Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation Yunhao Gou et.al. 2403.09572 null
2024-03-14 Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models Laura Fernández-Becerra et.al. 2403.09567 null
2024-03-14 Welcome Your New AI Teammate: On Safety Analysis by Leashing Large Language Models Ali Nouri et.al. 2403.09565 null
2024-03-14 Less is More: Data Value Estimation for Visual Instruction Tuning Zikang Liu et.al. 2403.09559 null
2024-03-13 Simple and Scalable Strategies to Continually Pre-train Large Language Models Adam Ibrahim et.al. 2403.08763 null
2024-03-13 Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework Jingling Li et.al. 2403.08743 null
2024-03-13 The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models Carlo Nicolini et.al. 2403.08739 null
2024-03-13 Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization Renjie Pi et.al. 2403.08730 null
2024-03-14 SOTOPIA- $π$ : Interactive Learning of Socially Intelligent Language Agents Ruiyi Wang et.al. 2403.08715 link
2024-03-13 Review of Generative AI Methods in Cybersecurity Yagmur Yigit et.al. 2403.08701 null
2024-03-13 TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning Shangding Gu et.al. 2403.08694 null
2024-03-13 Token Alignment via Character Matching for Subword Completion Ben Athiwaratkun et.al. 2403.08688 null
2024-03-13 Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records Erlend Frayling et.al. 2403.08664 null
2024-03-13 Human Alignment of Large Language Models through Online Preference Optimisation Daniele Calandriello et.al. 2403.08635 null
2024-03-12 Beyond Text: Frozen Large Language Models in Visual Signal Comprehension Lei Zhu et.al. 2403.07874 link
2024-03-12 Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Fangyun Wei et.al. 2403.07872 null
2024-03-12 Exploring Safety Generalization Challenges of Large Language Models via Code Qibing Ren et.al. 2403.07865 null
2024-03-12 DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies William Xie et.al. 2403.07832 null
2024-03-12 The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model Editing Jianchen Wang et.al. 2403.07825 null
2024-03-12 Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM Sainbayar Sukhbaatar et.al. 2403.07816 null
2024-03-12 Fine-tuning Large Language Models with Sequential Instructions Hanxu Hu et.al. 2403.07794 link
2024-03-12 Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations Carlos Jose Xavier Cruz et.al. 2403.07769 link
2024-03-12 Synth $^2$ : Boosting Visual-Language Models with Synthetic Captions and Image Embeddings Sahand Sharifzadeh et.al. 2403.07750 null
2024-03-12 FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models Yan Liu et.al. 2403.07747 null
2024-03-11 Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena Leonie Weissweiler et.al. 2403.06965 null
2024-03-11 Materials science in the era of large language models: a perspective Ge Lei et.al. 2403.06949 null
2024-03-11 Naming, Describing, and Quantifying Visual Objects in Humans and LLMs Alberto Testoni et.al. 2403.06935 null
2024-03-11 ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis Yanming Liu et.al. 2403.06932 link
2024-03-11 MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning Yichuan Li et.al. 2403.06914 null
2024-03-11 Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents Nishchal Prasad et.al. 2403.06872 null
2024-03-11 Development of a Reliable and Accessible Caregiving Language Model (CaLM) Bambang Parmanto et.al. 2403.06857 null
2024-03-11 DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation Guosheng Zhao et.al. 2403.06845 null
2024-03-11 RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback Yanming Liu et.al. 2403.06840 link
2024-03-11 ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts Lyuye Zhang et.al. 2403.06838 null
2024-03-08 Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Machel Reid et.al. 2403.05530 null
2024-03-08 GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM Hao Kang et.al. 2403.05527 link
2024-03-08 Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola Yijiang Li et.al. 2403.05523 null
2024-03-08 Will GPT-4 Run DOOM? Adrian de Wynter et.al. 2403.05468 null
2024-03-08 Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs Arijit Nag et.al. 2403.05434 null
2024-03-08 Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings Wei Zhou et.al. 2403.05338 null
2024-03-08 ChatASU: Evoking LLM's Reflexion to Truly Understand Aspect Sentiment in Dialogues Yiding Liu et.al. 2403.05326 null
2024-03-08 RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation Zihao Wang et.al. 2403.05313 null
2024-03-08 Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents Jinyang Li et.al. 2403.05307 null
2024-03-08 ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications Sotaro Takeshita et.al. 2403.05303 link
2024-03-07 Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed Yifan Wang et.al. 2403.04765 null
2024-03-07 iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries Adam Coscia et.al. 2403.04760 link
2024-03-07 KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts Adam Coscia et.al. 2403.04758 link
2024-03-07 LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error Boshi Wang et.al. 2403.04746 link
2024-03-07 SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM Jielin Qiu et.al. 2403.04735 null
2024-03-07 ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes Hashmat Shadab Malik et.al. 2403.04701 null
2024-03-07 Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification Ekaterina Fadeeva et.al. 2403.04696 null
2024-03-07 PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation Junsong Chen et.al. 2403.04692 null
2024-03-07 Telecom Language Models: Must They Be Large? Nicola Piovesan et.al. 2403.04666 null
2024-03-07 QAQ: Quality Adaptive Quantization for LLM KV Cache Shichen Dong et.al. 2403.04643 link
2024-03-06 Bridging Language and Items for Retrieval and Recommendation Yupeng Hou et.al. 2403.03952 link
2024-03-06 Did Translation Models Get More Robust Without Anyone Even Noticing? Ben Peters et.al. 2403.03923 null
2024-03-06 Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing Asmita et.al. 2403.03897 null
2024-03-06 SaulLM-7B: A pioneering Large Language Model for Law Pierre Colombo et.al. 2403.03883 null
2024-03-06 Learning to Decode Collaboratively with Multiple Language Models Shannon Zejiang Shen et.al. 2403.03870 link
2024-03-06 On the Origins of Linear Representations in Large Language Models Yibo Jiang et.al. 2403.03867 null
2024-03-06 KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions Fangyuan Xu et.al. 2403.03866 null
2024-03-06 Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning Deepanway Ghosal et.al. 2403.03864 link
2024-03-06 X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification Hanzi Xu et.al. 2403.03863 link
2024-03-06 Emojinize : Enriching Any Text with Emoji Translations Lars Henning Klein et.al. 2403.03857 null
2024-03-05 The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning Nathaniel Li et.al. 2403.03218 null
2024-03-05 CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments Savitha Sam Abraham et.al. 2403.03203 null
2024-03-05 Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement Rafaela Martelo et.al. 2403.03188 link
2024-03-05 MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting Fangchen Liu et.al. 2403.03174 null
2024-03-05 SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection Peng Qi et.al. 2403.03170 null
2024-03-05 PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset Arda Uzunoğlu et.al. 2403.03167 link
2024-03-05 Quantum Many-Body Physics Calculations with Large Language Models Haining Pan et.al. 2403.03154 null
2024-03-05 Language Guided Exploration for RL Agents in Text Environments Hitesh Golchha et.al. 2403.03141 null
2024-03-05 Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution Flor Miriam Plaza-del-Arco et.al. 2403.03121 null
2024-03-05 "In Dialogues We Learn": Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning Chuanqi Cheng et.al. 2403.03102 null
2024-03-02 LM4OPT: Unveiling the Potential of Large Language Models in Formulating Mathematical Optimization Problems Tasnim Ahmed et.al. 2403.01342 null
2024-03-02 Chaining thoughts and LLMs to learn DNA structural biophysics Tyler D. Ross et.al. 2403.01332 null
2024-03-02 VNLP: Turkish NLP Package Meliksah Turker et.al. 2403.01309 null
2024-03-02 VBART: The Turkish LLM Meliksah Turker et.al. 2403.01308 null
2024-03-02 ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation Moran Yanuka et.al. 2403.01306 null
2024-03-02 Improving the Validity of Automatically Generated Feedback via Reinforcement Learning Alexander Scarlatos et.al. 2403.01304 link
2024-03-02 NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention Tianyi Zhang et.al. 2403.01273 null
2024-03-02 Employing LLMs for Incident Response Planning and Review Sam Hays et.al. 2403.01271 null
2024-03-02 A comprehensive cross-language framework for harmful content detection with the aid of sentiment analysis Mohammad Dehghani et.al. 2403.01270 null
2024-03-02 Dissecting Language Models: Machine Unlearning via Selective Pruning Nicholas Pochinkov et.al. 2403.01267 null
2024-02-29 The All-Seeing Project V2: Towards General Relation Comprehension of the Open World Weiyun Wang et.al. 2402.19474 link
2024-02-29 Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling Gabriel Grand et.al. 2402.19471 null
2024-02-29 Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models Chen Qian et.al. 2402.19465 link
2024-02-29 Curiosity-driven Red-teaming for Large Language Models Zhang-Wei Hong et.al. 2402.19464 link
2024-02-29 ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL Yifei Zhou et.al. 2402.19446 link
2024-02-29 Compositional API Recommendation for Library-Oriented Code Generation Zexiong Ma et.al. 2402.19431 null
2024-02-29 Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines Lijia Ma et.al. 2402.19421 null
2024-02-29 On the Scaling Laws of Geographical Representation in Language Models Nathan Godey et.al. 2402.19406 null
2024-02-29 Entity-Aware Multimodal Alignment Framework for News Image Captioning Junzhe Zhang et.al. 2402.19404 null
2024-02-29 Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Match Human Crowd Accuracy Philipp Schoenegger et.al. 2402.19379 null
2024-02-28 Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards Haoxiang Wang et.al. 2402.18571 link
2024-02-28 A Categorization of Complexity Classes for Information Retrieval and Synthesis Using Natural Logic Gregory Coppola et.al. 2402.18566 null
2024-02-28 Implicit Bias of Next-Token Prediction Christos Thrampoulidis et.al. 2402.18551 null
2024-02-28 Few-Shot Fairness: Unveiling LLM's Potential for Fairness-Aware Classification Garima Chhikara et.al. 2402.18502 null
2024-02-28 Take It, Leave It, or Fix It: Measuring Productivity and Trust in Human-AI Collaboration Crystal Qian et.al. 2402.18498 null
2024-02-28 Language Models Represent Beliefs of Self and Others Wentao Zhu et.al. 2402.18496 null
2024-02-28 Meta-Task Prompting Elicits Embedding from Large Language Models Yibin Lei et.al. 2402.18458 null
2024-02-28 Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication Weize Chen et.al. 2402.18439 link
2024-02-28 Unsupervised Cross-Domain Image Retrieval via Prototypical Optimal Transport Bin Li et.al. 2402.18411 link
2024-02-28 A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models Xiujie Song et.al. 2402.18409 null

(back to top)

Scene Understanding

Publish Date Title Authors PDF Code
2024-05-21 Anticipating Object State Changes Victoria Manousaki et.al. 2405.12789 null
2024-05-21 Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency Hyeongjin Kim et.al. 2405.12648 null
2024-05-20 MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering Jingqun Tang et.al. 2405.11985 null
2024-05-19 The First Swahili Language Scene Text Detection and Recognition Dataset Fadila Wendigoundi Douamba et.al. 2405.11437 link
2024-05-16 Grounded 3D-LLM with Referent Tokens Yilun Chen et.al. 2405.10370 null
2024-05-16 4D Panoptic Scene Graph Generation Jingkang Yang et.al. 2405.10305 link
2024-05-16 When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models Xianzheng Ma et.al. 2405.10255 null
2024-05-16 A Preprocessing and Postprocessing Voxel-based Method for LiDAR Semantic Segmentation Improvement in Long Distance Andrea Matteazzi et.al. 2405.10046 null
2024-05-15 BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation Yunhao Ge et.al. 2405.09546 null
2024-05-15 HAAP: Vision-context Hierarchical Attention Autoregressive with Adaptive Permutation for Scene Text Recognition Honghui Chen et.al. 2405.09125 null
2024-05-15 3D Shape Augmentation with Content-Aware Shape Resizing Mingxiang Chen et.al. 2405.09050 null
2024-05-09 Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control Gunshi Gupta et.al. 2405.05852 link
2024-05-11 Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition Zuan Gao et.al. 2405.05841 null
2024-05-09 Benchmarking Neural Radiance Fields for Autonomous Robots: An Overview Yuhang Ming et.al. 2405.05526 null
2024-05-09 DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction Siyu Li et.al. 2405.05518 null
2024-05-08 OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies Lingdong Kong et.al. 2405.05259 link
2024-05-08 Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving Lingdong Kong et.al. 2405.05258 link
2024-05-07 DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving Chen Min et.al. 2405.04390 null
2024-05-07 Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and Editing Boqiang Zhang et.al. 2405.04377 null
2024-05-06 An Empty Room is All We Want: Automatic Defurnishing of Indoor Panoramas Mira Slavcheva et.al. 2405.03682 null
2024-05-04 Few-Shot Fruit Segmentation via Transfer Learning Jordan A. James et.al. 2405.02556 link
2024-04-29 Q-GroundCAM: Quantifying Grounding in Vision Language Models via GradCAM Navid Rajabi et.al. 2404.19128 null
2024-04-29 Compositional Factorization of Visual Scenes with Convolutional Sparse Coding and Resonator Networks Christopher J. Kymn et.al. 2404.19126 null
2024-04-24 Seeing Beyond Classes: Zero-Shot Grounded Situation Recognition via Language Explainer Jiaming Lei et.al. 2404.15785 null
2024-04-22 CloudFort: Enhancing Robustness of 3D Point Cloud Classification Against Backdoor Attacks via Spatial Partitioning and Ensemble Prediction Wenhao Lan et.al. 2404.14042 null
2024-04-22 On Support Relations Inference and Scene Hierarchy Graph Construction from Point Cloud in Clustered Environments Gang Ma et.al. 2404.13842 null
2024-04-29 Clio: Real-time Task-Driven Open-Set 3D Scene Graphs Dominic Maggio et.al. 2404.13696 link
2024-04-19 BACS: Background Aware Continual Semantic Segmentation Mostafa ElAraby et.al. 2404.13148 link
2024-04-19 Unified Scene Representation and Reconstruction for 3D Large Language Models Tao Chu et.al. 2404.13044 null
2024-04-18 SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth Estimation Mykola Lavreniuk et.al. 2404.12501 null
2024-04-19 AccidentBlip2: Accident Detection With Multi-View MotionBlip2 Yihua Shao et.al. 2404.12149 link
2024-04-17 Multimodal 3D Object Detection on Unseen Domains Deepti Hegde et.al. 2404.11764 null
2024-04-16 ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic Segmentation Iaroslav Melekhov et.al. 2404.10699 link
2024-04-16 PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction Sinisa Stekovic et.al. 2404.10620 null
2024-04-16 PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network Yuning Wang et.al. 2404.10263 null
2024-04-15 No More Ambiguity in 360° Room Layout via Bi-Layout Estimation Yu-Ju Tsai et.al. 2404.09993 null
2024-04-15 A Review and Efficient Implementation of Scene Graph Generation Metrics Julian Lorenz et.al. 2404.09616 null
2024-04-14 Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms Diandian Guo et.al. 2404.09231 null
2024-04-11 Gaga: Group Any Gaussians via 3D-aware Memory Bank Weijie Lyu et.al. 2404.07977 null
2024-04-11 AUG: A New Dataset and An Efficient Model for Aerial Image Urban Scene Graph Generation Yansheng Li et.al. 2404.07788 null
2024-04-11 Depth Estimation using Weighted-loss and Transfer Learning Muhammad Adeel Hafeez et.al. 2404.07686 null
2024-04-11 Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange Yanhao Wu et.al. 2404.07504 null
2024-04-10 Incorporating Explanations into Human-Machine Interfaces for Trust and Situation Awareness in Autonomous Vehicles Shahin Atakishiyev et.al. 2404.07383 null
2024-04-10 ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling Ege Özsoy et.al. 2404.07031 null
2024-04-10 O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation Muer Tie et.al. 2404.06836 null
2024-04-09 QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding Yash Mehan et.al. 2404.06442 null
2024-04-09 DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning Senthil Yogamani et.al. 2404.06352 null
2024-04-09 JSTR: Judgment Improves Scene Text Recognition Masato Fujitake et.al. 2404.05967 null
2024-04-06 Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation Danpei Zhao et.al. 2404.04608 null
2024-04-06 SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos Tao Wu et.al. 2404.04565 null
2024-04-05 Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation Zifu Wan et.al. 2404.04256 link
2024-04-06 HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion Jiahang Li et.al. 2404.03527 link
2024-04-04 You Only Scan Once: A Dynamic Scene Reconstruction Pipeline for 6-DoF Robotic Grasping of Novel Objects Lei Zhou et.al. 2404.03462 null
2024-04-03 Weakly-Supervised 3D Scene Graph Generation via Visual-Linguistic Assisted Pseudo-labeling Xu Wang et.al. 2404.02527 null
2024-04-05 EGTR: Extracting Graph from Transformer for Scene Graph Generation Jinbae Im et.al. 2404.02072 link
2024-04-01 NeRF-MAE : Masked AutoEncoders for Self Supervised 3D representation Learning for Neural Radiance Fields Muhammad Zubair Irshad et.al. 2404.01300 null
2024-04-08 360+x: A Panoptic Multi-modal Scene Understanding Dataset Hao Chen et.al. 2404.00989 null
2024-04-01 Improving Visual Recognition with Hyperbolical Visual Hierarchy Mapping Hyeongjun Kwon et.al. 2404.00974 link
2024-04-01 GOV-NeSF: Generalizable Open-Vocabulary Neural Semantic Fields Yunsong Wang et.al. 2404.00931 link
2024-04-01 MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements Lisong C. Sun et.al. 2404.00923 null
2024-04-01 From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models Rongjie Li et.al. 2404.00906 null
2024-03-31 Adapting to Length Shift: FlexiLength Network for Trajectory Prediction Yi Xu et.al. 2404.00742 null
2024-03-31 Neural Radiance Field-based Visual Rendering: A Comprehensive Review Mingyuan Yao et.al. 2404.00714 null
2024-03-29 VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection Zihua Liu et.al. 2404.00149 null
2024-03-29 HGS-Mapping: Online Dense Mapping Using Hybrid Gaussian Representation in Urban Scenes Ke Wu et.al. 2403.20159 null
2024-04-01 Efficient 3D Instance Mapping and Localization with Neural Fields George Tang et.al. 2403.19797 null
2024-03-27 Object Pose Estimation via the Aggregation of Diffusion Features Tianfu Wang et.al. 2403.18791 link
2024-03-25 Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding Lingdong Kong et.al. 2403.17010 link
2024-03-25 Towards Trustworthy Automated Driving through Qualitative Scene Understanding and Explanations Nassim Belmecheri et.al. 2403.16908 null
2024-03-25 DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding Xiaoxuan Yu et.al. 2403.16431 link
2024-03-24 AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans Cedric Perauer et.al. 2403.16318 null
2024-03-24 Improving Scene Graph Generation with Relation Words' Debiasing in Vision-Language Models Yuxuan Wang et.al. 2403.16184 null
2024-03-24 Multi-Task Learning with Multi-Task Optimization Lu Bai et.al. 2403.16162 null
2024-03-24 Semantic Is Enough: Only Semantic Information For NeRF Reconstruction Ruibo Wang et.al. 2403.16043 null
2024-03-22 Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting Jun Guo et.al. 2403.15624 null
2024-03-22 DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data Hanrong Ye et.al. 2403.15389 null
2024-03-21 DSGG: Dense Relation Transformer for an End-to-end Scene Graph Generation Zeeshan Hayder et.al. 2403.14886 null
2024-03-21 Evaluating Panoramic 3D Estimation in Indoor Lighting Analysis Zining Cheng et.al. 2403.14836 null
2024-03-21 SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field Lizhe Liu et.al. 2403.14366 null
2024-03-21 Exosense: A Vision-Centric Scene Understanding System For Safe Exoskeleton Navigation Jianeng Wang et.al. 2403.14320 null
2024-03-21 Volumetric Environment Representation for Vision-Language Navigation Rui Liu et.al. 2403.14158 null
2024-03-21 3D Object Detection from Point Cloud via Voting Step Diffusion Haoran Hou et.al. 2403.14133 null
2024-03-20 **Efficient scene text ima