Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-07-16 | Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling | Jaehyeok Kim et.al. | 2407.11962v1 | null |
2024-07-16 | A Transformer-based Approach for Augmenting Software Engineering Chatbots Datasets | Ahmad Abdellatif et.al. | 2407.11955v1 | null |
2024-07-16 | Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation | Olga Zatsarynna et.al. | 2407.11954v1 | null |
2024-07-16 | Temporally Consistent Stereo Matching | Jiaxi Zeng et.al. | 2407.11950v1 | link |
2024-07-17 | Hierarchical Separable Video Transformer for Snapshot Compressive Imaging | Ping Wang et.al. | 2407.11946v2 | link |
2024-07-16 | Tackling Oversmoothing in GNN via Graph Sparsification: A Truss-based Approach | Tanvir Hossain et.al. | 2407.11928v1 | null |
2024-07-16 | The Strength of Bisymmetric Modes in SDSS-IV/MaNGA Barred Galaxy Kinematics | Brian DiGiorgio Zanger et.al. | 2407.11908v1 | null |
2024-07-16 | GraphFM: A Scalable Framework for Multi-Graph Pretraining | Divyansha Lachi et.al. | 2407.11907v1 | null |
2024-07-16 | SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge | Hao Ding et.al. | 2407.11906v1 | null |
2024-07-16 | Automated production of batched unclonable micro-patterns anti-counterfeiting labels with strong robustness and rapid recognition speed | Yuzheng He et.al. | 2407.11886v1 | null |
2024-07-15 | No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations | Walter Simoncini et.al. | 2407.10964v1 | link |
2024-07-15 | InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models | Nirat Saini et.al. | 2407.10958v1 | null |
2024-07-15 | MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models | Chengguang Gan et.al. | 2407.10953v1 | null |
2024-07-15 | IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation | Yuanhao Zhai et.al. | 2407.10937v1 | link |
2024-07-15 | Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together | Dilara Soylu et.al. | 2407.10930v1 | null |
2024-07-15 | In-Loop Filtering via Trained Look-Up Tables | Zhuoyuan Li et.al. | 2407.10926v1 | null |
2024-07-15 | A Dual-Attention Aware Deep Convolutional Neural Network for Early Alzheimer's Detection | Pandiyaraju V et.al. | 2407.10921v1 | null |
2024-07-16 | DataDream: Few-shot Guided Dataset Generation | Jae Myung Kim et.al. | 2407.10910v2 | link |
2024-07-15 | Interpreting Hand gestures using Object Detection and Digits Classification | Sangeetha K et.al. | 2407.10902v1 | null |
2024-07-15 | Leveraging Multimodal CycleGAN for the Generation of Anatomically Accurate Synthetic CT Scans from MRIs | Leonardo Crespi et.al. | 2407.10888v1 | null |
2024-07-12 | Non-Hermitian Origin of Wannier Localizability and Detachable Topological Boundary States | Daichi Nakamura et.al. | 2407.09458v1 | null |
2024-07-12 | Let Me DeCode You: Decoder Conditioning with Tabular Data | Tomasz Szczepański et.al. | 2407.09437v1 | link |
2024-07-12 | Rethinking temporal self-similarity for repetitive action counting | Yanan Luo et.al. | 2407.09431v1 | null |
2024-07-12 | TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models | Hang Zou et.al. | 2407.09424v1 | null |
2024-07-12 | A grid of self-consistent MSG (MARCS-StaticWeather-GGchem) cool stellar, sub-stellar, and exoplanetary model atmospheres | Uffe G. Jørgensen et.al. | 2407.09397v1 | null |
2024-07-12 | Open-Canopy: A Country-Scale Benchmark for Canopy Height Estimation at Very High Resolution | Fajwel Fogel et.al. | 2407.09392v1 | link |
2024-07-12 | Radiance Fields from Photons | Sacha Jungerman et.al. | 2407.09386v1 | null |
2024-07-12 | Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation | Zhilin Zhu et.al. | 2407.09367v1 | link |
2024-07-12 | Novel clustered federated learning based on local loss | Endong Gu et.al. | 2407.09360v1 | link |
2024-07-12 | Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems | Ziyuan Luo et.al. | 2407.09352v1 | null |
2024-07-11 | Video Diffusion Alignment via Reward Gradients | Mihir Prabhudesai et.al. | 2407.08737v1 | link |
2024-07-11 | Real-Time Anomaly Detection and Reactive Planning with Large Language Models | Rohan Sinha et.al. | 2407.08735v1 | null |
2024-07-11 | WhisperNetV2: SlowFast Siamese Network For Lip-Based Biometrics | Abdollah Zakeri et.al. | 2407.08717v1 | null |
2024-07-11 | Sensor-Aware Classifiers for Energy-Efficient Time Series Applications on IoT Devices | Dina Hussein et.al. | 2407.08715v1 | null |
2024-07-11 | Towards Efficient Deployment of Hybrid SNNs on Neuromorphic and Edge AI Hardware | James Seekings et.al. | 2407.08704v1 | null |
2024-07-11 | Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models | Zhening Xing et.al. | 2407.08701v1 | null |
2024-07-11 | ElasticAST: An Audio Spectrogram Transformer for All Length and Resolutions | Jiu Feng et.al. | 2407.08691v1 | link |
2024-07-11 | Generalizable Implicit Motion Modeling for Video Frame Interpolation | Zujin Guo et.al. | 2407.08680v1 | null |
2024-07-11 | Still-Moving: Customized Video Generation without Customized Video Data | Hila Chefer et.al. | 2407.08674v1 | null |
2024-07-11 | NODE-Adapter: Neural Ordinary Differential Equations for Better Vision-Language Reasoning | Yi Zhang et.al. | 2407.08672v1 | null |
2024-07-10 | LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models | Feng Li et.al. | 2407.07895v1 | link |
2024-07-10 | Vegetable Peeling: A Case Study in Constrained Dexterous Manipulation | Tao Chen et.al. | 2407.07884v1 | null |
2024-07-10 | Controlling Space and Time with Diffusion Models | Daniel Watson et.al. | 2407.07860v1 | null |
2024-07-11 | Functional Assessment of Cerebral Capillaries using Single Capillary Reporters in Ultrasound Localization Microscopy | Stephen A Lee et.al. | 2407.07857v2 | null |
2024-07-10 | Study on Aspect Ratio Variability toward Robustness of Vision Transformer-based Vehicle Re-identification | Mei Qiu et.al. | 2407.07842v1 | null |
2024-07-10 | Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective | Shengjia Chen et.al. | 2407.07841v1 | link |
2024-07-10 | Probe and Prejudice: Classification of compact objects and model comparison using EOS knowledge | Hauke Koehn et.al. | 2407.07837v1 | null |
2024-07-10 | RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement | Honglie Chen et.al. | 2407.07825v1 | null |
2024-07-10 | New Gravitational Wave Discoveries Enabled by Machine Learning | Alexandra E. Koloniari et.al. | 2407.07820v1 | null |
2024-07-10 | The Misclassification Likelihood Matrix: Some Classes Are More Likely To Be Misclassified Than Others | Daniel Sikar et.al. | 2407.07818v1 | null |
2024-07-09 | V-VIPE: Variational View Invariant Pose Embedding | Mara Levy et.al. | 2407.07092v1 | null |
2024-07-09 | Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic | Ruochen Jin et.al. | 2407.07089v1 | link |
2024-07-09 | MoSt-DSA: Modeling Motion and Structural Interactions for Direct Multi-Frame Interpolation in DSA Images | Ziyang Xu et.al. | 2407.07078v1 | link |
2024-07-09 | MADE-for-ASD: A Multi-Atlas Deep Ensemble Network for Diagnosing Autism Spectrum Disorder | Md Rakibul Hasan et.al. | 2407.07076v1 | null |
2024-07-10 | CAPformer: Compression-Aware Pre-trained Transformer for Low-Light Image Enhancement | Wei Wang et.al. | 2407.07056v2 | null |
2024-07-09 | Latent Space Imaging | Matheus Souza et.al. | 2407.07052v1 | null |
2024-07-09 | Simple and Interpretable Probabilistic Classifiers for Knowledge Graphs | Christian Riefolo et.al. | 2407.07045v1 | null |
2024-07-09 | Free Fermionic Constructions of Heterotic Strings | Ioannis Florakis et.al. | 2407.07034v1 | null |
2024-07-09 | Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition | Daiqing Wu et.al. | 2407.07026v1 | null |
2024-07-09 | Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization | Jeongseok Hyun et.al. | 2407.07024v1 | link |
2024-07-08 | Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision | Orr Zohar et.al. | 2407.06189v1 | link |
2024-07-08 | Classification of Cellular Automata based on the Hamming distance | Gaspar Alfaro et.al. | 2407.06175v1 | null |
2024-07-08 | The Tug-of-War Between Deepfake Generation and Detection | Hannah Lee et.al. | 2407.06174v1 | null |
2024-07-08 | PanDORA: Casual HDR Radiance Acquisition for Indoor Scenes | Mohammad Reza Karimi Dastjerdi et.al. | 2407.06150v1 | null |
2024-07-08 | Physics-informed machine learning approaches to reactor antineutrino detection | Sophia Farrell et.al. | 2407.06139v1 | null |
2024-07-08 | Depression Detection and Analysis using Large Language Models on Textual and Audio-Visual Modalities | Avinash Anand et.al. | 2407.06125v1 | null |
2024-07-08 | Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation | Xinyu Bai et.al. | 2407.06095v1 | null |
2024-07-08 | ERR@HRI 2024 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Interactions | Micol Spitale et.al. | 2407.06094v1 | null |
2024-07-08 | Artificial Intuition: Efficient Classification of Scientific Abstracts | Harsh Sakhrani et.al. | 2407.06093v1 | null |
2024-07-08 | Assessing Cardiomegaly in Dogs Using a Simple CNN Model | Nikhil Deekonda et.al. | 2407.06092v1 | null |
2024-07-05 | VCoME: Verbal Video Composition with Multimodal Editing Effects | Weibo Gong et.al. | 2407.04697v1 | null |
2024-07-05 | Enhancing Vehicle Re-identification and Matching for Weaving Analysis | Mei Qiu et.al. | 2407.04688v1 | null |
2024-07-05 | Embracing Massive Medical Data | Yu-Cheng Chou et.al. | 2407.04687v1 | link |
2024-07-05 | Is plantar thermography a valid digital biomarker for characterising diabetic foot ulceration risk? | Akshay Jagadeesh et.al. | 2407.04676v1 | null |
2024-07-05 | AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation | Yuhan Zhu et.al. | 2407.04603v1 | null |
2024-07-05 | Multimodal Classification via Modal-Aware Interactive Enhancement | Qing-Yuan Jiang et.al. | 2407.04587v1 | null |
2024-07-05 | A Degree Bound for Planar Functions | Christof Beierle et.al. | 2407.04570v1 | null |
2024-07-05 | Pencils of plane cubics with one base point | Riccardo Moschetti et.al. | 2407.04569v1 | null |
2024-07-05 | Anticipating Solar Flares | Hugh S. Hudson et.al. | 2407.04567v1 | null |
2024-07-05 | Real Time Emotion Analysis Using Deep Learning for Education, Entertainment, and Beyond | Abhilash Khuntia et.al. | 2407.04560v1 | null |
2024-07-03 | InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output | Pan Zhang et.al. | 2407.03320v1 | link |
2024-07-03 | Value-Penalized Auxiliary Control from Examples for Learning without Rewards or Demonstrations | Trevor Ablett et.al. | 2407.03311v1 | link |
2024-07-03 | Accelerated Proton Resonance Frequency-based Magnetic Resonance Thermometry by Optimized Deep Learning Method | Sijie Xu et.al. | 2407.03308v1 | link |
2024-07-03 | HoloHisto: End-to-end Gigapixel WSI Segmentation with 4K Resolution Sequential Tokenization | Yucheng Tang et.al. | 2407.03307v1 | null |
2024-07-03 | VCHAR:Variance-Driven Complex Human Activity Recognition framework with Generative Representation | Yuan Sun et.al. | 2407.03291v1 | null |
2024-07-03 | Using Photoplethysmography to Detect Real-time Blood Pressure Changes with a Calibration-free Deep Learning Model | Jingyuan Hong et.al. | 2407.03274v1 | null |
2024-07-03 | Modern Neighborhood Components Analysis: A Deep Tabular Baseline Two Decades Later | Han-Jia Ye et.al. | 2407.03257v1 | link |
2024-07-03 | STF: Sentence Transformer Fine-Tuning For Topic Categorization With Limited Data | Kheir Eddine Daouadi et.al. | 2407.03253v1 | null |
2024-07-03 | ACTRESS: Active Retraining for Semi-supervised Visual Grounding | Weitai Kang et.al. | 2407.03251v1 | null |
2024-07-04 | TieBot: Learning to Knot a Tie from Visual Demonstration through a Real-to-Sim-to-Real Approach | Weikun Peng et.al. | 2407.03245v2 | null |
2024-07-02 | Characterizing the Interpretability of Attention Maps in Digital Pathology | Tomé Albuquerque et.al. | 2407.02484v1 | null |
2024-07-02 | Ensemble of pre-trained language models and data augmentation for hate speech detection from Arabic tweets | Kheir Eddine Daouadi et.al. | 2407.02448v1 | null |
2024-07-02 | PLeaS -- Merging Models with Permutations and Least Squares | Anshul Nasery et.al. | 2407.02447v1 | null |
2024-07-02 | Evaluating the Robustness of Adverse Drug Event Classification Models Using Templates | Dorothea MacPhail et.al. | 2407.02432v1 | null |
2024-07-02 | AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scans | Gabriele Lozupone et.al. | 2407.02418v1 | link |
2024-07-03 | Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs | Jinmin Li et.al. | 2407.02411v2 | null |
2024-07-02 | Tiny-PULP-Dronets: Squeezing Neural Networks for Faster and Lighter Inference on Multi-Tasking Autonomous Nano-Drones | Lorenzo Lamberti et.al. | 2407.02405v1 | null |
2024-07-03 | A neural networks method to search for long transient gravitational waves | Francesca Attadio et.al. | 2407.02391v2 | null |
2024-07-02 | Real HSI-MSI-PAN image dataset for the hyperspectral/multi-spectral/panchromatic image fusion and super-resolution fields | Shuangliang Li et.al. | 2407.02387v1 | link |
2024-07-02 | OpenSlot: Mixed Open-set Recognition with Object-centric Learning | Xu Yin et.al. | 2407.02386v1 | null |
2024-06-28 | Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs | Sukmin Yun et.al. | 2406.20098v1 | link |
2024-06-28 | LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression | Jieneng Chen et.al. | 2406.20092v1 | link |
2024-06-28 | Minimax And Adaptive Transfer Learning for Nonparametric Classification under Distributed Differential Privacy Constraints | Arnab Auddy et.al. | 2406.20088v1 | null |
2024-06-28 | Extreme horizon equation | Wojciech Kamiński et.al. | 2406.20068v1 | null |
2024-06-28 | Modeling and LQR Control of Insect Sized Flapping Wing Robot | Daksh Dhingra et.al. | 2406.20061v1 | null |
2024-06-28 | Pairwise Difference Learning for Classification | Mohamed Karim Belaid et.al. | 2406.20031v1 | link |
2024-06-28 | On the Trade-off between Flatness and Optimization in Distributed Learning | Ying Cao et.al. | 2406.20006v1 | null |
2024-06-28 | Malaria Cell Detection Using Deep Neural Networks | Saurabh Sawant et.al. | 2406.20005v1 | null |
2024-06-28 | Impact of Initialization on Intra-subject Pediatric Brain MR Image Registration: A Comparative Analysis between SyN ANTs and Deep Learning-Based Approaches | Andjela Dimitrijevic et.al. | 2406.19943v1 | link |
2024-07-01 | GRACE: Graph-Regularized Attentive Convolutional Entanglement with Laplacian Smoothing for Robust DeepFake Video Detection | Chih-Chung Hsu et.al. | 2406.19941v2 | link |
2024-06-27 | ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos | Jr-Jen Chen et.al. | 2406.19392v1 | link |
2024-06-27 | Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads | Ali Khaleghi Rahimian et.al. | 2406.19391v1 | link |
2024-06-27 | OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding | Tao Zhang et.al. | 2406.19389v1 | null |
2024-06-27 | Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model | Haobo Yuan et.al. | 2406.19369v1 | null |
2024-06-27 | IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language | Lucky Susanto et.al. | 2406.19349v1 | null |
2024-06-27 | Learning Visual Conditioning Tokens to Correct Domain Shift for Fully Test-time Adaptation | Yushun Tang et.al. | 2406.19341v1 | null |
2024-06-28 | LiverUSRecon: Automatic 3D Reconstruction and Volumetry of the Liver with a Few Partial Ultrasound Scans | Kaushalya Sivayogaraj et.al. | 2406.19336v2 | null |
2024-06-27 | PNeRV: A Polynomial Neural Representation for Videos | Sonam Gupta et.al. | 2406.19299v1 | null |
2024-06-27 | Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers | Jinsong Chen et.al. | 2406.19258v1 | null |
2024-06-27 | Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment | Hao Fei et.al. | 2406.19255v1 | null |
2024-06-26 | Towards Compositionality in Concept Learning | Adam Stein et.al. | 2406.18534v1 | link |
2024-06-26 | MatchTime: Towards Automatic Soccer Game Commentary Generation | Jiayuan Rao et.al. | 2406.18530v1 | null |
2024-06-26 | MultiDiff: Consistent Novel View Synthesis from a Single Image | Norman Müller et.al. | 2406.18524v1 | null |
2024-06-26 | ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation | Shenghai Yuan et.al. | 2406.18522v1 | null |
2024-06-27 | Distinguishing mechanisms of social contagion from local network view | Elsa Andres et.al. | 2406.18519v2 | null |
2024-06-26 | Assessment of Clonal Hematopoiesis of Indeterminate Potential from Cardiac Magnetic Resonance Imaging using Deep Learning in a Cardio-oncology Population | Sangeon Ryu et.al. | 2406.18508v1 | null |
2024-06-26 | Robust Surgical Phase Recognition From Annotation Efficient Supervision | Or Rubin et.al. | 2406.18481v1 | null |
2024-06-26 | Universal Anomaly Detection at the LHC: Transforming Optimal Classifiers and the DDD Method | Sascha Caron et.al. | 2406.18469v1 | null |
2024-06-26 | An Autotuning-based Optimization Framework for Mixed-kernel SVM Classifications in Smart Pixel Datasets and Heterojunction Transistors | Xingfu Wu et.al. | 2406.18445v1 | null |
2024-06-26 | Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling | Abril Corona-Figueroa et.al. | 2406.18422v1 | null |
2024-06-25 | Text-Animator: Controllable Visual Text Video Generation | Lin Liu et.al. | 2406.17777v1 | null |
2024-06-25 | MotionBooth: Motion-Aware Customized Text-to-Video Generation | Jianzong Wu et.al. | 2406.17758v1 | null |
2024-06-25 | Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation | Tushar Prasanna Swaminathan et.al. | 2406.17749v1 | null |
2024-06-25 | Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning | Arijit Sehanobish et.al. | 2406.17740v1 | null |
2024-06-25 | Mask-Guided Attention U-Net for Enhanced Neonatal Brain Extraction and Image Preprocessing | Bahram Jafrasteh et.al. | 2406.17709v1 | link |
2024-06-25 | SurgeMOD: Translating image-space tissue motions into vision-based surgical forces | Mikel De Iturrate Reyzabal et.al. | 2406.17707v1 | link |
2024-06-25 | Dualities for universal (co)acting Hopf monoids | Ana Agore et.al. | 2406.17684v1 | null |
2024-06-25 | Local-to-Global Cross-Modal Attention-Aware Fusion for HSI-X Semantic Segmentation | Xuming Zhang et.al. | 2406.17679v1 | null |
2024-06-25 | Lifting of locally initial objects and universal (co)acting Hopf algebras | Ana Agore et.al. | 2406.17677v1 | null |
2024-06-25 | Brain Tumor Classification using Vision Transformer with Selective Cross-Attention Mechanism and Feature Calibration | Mohammad Ali Labbaf Khaniki et.al. | 2406.17670v1 | null |
2024-06-24 | StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal | Chongjie Ye et.al. | 2406.16864v1 | null |
2024-06-24 | FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models | Haonan Qiu et.al. | 2406.16863v1 | link |
2024-06-24 | Dreamitate: Real-World Visuomotor Policy Learning via Video Generation | Junbang Liang et.al. | 2406.16862v1 | null |
2024-06-24 | Long Context Transfer from Language to Vision | Peiyuan Zhang et.al. | 2406.16852v1 | link |
2024-06-24 | Unsupervised Domain Adaptation for Pediatric Brain Tumor Segmentation | Jingru Fu et.al. | 2406.16848v1 | null |
2024-06-24 | Exploring Factual Entailment with NLI: A News Media Study | Guy Mor-Lan et.al. | 2406.16842v1 | null |
2024-06-24 | A Certifiable Algorithm for Simultaneous Shape Estimation and Object Tracking | Lorenzo Shaikewitz et.al. | 2406.16837v1 | null |
2024-06-24 | USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations | Mounika Marreddy et.al. | 2406.16833v1 | null |
2024-06-24 | The classification of simple complex Lie superalgebras of polynomial vector fields and their deformations | Dimitry Leites et.al. | 2406.16760v1 | null |
2024-06-24 | The MRI Scanner as a Diagnostic: Image-less Active Sampling | Yuning Du et.al. | 2406.16754v1 | null |
2024-06-21 | Full-Scale Indexing and Semantic Annotation of CT Imaging: Boosting FAIRness | Hannes Ulrich et.al. | 2406.15340v1 | null |
2024-06-21 | Image Conductor: Precision Control for Interactive Video Synthesis | Yaowei Li et.al. | 2406.15339v1 | null |
2024-06-21 | An End-to-End, Segmentation-Free, Arabic Handwritten Recognition Model on KHATT | Sondos Aabed et.al. | 2406.15329v1 | null |
2024-06-21 | Fine-grained Attention in Hierarchical Transformers for Tabular Time-series | Raphael Azorin et.al. | 2406.15327v1 | link |
2024-06-21 | NLP-KG: A System for Exploratory Search of Scientific Literature in Natural Language Processing | Tim Schopf et.al. | 2406.15294v1 | link |
2024-06-21 | Towards Fine-Grained Citation Evaluation in Generated Text: A Comparative Analysis of Faithfulness Metrics | Weijia Zhang et.al. | 2406.15264v1 | null |
2024-06-24 | VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation | Xuan He et.al. | 2406.15252v2 | null |
2024-06-21 | Retrieval Augmented Zero-Shot Text Classification | Tassallah Abdullahi et.al. | 2406.15241v1 | null |
2024-06-21 | Model Equivalences | Michael Benedikt et.al. | 2406.15235v1 | null |
2024-06-21 | Rate-Splitting Multiple Access for Overloaded Multi-group Multicast: A First Experimental Study | Xinze Lyu et.al. | 2406.15217v1 | null |
2024-06-20 | A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models | Xincheng Shuai et.al. | 2406.14555v1 | link |
2024-06-21 | Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation | Eyal Michaeli et.al. | 2406.14551v2 | link |
2024-06-20 | IRASim: Learning Interactive Real-Robot Action Simulators | Fangqi Zhu et.al. | 2406.14540v1 | null |
2024-06-20 | Epicardium Prompt-guided Real-time Cardiac Ultrasound Frame-to-volume Registration | Long Lei et.al. | 2406.14534v1 | link |
2024-06-20 | Local symmetries in partially ordered sets | Christoph Minz et.al. | 2406.14533v1 | null |
2024-06-20 | Fantastic Copyrighted Beasts and How (Not) to Generate Them | Luxi He et.al. | 2406.14526v1 | null |
2024-06-20 | MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding | Xinyu Fang et.al. | 2406.14515v1 | link |
2024-06-20 | V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data | Rotem Shalev-Arkushin et.al. | 2406.14510v1 | null |
2024-06-20 | LLaSA: Large Multimodal Agent for Human Activity Analysis Through Wearable Sensors | Sheikh Asif Imran et.al. | 2406.14498v1 | link |
2024-06-20 | African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification | Gregor Geigle et.al. | 2406.14496v1 | null |
2024-06-18 | DrVideo: Document Retrieval Based Long Video Understanding | Ziyu Ma et.al. | 2406.12846v1 | null |
2024-06-18 | LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging | Jinuk Kim et.al. | 2406.12837v1 | link |
2024-06-18 | GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation | Ci-Siang Lin et.al. | 2406.12834v1 | null |
2024-06-18 | VIA: A Spatiotemporal Video Adaptation Framework for Global and Local Video Editing | Jing Gu et.al. | 2406.12831v1 | null |
2024-06-18 | Neural Approximate Mirror Maps for Constrained Diffusion Models | Berthy T. Feng et.al. | 2406.12816v1 | null |
2024-06-18 | Privacy Preserving Federated Learning in Medical Imaging with Uncertainty Estimation | Nikolas Koutsoubis et.al. | 2406.12815v1 | link |
2024-06-18 | Probabilistic Temporal Prediction of Continuous Disease Trajectories and Treatment Effects Using Neural SDEs | Joshua Durso-Finley et.al. | 2406.12807v1 | null |
2024-06-18 | Composited-Nested-Learning with Data Augmentation for Nested Named Entity Recognition | Xingming Liao et.al. | 2406.12779v1 | null |
2024-06-18 | Medvedev degrees of subshifts on groups | Sebastián Barbieri et.al. | 2406.12777v1 | null |
2024-06-18 | Latent Intuitive Physics: Learning to Transfer Hidden Physics from A 3D Video | Xiangming Zhu et.al. | 2406.12769v1 | null |
2024-06-17 | Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% | Lei Zhu et.al. | 2406.11837v1 | link |
2024-06-17 | Spectral Introspection Identifies Group Training Dynamics in Deep Neural Networks for Neuroimaging | Bradley T. Baker et.al. | 2406.11825v1 | null |
2024-06-17 | Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation | Alexander Raistrick et.al. | 2406.11824v1 | null |
2024-06-17 | VideoLLM-online: Online Video Large Language Model for Streaming Video | Joya Chen et.al. | 2406.11816v1 | null |
2024-06-17 | Faces of Experimental Pain: Transferability of Deep Learned Heat Pain Features to Electrical Pain | Pooja Prajod et.al. | 2406.11808v1 | null |
2024-06-17 | Mix-Domain Contrastive Learning for Unpaired H&E-to-IHC Stain Translation | Song Wang et.al. | 2406.11799v1 | null |
2024-06-17 | CELL your Model: Contrastive Explanation Methods for Large Language Models | Ronny Luss et.al. | 2406.11785v1 | null |
2024-06-17 | Task Me Anything | Jieyu Zhang et.al. | 2406.11775v1 | link |
2024-06-17 | Domain Generalization for In-Orbit 6D Pose Estimation | Antoine Legrand et.al. | 2406.11743v1 | null |
2024-06-17 | Lightweight Model Pre-training via Language Guided Knowledge Distillation | Mingsheng Li et.al. | 2406.11689v1 | link |
2024-06-14 | VideoGUI: A Benchmark for GUI Automation from Instructional Videos | Kevin Qinghong Lin et.al. | 2406.10227v1 | null |
2024-06-14 | Short Film Dataset (SFD): A Benchmark for Story-Level Video Understanding | Ridouane Ghermi et.al. | 2406.10221v1 | null |
2024-06-14 | SSTFB: Leveraging self-supervised pretext learning and temporal self-attention with feature branching for real-time video polyp segmentation | Ziang Xu et.al. | 2406.10200v1 | null |
2024-06-14 | CarLLaVA: Vision language models for camera-only closed-loop driving | Katrin Renz et.al. | 2406.10165v1 | null |
2024-06-14 | Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition | Guinan Li et.al. | 2406.10152v1 | null |
2024-06-14 | Training-free Camera Control for Video Generation | Chen Hou et.al. | 2406.10126v1 | null |
2024-06-14 | Modified Risk Formulation for Improving the Prediction of Knee Osteoarthritis Progression | Haresh Rengaraj Rajamohan et.al. | 2406.10119v1 | null |
2024-06-14 | ECGMamba: Towards Efficient ECG Classification with BiSSM | Yupeng Qiang et.al. | 2406.10098v1 | null |
2024-06-14 | Biomarker based Cancer Classification using an Ensemble with Pre-trained Models | Chongmin Lee et.al. | 2406.10087v1 | null |
2024-06-14 | On the Evaluation of Speech Foundation Models for Spoken Language Understanding | Siddhant Arora et.al. | 2406.10083v1 | null |
2024-06-13 | VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding | Muhammad Maaz et.al. | 2406.09418v1 | link |
2024-06-13 | An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels | Duy-Kien Nguyen et.al. | 2406.09415v1 | null |
2024-06-13 | CodedEvents: Optimal Point-Spread-Function Engineering for 3D-Tracking with Event Cameras | Sachin Shah et.al. | 2406.09409v1 | null |
2024-06-13 | Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion | Linzhan Mou et.al. | 2406.09402v1 | null |
2024-06-13 | OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation | Junke Wang et.al. | 2406.09399v1 | link |
2024-06-13 | Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA | Jongwoo Park et.al. | 2406.09396v1 | null |
2024-06-13 | LLAVIDAL: Benchmarking Large Language Vision Models for Daily Activities of Living | Rajatsubhra Chakraborty et.al. | 2406.09390v1 | null |
2024-06-13 | Sagiri: Low Dynamic Range Image Enhancement with Generative Diffusion Prior | Baiang Li et.al. | 2406.09389v1 | null |
2024-06-13 | Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition | Youngtaek Oh et.al. | 2406.09388v1 | link |
2024-06-13 | SimGen: Simulator-conditioned Driving Scene Generation | Yunsong Zhou et.al. | 2406.09386v1 | null |
2024-06-12 | On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models | Hashmat Shadab Malik et.al. | 2406.08486v1 | link |
2024-06-12 | RMem: Restricted Memory Banks Improve Video Object Segmentation | Junbao Zhou et.al. | 2406.08476v1 | null |
2024-06-12 | AToM-Bot: Embodied Fulfillment of Unspoken Human Needs with Affective Theory of Mind | Wei Ding et.al. | 2406.08455v1 | null |
2024-06-12 | Transformation-Dependent Adversarial Attacks | Yaoteng Tan et.al. | 2406.08443v1 | null |
2024-06-12 | A Sticker is Worth a Thousand Words: Characterizing the Use of Stickers in WhatsApp Political Groups in Brazil | Philipe Melo et.al. | 2406.08429v1 | null |
2024-06-12 | Improving Noise Robustness through Abstractions and its Impact on Machine Learning | Alfredo Ibias et.al. | 2406.08428v1 | null |
2024-06-12 | OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text | Qingyun Li et.al. | 2406.08418v1 | link |
2024-06-13 | MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos | Xuehai He et.al. | 2406.08407v2 | link |
2024-06-12 | Eyes Wide Unshut: Unsupervised Mistake Detection in Egocentric Video by Detecting Unpredictable Gaze | Michele Mazzamuto et.al. | 2406.08379v1 | null |
2024-06-12 | 2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction | Tianqi Chen et.al. | 2406.08374v1 | null |
2024-06-11 | Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring | Huicong Zhang et.al. | 2406.07551v1 | link |
2024-06-11 | Image and Video Tokenization with Binary Spherical Quantization | Yue Zhao et.al. | 2406.07548v1 | link |
2024-06-11 | Zero-shot Image Editing with Reference Imitation | Xi Chen et.al. | 2406.07547v1 | null |
2024-06-11 | Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance | Kuan Heng Lin et.al. | 2406.07540v1 | null |
2024-06-11 | BAKU: An Efficient Transformer for Multi-Task Policy Learning | Siddhant Haldar et.al. | 2406.07539v1 | null |
2024-06-11 | Transforming a rare event search into a not-so-rare event search in real-time with deep learning-based object detection | J. Schueler et.al. | 2406.07538v1 | null |
2024-06-11 | Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection | Wenxiao Wang et.al. | 2406.07536v1 | null |
2024-06-11 | Dynamics of the non-radial energy-critical inhomogeneous NLS | Carlos M. Guzmán et.al. | 2406.07535v1 | null |
2024-06-11 | Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement | Yunzhen Feng et.al. | 2406.07515v1 | null |
2024-06-11 | Understanding Visual Concepts Across Models | Brandon Trabucco et.al. | 2406.07506v1 | link |
2024-06-10 | NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing | Ting-Hsuan Chen et.al. | 2406.06523v1 | null |
2024-06-10 | Data Augmentation for Multivariate Time Series Classification: An Experimental Study | Romain Ilbert et.al. | 2406.06518v1 | null |
2024-06-10 | Merlin: A Vision Language Foundation Model for 3D Computed Tomography | Louis Blankemeier et.al. | 2406.06512v1 | null |
2024-06-10 | Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusion for Zero-shot Motion Transfer | Sigal Raab et.al. | 2406.06508v1 | link |
2024-06-10 | Equivariant Neural Tangent Kernels | Philipp Misof et.al. | 2406.06504v1 | null |
2024-06-10 | Viscous shock fluctuations in KPZ | Alexander Dunlap et.al. | 2406.06502v1 | null |
2024-06-10 | NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative | Asmar Nadeem et.al. | 2406.06499v1 | null |
2024-06-10 | Demonstrating HumanTHOR: A Simulation Platform and Benchmark for Human-Robot Collaboration in a Shared Workspace | Chenxu Wang et.al. | 2406.06498v1 | null |
2024-06-10 | Graph-Based Bidirectional Transformer Decision Threshold Adjustment Algorithm for Class-Imbalanced Molecular Data | Nicole Hayes et.al. | 2406.06479v1 | null |
2024-06-10 | DiffAudit: Auditing Privacy Practices of Online Services for Children and Adolescents | Olivia Figueira et.al. | 2406.06473v1 | null |
2024-06-07 | DVOS: Self-Supervised Dense-Pattern Video Object Segmentation | Keyhan Najafian et.al. | 2406.05131v1 | null |
2024-06-07 | Compositional Curvature Bounds for Deep Neural Networks | Taha Entesari et.al. | 2406.05119v1 | null |
2024-06-07 | Large Generative Graph Models | Yu Wang et.al. | 2406.05109v1 | null |
2024-06-07 | A Novel Time Series-to-Image Encoding Approach for Weather Phenomena Classification | Christian Giannetti et.al. | 2406.05096v1 | null |
2024-06-10 | Discovery of An Apparent Red, High-Velocity Type Ia Supernova at z = 2.9 with JWST | J. D. R. Pierel et.al. | 2406.05089v2 | null |
2024-06-07 | CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion | Xingrui Wang et.al. | 2406.05082v1 | null |
2024-06-10 | Discovery of a Relativistic Stripped Envelope Type Ic-BL Supernova at z = 2.83 with JWST | M. R. Siebert et.al. | 2406.05076v2 | null |
2024-06-07 | Diving Deep into the Motion Representation of Video-Text Models | Chinmaya Devaraj et.al. | 2406.05075v1 | null |
2024-06-07 | Hibou: A Family of Foundational Vision Transformers for Pathology | Dmitry Nechaev et.al. | 2406.05074v1 | null |
2024-06-07 | Classification Metrics for Image Explanations: Towards Building Reliable XAI-Evaluations | Benjamin Fresz et.al. | 2406.05068v1 | link |
2024-06-06 | Verbalized Machine Learning: Revisiting Machine Learning with Language Models | Tim Z. Xiao et.al. | 2406.04344v1 | null |
2024-06-07 | Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion | Fangfu Liu et.al. | 2406.04338v2 | null |
2024-06-06 | Parameter-Inverted Image Pyramid Networks | Xizhou Zhu et.al. | 2406.04330v1 | link |
2024-06-06 | ShareGPT4Video: Improving Video Understanding and Generation with Better Captions | Lin Chen et.al. | 2406.04325v1 | null |
2024-06-06 | SF-V: Single Forward Video Generation Model | Zhixing Zhang et.al. | 2406.04324v1 | null |
2024-06-06 | ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories | Qianlan Yang et.al. | 2406.04323v1 | null |
2024-06-06 | VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling | Zeyue Tian et.al. | 2406.04321v1 | link |
2024-06-06 | Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models | Ali Behrouz et.al. | 2406.04320v1 | null |
2024-06-06 | Adaptive Sampling of k-Space in Magnetic Resonance for Rapid Pathology Prediction | Chen-Yu Yen et.al. | 2406.04318v1 | null |
2024-06-06 | Regularized KL-Divergence for Well-Defined Function-Space Variational Inference in Bayesian neural networks | Tristan Cinquin et.al. | 2406.04317v1 | null |
2024-06-05 | Grokking Modular Polynomials | Darshil Doshi et.al. | 2406.03495v1 | null |
2024-06-05 | The Logarithmic Memristor-Based Bayesian Machine | Clément Turck et.al. | 2406.03492v1 | null |
2024-06-05 | Convolutional Neural Networks and Vision Transformers for Fashion MNIST Classification: A Literature Review | Sonia Bbouzidi et.al. | 2406.03478v1 | null |
2024-06-05 | Node-wise Filtering in Graph Neural Networks: A Mixture of Experts Approach | Haoyu Han et.al. | 2406.03464v1 | null |
2024-06-05 | Polarization Wavefront Lidar: Learning Large Scene Reconstruction from Polarized Wavefronts | Dominik Scheuble et.al. | 2406.03461v1 | null |
2024-06-05 | FILS: Self-Supervised Video Feature Prediction In Semantic Language Space | Mona Ahmadian et.al. | 2406.03447v1 | null |
2024-06-05 | Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input | Joachim Ott et.al. | 2406.03439v1 | null |
2024-06-05 | Stabilizing massless fields with fluxes in Landau-Ginzburg models | Katrin Becker et.al. | 2406.03435v1 | null |
2024-06-05 | Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis | Moein Heidari et.al. | 2406.03430v1 | link |
2024-06-05 | Post-hoc Part-prototype Networks | Andong Tan et.al. | 2406.03421v1 | null |
2024-06-05 | Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting | Inkyu Shin et.al. | 2406.02541v2 | null |
2024-06-04 | ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation | Tianchen Zhao et.al. | 2406.02540v1 | null |
2024-06-04 | Enhancing predictive imaging biomarker discovery through treatment effect analysis | Shuhan Xiao et.al. | 2406.02534v1 | null |
2024-06-04 | ReLUs Are Sufficient for Learning Implicit Neural Representations | Joseph Shenouda et.al. | 2406.02529v1 | link |
2024-06-04 | RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots | Soroush Nasiriany et.al. | 2406.02523v1 | null |
2024-06-04 | DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering | Zhongpai Gao et.al. | 2406.02518v1 | null |
2024-06-04 | V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation | Cong Wang et.al. | 2406.02511v1 | null |
2024-06-04 | CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation | Dejia Xu et.al. | 2406.02509v1 | null |
2024-06-04 | Endomorphisms of Artin groups of type |
Luis Paris et.al. | 2406.02484v1 | null |
2024-06-04 | Inpainting Pathology in Lumbar Spine MRI with Latent Diffusion | Colin Hansen et.al. | 2406.02477v1 | null |
2024-05-31 | Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis | Chaoyou Fu et.al. | 2405.21075v1 | null |
2024-05-31 | Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights | Xin Wen et.al. | 2405.21070v1 | link |
2024-05-31 | You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet | Zhen Qin et.al. | 2405.21022v1 | null |
2024-05-31 | Beyond Conventional Parametric Modeling: Data-Driven Framework for Estimation and Prediction of Time Activity Curves in Dynamic PET Imaging | Niloufar Zakariaei et.al. | 2405.21021v1 | null |
2024-05-31 | The classification of dp-minimal integral domains | Christian d'Elbée et.al. | 2405.21014v1 | null |
2024-05-31 | Early Stopping Criteria for Training Generative Adversarial Networks in Biomedical Imaging | Muhammad Muneeb Saad et.al. | 2405.20987v1 | null |
2024-05-31 | PUAL: A Classifier on Trifurcate Positive-Unlabeled Data | Xiaoke Wang et.al. | 2405.20970v1 | null |
2024-05-31 | Aligning Multiclass Neural Network Classifier Criterion with Task Performance via |
Nathan Tsoi et.al. | 2405.20954v1 | null |
2024-05-31 | Standard model of electromagnetism and chirality in crystals | R. Winkler et.al. | 2405.20940v1 | null |
2024-05-31 | MALT: Multi-scale Action Learning Transformer for Online Action Detection | Zhipeng Yang et.al. | 2405.20892v1 | null |
2024-05-30 | MotionLLM: Understanding Human Behaviors from Human Motions and Videos | Ling-Hao Chen et.al. | 2405.20340v1 | null |
2024-05-30 | OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving | Lening Wang et.al. | 2405.20337v1 | link |
2024-05-30 | VividDream: Generating 3D Scene with Ambient Dynamics | Yao-Chih Lee et.al. | 2405.20334v1 | null |
2024-05-30 | SurgiTrack: Fine-Grained Multi-Class Multi-Tool Tracking in Surgical Videos | Chinedu Innocent Nwoye et.al. | 2405.20333v1 | null |
2024-05-31 | 4DHands: Reconstructing Interactive Hands in 4D with Transformers | Dixuan Lin et.al. | 2405.20330v2 | null |
2024-05-30 | MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion | Shuyuan Tu et.al. | 2405.20325v1 | null |
2024-05-30 | Vision-based Manipulation from Single Human Video with Open-World Object Graphs | Yifeng Zhu et.al. | 2405.20321v1 | null |
2024-05-30 | Improving the Training of Rectified Flows | Sangyun Lee et.al. | 2405.20320v1 | link |
2024-05-30 | CausalQuest: Collecting Natural Causal Questions for AI Agents | Roberto Ceraolo et.al. | 2405.20318v1 | link |
2024-05-30 | Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models | Himangi Mittal et.al. | 2405.20305v1 | null |
2024-05-29 | X-VILA: Cross-Modality Alignment for Large Language Model | Hanrong Ye et.al. | 2405.19335v1 | null |
2024-05-29 | LLMs Meet Multimodal Generation and Editing: A Survey | Yingqing He et.al. | 2405.19334v1 | link |
2024-05-29 | Multi-Modal Generative Embedding Model | Feipeng Ma et.al. | 2405.19333v1 | null |
2024-05-29 | NPGA: Neural Parametric Gaussian Avatars | Simon Giebenhain et.al. | 2405.19331v1 | null |
2024-05-29 | Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation | Atrisha Sarkar et.al. | 2405.19328v1 | null |
2024-05-29 | DGD: Dynamic 3D Gaussians Distillation | Isaac Labe et.al. | 2405.19321v1 | null |
2024-05-29 | Real-Time Environment Condition Classification for Autonomous Vehicles | Marco Introvigne et.al. | 2405.19305v1 | null |
2024-05-29 | Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare | Hanwei Zhu et.al. | 2405.19298v1 | null |
2024-05-29 | Archetype-Based Redshift Estimation for the Dark Energy Spectroscopic Instrument Survey | Abhijeet Anand et.al. | 2405.19288v1 | null |
2024-05-29 | A study on the adequacy of common IQA measures for medical images | Anna Breger et.al. | 2405.19224v1 | null |
2024-05-28 | Classifying Overlapping Gaussian Mixtures in High Dimensions: From Optimal Classifiers to Neural Nets | Khen Cohen et.al. | 2405.18427v1 | null |
2024-05-28 | GFlow: Recovering 4D World from Monocular Video | Shizun Wang et.al. | 2405.18426v1 | null |
2024-05-28 | Hierarchical World Models as Visual Whole-Body Humanoid Controllers | Nicklas Hansen et.al. | 2405.18418v1 | null |
2024-05-28 | 3D StreetUnveiler with Semantic-Aware 2DGS | Jingwei Xu et.al. | 2405.18416v1 | null |
2024-05-28 | Why are Visually-Grounded Language Models Bad at Image Classification? | Yuhui Zhang et.al. | 2405.18415v1 | link |
2024-05-28 | Towards a Sampling Theory for Implicit Neural Representations | Mahrokh Najaf et.al. | 2405.18410v1 | null |
2024-05-28 | Phased Consistency Model | Fu-Yun Wang et.al. | 2405.18407v1 | null |
2024-05-28 | RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives | Jaehong Yoon et.al. | 2405.18406v1 | null |
2024-05-28 | MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning | Somnath Kumar et.al. | 2405.18358v1 | null |
2024-05-28 | Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography | Jie Liu et.al. | 2405.18356v1 | link |
2024-05-27 | Matryoshka Multimodal Models | Mu Cai et.al. | 2405.17430v1 | null |
2024-05-27 | NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models | Chankyu Lee et.al. | 2405.17428v1 | null |
2024-05-27 | MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds | Jiahui Lei et.al. | 2405.17421v1 | null |
2024-05-27 | Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control | Zhengfei Kuang et.al. | 2405.17414v1 | null |
2024-05-27 | Enhancing Music Genre Classification through Multi-Algorithm Analysis and User-Friendly Visualization | Navin Kamuni et.al. | 2405.17413v1 | null |
2024-05-27 | The Peripatetic Hater: Predicting Movement Among Hate Subreddits | Daniel Hickey et.al. | 2405.17410v1 | null |
2024-05-27 | Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer | Ruizhi Shao et.al. | 2405.17405v1 | null |
2024-05-27 | Spectral Greedy Coresets for Graph Neural Networks | Mucong Ding et.al. | 2405.17404v1 | null |
2024-05-27 | Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability | Shenyuan Gao et.al. | 2405.17398v1 | link |
2024-05-27 | Non-Unitary Quantum Machine Learning | Jamie Heredge et.al. | 2405.17388v1 | null |
2024-05-24 | Canonical Variates in Wasserstein Metric Space | Jia Li et.al. | 2405.15768v1 | null |
2024-05-24 | Scaling Laws for Discriminative Classification in Large Language Models | Dean Wyatte et.al. | 2405.15765v1 | null |
2024-05-24 | InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation | Yuchi Wang et.al. | 2405.15758v1 | link |
2024-05-24 | Looking Backward: Streaming Video-to-Video Translation with Feature Banks | Feng Liang et.al. | 2405.15757v1 | link |
2024-05-24 | Characterizing Discourse Group Roles in Inquiry-based University Science Labs | Tong Wan et.al. | 2405.15746v1 | null |
2024-05-24 | Hierarchical Uncertainty Exploration via Feedforward Posterior Trees | Elias Nehme et.al. | 2405.15719v1 | null |
2024-05-24 | EmpathicStories++: A Multimodal Dataset for Empathy towards Personal Experiences | Jocelyn Shen et.al. | 2405.15708v1 | null |
2024-05-24 | Sums: Sniffing Unknown Multiband Signals under Low Sampling Rates | Jinbo Peng et.al. | 2405.15705v1 | null |
2024-05-24 | realSEUDO for real-time calcium imaging analysis | Iuliia Dmitrieva et.al. | 2405.15701v1 | null |
2024-05-24 | UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes | Ted Lentsch et.al. | 2405.15688v1 | null |
2024-05-23 | PuzzleAvatar: Assembling 3D Avatars from Personal Albums | Yuliang Xiu et.al. | 2405.14869v1 | null |
2024-05-23 | Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis | Basile Van Hoorick et.al. | 2405.14868v1 | null |
2024-05-23 | Video Diffusion Models are Training-free Motion Interpreter and Controller | Zeqi Xiao et.al. | 2405.14864v1 | null |
2024-05-23 | Synergistic Global-space Camera and Human Reconstruction from Videos | Yizhou Zhao et.al. | 2405.14855v1 | null |
2024-05-23 | Domain Wall Magnetic Tunnel Junction Reliable Integrate and Fire Neuron | Can Cui1 et.al. | 2405.14851v1 | null |
2024-05-23 | Learning to Detect and Segment Mobile Objects from Unlabeled Videos | Yihong Sun et.al. | 2405.14841v1 | null |
2024-05-23 | Designing A Sustainable Marine Debris Clean-up Framework without Human Labels | Raymond Wang et.al. | 2405.14815v1 | null |
2024-05-23 | As an AI Language Model, "Yes I Would Recommend Calling the Police'': Norm Inconsistency in LLM Decision-Making | Shomik Jain et.al. | 2405.14812v1 | null |
2024-05-23 | Lorentz-Equivariant Geometric Algebra Transformers for High-Energy Physics | Jonas Spinner et.al. | 2405.14806v1 | null |
2024-05-24 | Fast-DDPM: Fast Denoising Diffusion Probabilistic Models for Medical Image-to-Image Generation | Hongxu Jiang et.al. | 2405.14802v2 | link |
2024-05-21 | Comprehensive Multimodal Deep Learning Survival Prediction Enabled by a Transformer Architecture: A Multicenter Study in Glioblastoma | Ahmed Gomaa et.al. | 2405.12963v1 | null |
2024-05-21 | Online Learning of Halfspaces with Massart Noise | Ilias Diakonikolas et.al. | 2405.12958v1 | null |
2024-05-21 | Quantifying Uncertainty in Classification Performance: ROC Confidence Bands Using Conformal Prediction | Zheshi Zheng et.al. | 2405.12953v1 | null |
2024-05-21 | Tutorly: Turning Programming Videos Into Apprenticeship Learning Environments with LLMs | Wengxi Li et.al. | 2405.12946v1 | null |
2024-05-21 | Pytorch-Wildlife: A Collaborative Deep Learning Framework for Conservation | Andres Hernandez et.al. | 2405.12930v1 | link |
2024-05-21 | Streamlining Software Reviews: Efficient Predictive Modeling with Minimal Examples | Tim Menzies et.al. | 2405.12920v1 | null |
2024-05-21 | The |
Bachir Bekka et.al. | 2405.12919v1 | null |
2024-05-21 | Topic Modelling Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment | Holli Sargeant et.al. | 2405.12910v1 | link |
2024-05-21 | Decentralized Federated Learning Over Imperfect Communication Channels | Weicai Li et.al. | 2405.12894v1 | null |
2024-05-21 | Investigating Persuasion Techniques in Arabic: An Empirical Study Leveraging Large Language Models | Abdurahmman Alzahrani et.al. | 2405.12884v1 | null |
2024-05-20 | Images that Sound: Composing Images and Sounds on a Single Canvas | Ziyang Chen et.al. | 2405.12221v1 | null |
2024-05-20 | Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices | Nathaniel Cohen et.al. | 2405.12211v1 | null |
2024-05-20 | The sign of scalar curvature on Kähler blowups | Garrett M. Brown et.al. | 2405.12189v1 | null |
2024-05-20 | Building Temporal Kernels with Orthogonal Polynomials | Yan Ru Pei et.al. | 2405.12179v1 | link |
2024-05-20 | Wireless vs. Traditional Ultrasound Assessed Knee Cartilage Outcomes Utilizing Automated Gain and Normalization Techniques | Arjun Parmar et.al. | 2405.12172v1 | null |
2024-05-20 | DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM | Xuchen Li et.al. | 2405.12139v1 | null |
2024-05-20 | Alzheimer's Magnetic Resonance Imaging Classification Using Deep and Meta-Learning Models | Nida Nasir et.al. | 2405.12126v1 | null |
2024-05-20 | An Active Learning Framework with a Class Balancing Strategy for Time Series Classification | Shemonto Das et.al. | 2405.12122v1 | null |
2024-05-20 | AGNfitter-rx: Modelling the radio-to-X-ray SEDs of AGNs | L. N. Martínez-Ramírez et.al. | 2405.12111v1 | null |
2024-05-20 | Real topological phonons in 3D carbon allotropes | Xiaotian Wang et.al. | 2405.12072v1 | null |
2024-05-17 | Submodular Information Selection for Hypothesis Testing with Misclassification Penalties | Jayanth Bhargav et.al. | 2405.10930v1 | null |
2024-05-17 | A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model | Mingxiang Fu et.al. | 2405.10890v1 | null |
2024-05-17 | Multicenter Privacy-Preserving Model Training for Deep Learning Brain Metastases Autosegmentation | Yixing Huang et.al. | 2405.10870v1 | null |
2024-05-17 | "Hall" transport of liquid crystal solitons in Couette flow | Rodrigo C. V. Coelho et.al. | 2405.10850v1 | null |
2024-05-17 | Automatic segmentation of Organs at Risk in Head and Neck cancer patients from CT and MRI scans | Sébastien Quetin et.al. | 2405.10833v1 | null |
2024-05-17 | Open-Vocabulary Spatio-Temporal Action Detection | Tao Wu et.al. | 2405.10832v1 | null |
2024-05-17 | Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities | Hao Zhou et.al. | 2405.10825v1 | null |
2024-05-17 | ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios | Markus Bayer et.al. | 2405.10808v1 | null |
2024-05-17 | A Large-scale Multi Domain Leukemia Dataset for the White Blood Cells Detection with Morphological Attributes for Explainability | Abdul Rehman et.al. | 2405.10803v1 | null |
2024-05-17 | Reduced storage direct tensor ring decomposition for convolutional neural networks compression | Mateusz Gabor et.al. | 2405.10802v1 | link |
2024-05-16 | TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction | Yunfan Jiang et.al. | 2405.10315v1 | null |
2024-05-16 | 4D Panoptic Scene Graph Generation | Jingkang Yang et.al. | 2405.10305v1 | link |
2024-05-16 | On Sample Selection for Continual Learning: a Video Streaming Case Study | Alexander Dietmüller et.al. | 2405.10290v1 | null |
2024-05-16 | Quantum Vision Transformers for Quark-Gluon Classification | Marçal Comajoan Cara et.al. | 2405.10284v1 | null |
2024-05-16 | Faces that Speak: Jointly Synthesising Talking Face and Speech from Text | Youngjoon Jang et.al. | 2405.10272v1 | null |
2024-05-16 | A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision | Charles Raude et.al. | 2405.10266v1 | null |
2024-05-16 | PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology | George Shaikovski et.al. | 2405.10254v1 | null |
2024-05-16 | A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts | Xinru Zhang et.al. | 2405.10246v1 | null |
2024-05-16 | Ternary mappings of some evolution algebras | Candido Martin Gonzalez et.al. | 2405.10241v1 | null |
2024-05-16 | ENADPool: The Edge-Node Attention-based Differentiable Pooling for Graph Neural Networks | Zhehan Zhao et.al. | 2405.10218v1 | null |
2024-05-15 | Classifying geospatial objects from multiview aerial imagery using semantic meshes | David Russell et.al. | 2405.09544v1 | null |
2024-05-15 | Spectral complexity of deep neural networks | Simmaco Di Lillo et.al. | 2405.09541v1 | null |
2024-05-16 | MMFusion: Multi-modality Diffusion Model for Lymph Node Metastasis Diagnosis in Esophageal Cancer | Chengyu Wu et.al. | 2405.09539v2 | link |
2024-05-15 | Restoring balance: principled under/oversampling of data for optimal classification | Emanuele Loffredo et.al. | 2405.09535v1 | null |
2024-05-15 | Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck | Hongru Li et.al. | 2405.09514v1 | null |
2024-05-15 | Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts | Donya Rooein et.al. | 2405.09482v1 | null |
2024-05-15 | Perception- and Fidelity-aware Reduced-Reference Super-Resolution Image Quality Assessment | Xinying Lin et.al. | 2405.09472v1 | null |
2024-05-15 | Non-contact Lung Disease Classification via OFDM-based Passive 6G ISAC Sensing | Hasan Mujtaba Buttar et.al. | 2405.09458v1 | null |
2024-05-15 | Cohomogeneity one RCD-spaces | Diego Corro et.al. | 2405.09448v1 | null |
2024-05-15 | M$^4$oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts | Yufeng Jiang et.al. | 2405.09446v1 | null |
2024-05-14 | CinePile: A Long Video Question Answering Dataset and Benchmark | Ruchit Rawal et.al. | 2405.08813v1 | null |
2024-05-14 | The Developing Human Connectome Project: A Fast Deep Learning-based Pipeline for Neonatal Cortical Surface Reconstruction | Qiang Ma et.al. | 2405.08783v1 | null |
2024-05-14 | Harnessing the power of longitudinal medical imaging for eye disease prognosis using Transformer-based sequence modeling | Gregory Holste et.al. | 2405.08780v1 | null |
2024-05-14 | FolkTalent: Enhancing Classification and Tagging of Indian Folk Paintings | Nancy Hada et.al. | 2405.08776v1 | null |
2024-05-14 | From Text to Context: An Entailment Approach for News Stakeholder Classification | Alapan Kuila et.al. | 2405.08751v1 | null |
2024-05-14 | Enhancing Blind Video Quality Assessment with Rich Quality-aware Features | Wei Sun et.al. | 2405.08745v1 | null |
2024-05-14 | The impact of Compositionality in Zero-shot Multi-label action recognition for Object-based tasks | Carmela Calabrese et.al. | 2405.08695v1 | null |
2024-05-14 | Latent group structure in linear panel data models with endogenous regressors | Junho Choi et.al. | 2405.08687v1 | null |
2024-05-14 | Achieving Fairness Through Channel Pruning for Dermatological Disease Diagnosis | Qingpeng Kong et.al. | 2405.08681v1 | link |
2024-05-14 | Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning | Alain Riou et.al. | 2405.08679v1 | null |
2024-05-14 | MambaOut: Do We Really Need Mamba for Vision? | Weihao Yu et.al. | 2405.07992v2 | link |
2024-05-13 | SPIN: Simultaneous Perception, Interaction and Navigation | Shagun Uppal et.al. | 2405.07991v1 | null |
2024-05-13 | KG-Planner: Knowledge-Informed Graph Neural Planning for Collaborative Manipulators | Wansong Liu et.al. | 2405.07962v1 | null |
2024-05-13 | An Algorithmic Classification of Generalized Pseudo-Anosov Homeomorphisms via Geometric Markov Partitions | Inti Cruz Diaz et.al. | 2405.07954v1 | null |
2024-05-13 | Scene Action Maps: Behavioural Maps for Navigation without Metric Information | Joel Loo et.al. | 2405.07948v1 | null |
2024-05-14 | PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition | Ziyang Zhang et.al. | 2405.07932v2 | link |
2024-05-13 | Improving Multimodal Learning with Multi-Loss Gradient Modulation | Konstantinos Kontras et.al. | 2405.07930v1 | null |
2024-05-13 | PLUTO: Pathology-Universal Transformer | Dinkar Juyal et.al. | 2405.07905v1 | null |
2024-05-13 | Enhancing Clinically Significant Prostate Cancer Prediction in T2-weighted Images through Transfer Learning from Breast Cancer | Chi-en Amy Tai et.al. | 2405.07869v1 | null |
2024-05-13 | Improving Breast Cancer Grade Prediction with Multiparametric MRI Created Using Optimized Synthetic Correlated Diffusion Imaging | Chi-en Amy Tai et.al. | 2405.07861v1 | null |
2024-05-10 | Multi-Object Tracking in the Dark | Xinzhe Wang et.al. | 2405.06600v1 | link |
2024-05-10 | Ice phase classification made easy with score-based denoising | Hong Sun et.al. | 2405.06599v1 | null |
2024-05-10 | Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach | Elham Ravanbakhsh et.al. | 2405.06586v1 | null |
2024-05-10 | Deep video representation learning: a survey | Elham Ravanbakhsh et.al. | 2405.06574v1 | null |
2024-05-10 | The Role of Topological Photon Spheres in Constraining the Parameters of Black Holes | Jafar Sadeghi et.al. | 2405.06568v1 | null |
2024-05-10 | OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation | Jinwei Lin et.al. | 2405.06547v1 | link |
2024-05-10 | Separating States in Astronomical Sources Using Hidden Markov Models: With a Case Study of Flaring and Quiescence on EV Lac | Robert Zimmerman et.al. | 2405.06540v1 | null |
2024-05-10 | Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation | Xiaowen Ma et.al. | 2405.06525v1 | link |
2024-05-10 | Aspect-based Sentiment Evaluation of Chess Moves (ASSESS): an NLP-based Method for Evaluating Chess Strategies from Textbooks | Haifa Alrdahi et.al. | 2405.06499v1 | null |
2024-05-10 | Improving Deep Learning Model Calibration for Cardiac Applications using Deterministic Uncertainty Networks and Uncertainty-aware Training | Tareen Dawood et.al. | 2405.06487v1 | null |
2024-05-09 | A Universal Growth Rate for Learning with Smooth Surrogate Losses | Anqi Mao et.al. | 2405.05968v1 | null |
2024-05-09 | Self-Supervised Learning of Time Series Representation via Diffusion Process and Imputation-Interpolation-Forecasting Mask | Zineb Senane et.al. | 2405.05959v1 | link |
2024-05-09 | Frame Interpolation with Consecutive Brownian Bridge Diffusion | Zonglin Lyu et.al. | 2405.05953v1 | null |
2024-05-09 | Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers | Peng Gao et.al. | 2405.05945v1 | link |
2024-05-09 | MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRI | Yan Zhuang et.al. | 2405.05944v1 | null |
2024-05-09 | Non-symplectic automorphisms of prime order of O'Grady's tenfolds and cubic fourfolds | Simone Billi et.al. | 2405.05932v1 | null |
2024-05-09 | Deep Multi-Task Learning for Malware Image Classification | Ahmed Bensaoud et.al. | 2405.05906v1 | null |
2024-05-09 | An RNN-policy gradient approach for quantum architecture search | Gang Wang et.al. | 2405.05892v1 | null |
2024-05-09 | Composable Part-Based Manipulation | Weiyu Liu et.al. | 2405.05876v1 | null |
2024-05-09 | ExACT: An End-to-End Autonomous Excavator System Using Action Chunking With Transformers | Liangliang Chen et.al. | 2405.05861v1 | null |
2024-05-08 | Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo | Nayantara Mudur et.al. | 2405.05255v1 | link |
2024-05-08 | Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models | Hongjie Wang et.al. | 2405.05252v1 | null |
2024-05-08 | DanceCam: atmospheric turbulence mitigation in wide-field astronomical images with short-exposure video streams | Spencer Bialek et.al. | 2405.05250v1 | null |
2024-05-08 | Deep learning-based variational autoencoder for classification of quantum and classical states of light | Mahesh Bhupati et.al. | 2405.05243v1 | null |
2024-05-08 | On |
Barry Chin et.al. | 2405.05230v1 | null |
2024-05-08 | Are Economically Advanced Countries More Efficient in Basic and Applied Research? | Vladimír Holý et.al. | 2405.05227v1 | null |
2024-05-08 | Clustering Retail Products Based on Customer Behaviour | Vladimír Holý et.al. | 2405.05218v1 | null |
2024-05-08 | FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models | Jinglin Xu et.al. | 2405.05216v1 | link |
2024-05-08 | Graded Relevance Scoring of Written Essays with Dense Retrieval | Salam Albatarni et.al. | 2405.05200v1 | null |
2024-05-08 | Is Transductive Learning Equivalent to PAC Learning? | Shaddin Dughmi et.al. | 2405.05190v1 | null |
2024-05-07 | Switchable Decision: Dynamic Neural Generation Networks | Shujian Zhang et.al. | 2405.04513v1 | null |
2024-05-07 | Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing | Yi Zuo et.al. | 2405.04496v1 | null |
2024-05-07 | Exploration of Novel Neuromorphic Methodologies for Materials Applications | Derek Gobin et.al. | 2405.04478v1 | null |
2024-05-07 | Generalized classical Yang-Baxter equation and regular decompositions | Raschid Abedin et.al. | 2405.04440v1 | null |
2024-05-07 | On the classification of product-quotient surfaces with |
Federico Fallucca et.al. | 2405.04425v1 | null |
2024-05-07 | Vision Mamba: A Comprehensive Survey and Taxonomy | Xiao Liu et.al. | 2405.04404v1 | link |
2024-05-07 | Efficient Online Set-valued Classification with Bandit Feedback | Zhou Wang et.al. | 2405.04393v1 | null |
2024-05-07 | DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving | Chen Min et.al. | 2405.04390v1 | null |
2024-05-07 | Parallelized Multi-Agent Bayesian Optimization in Lava | Shay Snyder et.al. | 2405.04387v1 | null |
2024-05-07 | Pragmatist Intelligence: Where the Principle of Usefulness Can Take ANNs | Antonio Bikić et.al. | 2405.04386v1 | null |
2024-05-06 | Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs | Muhammad Uzair Khattak et.al. | 2405.03690v1 | null |
2024-05-06 | All-in-One Deep Learning Framework for MR Image Reconstruction | Geunu Jeong et.al. | 2405.03684v1 | null |
2024-05-06 | ScrewMimic: Bimanual Imitation from Human Videos with Screw Space Projection | Arpit Bahety et.al. | 2405.03666v1 | null |
2024-05-06 | CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification | Sankalp Sinha et.al. | 2405.03660v1 | null |
2024-05-06 | Collecting Consistently High Quality Object Tracks with Minimal Human Involvement by Using Self-Supervised Learning to Detect Tracker Errors | Samreen Anjum et.al. | 2405.03643v1 | null |
2024-05-06 | Classification of Breast Cancer Histopathology Images using a Modified Supervised Contrastive Learning Method | Matina Mahdizadeh Sani et.al. | 2405.03642v1 | link |
2024-05-06 | Nonequilibrium relaxation and odd-even effect in finite-temperature electron gases | Eric Nilsson et.al. | 2405.03635v1 | null |
2024-05-06 | Nonnegative Matrix Factorization in Dimensionality Reduction: A Survey | Farid Saberi-Movahed et.al. | 2405.03615v1 | null |
2024-05-06 | Dual Relation Mining Network for Zero-Shot Learning | Jinwei Han et.al. | 2405.03613v1 | null |
2024-05-06 | Communities for the Lagrangian Dynamics of the Turbulent Velocity Gradient Tensor: A Network Participation Approach | Christopher J. Keylock et.al. | 2405.03589v1 | null |
2024-05-03 | DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos | Wen-Hsuan Chu et.al. | 2405.02280v1 | null |
2024-05-03 | Transversely Projective Structures on Smooth Foliations on Surfaces | Gabriel Fazoli et.al. | 2405.02273v1 | null |
2024-05-03 | On its way to the neutron star-white dwarf binary graveyard, IGR J16194-2810, a first ascent M giant X-ray binary | K. H. Hinkle et.al. | 2405.02270v1 | null |
2024-05-03 | Validating Gaia DR3 Pulsating Variable Classifications with TESS: Building Reliable |
Ai-Ying Zhou et.al. | 2405.02264v1 | null |
2024-05-03 | Subgraph2vec: A random walk-based algorithm for embedding knowledge graphs | Elika Bozorgi et.al. | 2405.02240v1 | null |
2024-05-03 | Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks | Lujing Zhang et.al. | 2405.02225v1 | null |
2024-05-03 | Designed Dithering Sign Activation for Binary Neural Networks | Brayan Monroy et.al. | 2405.02220v1 | null |
2024-05-03 | Multispectral Fine-Grained Classification of Blackgrass in Wheat and Barley Crops | Madeleine Darbyshire et.al. | 2405.02218v1 | null |
2024-05-03 | Non-Destructive Peat Analysis using Hyperspectral Imaging and Machine Learning | Yijun Yan et.al. | 2405.02191v1 | null |
2024-05-03 | Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset | Hsuvas Borkakoty et.al. | 2405.02175v1 | null |
2024-05-02 | Confronting sparse Gaia DR3 photometry with TESS for a sample of about 60,000 hot massive non-radial pulsators | Daniel Hey et.al. | 2405.01539v1 | null |
2024-05-02 | Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks | Murtaza Dalal et.al. | 2405.01534v1 | null |
2024-05-02 | Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models | Nishad Singhi et.al. | 2405.01531v1 | null |
2024-05-02 | Track2Act: Predicting Point Tracks from Internet Videos enables Diverse Zero-shot Robot Manipulation | Homanga Bharadhwaj et.al. | 2405.01527v1 | null |
2024-05-03 | A separability-based approach to quantifying generalization: which layer is best? | Luciano Dyballa et.al. | 2405.01524v2 | null |
2024-05-02 | Grand Design vs. Multi-Armed Spiral Galaxies: Dependence on Galaxy Structure | Beverly J. Smith et.al. | 2405.01516v1 | null |
2024-05-03 | Accelerating Convergence in Bayesian Few-Shot Classification | Tianjun Ke et.al. | 2405.01507v2 | link |
2024-05-02 | PAM-UNet: Shifting Attention on Region of Interest in Medical Images | Abhijit Das et.al. | 2405.01503v1 | null |
2024-05-02 | Exploring Privacy Issues in Mission Critical Communication: Navigating 5G and Beyond Networks | Prajnamaya Dass et.al. | 2405.01492v1 | null |
2024-05-02 | Designing Algorithmic Recommendations to Achieve Human-AI Complementarity | Bryce McLaughlin et.al. | 2405.01484v1 | null |
2024-05-01 | Quantum algorithms for matrix geometric means | Nana Liu et.al. | 2405.00673v1 | null |
2024-05-01 | Adapting Pretrained Networks for Image Quality Assessment on High Dynamic Range Displays | Andrei Chubarau et.al. | 2405.00670v1 | null |
2024-05-01 | Screening of BindingDB database ligands against EGFR, HER2, Estrogen, Progesterone and NF-kB receptors based on machine learning and molecular docking | Parham Rezaee et.al. | 2405.00647v1 | null |
2024-05-01 | Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling | Yida Mu et.al. | 2405.00611v1 | null |
2024-05-01 | Investigating Automatic Scoring and Feedback using Large Language Models | Gloria Ashiya Katuka et.al. | 2405.00602v1 | null |
2024-05-01 | Discovering robust biomarkers of neurological disorders from functional MRI using graph neural networks: A Review | Yi Hao Chan et.al. | 2405.00577v1 | null |
2024-05-01 | EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model | Deng Li et.al. | 2405.00574v1 | null |
2024-05-01 | Remote Sensing Data Assimilation with a Chained Hydrologic-hydraulic Model for Flood Forecasting | Thanh Huy Nguyen et.al. | 2405.00567v1 | null |
2024-05-01 | Digital-analog quantum convolutional neural networks for image classification | Anton Simen et.al. | 2405.00548v1 | null |
2024-05-01 | UWAFA-GAN: Ultra-Wide-Angle Fluorescein Angiography Transformation via Multi-scale Generation and Registration Enhancement | Ruiquan Ge et.al. | 2405.00542v1 | link |
2024-04-30 | A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications | Steph Buongiorno et.al. | 2404.19729v1 | null |
2024-04-30 | Classification of simple 0-dimensional isolated complete intersection singularities | Thuy Huong Pham et.al. | 2404.19728v1 | null |
2024-04-30 | PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios | Jingbo Wang et.al. | 2404.19722v1 | null |
2024-04-30 | PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games | Steph Buongiorno et.al. | 2404.19721v1 | null |
2024-04-30 | ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents | Hoang-Thang Ta et.al. | 2404.19714v1 | null |
2024-04-30 | A rank decomposition for the topological classification of neural representations | Kosio Beshkov et.al. | 2404.19710v1 | null |
2024-04-30 | Neural Controlled Differential Equations with Quantum Hidden Evolutions | Lingyi Yang et.al. | 2404.19673v1 | link |
2024-04-30 | Beyond MOS: Subjective Image Quality Score Preprocessing Method Based on Perceptual Similarity | Lei Wang et.al. | 2404.19666v1 | null |
2024-04-30 | Towards Generalist Robot Learning from Internet Video: A Survey | Robert McCarthy et.al. | 2404.19664v1 | null |
2024-04-30 | Regularization of Riemannian optimization: Application to process tomography and quantum machine learning | Felix Soest et.al. | 2404.19659v1 | null |
2024-04-29 | Hallucination of Multimodal Large Language Models: A Survey | Zechen Bai et.al. | 2404.18930v1 | link |
2024-04-29 | Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing | Leonardo Rossi et.al. | 2404.18924v1 | null |
2024-04-29 | Anomaly and invertible field theory with higher-form symmetry: Extended group cohomology | Shi Chen et.al. | 2404.18921v1 | null |
2024-04-29 | A Survey on Diffusion Models for Time Series and Spatio-Temporal Data | Yiyuan Yang et.al. | 2404.18886v1 | link |
2024-04-29 | A Multilevel Strategy to Improve People Tracking in a Real-World Scenario | Cristiano B. de Oliveira et.al. | 2404.18876v1 | null |
2024-04-29 | A Survey on Vision Mamba: Models, Applications and Challenges | Rui Xu et.al. | 2404.18861v1 | link |
2024-04-29 | ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization | Hong Nguyen et.al. | 2404.18831v1 | link |
2024-04-29 | Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior | Zhiyuan Li et.al. | 2404.18820v1 | null |
2024-04-29 | Certification of Speaker Recognition Models to Additive Perturbations | Dmitrii Korzh et.al. | 2404.18791v1 | null |
2024-04-29 | Understanding Radicals via Orbital Parities | Reza G. Shirazi et.al. | 2404.18787v1 | null |
2024-04-26 | Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos | Zhengze Xu et.al. | 2404.17571v1 | null |
2024-04-26 | Multifold topological semimetals | Iñigo Robredo et.al. | 2404.17539v1 | null |
2024-04-26 | Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models | Yuhang Huang et.al. | 2404.17534v1 | null |
2024-04-26 | Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations | Puhao Li et.al. | 2404.17521v1 | link |
2024-04-26 | Learning text-to-video retrieval from image captioning | Lucas Ventura et.al. | 2404.17498v1 | null |
2024-04-26 | Tabular Data Contrastive Learning via Class-Conditioned and Feature-Correlation Based Augmentation | Wei Cui et.al. | 2404.17489v1 | link |
2024-04-26 | Low Cost Machine Vision for Insect Classification | Danja Brandt et.al. | 2404.17488v1 | null |
2024-04-26 | Conformal Prediction with Learned Features | Shayan Kiyani et.al. | 2404.17487v1 | null |
2024-04-26 | Sparse Reconstruction of Optical Doppler Tomography Based on State Space Model | Zhenghong Li et.al. | 2404.17484v1 | null |
2024-04-26 | One-Shot Image Restoration | Deborah Pereg et.al. | 2404.17426v1 | null |
2024-04-25 | Made to Order: Discovering monotonic temporal changes via self-supervised video ordering | Charig Yang et.al. | 2404.16828v1 | null |
2024-04-25 | ResVR: Joint Rescaling and Viewport Rendering of Omnidirectional Images | Weiqi Li et.al. | 2404.16825v1 | null |
2024-04-25 | V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection | Xuanyu Zhang et.al. | 2404.16824v1 | null |
2024-04-25 | Learning Visuotactile Skills with Two Multifingered Hands | Toru Lin et.al. | 2404.16823v1 | link |
2024-04-25 | Meta-Transfer Derm-Diagnosis: Exploring Few-Shot Learning and Transfer Learning for Skin Disease Classification in Long-Tail Distribution | Zeynep Özdemir et.al. | 2404.16814v1 | null |
2024-04-25 | Transformer-Based Local Feature Matching for Multimodal Image Registration | Remi Delaunay et.al. | 2404.16802v1 | null |
2024-04-25 | DrS: Learning Reusable Dense Rewards for Multi-Stage Tasks | Tongzhou Mu et.al. | 2404.16779v1 | null |
2024-04-25 | Modeling Selective Feature Attention for Representation-based Siamese Text Matching | Jianxiang Zang et.al. | 2404.16776v1 | link |
2024-04-25 | Classifying One-Dimensional Quantum States Prepared by a Single Round of Measurements | Rahul Sahay et.al. | 2404.16753v1 | null |
2024-04-25 | Characterizing Solar Center-to-Limb Radial-Velocity Variability with SDO | Michael L. Palumbo III et.al. | 2404.16747v1 | null |
2024-04-24 | Optimizing OOD Detection in Molecular Graphs: A Novel Approach with Diffusion Models | Xu Shen et.al. | 2404.15625v1 | null |
2024-04-24 | Layer Ensemble Averaging for Improving Memristor-Based Artificial Neural Network Performance | Osama Yousuf et.al. | 2404.15621v1 | null |
2024-04-24 | A Dynamic Kernel Prior Model for Unsupervised Blind Image Super-Resolution | Zhixiong Yang et.al. | 2404.15620v1 | link |
2024-04-24 | MDDD: Manifold-based Domain Adaptation with Dynamic Distribution for Non-Deep Transfer Learning in Cross-subject and Cross-session EEG-based Emotion Recognition | Ting Luo et.al. | 2404.15615v1 | null |
2024-04-24 | Federated Learning with Only Positive Labels by Exploring Label Correlations | Xuming An et.al. | 2404.15598v1 | null |
2024-04-24 | A Survey of Deep Long-Tail Classification Advancements | Charika de Alvis et.al. | 2404.15593v1 | null |
2024-04-24 | Domain Adaptation for Learned Image Compression with Supervised Adapters | Alberto Presta et.al. | 2404.15591v1 | null |
2024-04-24 | Brain Storm Optimization Based Swarm Learning for Diabetic Retinopathy Image Classification | Liang Qu et.al. | 2404.15585v1 | null |
2024-04-24 | Research on OPF control of three-phase four-wire low-voltage distribution network considering uncertainty | Rui Wang et.al. | 2404.15584v1 | null |
2024-04-24 | MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis | Jiaxin Zhuang et.al. | 2404.15580v1 | null |
2024-04-23 | ID-Animator: Zero-Shot Identity-Preserving Human Video Generation | Xuanhua He et.al. | 2404.15275v1 | link |
2024-04-23 | Metric-guided Image Reconstruction Bounds via Conformal Prediction | Matt Y Cheung et.al. | 2404.15274v1 | link |
2024-04-23 | Quantum optical classifier with superexponential speedup | Simone Roncallo et.al. | 2404.15266v1 | null |
2024-04-23 | TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting | Jiahe Li et.al. | 2404.15264v1 | null |
2024-04-23 | Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization | Lahav Lipson et.al. | 2404.15263v1 | link |
2024-04-23 | FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent | Cameron Smith et.al. | 2404.15259v1 | null |
2024-04-23 | Source-free Domain Adaptation for Video Object Detection Under Adverse Image Conditions | Xingguang Zhang et.al. | 2404.15252v1 | null |
2024-04-23 | Unifying the Temperature Dependent Dynamics of Glasses | Joseph B. Schlenoff et.al. | 2404.15250v1 | null |
2024-04-23 | Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification | Austin Goddard et.al. | 2404.15245v1 | null |
2024-04-23 | Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models | Aidan Z. H. Yang et.al. | 2404.15236v1 | null |
2024-04-22 | AutoAD III: The Prequel -- Back to the Pixels | Tengda Han et.al. | 2404.14412v1 | null |
2024-04-22 | Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses | Inhee Lee et.al. | 2404.14410v1 | null |
2024-04-22 | Hyp-OC: Hyperbolic One Class Classification for Face Anti-Spoofing | Kartik Narayan et.al. | 2404.14406v1 | null |
2024-04-22 | A mean curvature flow arising in adversarial training | Leon Bungert et.al. | 2404.14402v1 | null |
2024-04-22 | TAVGBench: Benchmarking Text to Audible-Video Generation | Yuxin Mao et.al. | 2404.14381v1 | link |
2024-04-22 | Rethinking Legal Compliance Automation: Opportunities with Large Language Models | Shabnam Hassani et.al. | 2404.14356v1 | null |
2024-04-22 | On-the-Fly Point Annotation for Fast Medical Video Labeling | Meyer Adrien et.al. | 2404.14344v1 | null |
2024-04-22 | X-Ray: A Sequential 3D Representation for Generation | Tao Hu et.al. | 2404.14329v1 | null |
2024-04-22 | A Novel Approach to Chest X-ray Lung Segmentation Using U-net and Modified Convolutional Block Attention Module | Mohammad Ali Labbaf Khaniki et.al. | 2404.14322v1 | null |
2024-04-22 | "I Upload...All Types of Different Things to Say, the World of Blindness Is More Than What They Think It Is": A Study of Blind TikTokers' Identity Work from a Flourishing Perspective | Yao Lyu et.al. | 2404.14305v1 | null |
2024-04-19 | Data Alignment for Zero-Shot Concept Generation in Dermatology AI | Soham Gadgil et.al. | 2404.13043v1 | null |
2024-04-19 | PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation | Tianyuan Zhang et.al. | 2404.13026v1 | null |
2024-04-19 | BANF: Band-limited Neural Fields for Levels of Detail Reconstruction | Ahan Shabanov et.al. | 2404.13024v1 | null |
2024-04-19 | Stronger Random Baselines for In-Context Learning | Gregory Yauney et.al. | 2404.13020v1 | link |
2024-04-19 | A New Multi-Picture Architecture for Learned Video Deinterlacing and Demosaicing with Parallel Deformable Convolution and Self-Attention Blocks | Ronglei Ji et.al. | 2404.13018v1 | null |
2024-04-19 | Towards Robust Ferrous Scrap Material Classification with Deep Learning and Conformal Prediction | Paulo Henrique dos Santos et.al. | 2404.13002v1 | null |
2024-04-19 | RadRotator: 3D Rotation of Radiographs with Diffusion Models | Pouria Rouzrokh et.al. | 2404.13000v1 | null |
2024-04-19 | Nuclei Instance Segmentation of Cryosectioned H&E Stained Histological Images using Triple U-Net Architecture | Zarif Ahmed et.al. | 2404.12986v1 | null |
2024-04-19 | Cross-modal Diffusion Modelling for Super-resolved Spatial Transcriptomics | Xiaofei Wang et.al. | 2404.12973v1 | null |
2024-04-19 | Improving Pediatric Pneumonia Diagnosis with Adult Chest X-ray Images Utilizing Contrastive Learning and Embedding Similarity | Mohammad Zunaed et.al. | 2404.12958v1 | null |
2024-04-18 | On the Content Bias in Fréchet Video Distance | Songwei Ge et.al. | 2404.12391v1 | null |
2024-04-18 | Moving Object Segmentation: All You Need Is SAM (and Flow) | Junyu Xie et.al. | 2404.12389v1 | null |
2024-04-18 | VideoGigaGAN: Towards Detail-rich Video Super-Resolution | Yiran Xu et.al. | 2404.12388v1 | null |
2024-04-18 | Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models | Aitor Ormazabal et.al. | 2404.12387v1 | null |
2024-04-18 | G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis | Yufei Ye et.al. | 2404.12383v1 | null |
2024-04-18 | Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos | Isabella Liu et.al. | 2404.12379v1 | null |
2024-04-18 | RoboDreamer: Learning Compositional World Models for Robot Imagination | Siyuan Zhou et.al. | 2404.12377v1 | null |
2024-04-18 | When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes | Asaf Yehudai et.al. | 2404.12365v1 | null |
2024-04-18 | Inverse Neural Rendering for Explainable Multi-Object Tracking | Julian Ost et.al. | 2404.12359v1 | null |
2024-04-18 | Improving the interpretability of GNN predictions through conformal-based graph sparsification | Pablo Sanchez-Martin et.al. | 2404.12356v1 | link |
2024-04-18 | Dynamic Typography: Bringing Text to Life via Video Diffusion Prior | Zichen Liu et.al. | 2404.11614v2 | null |
2024-04-17 | VG4D: Vision-Language Model Goes 4D Video Recognition | Zhichao Deng et.al. | 2404.11605v1 | link |
2024-04-17 | Variational Bayesian Last Layers | James Harrison et.al. | 2404.11599v1 | link |
2024-04-17 | State-space Decomposition Model for Video Prediction Considering Long-term Motion Trend | Fei Cui et.al. | 2404.11576v1 | null |
2024-04-17 | Simple Image Signal Processing using Global Context Guidance | Omar Elezabi et.al. | 2404.11569v1 | link |
2024-04-17 | Spatio-Temporal Motion Retargeting for Quadruped Robots | Taerim Yoon et.al. | 2404.11557v1 | null |
2024-04-17 | Predicting Long-horizon Futures by Conditioning on Geometry and Time | Tarasha Khurana et.al. | 2404.11554v1 | null |
2024-04-17 | Carbon- and Oxygen-rich stars in MaStar: identification and classification | Lewis Hill et.al. | 2404.11541v1 | null |
2024-04-17 | GenFighter: A Generative and Evolutive Textual Attack Removal | Md Athikul Islam et.al. | 2404.11538v1 | null |
2024-04-17 | SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening | Yu Zhong et.al. | 2404.11537v1 | null |
2024-04-16 | COMBO: Compositional World Models for Embodied Multi-Agent Cooperation | Hongxin Zhang et.al. | 2404.10775v1 | null |
2024-04-16 | RapidVol: Rapid Reconstruction of 3D Ultrasound Volumes from Sensorless 2D Scans | Mark C. Eid et.al. | 2404.10766v1 | null |
2024-04-16 | Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification | Yu-Yang Li et.al. | 2404.10757v1 | null |
2024-04-16 | Integer-valued o-minimal functions | Neer Bhardwaj et.al. | 2404.10737v1 | null |
2024-04-16 | Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning | Hao-Lun Hsu et.al. | 2404.10728v1 | null |
2024-04-16 | AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation | Zexin Li et.al. | 2404.10714v1 | null |
2024-04-17 | Dual Modalities of Text: Visual and Textual Generative Pre-training | Yekun Chai et.al. | 2404.10710v2 | null |
2024-04-16 | Question Difficulty Ranking for Multiple-Choice Reading Comprehension | Vatsal Raina et.al. | 2404.10704v1 | null |
2024-04-16 | Retrieval Augmented Verification : Unveiling Disinformation with Structured Representations for Zero-Shot Real-Time Evidence-guided Fact-Checking of Multi-modal Social media posts | Arka Ujjal Dey et.al. | 2404.10702v1 | null |
2024-04-16 | Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs | Georgy Perevozchikov et.al. | 2404.10700v1 | null |
2024-04-15 | Squish Jamming | Samuel Poincloux et.al. | 2404.09773v1 | null |
2024-04-15 | Hilti SLAM Challenge 2023: Benchmarking Single + Multi-session SLAM across Sensor Constellations in Construction | Ashish Devadas Nair et.al. | 2404.09765v1 | null |
2024-04-15 | Deep Learning-Based Segmentation of Tumors in PET/CT Volumes: Benchmark of Different Architectures and Training Strategies | Monika Górka et.al. | 2404.09761v1 | null |
2024-04-15 | Quantization of Large Language Models with an Overdetermined Basis | Daniil Merkulov et.al. | 2404.09737v1 | null |
2024-04-15 | FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features | Andre Rochow et.al. | 2404.09736v1 | null |
2024-04-15 | Classification of finite type fusion quivers | Ben Elias et.al. | 2404.09714v1 | null |
2024-04-15 | LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models | Guangyan Li et.al. | 2404.09695v1 | null |
2024-04-15 | Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration | Chenwei Lin et.al. | 2404.09690v1 | null |
2024-04-15 | Post-Training Network Compression for 3D Medical Image Segmentation: Reducing Computational Efforts via Tucker Decomposition | Tobias Weber et.al. | 2404.09683v1 | link |
2024-04-15 | Cluster analysis of the Roma-BZCAT blazars | D. O. Kudryavtsev et.al. | 2404.09667v1 | null |
2024-04-15 | Deformable MRI Sequence Registration for AI-based Prostate Cancer Diagnosis | Alessa Hering et.al. | 2404.09666v1 | null |
2024-04-15 | Closing the Gap in the Trade-off between Fair Representations and Accuracy | Biswajit Rout et.al. | 2404.09664v1 | null |
2024-04-15 | If there's a Trigger Warning, then where's the Trigger? Investigating Trigger Warnings at the Passage Level | Matti Wiegmann et.al. | 2404.09615v1 | link |
2024-04-12 | FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models | Yanting Wang et.al. | 2404.08631v1 | null |
2024-04-12 | Classification of Boolean Algebras through von Neumann regular $\mathcal{C}^{\infty}-$Rings | Jean Cerqueira Berni et.al. | 2404.08629v1 | null |
2024-04-12 | Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation | Yanhao Zheng et.al. | 2404.08603v1 | link |
2024-04-12 | Pathological Primitive Segmentation Based on Visual Foundation Model with Zero-Shot Mask Generation | Abu Bakor Hayat Arnob et.al. | 2404.08584v1 | link |
2024-04-12 | Lossy Image Compression with Foundation Diffusion Models | Lucas Relic et.al. | 2404.08580v1 | null |
2024-04-12 | IDD-X: A Multi-View Dataset for Ego-relative Important Object Localization and Explanation in Dense and Unstructured Traffic | Chirag Parikh et.al. | 2404.08561v1 | null |
2024-04-12 | Scalability in Building Component Data Annotation: Enhancing Facade Material Classification with Synthetic Data | Josie Harrison et.al. | 2404.08557v1 | null |
2024-04-12 | Benchmarking the Cell Image Segmentation Models Robustness under the Microscope Optical Aberrations | Boyuan Peng et.al. | 2404.08549v1 | null |
2024-04-12 | VertAttack: Taking advantage of Text Classifiers' horizontal vision | Jonathan Rusert et.al. | 2404.08538v1 | null |
2024-04-12 | Text Prompt with Normality Guidance for Weakly Supervised Video Anomaly Detection | Zhiwei Yang et.al. | 2404.08531v1 | null |
2024-04-11 | Connecting NeRFs, Images, and Text | Francesco Ballerini et.al. | 2404.07993v1 | null |
2024-04-11 | GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh | Jing Wen et.al. | 2404.07991v1 | null |
2024-04-11 | WaveMo: Learning Wavefront Modulations to See Through Scattering | Mingyang Xie et.al. | 2404.07985v1 | null |
2024-04-11 | Gaga: Group Any Gaussians via 3D-aware Memory Bank | Weijie Lyu et.al. | 2404.07977v1 | null |
2024-04-11 | FusionMamba: Efficient Image Fusion with State Space Model | Siran Peng et.al. | 2404.07932v1 | null |
2024-04-11 | HGRN2: Gated Linear RNNs with State Expansion | Zhen Qin et.al. | 2404.07904v1 | link |
2024-04-11 | Q-ITAGS: Quality-Optimized Spatio-Temporal Heterogeneous Task Allocation with a Time Budget | Glen Neville et.al. | 2404.07902v1 | null |
2024-04-11 | Auditing health-related recommendations in social media: A Case Study of Abortion on YouTube | Mohammed Lahsaini et.al. | 2404.07896v1 | null |
2024-04-11 | Typical blocks of the category |
Chih-Whi Chen et.al. | 2404.07894v1 | null |
2024-04-11 | Context-aware Video Anomaly Detection in Long-Term Datasets | Zhengye Yang et.al. | 2404.07887v1 | null |
2024-04-10 | RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion | Jaidev Shriram et.al. | 2404.07199v1 | null |
2024-04-10 | GCV-Turbo: End-to-end Acceleration of GNN-based Computer Vision Tasks on FPGA | Bingyi Zhang et.al. | 2404.07188v1 | null |
2024-04-10 | Adinkras and Pure Spinors | Richard Eager et.al. | 2404.07167v1 | null |
2024-04-10 | Lost in Translation: Modern Neural Networks Still Struggle With Small Realistic Image Transformations | Ofir Shifman et.al. | 2404.07153v1 | null |
2024-04-10 | Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization | Michael Kohler et.al. | 2404.07128v1 | null |
2024-04-10 | Measuring proximity to standard planes during fetal brain ultrasound scanning | Chiara Di Vece et.al. | 2404.07124v1 | null |
2024-04-10 | "My toxic trait is thinking I'll remember this": gaps in the learner experience of video tutorials for feature-rich software | Ian Drosos et.al. | 2404.07114v1 | null |
2024-04-10 | The generic dual of p-adic groups and applications | Chris Jantzen et.al. | 2404.07111v1 | null |
2024-04-10 | Learning Priors for Non Rigid SfM from Casual Videos | Yoni Kasten et.al. | 2404.07097v1 | null |
2024-04-10 | VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning | Alexandros Xenos et.al. | 2404.07078v1 | link |
2024-04-09 | MoReVQA: Exploring Modular Reasoning Models for Video Question Answering | Juhong Min et.al. | 2404.06511v1 | null |
2024-04-10 | Reconstructing Hand-Held Objects in 3D | Jane Wu et.al. | 2404.06507v2 | null |
2024-04-09 | A Machine Learning Framework for the Prediction of Grain Boundary Segregation in Chemically Complex Environments | Doruk Aksoy et.al. | 2404.06499v1 | null |
2024-04-10 | Flying with Photons: Rendering Novel Views of Propagating Light | Anagh Malik et.al. | 2404.06493v2 | null |
2024-04-09 | Uncovering Tidal Treasures: Automated Classification of Faint Tidal Features in DECaLS Data | Alexander J. Gordon et.al. | 2404.06487v1 | null |
2024-04-09 | RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos | Bochao Zou et.al. | 2404.06483v1 | null |
2024-04-09 | Laue Indexing with Optimal Transport | Tomasz Kacprzak et.al. | 2404.06478v1 | link |
2024-04-09 | A comparative analysis of deep learning models for lung segmentation on X-ray images | Weronika Hryniewska-Guzik et.al. | 2404.06455v1 | link |
2024-04-09 | QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding | Yash Mehan et.al. | 2404.06442v1 | null |
2024-04-09 | ClassiPyGRB: Machine Learning-Based Classification and Visualization of Gamma Ray Bursts using t-SNE | Keneth Garcia-Cifuentes et.al. | 2404.06439v1 | null |
2024-04-08 | MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding | Bo He et.al. | 2404.05726v1 | null |
2024-04-08 | Predicting Overtakes in Trucks Using CAN Data | Talha Hanif Butt et.al. | 2404.05723v1 | null |
2024-04-08 | Case Study: Neural Network Malware Detection Verification for Feature and Image Datasets | Preston K. Robinette et.al. | 2404.05703v1 | null |
2024-04-08 | Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding | Ahmad Idrissi-Yaghir et.al. | 2404.05694v1 | null |
2024-04-08 | Evaluating the Efficacy of Cut-and-Paste Data Augmentation in Semantic Segmentation for Satellite Imagery | Ionut M. Motoi et.al. | 2404.05693v1 | null |
2024-04-08 | AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation | Jiannan Ge et.al. | 2404.05667v1 | null |
2024-04-08 | Oblique photons, plasmons, and current-plasmons in relativistic plasmas and their topological implications | Hong Qin et.al. | 2404.05636v1 | null |
2024-04-08 | AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets | Pietro Lesci et.al. | 2404.05623v1 | null |
2024-04-08 | Experimental observation of a time rondeau crystal: Temporal Disorder in Spatiotemporal Order | Leo Joon Il Moon et.al. | 2404.05620v1 | null |
2024-04-08 | Self-Explainable Affordance Learning with Embodied Caption | Zhipeng Zhang et.al. | 2404.05603v1 | null |
2024-04-05 | On classification of global dynamics for energy-critical equivariant harmonic map heat flows and radial nonlinear heat equation | Kihyun Kim et.al. | 2404.04247v1 | null |
2024-04-05 | Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism | Trilokesh Ranjan Sarkar et.al. | 2404.04245v1 | null |
2024-04-05 | player2vec: A Language Modeling Approach to Understand Player Behavior in Games | Tianze Wang et.al. | 2404.04234v1 | null |
2024-04-05 | Deep-learning Segmentation of Small Volumes in CT images for Radiotherapy Treatment Planning | Jianxin Zhou et.al. | 2404.04202v1 | null |
2024-04-05 | SCAResNet: A ResNet Variant Optimized for Tiny Object Detection in Transmission and Distribution Towers | Weile Li et.al. | 2404.04179v1 | link |
2024-04-05 | Noisy Label Processing for Classification: A Survey | Mengting Li et.al. | 2404.04159v1 | null |
2024-04-05 | Improving Detection in Aerial Images by Capturing Inter-Object Relationships | Botao Ren et.al. | 2404.04140v1 | null |
2024-04-05 | Label Propagation for Zero-shot Classification with Vision-Language Models | Vladan Stojnić et.al. | 2404.04072v1 | link |
2024-04-05 | VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots | Akhil Padmanabha et.al. | 2404.04066v1 | null |
2024-04-05 | Phase Binarization in Mutually Synchronized Bias Field-free Spin Hall Nano-oscillators for Reservoir Computing | Sourabh Manna et.al. | 2404.04023v1 | null |
2024-04-04 | OW-VISCap: Open-World Video Instance Segmentation and Captioning | Anwesa Choudhuri et.al. | 2404.03657v1 | null |
2024-04-04 | Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation | Shuting He et.al. | 2404.03645v1 | link |
2024-04-04 | On the Efficiency of Convolutional Neural Networks | Andrew Lavin et.al. | 2404.03617v1 | null |
2024-04-04 | Creator Hearts: Investigating the Impact Positive Signals from YouTube Creators in Shaping Comment Section Behavior | Frederick Choi et.al. | 2404.03612v1 | null |
2024-04-04 | InsectMamba: Insect Pest Classification with State Space Model | Qianning Wang et.al. | 2404.03611v1 | null |
2024-04-04 | DiffDet4SAR: Diffusion-based Aircraft Target Detection Network for SAR Images | Zhou Jie et.al. | 2404.03595v1 | link |
2024-04-04 | Alzheimer's disease detection in PSG signals | Lorena Gallego-Viñarás et.al. | 2404.03549v1 | null |
2024-04-04 | Towards Transcranial 3D Ultrasound Localization Microscopy of the Nonhuman Primate Brain | Paul Xing et.al. | 2404.03547v1 | null |
2024-04-04 | Segmentation-Guided Knee Radiograph Generation using Conditional Diffusion Models | Siyuan Mei et.al. | 2404.03541v1 | null |
2024-04-05 | A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data | Iqra Bano et.al. | 2404.03493v2 | null |
2024-04-03 | LidarDM: Generative LiDAR Simulation in a Generated World | Vlas Zyrianov et.al. | 2404.02903v1 | null |
2024-04-03 | Guarantees of confidentiality via Hammersley-Chapman-Robbins bounds | Kamalika Chaudhuri et.al. | 2404.02866v1 | link |
2024-04-03 | Semisimple Algebras of Vector Fields on |
Sajid Ali et.al. | 2404.02847v1 | null |
2024-04-03 | GPU-Accelerated RSF Level Set Evolution for Large-Scale Microvascular Segmentation | Meher Niger et.al. | 2404.02813v1 | null |
2024-04-03 | Generative-Contrastive Heterogeneous Graph Neural Network | Yu Wang et.al. | 2404.02810v1 | null |
2024-04-03 | FPT: Feature Prompt Tuning for Few-shot Readability Assessment | Ziyang Wang et.al. | 2404.02772v1 | link |
2024-04-03 | DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement | Hao Wu et.al. | 2404.02755v1 | null |
2024-04-03 | Terraced Compression Method with Automated Threshold Selection for Multidimensional Image Clustering of Heterogeneous Bodies | Jiatong Li et.al. | 2404.02744v1 | null |
2024-04-03 | Event Camera Demosaicing via Swin Transformer and Pixel-focus Loss | Yunfan Lu et.al. | 2404.02731v1 | link |
2024-04-03 | Unblind Text Inputs: Predicting Hint-text of Text Input in Mobile Apps via LLM | Zhe Liu et.al. | 2404.02706v1 | null |
2024-04-02 | Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models | Zeyu Yang et.al. | 2404.02148v1 | link |
2024-04-02 | Multiparametric quantification and visualization of liver fat using ultrasound | Jihye Baek et.al. | 2404.02143v1 | null |
2024-04-03 | ResNet with Integrated Convolutional Block Attention Module for Ship Classification Using Transfer Learning on Optical Satellite Imagery | Ryan Donghan Kwon et.al. | 2404.02135v2 | null |
2024-04-02 | ViTamin: Designing Scalable Vision Models in the Vision-Language Era | Jienneg Chen et.al. | 2404.02132v1 | link |
2024-04-02 | ImageNot: A contrast with ImageNet preserves model rankings | Olawale Salaudeen et.al. | 2404.02112v1 | null |
2024-04-02 | CameraCtrl: Enabling Camera Control for Text-to-Video Generation | Hao He et.al. | 2404.02101v1 | link |
2024-04-02 | Explainability in JupyterLab and Beyond: Interactive XAI Systems for Integrated and Collaborative Workflows | Grace Guo et.al. | 2404.02081v1 | null |
2024-04-02 | Multi-Level Label Correction by Distilling Proximate Patterns for Semi-supervised Semantic Segmentation | Hui Xiao et.al. | 2404.02065v1 | null |
2024-04-02 | Long-context LLMs Struggle with Long In-context Learning | Tianle Li et.al. | 2404.02060v1 | link |
2024-04-02 | Deconstructing In-Context Learning: Understanding Prompts via Corruption | Namrata Shivagunde et.al. | 2404.02054v1 | link |
2024-03-29 | Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations | Jaisidh Singh et.al. | 2403.20312v1 | link |
2024-03-29 | Emotion-Anchored Contrastive Learning Framework for Emotion Recognition in Conversation | Fangxu Yu et.al. | 2403.20289v1 | link |
2024-03-29 | Prototype-based Interpretable Breast Cancer Prediction Models: Analysis and Challenges | Shreyasi Pathak et.al. | 2403.20260v1 | null |
2024-03-29 | Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions | Runhao Zeng et.al. | 2403.20254v1 | null |
2024-03-29 | Latent Embedding Clustering for Occlusion Robust Head Pose Estimation | José Celestino et.al. | 2403.20251v1 | null |
2024-03-29 | Long-Tailed Anomaly Detection with Learnable Class Names | Chih-Hui Ho et.al. | 2403.20236v1 | null |
2024-04-02 | Artificial Neural Networks-based Real-time Classification of ENG Signals for Implanted Nerve Interfaces | Antonio Coviello et.al. | 2403.20234v2 | null |
2024-03-29 | MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark | Sanghyun Woo et.al. | 2403.20225v1 | null |
2024-03-29 | Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science | Yazheng Yang et.al. | 2403.20208v1 | null |
2024-03-29 | The Future of Combating Rumors? Retrieval, Discrimination, and Generation | Junhao Xu et.al. | 2403.20204v1 | null |
2024-03-28 | RSMamba: Remote Sensing Image Classification with State Space Model | Keyan Chen et.al. | 2403.19654v1 | link |
2024-03-28 | Square patterns in dynamical orbits | Vefa Goksel et.al. | 2403.19642v1 | null |
2024-03-28 | Siamese Vision Transformers are Scalable Audio-visual Learners | Yan-Bo Lin et.al. | 2403.19638v1 | null |
2024-03-28 | Four-dimensional gradient Ricci solitons with (half) nonnegative isotropic curvature | Huai-Dong Cao et.al. | 2403.19627v1 | null |
2024-03-28 | Top-$k$ Classification and Cardinality-Aware Prediction | Anqi Mao et.al. | 2403.19625v1 | null |
2024-03-28 | RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents | Zeren Chen et.al. | 2403.19622v1 | null |
2024-03-28 | SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects | Avinash Ummadisingu et.al. | 2403.19607v1 | null |
2024-03-28 | Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model | Zhicai Wang et.al. | 2403.19600v1 | link |
2024-03-28 | Frame by Familiar Frame: Understanding Replication in Video Diffusion Models | Aimon Rahman et.al. | 2403.19593v1 | null |
2024-03-28 | Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation | Zhongliang Zhou et.al. | 2403.19584v1 | null |
2024-03-27 | MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and Rendering | Guoxing Sun et.al. | 2403.18820v1 | null |
2024-03-27 | Breaking the Limitations with Sparse Inputs by Variational Frameworks (BLIss) in Terahertz Super-Resolution 3D Reconstruction | Yiyao Zhang et.al. | 2403.18776v1 | null |
2024-03-27 | CaT: Constraints as Terminations for Legged Locomotion Reinforcement Learning | Elliot Chane-Sane et.al. | 2403.18765v1 | null |
2024-03-27 | A vascular synthetic model for improved aneurysm segmentation and detection via Deep Neural Networks | Rafic Nader et.al. | 2403.18734v1 | null |
2024-03-27 | Contrastive Learning with Orthonormal Anchors (CLOA) | Huanran Li et.al. | 2403.18699v1 | null |
2024-03-27 | Annolid: Annotate, Segment, and Track Anything You Need | Chen Yang et.al. | 2403.18690v1 | null |
2024-03-27 | InceptionTime vs. Wavelet -- A comparison for time series classification | Daniel Klenkert et.al. | 2403.18687v1 | null |
2024-03-27 | TransFusion: Contrastive Learning with Transformers | Huanran Li et.al. | 2403.18681v1 | null |
2024-03-28 | FluxGAT: Integrating Flux Sampling with Graph Neural Networks for Unbiased Gene Essentiality Classification | Kieren Sharma et.al. | 2403.18666v2 | null |
2024-03-27 | Indecomposable set-theoretical solutions to the Yang-Baxter equation of size |
Carsten Dietzel et.al. | 2403.18653v1 | null |
2024-03-26 | Efficient Video Object Segmentation via Modulated Cross-Attention Memory | Abdelrahman Shaker et.al. | 2403.17937v1 | link |
2024-03-26 | ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis | Muhammad Hamza Mughal et.al. | 2403.17936v1 | null |
2024-03-26 | OmniVid: A Generative Framework for Universal Video Understanding | Junke Wang et.al. | 2403.17935v1 | link |
2024-03-26 | Track Everything Everywhere Fast and Robustly | Yunzhou Song et.al. | 2403.17931v1 | null |
2024-03-26 | FastCAR: Fast Classification And Regression Multi-Task Learning via Task Consolidation for Modelling a Continuous Property Variable of Object Classes | Anoop Kini et.al. | 2403.17926v1 | null |
2024-03-26 | The Need for Speed: Pruning Transformers with One Recipe | Samir Khaki et.al. | 2403.17921v1 | link |
2024-03-26 | TC4D: Trajectory-Conditioned Text-to-4D Generation | Sherwin Bahmani et.al. | 2403.17920v1 | null |
2024-03-26 | AgentStudio: A Toolkit for Building General Virtual Agents | Longtao Zheng et.al. | 2403.17918v1 | null |
2024-03-26 | Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos | Akshay Paruchuri et.al. | 2403.17915v1 | null |
2024-03-26 | Hierarchical Multi-label Classification for Fine-level Event Extraction from Aviation Accident Reports | Xinyu Zhao et.al. | 2403.17914v1 | null |
2024-03-25 | DBPF: A Framework for Efficient and Robust Dynamic Bin-Picking | Yichuan Li et.al. | 2403.16786v1 | null |
2024-03-25 | C-arm inverse geometry CT for 3D cardiac chamber mapping | Jordan M. Slagowski et.al. | 2403.16779v1 | null |
2024-03-25 | Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases | Sophie Starck et.al. | 2403.16776v1 | null |
2024-03-25 | As Good As A Coin Toss Human detection of AI-generated images, videos, audio, and audiovisual stimuli | Di Cooke et.al. | 2403.16760v1 | null |
2024-03-25 | Creating a Digital Twin of Spinal Surgery: A Proof of Concept | Jonas Hein et.al. | 2403.16736v1 | null |
2024-03-25 | A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models | Nils Ingelhag et.al. | 2403.16730v1 | null |
2024-03-25 | One-Shot Domain Incremental Learning | Yasushi Esaki et.al. | 2403.16707v1 | null |
2024-03-25 | Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer | Dominik Müller et.al. | 2403.16695v1 | null |
2024-03-25 | DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural Networks | Dominik Müller et.al. | 2403.16678v1 | link |
2024-03-25 | FOOL: Addressing the Downlink Bottleneck in Satellite Computing with Neural Feature Compression | Alireza Furutanpey et.al. | 2403.16677v1 | null |
2024-03-25 | A Novel Loss Function-based Support Vector Machine for Binary Classification | Yan Li et.al. | 2403.16654v1 | null |
2024-03-25 | Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution | Qingping Zheng et.al. | 2403.16643v1 | null |
2024-03-25 | Multi-Scale Texture Loss for CT denoising with GANs | Francesco Di Feola et.al. | 2403.16640v1 | link |
2024-03-25 | AI-Generated Video Detection via Spatio-Temporal Anomaly Learning | Jianfa Bai et.al. | 2403.16638v1 | null |
2024-03-25 | Distributed collaborative anomalous sound detection by embedding sharing | Kota Dohi et.al. | 2403.16610v1 | null |
2024-03-25 | EDUE: Expert Disagreement-Guided One-Pass Uncertainty Estimation for Medical Image Segmentation | Kudaibergen Abutalip et.al. | 2403.16594v1 | null |
2024-03-22 | LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models | Yuzhang Shang et.al. | 2403.15388v1 | null |
2024-03-22 | Time-efficient, high-resolution 3T whole-brain relaxometry using Cartesian 3D MR-STAT with CSF suppression | Hongyan Liu et.al. | 2403.15379v1 | null |
2024-03-22 | Long-CLIP: Unlocking the Long-Text Capability of CLIP | Beichen Zhang et.al. | 2403.15378v1 | null |
2024-03-22 | InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding | Yi Wang et.al. | 2403.15377v1 | null |
2024-03-22 | Cascading Blackout Severity Prediction with Statistically-Augmented Graph Neural Networks | Joe Gorka et.al. | 2403.15363v1 | null |
2024-03-22 | SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series | Badri N. Patro et.al. | 2403.15360v1 | null |
2024-03-22 | Ultrasound Imaging based on the Variance of a Diffusion Restoration Model | Yuxin Zhang et.al. | 2403.15316v1 | null |
2024-03-22 | Global Control for Local SO(3)-Equivariant Scale-Invariant Vessel Segmentation | Patryk Rygiel et.al. | 2403.15314v1 | null |
2024-03-22 | Quantum-inspired classification via efficient simulation of Helstrom measurement | Wooseop Hwang et.al. | 2403.15308v1 | null |
2024-03-22 | Reconnaissance ultracool spectra in the Euclid Deep Fields | Jerry Jun-Yan Zhang et.al. | 2403.15288v1 | null |
2024-03-21 | Language Repository for Long Video Understanding | Kumara Kahatapitiya et.al. | 2403.14622v1 | link |
2024-03-22 | Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion | Xiang Fan et.al. | 2403.14617v2 | null |
2024-03-21 | Explorative Inbetweening of Time and Space | Haiwen Feng et.al. | 2403.14611v1 | null |
2024-03-21 | ReNoise: Real Image Inversion Through Iterative Noising | Daniel Garibi et.al. | 2403.14602v1 | null |
2024-03-21 | PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model | Zheng Zhang et.al. | 2403.14598v1 | link |
2024-03-21 | Large Language Models for Multi-Choice Question Classification of Medical Subjects | Víctor Ponce-López et.al. | 2403.14582v1 | null |
2024-03-21 | DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video | Narek Tumanyan et.al. | 2403.14548v1 | null |
2024-03-21 | Estimating Physical Information Consistency of Channel Data Augmentation for Remote Sensing Images | Tom Burgert et.al. | 2403.14547v1 | null |
2024-03-21 | Transfer Learning for Cross-dataset Isolated Sign Language Recognition in Under-Resourced Datasets | Ahmet Alp Kindiroglu et.al. | 2403.14534v1 | link |
2024-03-21 | Invisible Needle Detection in Ultrasound: Leveraging Mechanism-Induced Vibration | Chenyang Li et.al. | 2403.14523v1 | null |
2024-03-21 | Denoising Diffusion Models for 3D Healthy Brain Tissue Inpainting | Alicia Durrer et.al. | 2403.14499v1 | link |
2024-03-20 | TimeRewind: Rewinding Time with Image-and-Events Video Diffusion | Jingxi Chen et.al. | 2403.13800v1 | null |
2024-03-20 | Hierarchical NeuroSymbolic Approach for Action Quality Assessment | Lauren Okamoto et.al. | 2403.13798v1 | null |
2024-03-20 | Bridge the Modality and Capacity Gaps in Vision-Language Model Selection | Chao Yi et.al. | 2403.13797v1 | null |
2024-03-20 | The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency and Usability in AI | Matt White et.al. | 2403.13784v1 | null |
2024-03-20 | Gradings on associative triple systems of the second kind | Alberto Daza-Garcia et.al. | 2403.13775v1 | null |
2024-03-20 | Towards Principled Representation Learning from Videos for Reinforcement Learning | Dipendra Misra et.al. | 2403.13765v1 | null |
2024-03-20 | Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model | Diwei Wang et.al. | 2403.13756v1 | null |
2024-03-20 | Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation | Fu-Yun Wang et.al. | 2403.13745v1 | null |
2024-03-20 | Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes | Yifan Chen et.al. | 2403.13724v1 | null |
2024-03-20 | Improving the Adaptive Moment Estimation (ADAM) stochastic optimizer through an Implicit-Explicit (IMEX) time-stepping approach | Abhinab Bhattacharjee et.al. | 2403.13704v1 | null |
2024-03-19 | LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression | Zhuoshi Pan et.al. | 2403.12968v1 | null |
2024-03-19 | FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation | Shuai Yang et.al. | 2403.12962v1 | link |
2024-03-19 | WHAC: World-grounded Humans and Cameras | Wanqi Yin et.al. | 2403.12959v1 | null |
2024-03-19 | FutureDepth: Learning to Predict the Future Improves Video Depth Estimation | Rajeev Yasarla et.al. | 2403.12953v1 | null |
2024-03-19 | Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models | Elaine Sui et.al. | 2403.12952v1 | link |
2024-03-19 | Legendrian loops and cluster modular groups | James Hughes et.al. | 2403.12951v1 | null |
2024-03-19 | Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers | Vidhi Jain et.al. | 2403.12943v1 | null |
2024-03-19 | Contextual AD Narration with Interleaved Multimodal Sequence | Hanlin Wang et.al. | 2403.12922v1 | null |
2024-03-19 | Semantic Layering in Room Segmentation via LLMs | Taehyeon Kim et.al. | 2403.12920v1 | null |
2024-03-19 | Yell At Your Robot: Improving On-the-Fly from Language Corrections | Lucy Xiaoyang Shi et.al. | 2403.12910v1 | null |
2024-03-18 | Time Series Compression using Quaternion Valued Neural Networks and Quaternion Backpropagation | Johannes Pöppelbaum et.al. | 2403.11722v1 | null |
2024-03-18 | Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing | Juan Zhang et.al. | 2403.11700v1 | null |
2024-03-18 | A Spatial-Temporal Progressive Fusion Network for Breast Lesion Segmentation in Ultrasound Videos | Zhengzheng Tu et.al. | 2403.11699v1 | null |
2024-03-18 | Object Segmentation-Assisted Inter Prediction for Versatile Video Coding | Zhuoyuan Li et.al. | 2403.11694v1 | null |
2024-03-19 | MoreStyle: Relax Low-frequency Constraint of Fourier-based Image Reconstruction in Generalizable Medical Image Segmentation | Haoyu Zhao et.al. | 2403.11689v2 | null |
2024-03-18 | Better (pseudo-)labels for semi-supervised instance segmentation | François Porcher et.al. | 2403.11675v1 | null |
2024-03-19 | WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising | Haoyu Zhao et.al. | 2403.11672v2 | null |
2024-03-18 | Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection | Julia Wolleb et.al. | 2403.11667v1 | null |
2024-03-18 | Combining Local and Global Perception for Autonomous Navigation on Nano-UAVs | Lorenzo Lamberti et.al. | 2403.11661v1 | null |
2024-03-18 | LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything Model | Yuxin Cao et.al. | 2403.11656v1 | null |
2024-03-15 | Strong and Controllable Blind Image Decomposition | Zeyu Zhang et.al. | 2403.10520v1 | link |
2024-03-15 | Frozen Feature Augmentation for Few-Shot Image Classification | Andreas Bär et.al. | 2403.10519v1 | null |
2024-03-15 | VideoAgent: Long-form Video Understanding with Large Language Model as Agent | Xiaohan Wang et.al. | 2403.10517v1 | null |
2024-03-15 | Surveyor: Facilitating Discovery Within Video Games for Blind and Low Vision Players | Vishnu Nair et.al. | 2403.10512v1 | null |
2024-03-15 | Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study | Chenguang Wang et.al. | 2403.10499v1 | link |
2024-03-15 | Joint Multimodal Transformer for Dimensional Emotional Recognition in the Wild | Paul Waligora et.al. | 2403.10488v1 | null |
2024-03-15 | Tensor Star Decomposition | Wuyang Zhou et.al. | 2403.10481v1 | null |
2024-03-15 | Using an LLM to Turn Sign Spottings into Spoken Language Sentences | Ozge Mercanoglu Sincan et.al. | 2403.10434v1 | null |
2024-03-15 | Neural Networks Hear You Loud And Clear: Hearing Loss Compensation Using Deep Neural Networks | Peter Leer et.al. | 2403.10420v1 | null |
2024-03-15 | A comparative study on machine learning approaches for rock mass classification using drilling data | Tom F. Hansen et.al. | 2403.10404v1 | null |
2024-03-14 | Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models | Akhil Kedia et.al. | 2403.09635v1 | link |
2024-03-14 | Generalized Predictive Model for Autonomous Driving | Jiazhi Yang et.al. | 2403.09630v1 | link |
2024-03-14 | From the Conformal Anomaly to the Virasoro Algebra | Sid Maibach et.al. | 2403.09628v1 | null |
2024-03-14 | Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding | Guo Chen et.al. | 2403.09626v1 | link |
2024-03-14 | Score-Guided Diffusion for 3D Human Recovery | Anastasis Stathopoulos et.al. | 2403.09623v1 | link |
2024-03-14 | PosSAM: Panoptic Open-vocabulary Segment Anything | Vibashan VS et.al. | 2403.09620v1 | null |
2024-03-14 | Explore In-Context Segmentation via Latent Diffusion Models | Chaoyang Wang et.al. | 2403.09616v1 | null |
2024-03-14 | Compute-first optical detection for noise-resilient visual perception | Jungmin Kim et.al. | 2403.09612v1 | null |
2024-03-14 | Mixture of Mixups for Multi-label Classification of Rare Anuran Sounds | Ilyass Moummad et.al. | 2403.09598v1 | link |
2024-03-14 | DungeonMaker: Embedding Tangible Creation and Destruction in Hybrid Board Games through Personal Fabrication Technology | Evgeny Stemasov et.al. | 2403.09592v1 | null |
2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Enric Corona et.al. | 2403.08764v1 | null |
2024-03-13 | Segmentation of Knee Bones for Osteoarthritis Assessment: A Comparative Analysis of Supervised, Few-Shot, and Zero-Shot Learning Approaches | Yun Xin Teoh et.al. | 2403.08761v1 | null |
2024-03-13 | MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning | Jialv Zou et.al. | 2403.08760v1 | link |
2024-03-13 | Spatiotemporal Diffusion Model with Paired Sampling for Accelerated Cardiac Cine MRI | Shihan Qiu et.al. | 2403.08758v1 | null |
2024-03-13 | DAM: Dynamic Adapter Merging for Continual Video QA Learning | Feng Cheng et.al. | 2403.08755v1 | link |
2024-03-13 | Clinically Feasible Diffusion Reconstruction for Highly-Accelerated Cardiac Cine MRI | Shihan Qiu et.al. | 2403.08749v1 | null |
2024-03-13 | Torsion pairs, t-structures, and co-t-structures for completions of discrete cluster categories | Sofia Franchini et.al. | 2403.08735v1 | null |
2024-03-13 | Euclid: Testing photometric selection of emission-line galaxy targets | M. S. Cagliari et.al. | 2403.08726v1 | null |
2024-03-13 | Diffusion-based Iterative Counterfactual Explanations for Fetal Ultrasound Image Quality Assessment | Paraskevas Pegios et.al. | 2403.08700v1 | null |
2024-03-13 | Implicit Regularization of Gradient Flow on One-Layer Softmax Attention | Heejune Sheen et.al. | 2403.08699v1 | null |
2024-03-12 | OPEN TEACH: A Versatile Teleoperation System for Robotic Manipulation | Aadhithya Iyer et.al. | 2403.07870v1 | null |
2024-03-12 | TeleMoMa: A Modular and Versatile Teleoperation System for Mobile Manipulation | Shivin Dass et.al. | 2403.07869v1 | null |
2024-03-12 | Iterative Graph Neural Network Enhancement via Frequent Subgraph Mining of Explanations | Harish G. Naik et.al. | 2403.07849v1 | null |
2024-03-12 | When Eye-Tracking Meets Machine Learning: A Systematic Review on Applications in Medical Image Analysis | Sahar Moradizeyveh et.al. | 2403.07834v1 | null |
2024-03-12 | DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies | William Xie et.al. | 2403.07832v1 | null |
2024-03-12 | A geometric model for the module category of a string algebra | Karin Baur et.al. | 2403.07810v1 | null |
2024-03-12 | BraSyn 2023 challenge: Missing MRI synthesis and the effect of different learning objectives | Ivo M. Baltruschat et.al. | 2403.07800v1 | null |
2024-03-12 | A robust SVM-based approach with feature selection and outliers detection for classification problems | Marta Baldomero-Naranjo et.al. | 2403.07753v1 | null |
2024-03-12 | Vision-based Vehicle Re-identification in Bridge Scenario using Flock Similarity | Chunfeng Zhang et.al. | 2403.07752v1 | null |
2024-03-12 | Harnessing two-photon dissipation for enhanced quantum measurement and control | Antoine Marquet et.al. | 2403.07744v1 | null |
2024-03-11 | Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling | Wele Gedara Chaminda Bandara et.al. | 2403.06978v1 | link |
2024-03-12 | VideoMamba: State Space Model for Efficient Video Understanding | Kunchang Li et.al. | 2403.06977v2 | link |
2024-03-11 | Memory-based Adapters for Online 3D Scene Perception | Xiuwei Xu et.al. | 2403.06974v1 | null |
2024-03-11 | Explainable Transformer Prototypes for Medical Diagnoses | Ugur Demir et.al. | 2403.06961v1 | link |
2024-03-11 | Quadruped-Frog: Rapid Online Optimization of Continuous Quadruped Jumping | Guillaume Bellegarda et.al. | 2403.06954v1 | null |
2024-03-11 | Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer | Siddhant Satyanaik et.al. | 2403.06953v1 | null |
2024-03-11 | Advancing Generalizable Remote Physiological Measurement through the Integration of Explicit and Implicit Prior Knowledge | Yuting Zhang et.al. | 2403.06947v1 | link |
2024-03-11 | Conditional Score-Based Diffusion Model for Cortical Thickness Trajectory Prediction | Qing Xiao et.al. | 2403.06940v1 | null |
2024-03-11 | FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks | Muhammad Saif Ullah Khan et.al. | 2403.06904v1 | null |
2024-03-11 | Benign overfitting in leaky ReLU networks with moderate input dimension | Kedar Karhadkar et.al. | 2403.06903v1 | null |
2024-03-08 | Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos | Tarun Kalluri et.al. | 2403.05535v1 | null |
2024-03-08 | Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets | Lorenzo Brigato et.al. | 2403.05532v1 | null |
2024-03-08 | Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context | Machel Reid et.al. | 2403.05530v1 | null |
2024-03-08 | Take Your Best Shot: Sampling-Based Next-Best-View Planning for Autonomous Photography & Inspection | Shijie Gao et.al. | 2403.05477v1 | null |
2024-03-08 | Will GPT-4 Run DOOM? | Adrian de Wynter et.al. | 2403.05468v1 | null |
2024-03-08 | Evaluating AI and Human Authorship Quality in Academic Writing through Physics Essays | Will Yeadon et.al. | 2403.05458v1 | null |
2024-03-08 | VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models | Yabo Zhang et.al. | 2403.05438v1 | link |
2024-03-08 | OmniCount: Multi-label Object Counting with Semantic-Geometric Priors | Anindya Mondal et.al. | 2403.05435v1 | null |
2024-03-08 | Infinite Translation Surfaces in the Wild | Vincent Delecroix et.al. | 2403.05424v1 | null |
2024-03-08 | Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery | Mubashir Noman et.al. | 2403.05419v1 | link |
2024-03-07 | DeepSee: Multidimensional Visualizations of Seabed Ecosystems | Adam Coscia et.al. | 2403.04761v1 | link |
2024-03-07 | iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries | Adam Coscia et.al. | 2403.04760v1 | link |
2024-03-07 | KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts | Adam Coscia et.al. | 2403.04758v1 | link |
2024-03-07 | Preliminary Guidelines For Combining Data Integration and Visual Data Analysis | Adam Coscia et.al. | 2403.04757v1 | link |
2024-03-07 | Photonic probabilistic machine learning using quantum vacuum noise | Seou Choi et.al. | 2403.04731v1 | null |
2024-03-07 | Analysis of Systems' Performance in Natural Language Processing Competitions | Sergio Nava-Muñoz et.al. | 2403.04693v1 | null |
2024-03-07 | CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios | Qilang Ye et.al. | 2403.04640v1 | link |
2024-03-07 | Scalable, Simulation-Guided Compliant Tactile Finger Design | Yuxiang Ma et.al. | 2403.04638v1 | null |
2024-03-08 | Pix2Gif: Motion-Guided Diffusion for GIF Generation | Hitesh Kandala et.al. | 2403.04634v2 | null |
2024-03-07 | MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder | Lei Li et.al. | 2403.04626v1 | null |
2024-03-06 | 3D Diffusion Policy | Yanjie Ze et.al. | 2403.03954v1 | link |
2024-03-06 | Stop Regressing: Training Value Functions via Classification for Scalable Deep RL | Jesse Farebrother et.al. | 2403.03950v1 | null |
2024-03-06 | Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation | Marcel Torne et.al. | 2403.03949v1 | null |
2024-03-06 | DART: Implicit Doppler Tomography for Radar Novel View Synthesis | Tianshu Huang et.al. | 2403.03896v1 | null |
2024-03-06 | Joint multi-task learning improves weakly-supervised biomarker prediction in computational pathology | Omar S. M. El Nahhas et.al. | 2403.03891v1 | link |
2024-03-06 | Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation | Xiao Ma et.al. | 2403.03890v1 | null |
2024-03-06 | Decoupled Vertical Federated Learning for Practical Training on Vertically Partitioned Data | Avi Amalanshu et.al. | 2403.03871v1 | null |
2024-03-06 | X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification | Hanzi Xu et.al. | 2403.03863v1 | link |
2024-03-06 | ProxNF: Neural Field Proximal Training for High-Resolution 4D Dynamic Image Reconstruction | Luke Lozenski et.al. | 2403.03860v1 | null |
2024-03-06 | MedMamba: Vision Mamba for Medical Image Classification | Yubiao Yue et.al. | 2403.03849v1 | link |
2024-03-05 | Extension Theory and Fermionic Strongly Fusion 2-Categories | Thibault D. Décoppet et.al. | 2403.03211v1 | null |
2024-03-05 | Scaling Rectified Flow Transformers for High-Resolution Image Synthesis | Patrick Esser et.al. | 2403.03206v1 | null |
2024-03-05 | Behavior Generation with Latent Actions | Seungjae Lee et.al. | 2403.03181v1 | link |
2024-03-05 | Deep-Learned Compression for Radio-Frequency Signal Classification | Armani Rodriguez et.al. | 2403.03150v1 | null |
2024-03-05 | Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization | Yuxin Guo et.al. | 2403.03145v1 | link |
2024-03-05 | Motion-Corrected Moving Average: Including Post-Hoc Temporal Information for Improved Video Segmentation | Robert Mendel et.al. | 2403.03120v1 | null |
2024-03-05 | Equilibria in Two-Stage Facility Location with Atomic Clients | Simon Krogmann et.al. | 2403.03114v1 | null |
2024-03-05 | Galaxies in the Zone of Avoidance: Misclassifications using machine learning tools | P. Marchant Cortés et.al. | 2403.03098v1 | null |
2024-03-05 | Collective self-caging of active filaments in virtual confinement | Maximilian Kurjahn et.al. | 2403.03093v1 | null |
2024-03-05 | A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives | Simone Alberto Peirone et.al. | 2403.03037v1 | null |
2024-03-03 | Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model | Rui Yang et.al. | 2403.01362v1 | null |
2024-03-02 | Improve Cost Efficiency of Active Learning over Noisy Dataset | Zan-Kai Chong et.al. | 2403.01346v1 | null |
2024-03-02 | An eternal hypersurface flow arising in centro-affine geometry | Xinjie Jiang et.al. | 2403.01340v1 | null |
2024-03-02 | Image-Based Dietary Assessment: A Healthy Eating Plate Estimation System | Assylzhan Izbassar et.al. | 2403.01310v1 | null |
2024-03-02 | VNLP: Turkish NLP Package | Meliksah Turker et.al. | 2403.01309v1 | null |
2024-03-02 | Towards a classification of |
Alyson Deines et.al. | 2403.01287v1 | null |
2024-03-02 | Irfan Habib et.al. | 2403.01285v1 | null | |
2024-03-02 | Fast Low-parameter Video Activity Localization in Collaborative Learning Environments | Venkatesh Jatla et.al. | 2403.01281v1 | null |
2024-03-02 | Rigidity results for group von Neumann algebras with diffuse center | Ionuţ Chifan et.al. | 2403.01280v1 | null |
2024-03-02 | Can a Confident Prior Replace a Cold Posterior? | Martin Marek et.al. | 2403.01272v1 | link |
2024-02-29 | Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers | Tsai-Shien Chen et.al. | 2402.19479v1 | null |
2024-02-29 | Towards Generalizable Tumor Synthesis | Qi Chen et.al. | 2402.19470v1 | null |
2024-02-29 | Humanoid Locomotion as Next Token Prediction | Ilija Radosavovic et.al. | 2402.19469v1 | null |
2024-03-01 | TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning | Kate Sanders et.al. | 2402.19467v2 | null |
2024-02-29 | Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models | Frederik Kunstner et.al. | 2402.19449v1 | null |
2024-02-29 | Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems | Quentin Raymondaud et.al. | 2402.19443v1 | null |
2024-02-29 | Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation | Jonathan Yang et.al. | 2402.19432v1 | null |
2024-02-29 | PaECTER: Patent-level Representation Learning using Citation-informed Transformers | Mainak Ghosh et.al. | 2402.19411v1 | null |
2024-02-29 | Navigating Hallucinations for Reasoning of Unintentional Activities | Shresth Grover et.al. | 2402.19405v1 | null |
2024-02-29 | A Newborn AGN in a Starforming Galaxy | P. Arévalo et.al. | 2402.19403v1 | null |
2024-02-28 | Time-efficient filtering of polarimetric data by checking physical realizability of experimental Mueller matrices | Tatiana Novikova et.al. | 2402.18555v1 | null |
2024-02-28 | Selection of appropriate multispectral camera exposure settings and radiometric calibration methods for applications in phenotyping and precision agriculture | Vaishali Swaminathan et.al. | 2402.18553v1 | null |
2024-02-28 | Implicit Bias of Next-Token Prediction | Christos Thrampoulidis et.al. | 2402.18551v1 | null |
2024-02-28 | Defect Detection in Tire X-Ray Images: Conventional Methods Meet Deep Structures | Andrei Cozma et.al. | 2402.18527v1 | null |
2024-02-28 | Do galaxy mergers prefer under-dense environments? | U. Sureshkumar et.al. | 2402.18520v1 | null |
2024-02-28 | Log Neural Controlled Differential Equations: The Lie Brackets Make a Difference | Benjamin Walker et.al. | 2402.18512v1 | null |
2024-02-28 | Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling | Mahdi Karami et.al. | 2402.18508v1 | null |
2024-02-28 | Detection of Micromobility Vehicles in Urban Traffic Videos | Khalil Sabri et.al. | 2402.18503v1 | link |
2024-02-28 | Few-Shot Fairness: Unveiling LLM's Potential for Fairness-Aware Classification | Garima Chhikara et.al. | 2402.18502v1 | null |
2024-02-28 | ROG$_{PL}$: Robust Open-Set Graph Learning via Region-Based Prototype Learning | Qin Zhang et.al. | 2402.18495v1 | null |
2024-02-27 | Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning | Xiaoyu Zhang et.al. | 2402.17768v1 | null |
2024-02-27 | Towards Optimal Learning of Language Models | Yuxian Gu et.al. | 2402.17759v1 | null |
2024-02-27 | An Eye Gaze Heatmap Analysis of Uncertainty Head-Up Display Designs for Conditional Automated Driving | Michael A. Gerber et.al. | 2402.17751v1 | null |
2024-02-27 | Scaling on-chip photonic neural processors using arbitrarily programmable wave propagation | Tatsuhiro Onodera et.al. | 2402.17750v1 | link |
2024-02-27 | Linking Order to Strength in Metals | Nicolas Argibay et.al. | 2402.17728v1 | null |
2024-02-27 | MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation | Hanan Gani et.al. | 2402.17725v1 | link |
2024-02-27 | Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners | Yazhou Xing et.al. | 2402.17723v1 | null |
2024-02-27 | Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers | Yiwei Lu et.al. | 2402.17710v1 | null |
2024-02-27 | NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents | Tamara Czinczoll et.al. | 2402.17682v1 | null |
2024-02-27 | MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning | Huiyu Xiong et.al. | 2402.17680v1 | null |
2024-02-26 | Open Your Ears to Take a Look: A State-of-the-Art Report on the Integration of Sonification and Visualization | Kajetan Enge et.al. | 2402.16558v1 | null |
2024-02-26 | LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification | Yiping Song et.al. | 2402.16515v1 | null |
2024-02-26 | Photonic Neural Network Fabricated on Thin Film Lithium Niobate for High-Fidelity and Power-Efficient Matrix Computation | Yong Zheng et.al. | 2402.16513v1 | null |
2024-02-26 | Intelligent Known and Novel Aircraft Recognition -- A Shift from Classification to Similarity Learning for Combat Identification | Ahmad Saeed et.al. | 2402.16486v1 | null |
2024-02-26 | Edge Detectors Can Make Deep Convolutional Neural Networks More Robust | Jin Ding et.al. | 2402.16479v1 | null |
2024-02-26 | Autonomous Integration of TSN-unaware Applications with QoS Requirements in TSN Networks | Moritz Fluechter et.al. | 2402.16454v1 | null |
2024-02-26 | Retrouver l'inventeur-auteur : la lev{é}e d'homonymies d'autorat entre les brevets et les publications scientifiques | David Reymond et.al. | 2402.16440v1 | null |
2024-02-26 | Improving behavior based authentication against adversarial attack using XAI | Dong Qin et.al. | 2402.16430v1 | null |
2024-02-26 | Adaptive Online Learning of Separable Path Graph Transforms for Intra-prediction | Wen-Yang Lu et.al. | 2402.16371v1 | null |
2024-02-26 | DEYO: DETR with YOLO for End-to-End Object Detection | Haodong Ouyang et.al. | 2402.16370v1 | null |
2024-02-26 | SPINEPS -- Automatic Whole Spine Segmentation of T2-weighted MR images using a Two-Phase Approach to Multi-class Semantic and Instance Segmentation | Hendrik Möller et.al. | 2402.16368v1 | link |
2024-02-26 | An Integrated Data Processing Framework for Pretraining Foundation Models | Yiding Sun et.al. | 2402.16358v1 | link |
2024-02-26 | What Text Design Characterizes Book Genres? | Daichi Haraguchi et.al. | 2402.16356v1 | null |
2024-02-23 | A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends | Abolfazl Younesi et.al. | 2402.15490v1 | null |
2024-02-23 | Retinotopic Mapping Enhances the Robustness of Convolutional Neural Networks | Jean-Nicolas Jérémie et.al. | 2402.15480v1 | null |
2024-02-23 | FAIR: Filtering of Automatically Induced Rules | Divya Jyoti Bajpai et.al. | 2402.15472v1 | null |
2024-02-23 | GROS: A General Robust Aggregation Strategy | Alejandro Cholaquidis et.al. | 2402.15442v1 | null |
2024-02-23 | Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales | Shuren Qi et.al. | 2402.15430v1 | link |
2024-02-23 | ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation | Yi Zhang et.al. | 2402.15429v1 | link |
2024-02-23 | Understanding Entrainment in Human Groups: Optimising Human-Robot Collaboration from Lessons Learned during Human-Human Collaboration | Eike Schneiders et.al. | 2402.15427v1 | null |
2024-02-23 | PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning | Simon Holk et.al. | 2402.15420v1 | null |
2024-02-23 | G-RepsNet: A Fast and General Construction of Equivariant Networks for Arbitrary Matrix Groups | Sourya Basu et.al. | 2402.15413v1 | null |
2024-02-23 | A Universal Method for Solar Filament Detection from H-alpha Observations using Semi-supervised Deep Learning | Andrea Diercke et.al. | 2402.15407v1 | null |
2024-02-22 | Link Prediction under Heterophily: A Physics-Inspired Graph Neural Network Approach | Andrea Giuseppe Di Francesco et.al. | 2402.14802v1 | null |
2024-02-22 | Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis | Willi Menapace et.al. | 2402.14797v1 | null |
2024-02-22 | Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models | Yixuan Ren et.al. | 2402.14780v1 | null |
2024-02-22 | Zero-Shot Pediatric Tuberculosis Detection in Chest X-Rays using Self-Supervised Learning | Daniel Capellán-Martín et.al. | 2402.14741v1 | null |
2024-02-22 | Solitons of the mean curvature flow in |
Rafael López et.al. | 2402.14727v1 | null |
2024-02-22 | A Transformer Model for Boundary Detection in Continuous Sign Language | Razieh Rastgoo et.al. | 2402.14720v1 | null |
2024-02-22 | InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks | Somnath Banerjee et.al. | 2402.14702v1 | null |
2024-02-22 | Big data analytics to classify earthwork-related locations: A Chengdu study | Lei Yu et.al. | 2402.14698v1 | null |
2024-02-22 | Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off | Futa Waseda et.al. | 2402.14648v1 | null |
2024-02-22 | Distributed Radiance Fields for Edge Video Compression and Metaverse Integration in Autonomous Driving | Eugen Šlapak et.al. | 2402.14642v1 | null |
2024-02-21 | A Simple and Yet Fairly Effective Defense for Graph Neural Networks | Sofiane Ennadir et.al. | 2402.13987v1 | link |
2024-02-21 | On modular representations of inner forms of |
Johannes Droschl et.al. | 2402.13969v1 | null |
2024-02-21 | New directions in algebraic statistics: Three challenges from 2023 | Yulia Alexandr et.al. | 2402.13961v1 | null |
2024-02-21 | On the topological classification of complex plane curve singularities | Alberto Fernández-Hernández et.al. | 2402.13941v1 | null |
2024-02-21 | Verifying message-passing neural networks via topology-based bounds tightening | Christopher Hojny et.al. | 2402.13937v1 | null |
2024-02-21 | Tumor segmentation on whole slide images: training or prompting? | Huaqian Wu et.al. | 2402.13932v1 | null |
2024-02-21 | BenchCloudVision: A Benchmark Analysis of Deep Learning Approaches for Cloud Detection and Segmentation in Remote Sensing Imagery | Loddo Fabio et.al. | 2402.13918v1 | link |
2024-02-21 | An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach | Mohammad Amaz Uddin et.al. | 2402.13871v1 | null |
2024-02-21 | RFI-DRUnet: Restoring dynamic spectra corrupted by radio frequency interference -- Application to pulsar observations | Xiao Zhang et.al. | 2402.13867v1 | null |
2024-02-21 | What we can learn from TikTok through its Research API | Francesco Corso et.al. | 2402.13855v1 | null |
2024-02-20 | Video ReCap: Recursive Captioning of Hour-Long Videos | Md Mohaiminul Islam et.al. | 2402.13250v1 | null |
2024-02-20 | SMORE: Similarity-based Hyperdimensional Domain Adaptation for Multi-Sensor Time Series Classification | Junyao Wang et.al. | 2402.13233v1 | null |
2024-02-20 | A Touch, Vision, and Language Dataset for Multimodal Alignment | Letian Fu et.al. | 2402.13232v1 | null |
2024-02-20 | NeRF Solves Undersampled MRI Reconstruction | Tae Jun Jang et.al. | 2402.13226v1 | null |
2024-02-20 | VideoPrism: A Foundational Visual Encoder for Video Understanding | Long Zhao et.al. | 2402.13217v1 | null |
2024-02-20 | How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena | Marco Gaido et.al. | 2402.13208v1 | null |
2024-02-20 | A novel image correction method for cloud-affected observations with Imaging Atmospheric Cherenkov Telescopes | Natalia Żywucka et.al. | 2402.13190v1 | null |
2024-02-20 | UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing | Jianhong Bai et.al. | 2402.13185v1 | null |
2024-02-20 | DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models | Norman Di Palo et.al. | 2402.13181v1 | null |
2024-02-20 | 3D Kinematics Estimation from Video with a Biomechanical Model and Synthetic Training Data | Zhi-Yi Lin et.al. | 2402.13172v1 | null |
2024-02-19 | Short-Period Variables in TESS Full-Frame Image Light Curves Identified via Convolutional Neural Networks | Greg Olmschenk et.al. | 2402.12369v1 | null |
2024-02-19 | The first all-sky survey of star-forming galaxies with eROSITA: Scaling relations and a population of X-ray luminous starbursts | E. Kyritsis et.al. | 2402.12367v1 | null |
2024-02-19 | An Adversarial Approach to Evaluating the Robustness of Event Identification Models | Obai Bahwal et.al. | 2402.12338v1 | null |
2024-02-19 | Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models | Christian Schlarmann et.al. | 2402.12336v1 | link |
2024-02-19 | Generating Survival Interpretable Trajectories and Data | Andrei V. Konstantinov et.al. | 2402.12331v1 | null |
2024-02-19 | Asymptotic Gaussian Fluctuations of Eigenvectors in Spectral Clustering | Hugo Lebeau et.al. | 2402.12302v1 | null |
2024-02-19 | Time-periodic behaviour in one- and two-dimensional interacting particle systems | Jonas Köppl et.al. | 2402.12300v1 | null |
2024-02-19 | Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports | Felix J. Dorfner et.al. | 2402.12298v1 | null |
2024-02-19 | Revisiting registration-based synthesis: A focus on unsupervised MR image synthesis | Savannah P. Hays et.al. | 2402.12288v1 | null |
2024-02-19 | Significance of Chirp MFCC as a Feature in Speech and Audio Applications | S. Johanan Joysingh et.al. | 2402.12239v1 | null |
2024-02-16 | PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter | Junfei Xiao et.al. | 2402.10896v1 | null |
2024-02-16 | Fusion of Diffusion Weighted MRI and Clinical Data for Predicting Functional Outcome after Acute Ischemic Stroke with Deep Contrastive Learning | Chia-Ling Tsai et.al. | 2402.10894v1 | null |
2024-02-16 | Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation | Ziyang Wang et.al. | 2402.10887v1 | link |
2024-02-16 | Control Color: Multimodal Diffusion-based Interactive Image Colorization | Zhexin Liang et.al. | 2402.10855v1 | null |
2024-02-16 | HistoSegCap: Capsules for Weakly-Supervised Semantic Segmentation of Histological Tissue Type in Whole Slide Images | Mobina Mansoori et.al. | 2402.10851v1 | null |
2024-02-16 | FedD2S: Personalized Data-Free Federated Knowledge Distillation | Kawa Atapour et.al. | 2402.10846v1 | null |
2024-02-16 | Pedipulate: Enabling Manipulation Skills using a Quadruped Robot's Leg | Philip Arm et.al. | 2402.10837v1 | null |
2024-02-16 | GAN-driven Electromagnetic Imaging of 2-D Dielectric Scatterers | Ehtasham Naseer et.al. | 2402.10831v1 | null |
2024-02-16 | Structure results for torus fixed loci | Jarod Alper et.al. | 2402.10823v1 | null |
2024-02-16 | Training Class-Imbalanced Diffusion Model Via Overlap Optimization | Divin Yan et.al. | 2402.10821v1 | link |
2024-02-15 | Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling | Raunaq Bhirangi et.al. | 2402.10211v1 | null |
2024-02-15 | FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients | Xinchi Qiu et.al. | 2402.10191v1 | null |
2024-02-15 | Euclid preparation. Measuring detailed galaxy morphologies for Euclid with Machine Learning | Euclid Collaboration et.al. | 2402.10187v1 | link |
2024-02-15 | DeepSRGM -- Sequence Classification and Ranking in Indian Classical Music with Deep Learning | Sathwik Tejaswi Madhusudhan et.al. | 2402.10168v1 | null |
2024-02-15 | Holographic covering and the fortuity of black holes | Chi-Ming Chang et.al. | 2402.10129v1 | null |
2024-02-15 | Classification Diffusion Models | Shahar Yadin et.al. | 2402.10095v1 | null |
2024-02-15 | MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations | Benedikt Alkin et.al. | 2402.10093v1 | link |
2024-02-15 | GraphCBAL: Class-Balanced Active Learning for Graph Neural Networks via Reinforcement Learning | Chengcheng Yu et.al. | 2402.10074v1 | null |
2024-02-15 | Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence | Weixiang Zhao et.al. | 2402.10073v1 | null |
2024-02-15 | NYCTALE: Neuro-Evidence Transformer for Adaptive and Personalized Lung Nodule Invasiveness Prediction | Sadaf Khademi et.al. | 2402.10066v1 | null |
2024-02-14 | LL-GABR: Energy Efficient Live Video Streaming Using Reinforcement Learning | Adithya Raman et.al. | 2402.09392v1 | null |
2024-02-14 | GraSSRep: Graph-Based Self-Supervised Learning for Repeat Detection in Metagenomic Assembly | Ali Azizpour et.al. | 2402.09381v1 | link |
2024-02-14 | Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac Challenge | Jiancheng Yang et.al. | 2402.09372v1 | null |
2024-02-14 | Magic-Me: Identity-Specific Video Customized Diffusion | Ze Ma et.al. | 2402.09368v1 | null |
2024-02-14 | Small instanton-induced flavor invariants and the axion potential | Ravneet Bedi et.al. | 2402.09361v1 | null |
2024-02-14 | Pruning Sparse Tensor Neural Networks Enables Deep Learning for 3D Ultrasound Localization Microscopy | Brice Rauby et.al. | 2402.09359v1 | null |
2024-02-14 | DoRA: Weight-Decomposed Low-Rank Adaptation | Shih-Yang Liu et.al. | 2402.09353v1 | null |
2024-02-14 | Irreducible representations of the crystallisation of the |
Manabendra Giri et.al. | 2402.09347v1 | null |
2024-02-14 | Registration of Longitudinal Spine CTs for Monitoring Lesion Growth | Malika Sanhinova et.al. | 2402.09341v1 | null |
2024-02-14 | Stability and Multigroup Fairness in Ranking with Uncertain Predictions | Siddartha Devic et.al. | 2402.09326v1 | null |
2024-02-13 | IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation | Luke Melas-Kyriazi et.al. | 2402.08682v1 | null |
2024-02-13 | A Convergence Analysis of Approximate Message Passing with Non-Separable Functions and Applications to Multi-Class Classification | Burak Çakmak et.al. | 2402.08676v1 | null |
2024-02-13 | Learning Emergent Gaits with Decentralized Phase Oscillators: on the role of Observations, Rewards, and Feedback | Jenny Zhang et.al. | 2402.08662v1 | null |
2024-02-13 | BdSLW60: A Word-Level Bangla Sign Language Dataset | Husne Ara Rubaiyeat et.al. | 2402.08635v1 | link |
2024-02-13 | Convolutional Neural Networks Towards Facial Skin Lesions Detection | Reza Sarshar et.al. | 2402.08592v1 | null |
2024-02-13 | Totally geodesic submanifolds and polar actions on Stiefel manifolds | Claudio Gorodski et.al. | 2402.08585v1 | null |
2024-02-13 | Motion-Adaptive Inference for Flexible Learned B-Frame Compression | M. Akin Yilmaz et.al. | 2402.08550v1 | null |
2024-02-13 | Approximately Piecewise E(3) Equivariant Point Networks | Matan Atzmon et.al. | 2402.08529v1 | null |
2024-02-13 | Reduced-order modeling of the dynamics of an inverted flag from experimental data | Zhenwei Xu et.al. | 2402.08504v1 | null |
2024-02-13 | Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models | Shaeke Salman et.al. | 2402.08473v1 | null |
2024-02-13 | Wavefront Randomization Improves Deconvolution | Amit Kohli et.al. | 2402.07900v2 | null |
2024-02-12 | Detection of Spider Mites on Labrador Beans through Machine Learning Approaches Using Custom Datasets | Violet Liu et.al. | 2402.07895v1 | null |
2024-02-12 | Perfect stable regularity lemma and slice-wise stable hypergraphs | Artem Chernikov et.al. | 2402.07870v1 | null |
2024-02-12 | On Computationally Efficient Multi-Class Calibration | Parikshit Gopalan et.al. | 2402.07821v1 | null |
2024-02-12 | A Benchmark Grocery Dataset of Realworld Point Clouds From Single View | Shivanand Venkanna Sheshappanavar et.al. | 2402.07819v1 | null |
2024-02-12 | Fixation for |
Laure Marêché et.al. | 2402.07807v1 | null |
2024-02-12 | Estimation of non-uniform blur using a patch-based regression convolutional neural network (CNN) | Luis G. Varela et.al. | 2402.07796v1 | null |
2024-02-12 | "Layer-by-layer" Unsupervised Clustering of Statistically Relevant Fluctuations in Noisy Time-series Data of Complex Dynamical Systems | Matteo Becchi et.al. | 2402.07786v1 | null |
2024-02-12 | Solving parameter-dependent semi-algebraic systems | Louis Gaillard et.al. | 2402.07782v1 | null |
2024-02-12 | Observations of the new meteor shower from comet 46P/Wirtanen | D. Vida et.al. | 2402.07769v1 | null |
2024-02-09 | A two-stage algorithm in evolutionary product unit neural networks for classification | Antonio J. Tallón-Ballesteros et.al. | 2402.06622v1 | null |
2024-02-09 | Image-based Deep Learning for the time-dependent prediction of fresh concrete properties | Max Meyer et.al. | 2402.06611v1 | null |
2024-02-09 | SAE: Single Architecture Ensemble Neural Networks | Martin Ferianc et.al. | 2402.06580v1 | null |
2024-02-09 | Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning | Amir Ziai et.al. | 2402.06560v1 | link |
2024-02-09 | Self Supervised Learning for Improved Calibrationless Radial MRI with NLINV-Net | Moritz Blumenthal et.al. | 2402.06550v1 | null |
2024-02-09 | Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA | Marek Šuppa et.al. | 2402.06549v1 | null |
2024-02-09 | Feature Density Estimation for Out-of-Distribution Detection via Normalizing Flows | Evan D. Cook et.al. | 2402.06537v1 | null |
2024-02-09 | Refining Myocardial Infarction Detection: A Novel Multi-Modal Composite Kernel Strategy in One-Class Classification | Muhammad Uzair Zahid et.al. | 2402.06530v1 | null |
2024-02-09 | Flexible infinite-width graph convolutional networks and the importance of representation learning | Ben Anson et.al. | 2402.06525v1 | null |
2024-02-09 | Dynamic swarms regulate the morphology and distribution of soft membrane domains | Aakanksha Gubbala et.al. | 2402.06518v1 | null |
2024-02-08 | Classifying Nodes in Graphs without GNNs | Daniel Winter et.al. | 2402.05934v1 | link |
2024-02-08 | An Interactive Agent Foundation Model | Zane Durante et.al. | 2402.05929v1 | null |
2024-02-08 | Point-VOS: Pointing Up Video Object Segmentation | Idil Esen Zulfikar et.al. | 2402.05917v1 | null |
2024-02-08 | A Survey on Detection, Classification, and Tracking of Aerial Threats using Radar and Communications Systems | Wahab Khawaja et.al. | 2402.05909v1 | null |
2024-02-09 | Large Language Model Meets Graph Neural Network in Knowledge Distillation | Shengxiang Hu et.al. | 2402.05894v2 | null |
2024-02-08 | Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data | Shufan Li et.al. | 2402.05892v1 | null |
2024-02-08 | CREMA: Multimodal Compositional Video Reasoning via Efficient Modular Adaptation and Fusion | Shoubin Yu et.al. | 2402.05889v1 | null |
2024-02-08 | Sandwiched Compression: Repurposing Standard Codecs with Neural Network Wrappers | Onur G. Guleryuz et.al. | 2402.05887v1 | link |
2024-02-08 | GET-Tok: A GenAI-Enriched Multimodal TikTok Dataset Documenting the 2022 Attempted Coup in Peru | Gabriela Pinto et.al. | 2402.05882v1 | link |
2024-02-08 | You've Got to Feel It To Believe It: Multi-Modal Bayesian Inference for Semantic and Property Prediction | Parker Ewen et.al. | 2402.05872v1 | null |
2024-02-07 | Edu-ConvoKit: An Open-Source Library for Education Conversation Data | Rose E. Wang et.al. | 2402.05111v1 | link |
2024-02-07 | Moduli Parameters of Complex Singularities with Non-Degenerate Newton Boundary | Janko Boehm et.al. | 2402.05093v1 | null |
2024-02-07 | Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation | Ziyang Wang et.al. | 2402.05079v1 | link |
2024-02-07 | Arbitrary Scale Super-Resolution Assisted Lunar Crater Detection in Satellite Images | Atal Tewari et.al. | 2402.05068v1 | null |
2024-02-07 | Efficient Multi-Resolution Fusion for Remote Sensing Data with Label Uncertainty | Hersh Vakharia et.al. | 2402.05045v1 | link |
2024-02-07 | PAC Learnability under Explanation-Preserving Graph Perturbations | Xu Zheng et.al. | 2402.05039v1 | null |
2024-02-07 | Strong convexity-guided hyper-parameter optimization for flatter losses | Rahul Yedida et.al. | 2402.05025v1 | null |
2024-02-07 | Example-based Explanations for Random Forests using Machine Unlearning | Tanmay Surve et.al. | 2402.05007v1 | null |
2024-02-07 | Randomized Confidence Bounds for Stochastic Partial Monitoring | Maxime Heuillet et.al. | 2402.05002v1 | null |
2024-02-07 | Beyond explaining: XAI-based Adaptive Learning with SHAP Clustering for Energy Consumption Prediction | Tobias Clement et.al. | 2402.04982v1 | null |
2024-02-06 | EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters | Quan Sun et.al. | 2402.04252v1 | link |
2024-02-06 | The spectrum of excisive functors | Gregory Arone et.al. | 2402.04244v1 | null |
2024-02-06 | A classification of nonzero skew immaculate functions | Sarah Mason et.al. | 2402.04219v1 | null |
2024-02-06 | Resource-Aware Hierarchical Federated Learning in Wireless Video Caching Networks | Md Ferdous Pervej et.al. | 2402.04216v1 | null |
2024-02-06 | "Task Success" is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors | Lin Guan et.al. | 2402.04210v1 | null |
2024-02-06 | 3D Volumetric Super-Resolution in Radiology Using 3D RRDB-GAN | Juhyung Ha et.al. | 2402.04171v1 | null |
2024-02-06 | Human Emotions Analysis and Recognition Using EEG Signals in Response to 360$^\circ$ Videos | Haseeb ur Rahman Abbasi et.al. | 2402.04142v1 | null |
2024-02-06 | Hierarchical Delay Attribution Classification using Unstructured Text in Train Management Systems | Anton Borg et.al. | 2402.04108v1 | null |
2024-02-06 | Analysis of Deep Image Prior and Exploiting Self-Guidance for Image Reconstruction | Shijun Liang et.al. | 2402.04097v1 | null |
2024-02-06 | A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation | Zhengbo Wang et.al. | 2402.04087v1 | link |
2024-02-05 | Multiclass Classification Procedure for Detecting Attacks on MQTT-IoT Protocol | Hector Alaiz-Moreton et.al. | 2402.03270v1 | null |
2024-02-05 | Security Advice for Parents and Children About Content Filtering and Circumvention as Found on YouTube and TikTok | Ran Elgedawy et.al. | 2402.03255v1 | null |
2024-02-05 | JOBSKAPE: A Framework for Generating Synthetic Job Postings to Enhance Skill Matching | Antoine Magron et.al. | 2402.03242v1 | link |
2024-02-05 | FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action Recognition | Xiaohu Huang et.al. | 2402.03241v1 | null |
2024-02-05 | IGUANe: a 3D generalizable CycleGAN for multicenter harmonization of brain MR images | Vincent Roca et.al. | 2402.03227v1 | null |
2024-02-05 | English Prompts are Better for NLI-based Zero-Shot Emotion Classification than Target-Language Prompts | Patrick Barreiß et.al. | 2402.03223v1 | null |
2024-02-05 | "Define Your Terms" : Enhancing Efficient Offensive Speech Classification with Definition | Huy Nghiem et.al. | 2402.03221v1 | link |
2024-02-05 | Isotropy, Clusters, and Classifiers | Timothee Mickus et.al. | 2402.03191v1 | null |
2024-02-06 | Cool-chic video: Learned video coding with 800 parameters | Thomas Leguay et.al. | 2402.03179v2 | null |
2024-02-05 | Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings | Gonçalo Gomes et.al. | 2402.03172v1 | link |
2024-02-02 | From gas to stars: MUSEings on the internal evolution of IC 1613 | S. Taibi et.al. | 2402.01631v1 | null |
2024-02-02 | Truncation technique for variational quantum eigensolver for Molecular Hamiltonians | Qidong Xu et.al. | 2402.01630v1 | null |
2024-02-02 | L2G2G: a Scalable Local-to-Global Network Embedding with Graph Autoencoders | Ruikang Ouyang et.al. | 2402.01614v1 | link |
2024-02-02 | Immersive Video Compression using Implicit Neural Representations | Ho Man Kwan et.al. | 2402.01596v1 | link |
2024-02-02 | NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties | Jingyuan Sun et.al. | 2402.01590v1 | null |
2024-02-02 | Boximator: Generating Rich and Controllable Motions for Video Synthesis | Jiawei Wang et.al. | 2402.01566v1 | null |
2024-02-02 | Deep Continuous Networks | Nergis Tomen et.al. | 2402.01557v1 | link |
2024-02-02 | SLYKLatent, a Learning Framework for Facial Features Estimation | Samuel Adebayo et.al. | 2402.01555v1 | null |
2024-02-02 | Advancing Brain Tumor Inpainting with Generative Models | Ruizhi Zhu et.al. | 2402.01509v1 | null |
2024-02-02 | Di-NeRF: Distributed NeRF for Collaborative Learning with Unknown Relative Poses | Mahboubeh Asadi et.al. | 2402.01485v1 | null |
2024-02-01 | We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline | Simar Kareer et.al. | 2402.00868v1 | link |
2024-02-01 | Deep Room Impulse Response Completion | Jackie Lin et.al. | 2402.00859v1 | null |
2024-02-01 | Early Time Classification with Accumulated Accuracy Gap Control | Liran Ringel et.al. | 2402.00857v1 | link |
2024-02-01 | BootsTAP: Bootstrapped Training for Tracking-Any-Point | Carl Doersch et.al. | 2402.00847v1 | link |
2024-02-01 | Emo-Avatar: Efficient Monocular Video Style Avatar through Texture Rendering | Pinxin Liu et.al. | 2402.00827v1 | null |
2024-02-01 | Examining the Influence of Digital Phantom Models in Virtual Imaging Trials for Tomographic Breast Imaging | Amar Kavuri et.al. | 2402.00812v1 | null |
2024-02-01 | ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models | Zhixue Zhao et.al. | 2402.00794v1 | link |
2024-02-01 | Distinguishing the Indistinguishable: Human Expertise in Algorithmic Prediction | Rohan Alur et.al. | 2402.00793v1 | link |
2024-02-02 | CroissantLLM: A Truly Bilingual French-English Language Model | Manuel Faysse et.al. | 2402.00786v2 | link |
2024-02-01 | Hybrid Quantum Vision Transformers for Event Classification in High Energy Physics | Eyup B. Unlu et.al. | 2402.00776v1 | null |
2024-01-31 | Classification-Oriented Semantic Wireless Communications | Emrecan Kutay et.al. | 2401.18069v1 | null |
2024-01-31 | Rank Supervised Contrastive Learning for Time Series Classification | Qianying Ren et.al. | 2401.18057v1 | null |
2024-01-31 | Variable selection for Naïve Bayes classification | Rafael Blanquero et.al. | 2401.18039v1 | null |
2024-01-31 | Optimizing contrastive learning for cortical folding pattern detection | Aymeric Gaudin et.al. | 2401.18035v1 | null |
2024-01-31 | A Neural Enhancement Post-Processor with a Dynamic AV1 Encoder Configuration Strategy for CLIC 2024 | Darren Ramsook et.al. | 2401.18021v1 | null |
2024-01-31 | EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation | Jonathan W. Kim et.al. | 2401.18006v1 | null |
2024-01-31 | Unsupervised Learning of Topological Non-Abelian Braiding in Non-Hermitian Bands | Yang Long et.al. | 2401.17968v1 | null |
2024-01-31 | Error-Tolerant E-Discovery Protocols | Jinshuo Dong et.al. | 2401.17952v1 | null |
2024-01-31 | HyperZ$\cdot$Z$\cdot$W Operator Connects Slow-Fast Networks for Full Context Interaction | Harvie Zhang et.al. | 2401.17948v1 | null |
2024-01-31 | Probabilistic Photonic Computing with Chaotic Light | Frank Brückerhoff-Plückelmann et.al. | 2401.17915v1 | null |
2024-01-30 | The SRG/eROSITA all-sky survey: Hard X-ray selected Active Galactic Nuclei | Sophia G. H. Waddell et.al. | 2401.17306v1 | null |
2024-01-30 | Compact white-dwarf binaries in the combined SRG/eROSITA/SDSS eFEDS survey | A. Schwope et.al. | 2401.17304v1 | null |
2024-01-30 | Searching for X-ray counterparts of unassociated Fermi-LAT sources and rotation-powered pulsars with SRG/eROSITA | Martin G. F. Mayer et.al. | 2401.17295v1 | null |
2024-01-30 | X-ray AGNs with SRG/eROSITA: Multi-wavelength observations reveal merger triggering and post-coalescence circumnuclear blowout | Robert W. Bickley et.al. | 2401.17277v1 | null |
2024-01-30 | ReacLLaMA: Merging chemical and textual information in chemical reactivity AI models | Aline Hartgers et.al. | 2401.17267v1 | null |
2024-01-30 | SLIC: A Learned Image Codec Using Structure and Color | Srivatsa Prativadibhayankaram et.al. | 2401.17246v1 | link |
2024-01-31 | Faster coloring and embedding in dense hypergraphs via stability | Jianfeng Hou et.al. | 2401.17219v2 | null |
2024-01-31 | GazeGPT: Augmenting Human Capabilities using Gaze-contingent Contextual AI for Smart Eyewear | Robert Konrad et.al. | 2401.17217v2 | null |
2024-01-30 | Single Word Change is All You Need: Designing Attacks and Defenses for Text Classifiers | Lei Xu et.al. | 2401.17196v1 | null |
2024-01-30 | GraphViz2Vec: A Structure-aware Feature Generation Model to Improve Classification in GNNs | Shraban Kumar Chatterjee et.al. | 2401.17178v1 | null |
2024-01-29 | Computer Vision for Primate Behavior Analysis in the Wild | Richard Vogg et.al. | 2401.16424v1 | null |
2024-01-29 | Synchformer: Efficient Synchronization from Sparse Cues | Vladimir Iashin et.al. | 2401.16423v1 | null |
2024-01-29 | Strategic Usage in a Multi-Learner Setting | Eliot Shekhtman et.al. | 2401.16422v1 | null |
2024-01-29 | ReTaSA: A Nonparametric Functional Estimation Approach for Addressing Continuous Target Shift | Hwanwoo Kim et.al. | 2401.16410v1 | null |
2024-01-29 | Is K-fold cross validation the best model selection method for Machine Learning? | Juan M Gorriz et.al. | 2401.16407v1 | null |
2024-01-29 | Zero-shot Imitation Policy via Search in Demonstration Dataset | Federco Malato et.al. | 2401.16398v1 | null |
2024-01-29 | Ovarian Cancer Diagnostics using Wavelet Packet Scaling Descriptors | Raymond J. Hinton Jr. et.al. | 2401.16396v1 | null |
2024-01-29 | Evaluation of pseudo-healthy image reconstruction for anomaly detection with deep generative models: Application to brain FDG PET | Ravi Hassanaly et.al. | 2401.16363v1 | link |
2024-01-29 | Curriculum-Based Reinforcement Learning for Quadrupedal Jumping: A Reference-free Design | Vassil Atanassov et.al. | 2401.16337v1 | null |
2024-01-29 | Making the unmodulated Pyramid wavefront sensor smart. Closed-loop demonstration of neural network wavefront reconstruction with MagAO-X | Rico Landman et.al. | 2401.16325v1 | null |
2024-01-26 | From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities | Chaochao Lu et.al. | 2401.15071v1 | null |
2024-01-26 | Deep learning-based approach for tomato classification in complex scenes | Mikael A. Mousse et.al. | 2401.15055v1 | null |
2024-01-26 | Non-Unitary |
Pedro M. F. Pereira et.al. | 2401.15049v1 | null |
2024-01-26 | Machine learning-based analysis of glioma tissue sections: a review | Jan-Philipp Redlich et.al. | 2401.15022v1 | null |
2024-01-26 | Enhancement of a Text-Independent Speaker Verification System by using Feature Combination and Parallel-Structure Classifiers | Kerlos Atia Abdalmalak et.al. | 2401.15018v1 | null |
2024-01-26 | Graph-based Active Learning for Entity Cluster Repair | Victor Christen et.al. | 2401.14992v1 | null |
2024-01-26 | Stokes graphs of the Rabi problem with real parameters | René Langøen et.al. | 2401.14991v1 | null |
2024-01-26 | Minimum-dissipation principle for synchronised stochastic oscillators far from equilibrium | Jan Meibohm et.al. | 2401.14982v1 | null |
2024-01-26 | Microwave lymphedema assessment using deep learning with contour assisted backprojection | Yuyi Chang et.al. | 2401.14970v1 | null |
2024-01-26 | Hold Tight: Identifying Behavioral Patterns During Prolonged Work in VR through Video Analysis | Verena Biener et.al. | 2401.14920v1 | null |
2024-01-25 | Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities | Yiyuan Zhang et.al. | 2401.14405v1 | link |
2024-01-25 | Adaptive Mobile Manipulation for Articulated Objects In the Open World | Haoyu Xiong et.al. | 2401.14403v1 | null |
2024-01-25 | Range-Agnostic Multi-View Depth Estimation With Keyframe Selection | Andrea Conti et.al. | 2401.14401v1 | link |
2024-01-25 | Rethinking Patch Dependence for Masked Autoencoders | Letian Fu et.al. | 2401.14391v1 | null |
2024-01-25 | Smooth Ranking SVM via Cutting-Plane Method | Erhan Can Ozcan et.al. | 2401.14388v1 | link |
2024-01-25 | Inconsistency Masks: Removing the Uncertainty from Input-Pseudo-Label Pairs | Michael R. H. Vorndran et.al. | 2401.14387v1 | link |
2024-01-25 | A Comparative Analysis of Noise Reduction Methods in Sentiment Analysis on Noisy Bengali Texts | Kazi Toufique Elahi et.al. | 2401.14360v1 | link |
2024-01-25 | Computing Derivations on Nilpotent Quadratic Lie Algebras | Pilar Benito et.al. | 2401.14348v1 | null |
2024-01-25 | Class-attribute Priors: Adapting Optimization to Heterogeneity and Fairness Objective | Xuechen Zhang et.al. | 2401.14343v1 | null |
2024-01-25 | Progressive Multi-task Anti-Noise Learning and Distilling Frameworks for Fine-grained Vehicle Recognition | Dichao Liu et.al. | 2401.14336v1 | link |
2024-01-24 | Tyche: Stochastic In-Context Learning for Medical Image Segmentation | Marianne Rakic et.al. | 2401.13650v1 | null |
2024-01-24 | Quantifying the Impact of Frame Preemption on Combined TSN Shapers | Rubi Debnath et.al. | 2401.13631v1 | null |
2024-01-24 | Can overfitted deep neural networks in adversarial training generalize? -- An approximation viewpoint | Zhongjie Shi et.al. | 2401.13624v1 | null |
2024-01-24 | FLLIC: Functionally Lossless Image Compression | Xi Zhang et.al. | 2401.13616v1 | null |
2024-01-24 | Enhancing Image Retrieval : A Comprehensive Study on Photo Search using the CLIP Mode | Naresh Kumar Lahajal et.al. | 2401.13613v1 | null |
2024-01-24 | Prompt Weight Experiments for LLM Instruction Fine-Tuning | Mathew Huerta-Enochian et.al. | 2401.13586v1 | null |
2024-01-24 | WPDA: Frequency-based Backdoor Attack with Wavelet Packet Decomposition | Zhengyao Song et.al. | 2401.13578v1 | null |
2024-01-24 | CNN architecture extraction on edge GPU | Peter Horvath et.al. | 2401.13575v1 | null |
2024-01-24 | Benchmarking the Fairness of Image Upsampling Methods | Mike Laszkiewicz et.al. | 2401.13555v1 | null |
2024-01-24 | PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition | Otto Brookes et.al. | 2401.13554v1 | null |
2024-01-23 | SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location on MRI | Hanxue Gu et.al. | 2401.12974v1 | null |
2024-01-23 | On the Efficacy of Text-Based Input Modalities for Action Anticipation | Apoorva Beedu et.al. | 2401.12972v1 | null |
2024-01-23 | The role of environment and AGN feedback in quenching local galaxies: Comparing cosmological hydrodynamical simulations to the SDSS | Paul H. Goubert et.al. | 2401.12953v1 | null |
2024-01-23 | Lumiere: A Space-Time Diffusion Model for Video Generation | Omer Bar-Tal et.al. | 2401.12945v1 | null |
2024-01-23 | Long-range three-dimensional tracking of nanoparticles using interferometric scattering (iSCAT) microscopy | Kiarash Kasaian et.al. | 2401.12939v1 | null |
2024-01-23 | Neural deformation fields for template-based reconstruction of cortical surfaces from MRI | Fabian Bongratz et.al. | 2401.12938v1 | null |
2024-01-23 | Segmentation of tibiofemoral joint tissues from knee MRI using MtRA-Unet and incorporating shape information: Data from the Osteoarthritis Initiative | Akshay Daydar et.al. | 2401.12932v1 | null |
2024-01-23 | pyAKI - An Open Source Solution to Automated KDIGO classification | Christian Porschen et.al. | 2401.12930v1 | null |
2024-01-23 | Performance Analysis of Support Vector Machine (SVM) on Challenging Datasets for Forest Fire Detection | Ankan Kar et.al. | 2401.12924v1 | null |
2024-01-23 | Advancing Glitch Classification in Gravity Spy: Multi-view Fusion with Attention-based Machine Learning for Advanced LIGO's Fourth Observing Run | Yunan Wu et.al. | 2401.12913v1 | null |
2024-01-22 | Connecting the Dots: Leveraging Spatio-Temporal Graph Neural Networks for Accurate Bangla Sign Language Recognition | Haz Sameen Shahgir et.al. | 2401.12210v1 | null |
2024-01-22 | Unsupervised Machine Learning for the Classification of Astrophysical X-ray Sources | Víctor Samuel Pérez-Díaz et.al. | 2401.12203v1 | link |
2024-01-22 | OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics | Peiqi Liu et.al. | 2401.12202v1 | null |
2024-01-22 | In-Context Learning for Extreme Multi-Label Classification | Karel D'Oosterlinck et.al. | 2401.12178v1 | null |
2024-01-22 | Broiler-Net: A Deep Convolutional Framework for Broiler Behavior Analysis in Poultry Houses | Tahereh Zarrat Ehsan et.al. | 2401.12176v1 | link |
2024-01-22 | VRMN-bD: A Multi-modal Natural Behavior Dataset of Immersive Human Fear Responses in VR Stand-up Interactive Games | He Zhang et.al. | 2401.12133v1 | link |
2024-01-22 | Evaluation of QCNN-LSTM for Disability Forecasting in Multiple Sclerosis Using Sequential Multisequence MRI | John D. Mayfield et.al. | 2401.12132v1 | null |
2024-01-22 | Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy | Will LeVine et.al. | 2401.12129v1 | link |
2024-01-22 | Measures of the Capital Network of the U.S. Economy | Ben Klemens et.al. | 2401.12118v1 | null |
2024-01-22 | A quantitative version of the Steinhaus theorem | Alex Iosevich et.al. | 2401.12112v1 | null |
2024-01-19 | Classifying affine structures with focus-focus singularities | Xiudi Tang et.al. | 2401.10881v1 | null |
2024-01-19 | Motion Consistency Loss for Monocular Visual Odometry with Attention-Based Deep Learning | André O. Françani et.al. | 2401.10857v1 | null |
2024-01-19 | Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models | Mia Mohammad Imran et.al. | 2401.10845v1 | null |
2024-01-19 | Understanding Video Transformers via Universal Concept Discovery | Matthew Kowal et.al. | 2401.10831v1 | null |
2024-01-19 | Long-Term Monitoring of the Oe Star VES 735: Ope! Not So Quiet After All | Brandon Marshall et.al. | 2401.10829v1 | null |
2024-01-19 | ActAnywhere: Subject-Aware Video Background Generation | Boxiao Pan et.al. | 2401.10822v1 | null |
2024-01-19 | RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision | Fernando Pérez-García et.al. | 2401.10815v1 | null |
2024-01-19 | Learning to Visually Connect Actions and their Effects | Eric Peh et.al. | 2401.10805v1 | null |
2024-01-19 | Endovascular Detection of Catheter-Thrombus Contact by Vacuum Excitation | Jared Lawson et.al. | 2401.10804v1 | null |
2024-01-19 | TDC-less Direct Time-of-Flight Imaging Using Spiking Neural Networks | Jack MacLean et.al. | 2401.10793v1 | null |
2024-01-18 | Simultaneous Tactile Estimation and Control for Extrinsic Dexterity | Antonia Bronars et.al. | 2401.10230v1 | null |
2024-01-18 | OMG-Seg: Is One Model Good Enough For All Segmentation? | Xiangtai Li et.al. | 2401.10229v1 | link |
2024-01-18 | RAP-SAM: Towards Real-Time All-Purpose Segment Anything | Shilin Xu et.al. | 2401.10228v1 | link |
2024-01-18 | Towards Language-Driven Video Inpainting via Multimodal Large Language Models | Jianzong Wu et.al. | 2401.10226v1 | null |
2024-01-18 | Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions | Namitha Padmanabhan et.al. | 2401.10217v1 | null |
2024-01-18 | Transfer Learning in Human Activity Recognition: A Survey | Sourish Gunesh Dhekane et.al. | 2401.10185v1 | null |
2024-01-18 | SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild | Andreas Engelhardt et.al. | 2401.10171v1 | null |
2024-01-19 | Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation | Changgu Chen et.al. | 2401.10150v2 | null |
2024-01-18 | Few-shot learning for COVID-19 Chest X-Ray Classification with Imbalanced Data: An Inter vs. Intra Domain Study | Alejandro Galán-Cuenca et.al. | 2401.10129v1 | null |
2024-01-18 | Sub2Full: split spectrum to boost OCT despeckling without clean data | Lingyun Wang et.al. | 2401.10128v1 | link |
2024-01-17 | Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model | Lianghui Zhu et.al. | 2401.09417v1 | link |
2024-01-17 | Vlogger: Make Your Dream A Vlog | Shaobin Zhuang et.al. | 2401.09414v1 | link |
2024-01-17 | Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text | Mazal Bethany et.al. | 2401.09407v1 | null |
2024-01-17 | Élivágar: Efficient Quantum Circuit Search for Classification | Sashwat Anagolum et.al. | 2401.09393v1 | null |
2024-01-17 | Tri$^{2}$-plane: Volumetric Avatar Reconstruction with Feature Pyramid | Luchuan Song et.al. | 2401.09386v1 | link |
2024-01-17 | New relations of pod partition and its connection with other partition functions | Hemjyoti Nath et.al. | 2401.09374v1 | null |
2024-01-17 | To deform or not: treatment-aware longitudinal registration for breast DCE-MRI during neoadjuvant chemotherapy via unsupervised keypoints detection | Luyi Han et.al. | 2401.09336v1 | link |
2024-01-17 | Machines Do See Color: A Guideline to Classify Different Forms of Racist Discourse in Large Corpora | Diana Davila Gordillo et.al. | 2401.09333v1 | null |
2024-01-17 | Spectral Distribution Complexity of the Surface Fibrillatory Waves Predicts Post-Catheter Ablation Relapse in Persistent Atrial Fibrillation | Pilar Escribano et.al. | 2401.09297v1 | null |
2024-01-17 | T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis | Yoonjin Chung et.al. | 2401.09294v1 | null |
2024-01-16 | From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers | Jiu Feng et.al. | 2401.08415v1 | null |
2024-01-16 | Faster ISNet for Background Bias Mitigation on Deep Neural Networks | Pedro R. A. S. Bassi et.al. | 2401.08409v1 | null |
2024-01-16 | Training and Comparison of nnU-Net and DeepMedic Methods for Autosegmentation of Pediatric Brain Tumors | Arastoo Vossough et.al. | 2401.08404v1 | null |
2024-01-16 | High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering | Xin Ming et.al. | 2401.08398v1 | null |
2024-01-16 | DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models | Zongxin Yang et.al. | 2401.08392v1 | link |
2024-01-16 | We don't need no labels: Estimating post-deployment model performance under covariate shift without ground truth | Jakub Białek et.al. | 2401.08348v1 | null |
2024-01-16 | Learn What You Need in Personalized Federated Learning | Kexin Lv et.al. | 2401.08327v1 | link |
2024-01-16 | Application of LLM Agents in Recruitment: A Novel Framework for Resume Screening | Chengguang Gan et.al. | 2401.08315v1 | null |
2024-01-16 | Central extensions of restricted Lie superalgebras and classification of |
Sofiane Bouarroudj et.al. | 2401.08313v1 | null |
2024-01-16 | Evaluating online elasticity estimation of soft objects using standard robot grippers | Shubhan P. Patni et.al. | 2401.08298v1 | null |
2024-01-16 | Multitask Learning in Minimally Invasive Surgical Vision: A Review | Oluwatosin Alabi et.al. | 2401.08256v1 | null |
2024-01-16 | Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization | Chongzhi Zhang et.al. | 2401.08232v1 | null |
2024-01-16 | Towards Causal Relationship in Indefinite Data: Baseline Model and New Datasets | Hang Chen et.al. | 2401.08221v1 | link |
2024-01-16 | Ship Detection in SAR Images with Human-in-the-Loop | Hecheng Jia et.al. | 2401.08213v1 | null |
2024-01-16 | ModelNet-O: A Large-Scale Synthetic Dataset for Occlusion-Aware Point Cloud Classification | Zhongbin Fang et.al. | 2401.08210v1 | link |
2024-01-12 | Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements | Anton Voronov et.al. | 2401.06766v1 | null |
2024-01-12 | Classification of singularities of cluster algebras of finite type II: coefficients | Angélica Benito et.al. | 2401.06758v1 | null |
2024-01-12 | Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention Prediction | Muhammad Naveed Riaz et.al. | 2401.06757v1 | null |
2024-01-12 | Stylometry Analysis of Multi-authored Documents for Authorship and Author Style Change Detection | Muhammad Tayyab Zamir et.al. | 2401.06752v1 | null |
2024-01-12 | Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part II: Spatial and Tonal Data Optimization | Niklas Kämper et.al. | 2401.06747v1 | null |
2024-01-12 | Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part I: Homogeneous Diffusion Inpainting | Niklas Kämper et.al. | 2401.06744v1 | null |
2024-01-12 | Complexity Classification of Product State Problems for Local Hamiltonians | John Kallaugher et.al. | 2401.06725v1 | null |
2024-01-12 | Obstacle-Aware Positioning of a Mobile Robotic Platform for 6G Networks | Alexandre Costa et.al. | 2401.06717v1 | null |
2024-01-12 | Reliability Analysis of Psychological Concept Extraction and Classification in User-penned Text | Muskan Garg et.al. | 2401.06709v1 | null |
2024-01-12 | On the existence of charged electrostatic black holes in arbitrary topology | Martin Reiris et.al. | 2401.06702v1 | null |
2024-01-11 | Distilling Vision-Language Models on Millions of Videos | Yue Zhao et.al. | 2401.06129v1 | null |
2024-01-11 | Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors | Jack Saunders et.al. | 2401.06126v1 | null |
2024-01-11 | Gaussian Shadow Casting for Neural Characters | Luis Bolanos et.al. | 2401.06116v1 | null |
2024-01-11 | A Closer Look at AUROC and AUPRC under Class Imbalance | Matthew B. A. McDermott et.al. | 2401.06091v1 | link |
2024-01-12 | LEGO:Language Enhanced Multi-modal Grounding Model | Zhaowei Li et.al. | 2401.06071v2 | link |
2024-01-11 | On the Power of Graph Neural Networks and Feature Augmentation Strategies to Classify Social Networks | Walid Guettala et.al. | 2401.06048v1 | null |
2024-01-11 | RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane Networks | Partha Ghosh et.al. | 2401.06035v1 | null |
2024-01-11 | Attention to detail: inter-resolution knowledge distillation | Rocío del Amor et.al. | 2401.06010v1 | link |
2024-01-11 | Sea ice detection using concurrent multispectral and synthetic aperture radar imagery | Martin S J Rogers et.al. | 2401.06009v1 | null |
2024-01-11 | Boosting Mixed-Initiative Co-Creativity in Game Design: A Tutorial | Solange Margarido et.al. | 2401.05999v1 | null |
2024-01-10 | Towards Online Sign Language Recognition and Translation | Ronglai Zuo et.al. | 2401.05336v1 | link |
2024-01-10 | ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video | Kevin Cai et.al. | 2401.05314v1 | link |
2024-01-10 | Strategic Client Selection to Address Non-IIDness in HAPS-enabled FL Networks | Amin Farajzadeh et.al. | 2401.05308v1 | null |
2024-01-10 | Frame-like Fourier expansions for finite Borel measures on |
Chad Berner et.al. | 2401.05243v1 | null |
2024-01-10 | Learning effective good variables from physical data | Giulio Barletta et.al. | 2401.05226v1 | link |
2024-01-10 | TOVAC: Tele-operated Vehicle Admission Control and Routing | Jorge Martín-Pérez et.al. | 2401.05225v1 | null |
2024-01-10 | Do Vision and Language Encoders Represent the World Similarly? | Mayug Maniparambil et.al. | 2401.05224v1 | null |
2024-01-10 | Exploring Vulnerabilities of No-Reference Image Quality Assessment Models: A Query-Based Black-Box Method | Chenxi Yang et.al. | 2401.05217v1 | null |
2024-01-10 | Pre-trained Large Language Models for Financial Sentiment Analysis | Wei Luo et.al. | 2401.05215v1 | link |
2024-01-10 | A Novel Prompt-tuning Method: Incorporating Scenario-specific Concepts into a Verbalizer | Yong Ma et.al. | 2401.05204v1 | null |
2024-01-09 | A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars | Ronglai Zuo et.al. | 2401.04730v1 | link |
2024-01-09 | U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation | Jun Ma et.al. | 2401.04722v1 | null |
2024-01-09 | Helicoidal surfaces of prescribed mean curvature in |
Aires Eduardo Menani Barbieri et.al. | 2401.04721v1 | null |
2024-01-09 | Low-resource finetuning of foundation models beats state-of-the-art in histopathology | Benedikt Roth et.al. | 2401.04720v1 | null |
2024-01-09 | Jump Cut Smoothing for Talking Heads | Xiaojuan Wang et.al. | 2401.04718v1 | null |
2024-01-09 | NIPn CHIPS | Blaise Boissonneau et.al. | 2401.04697v1 | null |
2024-01-09 | CoordGate: Efficiently Computing Spatially-Varying Convolutions in Convolutional Neural Networks | Sunny Howard et.al. | 2401.04680v1 | null |
2024-01-09 | Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset | Galib Muhammad Shahriar Himel et.al. | 2401.04666v1 | null |
2024-01-09 | DepressionEmo: A novel dataset for multilabel classification of depression emotions | Abu Bakar Siddiqur Rahman et.al. | 2401.04655v1 | link |
2024-01-09 | Hold 'em and Fold 'em: Towards Human-scale, Feedback-Controlled Soft Origami Robots | Immanuel Ampomah Mensah et.al. | 2401.04650v1 | null |
2024-01-08 | Dr$^2$Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning | Chen Zhao et.al. | 2401.04105v1 | null |
2024-01-08 | RudolfV: A Foundation Model by Pathologists for Pathologists | Jonas Dippel et.al. | 2401.04079v1 | null |
2024-01-08 | Variance Reduction in Ratio Metrics for Efficient Online Experiments | Shubham Baweja et.al. | 2401.04062v1 | null |
2024-01-08 | Bjøntegaard Delta (BD): A Tutorial Overview of the Metric, Evolution, Challenges, and Recommendations | Nabajeet Barman et.al. | 2401.04039v1 | null |
2024-01-08 | Blocks whose defect groups are Suzuki |
Charles W. Eaton et.al. | 2401.04028v1 | null |
2024-01-08 | IDoFew: Intermediate Training Using Dual-Clustering in Language Models for Few Labels Text Classification | Abdullah Alsuhaibani et.al. | 2401.04025v1 | null |
2024-01-08 | Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification | Wentao Zhu et.al. | 2401.04023v1 | null |
2024-01-08 | Resident space object detection method based on the connection between Fourier spectrum of the video data difference frame and the linear velocity projection | V. S. Baranova et.al. | 2401.04021v1 | null |
2024-01-09 | Recognizing Blazars Using Radio Morphology from the VLA Sky Survey | Zhang-Liang Xie et.al. | 2401.04009v2 | null |
2024-01-08 | Calabi-Yau Varieties via Cyclic Covers, and Complex Hyperbolic Structures for their Moduli Spaces | Chenglong Yu et.al. | 2401.04006v1 | null |
2024-01-05 | Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively | Haobo Yuan et.al. | 2401.02955v1 | link |
2024-01-05 | The Dark Energy Survey Supernova Program: Cosmological Analysis and Systematic Uncertainties | M. Vincenzi et.al. | 2401.02945v1 | null |
2024-01-05 | Digital-analog quantum learning on Rydberg atom arrays | Jonathan Z. Lu et.al. | 2401.02940v1 | null |
2024-01-05 | Mixing Magnetic and Electric Ehlers-Harrison transformations: The Electromagnetic Swirling Spacetime and Novel Type I Backgrounds | José Barrientos et.al. | 2401.02924v1 | null |
2024-01-05 | Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks | Kevin Everson et.al. | 2401.02921v1 | null |
2024-01-05 | Analytically-Driven Resource Management for Cloud-Native Microservices | Yanqi Zhang et.al. | 2401.02920v1 | null |
2024-01-05 | Introducing Bode: A Fine-Tuned Large Language Model for Portuguese Prompt-Based Task | Gabriel Lino Garcia et.al. | 2401.02909v1 | null |
2024-01-05 | Robust Bichromatic Classification using Two Lines | Erwin Glazenburg et.al. | 2401.02897v1 | null |
2024-01-05 | Particle-Wise Higher-Order SPH Field Approximation for DVR | Jonathan Fischer et.al. | 2401.02896v1 | null |
2024-01-05 | Nonlinear functional regression by functional deep neural network with kernel embedding | Zhongjie Shi et.al. | 2401.02890v1 | null |
2024-01-04 | asimulation: Domain formation and impact on observables in resolved cosmological simulations of the (a)symmetron | Øyvind Christiansen et.al. | 2401.02410v1 | link |
2024-01-04 | Gravitational waves from dark domain walls | Øyvind Christiansen et.al. | 2401.02409v1 | link |
2024-01-05 | Correctness Comparison of ChatGPT-4, Bard, Claude-2, and Copilot for Spatial Tasks | Hartwig H. Hochmair et.al. | 2401.02404v2 | null |
2024-01-04 | 3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation | Zihao Xiao et.al. | 2401.02402v1 | null |
2024-01-04 | Analyzing Misinformation Claims During the 2022 Brazilian General Election on WhatsApp, Twitter, and Kwai | Scott A. Hale et.al. | 2401.02395v1 | null |
2024-01-04 | Image denoising and model-independent parameterization for improving IVIM MRI | Caleb Sample et.al. | 2401.02394v1 | null |
2024-01-04 | Survey of 3D Human Body Pose and Shape Estimation Methods for Contemporary Dance Applications | Darshan Venkatrayappa et.al. | 2401.02383v1 | null |
2024-01-04 | A novel method to enhance pneumonia detection via a model-level ensembling of CNN and vision transformer | Sandeep Angara et.al. | 2401.02358v1 | null |
2024-01-04 | ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation | Xinyang Pu et.al. | 2401.02326v1 | link |
2024-01-04 | Reflection physics in X-ray-emitting Symbiotic Stars | Jesús A. Toalá et.al. | 2401.02318v1 | null |
2024-01-03 | Profinite equivariant spectra and their tensor-triangular geometry | Scott Balchin et.al. | 2401.01878v1 | null |
2024-01-03 | A spatial mixture model for spaceborne lidar observations over mixed forest and non-forest land types | Paul B. May et.al. | 2401.01848v1 | null |
2024-01-03 | Teaching with a companion: the case of gravity | Iuliia Zhurakovskaia et.al. | 2401.01832v1 | null |
2024-01-03 | Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling | Himmet Toprak Kesgin et.al. | 2401.01830v1 | null |
2024-01-03 | Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions | David Junhao Zhang et.al. | 2401.01827v1 | link |
2024-01-03 | Detours for Navigating Instructional Videos | Kumar Ashutosh et.al. | 2401.01823v1 | null |
2024-01-03 | SENS3: Multisensory Database of Finger-Surface Interactions and Corresponding Sensations | Jagan K. Balasubramanian et.al. | 2401.01818v1 | null |
2024-01-03 | Signal Processing in the Retina: Interpretable Graph Classifier to Predict Ganglion Cell Responses | Yasaman Parhizkar et.al. | 2401.01813v1 | null |
2024-01-03 | Efficient Computation of Confidence Sets Using Classification on Equidistributed Grids | Lujie Zhou et.al. | 2401.01804v1 | null |
2024-01-03 | An experimental sorting method for improving metagenomic data encoding | Diogo Pratas et.al. | 2401.01786v1 | null |
2024-01-02 | Street Gaussians for Modeling Dynamic Urban Scenes | Yunzhi Yan et.al. | 2401.01339v1 | null |
2024-01-02 | Classifying Words with 3-sort Automata | Tomasz Jastrząb et.al. | 2401.01314v1 | null |
2024-01-03 | A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models | S. M Towhidul Islam Tonmoy et.al. | 2401.01313v2 | null |
2024-01-02 | Integrating Edges into U-Net Models with Explainable Activation Maps for Brain Tumor Segmentation using MR Images | Subin Sahayam et.al. | 2401.01303v1 | null |
2024-01-02 | Nicola Novello et.al. | 2401.01268v1 | link | |
2024-01-02 | VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM | Fuchen Long et.al. | 2401.01256v1 | null |
2024-01-02 | An operational approach to classifying measurement incompatibility | Arun Kumar Das et.al. | 2401.01236v1 | null |
2024-01-03 | Distribution Matching for Multi-Task Learning of Classification Tasks: a Large-Scale Study on Faces & Beyond | Dimitrios Kollias et.al. | 2401.01219v2 | null |
2024-01-02 | FGENet: Fine-Grained Extraction Network for Congested Crowd Counting | Hao-Yuan Ma et.al. | 2401.01208v1 | null |
2024-01-02 | Whole-examination AI estimation of fetal biometrics from 20-week ultrasound scans | Lorenzo Venturini et.al. | 2401.01201v1 | null |
2023-12-29 | Computational Tools for Trees in Gauge Theory and Gravity | Jacob L. Bourjaily et.al. | 2312.17745v1 | null |
2023-12-29 | Multiscale Vision Transformers meet Bipartite Matching for efficient single-stage Action Localization | Ioanna Ntinou et.al. | 2312.17686v1 | null |
2023-12-29 | Malware Detection in IOT Systems Using Machine Learning Techniques | Ali Mehrban et.al. | 2312.17683v1 | null |
2023-12-29 | FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis | Feng Liang et.al. | 2312.17681v1 | null |
2023-12-29 | Grasping, Part Identification, and Pose Refinement in One Shot with a Tactile Gripper | Joyce Xin-Yan Lim et.al. | 2312.17650v1 | null |
2023-12-29 | MoD2T:Model-Data-Driven Motion-Static Object Tracking Method | Yang Feng et.al. | 2312.17641v1 | null |
2023-12-29 | A New Explanation of the Mechanism of Hadley Circulation | Wei Huang et.al. | 2312.17637v1 | null |
2023-12-29 | Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training | Dongfang Li et.al. | 2312.17591v1 | null |
2023-12-29 | A Tool for the Procedural Generation of Shaders using Interactive Evolutionary Algorithms | Elio Sasso et.al. | 2312.17587v1 | link |
2023-12-29 | Distribution-based Low-rank Embedding | Bardia Yousefi et.al. | 2312.17579v1 | null |
2023-12-28 | A Simple LLM Framework for Long-Range Video Question-Answering | Ce Zhang et.al. | 2312.17235v1 | null |
2023-12-28 | 4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency | Yuyang Yin et.al. | 2312.17225v1 | null |
2023-12-28 | EFHQ: Multi-purpose ExtremePose-Face-HQ dataset | Trung Tuan Dao et.al. | 2312.17205v1 | null |
2023-12-28 | One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts | Ziheng Zhao et.al. | 2312.17183v1 | null |
2023-12-28 | Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action | Jiasen Lu et.al. | 2312.17172v1 | null |
2023-12-28 | Classification of multiplication modules over multiplication rings with finitely many minimal primes | Volodymyr Bavula et.al. | 2312.17170v1 | null |
2023-12-28 | Securing NextG Systems against Poisoning Attacks on Federated Learning: A Game-Theoretic Solution | Yalin E. Sagduyu et.al. | 2312.17164v1 | null |
2023-12-28 | Replica Tree-based Federated Learning using Limited Data | Ramona Ghilea et.al. | 2312.17159v1 | null |
2023-12-29 | ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe | Yifan Bai et.al. | 2312.17133v2 | null |
2023-12-28 | Grounding-Prompter: Prompting LLM with Multimodal Information for Temporal Sentence Grounding in Long Videos | Houlun Chen et.al. | 2312.17117v1 | null |
2023-12-26 | Microwave signal processing using an analog quantum reservoir computer | Alen Senanian et.al. | 2312.16166v1 | null |
2023-12-26 | Large-scale Long-tailed Disease Diagnosis on Radiology Images | Qiaoyu Zheng et.al. | 2312.16151v1 | null |
2023-12-27 | The Media Bias Taxonomy: A Systematic Literature Review on the Forms and Automated Detection of Media Bias | Timo Spinde et.al. | 2312.16148v2 | link |
2023-12-26 | The non-Abelian Aharonov-Bohm effect | P. A. Horvathy et.al. | 2312.16133v1 | null |
2023-12-26 | LangSplat: 3D Language Gaussian Splatting | Minghan Qin et.al. | 2312.16084v1 | null |
2023-12-26 | AdaNAS: Adaptively Post-processing with Self-supervised Neural Architecture Search for Ensemble Rainfall Forecasts | Yingpeng Wen et.al. | 2312.16046v1 | null |
2023-12-26 | An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced linear classification | Hyenkyun Woo et.al. | 2312.16043v1 | null |
2023-12-26 | Multi-scale Progressive Feature Embedding for Accurate NIR-to-RGB Spectral Domain Translation | Xingxing Yang et.al. | 2312.16040v1 | null |
2023-12-26 | Plug-and-Play Regularization on Magnitude with Deep Priors for 3D Near-Field MIMO Imaging | Okyanus Oral et.al. | 2312.16024v1 | null |
2023-12-26 | Classification of positive solutions of Hardy-Sobolev equation without the finite volume constraints | Lu Chen et.al. | 2312.16017v1 | null |
2023-12-25 | Training Convolutional Neural Networks with the Forward-Forward algorithm | Riccardo Scodellaro et.al. | 2312.14924v2 | null |
2023-12-22 | DRStageNet: Deep Learning for Diabetic Retinopathy Staging from Fundus Images | Yevgeniy Men et.al. | 2312.14891v1 | null |
2023-12-22 | On rate-optimal classification from non-private and from private data | Balázs Csanád Csáji et.al. | 2312.14889v1 | null |
2023-12-22 | Classification of cubic tricirculant nut graphs | Ivan Damnjanović et.al. | 2312.14884v1 | null |
2023-12-22 | Neural-network-based regularization methods for inverse problems in imaging | Andreas Habring et.al. | 2312.14849v1 | null |
2023-12-22 | Classification of 3-GNDB Graphs | Amir Hosseini et.al. | 2312.14835v1 | null |
2023-12-22 | Dreaming of Electrical Waves: Generative Modeling of Cardiac Excitation Waves using Diffusion Models | Tanish Baranwal et.al. | 2312.14830v1 | null |
2023-12-22 | Classification of generalised higher-order Einstein-Maxwell Lagrangians | Aimeric Colléaux et.al. | 2312.14814v1 | null |
2023-12-22 | On support vector machines under a multiple-cost scenario | Sandra Benítez-Peña et.al. | 2312.14795v1 | null |
2023-12-22 | The Rate-Distortion-Perception-Classification Tradeoff: Joint Source Coding and Modulation via Inverse-Domain GANs | Junli Fang et.al. | 2312.14792v1 | null |
2023-12-21 | 3D Pose Estimation of Two Interacting Hands from a Monocular Event Camera | Christen Millerdurai et.al. | 2312.14157v1 | null |
2023-12-21 | Virtual Pets: Animatable Animal Generation in 3D Scenes | Yen-Chi Cheng et.al. | 2312.14154v1 | null |
2023-12-21 | TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification | Qinying Liu et.al. | 2312.14149v1 | link |
2023-12-21 | HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs | Artem Sevastopolsky et.al. | 2312.14140v1 | null |
2023-12-21 | Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-based Approach | Qinying Liu et.al. | 2312.14138v1 | link |
2023-12-21 | Diffusion Reward: Learning Rewards via Conditional Video Diffusion | Tao Huang et.al. | 2312.14134v1 | null |
2023-12-21 | WellFactor: Patient Profiling using Integrative Embedding of Healthcare Data | Dongjin Choi et.al. | 2312.14129v1 | null |
2023-12-21 | VideoPoet: A Large Language Model for Zero-Shot Video Generation | Dan Kondratyuk et.al. | 2312.14125v1 | null |
2023-12-21 | LingoQA: Video Question Answering for Autonomous Driving | Ana-Maria Marcu et.al. | 2312.14115v1 | link |
2023-12-21 | LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding | Senqiao Yang et.al. | 2312.14074v1 | null |
2023-12-20 | Deep Learning on 3D Neural Fields | Pierluigi Zama Ramirez et.al. | 2312.13277v1 | null |
2023-12-20 | The 1/4-BPS building blocks of brane interactions | Ben Eckardt et.al. | 2312.13269v1 | null |
2023-12-20 | ClassLIE: Structure- and Illumination-Adaptive Classification for Low-Light Image Enhancement | Zixiang Wei et.al. | 2312.13265v1 | null |
2023-12-20 | Putting the p back in Prym | Jeff Achter et.al. | 2312.13263v1 | null |
2023-12-20 | The role of data embedding in equivariant quantum convolutional neural networks | Sreetama Das et.al. | 2312.13250v1 | null |
2023-12-20 | Enhancing Neural Training via a Correlated Dynamics Model | Jonathan Brokman et.al. | 2312.13247v1 | null |
2023-12-20 | SISMIK for brain MRI: Deep-learning-based motion estimation and model-based motion correction in k-space | Oscar Dabrowski et.al. | 2312.13220v1 | null |
2023-12-20 | Boost recall in QSO selection from highly imbalanced photometric datasets | Giorgio Calderone et.al. | 2312.13194v1 | null |
2023-12-20 | Ergodic measures for periodic type |
Yuriy Tumarkin et.al. | 2312.13165v1 | null |
2023-12-20 | Underwater Acoustic Signal Recognition Based on Salient Features | Minghao Chen et.al. | 2312.13143v1 | null |
2023-12-19 | Tracking Any Object Amodally | Cheng-Yen Hsieh et.al. | 2312.12433v1 | null |
2023-12-19 | The Endoscapes Dataset for Surgical Scene Segmentation, Object Detection, and Critical View of Safety Assessment: Official Splits and Benchmark | Aditya Murali et.al. | 2312.12429v1 | null |
2023-12-19 | Chasing Fairness in Graphs: A GNN Architecture Perspective | Zhimeng Jiang et.al. | 2312.12369v1 | link |
2023-12-19 | Easy quantum groups | Teo Banica et.al. | 2312.12368v1 | null |
2023-12-19 | SMC-NCA: Semantic-guided Multi-level Contrast for Semi-supervised Action Segmentation | Feixiang Zhou et.al. | 2312.12347v1 | null |
2023-12-19 | On the Effectiveness of Retrieval, Alignment, and Replay in Manipulation | Norman Di Palo et.al. | 2312.12345v1 | null |
2023-12-19 | Full-reference Video Quality Assessment for User Generated Content Transcoding | Zihao Qi et.al. | 2312.12317v1 | null |
2023-12-19 | First qualitative observations on deep learning vision model YOLO and DETR for automated driving in Austria | Stefan Schoder et.al. | 2312.12314v1 | null |
2023-12-19 | Holography of New Conformal Higher Spin Gravities in 3d | I. Lovrekovic et.al. | 2312.12301v1 | null |
2023-12-19 | Prompt-based Domain Discrimination for Multi-source Time Series Domain Adaptation | Junxiang Wang et.al. | 2312.12276v1 | null |
2023-12-18 | Development and Evaluation of Ensemble Learning-based Environmental Methane Detection and Intensity Prediction Models | Reek Majumder et.al. | 2312.10879v1 | null |
2023-12-18 | Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation | Hui Fu et.al. | 2312.10877v1 | null |
2023-12-17 | Global relaxation-based LP-Newton method for multiple hyperparameter selection in support vector classification with feature selection | Qingna Li et.al. | 2312.10848v1 | null |
2023-12-17 | Online Boosting Adaptive Learning under Concept Drift for Multistream Classification | En Yu et.al. | 2312.10841v1 | null |
2023-12-17 | Learning to Act without Actions | Dominik Schmidt et.al. | 2312.10812v1 | null |
2023-12-17 | Land use/land cover classification of fused Sentinel-1 and Sentinel-2 imageries using ensembles of Random Forests | Shivam Pande et.al. | 2312.10798v1 | null |
2023-12-17 | Learning to Learn in Interactive Constraint Acquisition | Dimos Tsouros et.al. | 2312.10795v1 | null |
2023-12-17 | Identification of Knowledge Neurons in Protein Language Models | Divya Nori et.al. | 2312.10770v1 | null |
2023-12-17 | Multi-Label Classification of COVID-Tweets Using Large Language Models | Aniket Deroy et.al. | 2312.10748v1 | link |
2023-12-17 | Unmasking Deepfake Faces from Videos Using An Explainable Cost-Sensitive Deep Learning Approach | Faysal Mahmud et.al. | 2312.10740v1 | link |
2023-12-15 | Understanding Probe Behaviors through Variational Bounds of Mutual Information | Kwanghee Choi et.al. | 2312.10019v1 | link |
2023-12-15 | Wearable Coaxially-shielded Metamaterial for Magnetic Resonance Imaging | Xia Zhu et.al. | 2312.10018v1 | null |
2023-12-15 | On the Invertibility of Euler Integral Transforms with Hyperplanes and Quadric Hypersurfaces | Mattie Ji et.al. | 2312.10002v1 | null |
2023-12-15 | Towards Architecture-Insensitive Untrained Network Priors for Accelerated MRI Reconstruction | Yilin Liu et.al. | 2312.09988v1 | null |
2023-12-15 | DHFormer: A Vision Transformer-Based Attention Module for Image Dehazing | Abdul Wasi et.al. | 2312.09955v1 | null |
2023-12-15 | Multi-level graph learning for audio event classification and human-perceived annoyance rating prediction | Yuanbo Hou et.al. | 2312.09952v1 | null |
2023-12-15 | LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer | Yuxin Cao et.al. | 2312.09935v1 | link |
2023-12-15 | RDR: the Recap, Deliberate, and Respond Method for Enhanced Language Understanding | Yuxin Zi et.al. | 2312.09932v1 | null |
2023-12-15 | Reliable Probabilistic Classification with Neural Networks | Harris Papadopoulos et.al. | 2312.09912v1 | null |
2023-12-15 | TMP: Temporal Motion Propagation for Online Video Super-Resolution | Zhengqiang Zhang et.al. | 2312.09909v1 | null |
2023-12-14 | 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting | Zhiyin Qian et.al. | 2312.09228v1 | null |
2023-12-14 | Efficient Online Learning of Contact Force Models for Connector Insertion | Kevin Tracy et.al. | 2312.09190v1 | null |
2023-12-14 | General Object Foundation Model for Images and Videos at Scale | Junfeng Wu et.al. | 2312.09158v1 | null |
2023-12-14 | Evaluating Augmented Reality Communication: How Can We Teach Procedural Skill in AR? | Manuel Rebol et.al. | 2312.09152v1 | null |
2023-12-14 | Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting | Anthony Chen et.al. | 2312.09148v1 | null |
2023-12-14 | Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training Strategy | Junsu Kim et.al. | 2312.09139v1 | null |
2023-12-14 | Less is more -- the Dispatcher/ Executor principle for multi-task Reinforcement Learning | Martin Riedmiller et.al. | 2312.09120v1 | null |
2023-12-14 | VideoLCM: Video Latent Consistency Model | Xiang Wang et.al. | 2312.09109v1 | null |
2023-12-14 | FastInject: Injecting Unpaired Text Data into CTC-based ASR training | Keqi Deng et.al. | 2312.09100v1 | null |
2023-12-14 | Agent Attention: On the Integration of Softmax and Linear Attention | Dongchen Han et.al. | 2312.08874v1 | link |
2023-12-13 | VLAP: Efficient Video-Language Alignment via Frame Prompting and Distilling for Video Question Answering | Xijun Wang et.al. | 2312.08367v1 | null |
2023-12-13 | Challenges and Opportunities in Implementing Negative Differential Resistance Mode Reconfigurable Field Effect Transistors | Lephe S et.al. | 2312.08351v1 | null |
2023-12-13 | Ehancing CT Image synthesis from multi-modal MRI data based on a multi-task neural network framework | Zhuoyao Xin et.al. | 2312.08343v1 | null |
2023-12-13 | Preparing VVC for Streaming: A Fast Multi-Rate Encoding Approach | Yiqun Liu et.al. | 2312.08330v1 | null |
2023-12-13 | Affine monoids of corank one | Yulia Zaitseva et.al. | 2312.08316v1 | null |
2023-12-13 | VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space | Guénolé Fiche et.al. | 2312.08291v1 | null |
2023-12-13 | PhenDiff: Revealing Invisible Phenotypes with Conditional Diffusion Models | Anis Bourou et.al. | 2312.08290v1 | link |
2023-12-13 | On the verification of Embeddings using Hybrid Markov Logic | Anup Shakya et.al. | 2312.08287v1 | null |
2023-12-14 | High-throughput Biomedical Relation Extraction for Semi-Structured Web Articles Empowered by Large Language Models | Songchi Zhou et.al. | 2312.08274v2 | null |
2023-12-13 | Efficient Multi-Object Pose Estimation using Multi-Resolution Deformable Attention and Query Aggregation | Arul Selvam Periyasamy et.al. | 2312.08268v1 | null |
2023-12-12 | diff History for Long-Context Language Agents | Ulyana Piterbarg et.al. | 2312.07540v1 | null |
2023-12-12 | FreeInit: Bridging Initialization Gap in Video Diffusion Models | Tianxing Wu et.al. | 2312.07537v1 | link |
2023-12-12 | WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion | Soyong Shin et.al. | 2312.07531v1 | null |
2023-12-12 | RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation | Peng Lu et.al. | 2312.07526v1 | link |
2023-12-12 | PEEKABOO: Interactive Video Generation via Masked-Diffusion | Yash Jain et.al. | 2312.07509v1 | null |
2023-12-12 | NAC-TCN: Temporal Convolutional Networks with Causal Dilated Neighborhood Attention for Emotion Understanding | Alexander Mehta et.al. | 2312.07507v1 | link |
2023-12-12 | COLMAP-Free 3D Gaussian Splatting | Yang Fu et.al. | 2312.07504v1 | null |
2023-12-12 | NearbyPatchCL: Leveraging Nearby Patches for Self-Supervised Patch-Level Multi-Class Classification in Whole-Slide Images | Gia-Bao Le et.al. | 2312.07489v1 | null |
2023-12-12 | MinD-3D: Reconstruct High-quality 3D objects in Human Brain | Jianxiong Gao et.al. | 2312.07485v1 | null |
2023-12-12 | Classification of retail products: From probabilistic ranking to neural networks | Manar Mohamed Hafez et.al. | 2312.07482v1 | null |
2023-12-11 | Photorealistic Video Generation with Diffusion Models | Agrim Gupta et.al. | 2312.06662v1 | null |
2023-12-11 | LightSim: Neural Lighting Simulation for Urban Scenes | Ava Pun et.al. | 2312.06654v1 | null |
2023-12-11 | Beyond Classification: Definition and Density-based Estimation of Calibration in Object Detection | Teodora Popordanoska et.al. | 2312.06645v1 | null |
2023-12-11 | Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution | Shangchen Zhou et.al. | 2312.06640v1 | null |
2023-12-12 | TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation | Rongkun Zheng et.al. | 2312.06630v2 | link |
2023-12-11 | Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism | Georgios Milis et.al. | 2312.06613v1 | link |
2023-12-11 | Early Action Recognition with Action Prototypes | Guglielmo Camporese et.al. | 2312.06598v1 | null |
2023-12-11 | Flexible visual prompts for in-context learning in computer vision | Thomas Foster et.al. | 2312.06592v1 | link |
2023-12-11 | QuickQuakeBuildings: Post-earthquake SAR-Optical Dataset for Quick Damaged-building Detection | Yao Sun et.al. | 2312.06587v1 | null |
2023-12-12 | ESO/HARPS Radial Velocities Catalog | Mauro Barbieri et.al. | 2312.06586v2 | null |
2023-12-08 | The Long Secondary Period (LSP) Variables: Overview and Some Analysis | John R. Percy et.al. | 2312.05255v1 | null |
2023-12-08 | Few-Shot Class-Incremental Learning via Training-Free Prototype Calibration | Qi-Wei Wang et.al. | 2312.05229v1 | null |
2023-12-08 | Shape Matters: Detecting Vertebral Fractures Using Differentiable Point-Based Shape Decoding | Hellena Hempe et.al. | 2312.05220v1 | link |
2023-12-08 | Enhancing Facial Classification and Recognition using 3D Facial Models and Deep Learning | Houting Li et.al. | 2312.05219v1 | null |
2023-12-08 | IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing | Shaofei Wang et.al. | 2312.05210v1 | null |
2023-12-08 | Embedding theory in ML toward real-time tracking of structural dynamics through hyperspectral datasets | Jonathan D Hollenbach et.al. | 2312.05201v1 | null |
2023-12-08 | Video-Based Rendering Techniques: A Survey | Rafael Kuffner dos Anjos et.al. | 2312.05179v1 | null |
2023-12-08 | Enhancing Single-Frame Supervision for Better Temporal Action Localization | Changjian Chen et.al. | 2312.05178v1 | null |
2023-12-08 | MRI Scan Synthesis Methods based on Clustering and Pix2Pix | Giulia Baldini et.al. | 2312.05176v1 | null |
2023-12-08 | TriHuman : A Real-time and Controllable Tri-plane Representation for Detailed Human Geometry and Appearance Synthesis | Heming Zhu et.al. | 2312.05161v1 | null |
2023-12-07 | GenDeF: Learning Generative Deformation Field for Video Generation | Wen Wang et.al. | 2312.04561v1 | null |
2023-12-07 | MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar | Yufan Chen et.al. | 2312.04558v1 | null |
2023-12-07 | GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation | Shoufa Chen et.al. | 2312.04557v1 | null |
2023-12-07 | SPIDeRS: Structured Polarization for Invisible Depth and Reflectance Sensing | Tomoki Ichikawa et.al. | 2312.04553v1 | null |
2023-12-07 | PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play | Lili Chen et.al. | 2312.04549v1 | null |
2023-12-07 | Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception? | Aritra Dutta et.al. | 2312.04548v1 | null |
2023-12-07 | Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models | Ivan Kapelyukh et.al. | 2312.04533v1 | null |
2023-12-07 | Camera Height Doesn't Change: Unsupervised Monocular Scale-Aware Road-Scene Depth Estimation | Genki Kinoshita et.al. | 2312.04530v1 | null |
2023-12-07 | RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models | Ozgur Kara et.al. | 2312.04524v1 | link |
2023-12-07 | Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation | Zhiwu Qing et.al. | 2312.04483v1 | null |
2023-12-06 | OneLLM: One Framework to Align All Modalities with Language | Jiaming Han et.al. | 2312.03700v1 | link |
2023-12-07 | Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers | Umberto Cappellazzo et.al. | 2312.03694v2 | null |
2023-12-06 | Direct Exoplanet Detection Using Deep Convolutional Image Reconstruction (ConStruct): A New Algorithm for Post-Processing High-Contrast Images | Trevor N. Wolf et.al. | 2312.03671v1 | null |
2023-12-06 | Annihilating branching Brownian motion | Daniel Ahlberg et.al. | 2312.03669v1 | null |
2023-12-06 | Towards small and accurate convolutional neural networks for acoustic biodiversity monitoring | Serge Zaugg et.al. | 2312.03666v1 | null |
2023-12-06 | Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving | Ming Nie et.al. | 2312.03661v1 | link |
2023-12-06 | Editable Stain Transformation Of Histological Images Using Unpaired GANs | Tibor Sloboda et.al. | 2312.03647v1 | link |
2023-12-06 | MotionCtrl: A Unified and Flexible Motion Controller for Video Generation | Zhouxia Wang et.al. | 2312.03641v1 | null |
2023-12-06 | Training Neural Networks on RAW and HDR Images for Restoration Tasks | Lei Luo et.al. | 2312.03640v1 | link |
2023-12-07 | Evaluation of Active Feature Acquisition Methods for Static Feature Settings | Henrik von Kleist et.al. | 2312.03619v2 | null |
2023-12-05 | Dexterous Functional Grasping | Ananye Agarwal et.al. | 2312.02975v1 | null |
2023-12-05 | Describing Differences in Image Sets with Natural Language | Lisa Dunlap et.al. | 2312.02974v1 | link |
2023-12-05 | GauHuman: Articulated Gaussian Splatting from Monocular Human Videos | Shoukang Hu et.al. | 2312.02973v1 | link |
2023-12-05 | Detecting algorithmic bias in medical AI-models | Jeffrey Smith et.al. | 2312.02959v1 | null |
2023-12-05 | Classification for everyone : Building geography agnostic models for fairer recognition | Akshat Jindal et.al. | 2312.02957v1 | null |
2023-12-05 | Choroidalyzer: An open-source, end-to-end pipeline for choroidal analysis in optical coherence tomography | Justin Engelmann et.al. | 2312.02956v1 | null |
2023-12-05 | An alternating peak-optimization method for optimal trajectory generation of quadrotor drones | Wytze A. B. de Vries et.al. | 2312.02944v1 | null |
2023-12-05 | Fast CT anatomic localization algorithm | Amit Oved et.al. | 2312.02941v1 | null |
2023-12-05 | Drag-A-Video: Non-rigid Video Editing with Point-based Interaction | Yao Teng et.al. | 2312.02936v1 | null |
2023-12-06 | WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation | Jiachen Lu et.al. | 2312.02934v2 | link |
2023-12-04 | iMatching: Imperative Correspondence Learning | Zitong Zhan et.al. | 2312.02141v1 | null |
2023-12-04 | Fast View Synthesis of Casual Videos | Yao-Chih Lee et.al. | 2312.02135v1 | null |
2023-12-04 | GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians | Liangxiao Hu et.al. | 2312.02134v1 | null |
2023-12-04 | Hot PATE: Private Aggregation of Distributions for Diverse Task | Edith Cohen et.al. | 2312.02132v1 | null |
2023-12-04 | Can we truly transfer an actor's genuine happiness to avatars? An investigation into virtual, real, posed and spontaneous faces | Vitor Miguel Xavier Peres et.al. | 2312.02128v1 | null |
2023-12-04 | Cosmic star-formation history and black hole accretion history inferred from the JWST mid-infrared source counts | Seong Jin Kim et.al. | 2312.02090v1 | null |
2023-12-05 | VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence | Yuchao Gu et.al. | 2312.02087v2 | null |
2023-12-04 | Integrating AI into CCTV Systems: A Comprehensive Evaluation of Smart Video Surveillance in Community Space | Shanle Yao et.al. | 2312.02078v1 | null |
2023-12-04 | GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians | Shenhan Qian et.al. | 2312.02069v1 | null |
2023-12-04 | TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding | Shuhuai Ren et.al. | 2312.02051v1 | null |
2023-12-01 | Dense Optical Tracking: Connecting the Dots | Guillaume Le Moing et.al. | 2312.00786v1 | null |
2023-12-01 | Sequential Modeling Enables Scalable Learning for Large Vision Models | Yutong Bai et.al. | 2312.00785v1 | null |
2023-12-01 | MorpheuS: Neural Dynamic 360° Surface Reconstruction from Monocular RGB-D Video | Hengyi Wang et.al. | 2312.00778v1 | null |
2023-12-01 | VideoBooth: Diffusion-based Video Generation with Image Prompts | Yuming Jiang et.al. | 2312.00777v1 | null |
2023-12-01 | Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans | Homanga Bharadhwaj et.al. | 2312.00775v1 | null |
2023-12-01 | Explaining Knock-on Effects of Bias Mitigation | Svetoslav Nizhnichenkov et.al. | 2312.00765v1 | null |
2023-12-04 | Deep Unlearning: Fast and Efficient Training-free Approach to Controlled Forgetting | Sangamesh Kodge et.al. | 2312.00761v2 | null |
2023-12-01 | Mitigating Over-smoothing in Transformers via Regularized Nonlocal Functionals | Tam Nguyen et.al. | 2312.00751v1 | null |
2023-12-01 | Tight-minimal dichotomies in Banach spaces | Alejandra C. Cáceres-Rigo et.al. | 2312.00721v1 | null |
2023-12-01 | GIFT: Generative Interpretable Fine-Tuning Transformers | Chinmay Savadikar et.al. | 2312.00700v1 | link |
2023-11-30 | Just Add |
Dominick Reilly et.al. | 2311.18840v1 | null |
2023-11-30 | TrafficMOT: A Challenging Dataset for Multi-Object Tracking in Complex Traffic Scenarios | Lihao Liu et.al. | 2311.18839v1 | null |
2023-11-30 | VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models | Zhen Xing et.al. | 2311.18837v1 | null |
2023-11-30 | ART$\boldsymbol{\cdot}$V: Auto-Regressive Text-to-Video Generation with Diffusion Models | Wenming Weng et.al. | 2311.18834v1 | null |
2023-11-30 | MotionEditor: Editing Video Motion via Content-Aware Diffusion | Shuyuan Tu et.al. | 2311.18830v1 | link |
2023-11-30 | MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation | Yanhui Wang et.al. | 2311.18829v1 | null |
2023-11-30 | Motion-Conditioned Image Animation for Video Editing | Wilson Yan et.al. | 2311.18827v1 | null |
2023-11-30 | CAST: Cross-Attention in Space and Time for Video Action Recognition | Dongho Lee et.al. | 2311.18825v1 | link |
2023-11-30 | Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking | Kaifeng Lyu et.al. | 2311.18817v1 | link |
2023-11-30 | BIOCLIP: A Vision Foundation Model for the Tree of Life | Samuel Stevens et.al. | 2311.18803v1 | null |
2023-11-30 | Do text-free diffusion models learn discriminative visual representations? | Soumik Mukhopadhyay et.al. | 2311.17921v2 | null |
2023-11-29 | Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving | Yuqi Wang et.al. | 2311.17918v1 | link |
2023-11-29 | HUGS: Human Gaussian Splats | Muhammed Kocabas et.al. | 2311.17910v1 | null |
2023-11-29 | SODA: Bottleneck Diffusion Models for Representation Learning | Drew A. Hudson et.al. | 2311.17901v1 | null |
2023-11-30 | Knowledge Pursuit Prompting for Zero-Shot Multimodal Synthesis | Jinqi Luo et.al. | 2311.17898v2 | null |
2023-11-29 | On the geometry of tensor products over finite fields | Stefano Lia et.al. | 2311.17896v1 | null |
2023-11-29 | Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation | Shuangrui Ding et.al. | 2311.17893v1 | null |
2023-11-29 | TSDF-Sampling: Efficient Sampling for Neural Surface Field using Truncated Signed Distance Field | Chaerin Min et.al. | 2311.17878v1 | null |
2023-11-29 | Enhancing Post-Hoc Explanation Benchmark Reliability for Image Classification | Tristan Gomez et.al. | 2311.17876v1 | null |
2023-11-29 | On the Adversarial Robustness of Graph Contrastive Learning Methods | Filippo Guerranti et.al. | 2311.17853v1 | null |
2023-11-28 | Panoptic Video Scene Graph Generation | Jingkang Yang et.al. | 2311.17058v1 | link |
2023-11-28 | Self-Supervised Motion Magnification by Backpropagating Through Optical Flow | Zhaoying Pan et.al. | 2311.17056v1 | null |
2023-11-28 | MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training | Pavan Kumar Anasosalu Vasu et.al. | 2311.17049v1 | null |
2023-11-28 | Jets of foliations and |
Francis Bischoff et.al. | 2311.17045v1 | null |
2023-11-28 | LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models | Yanwei Li et.al. | 2311.17043v1 | link |
2023-11-29 | Efficient In-Context Learning in Vision-Language Models for Egocentric Videos | Keunwoo Peter Yu et.al. | 2311.17041v2 | null |
2023-11-28 | Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer | Danah Yatim et.al. | 2311.17009v1 | null |
2023-11-28 | MVBench: A Comprehensive Multi-modal Video Understanding Benchmark | Kunchang Li et.al. | 2311.17005v1 | link |
2023-11-28 | Mirković-Vilonen Polytopes from Combinatorics | Mario Sanchez et.al. | 2311.16979v1 | null |
2023-11-28 | Natural Language Processing Through Transfer Learning: A Case Study on Sentiment Analysis | Aman Yadav et.al. | 2311.16965v1 | null |
2023-11-28 | Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models | Munan Ning et.al. | 2311.16103v2 | link |
2023-11-27 | GART: Gaussian Articulated Template Models | Jiahui Lei et.al. | 2311.16099v1 | null |
2023-11-27 | On Bringing Robots Home | Nur Muhammad Mahi Shafiullah et.al. | 2311.16098v1 | link |
2023-11-27 | CG-HOI: Contact-Guided 3D Human-Object Interaction Generation | Christian Diller et.al. | 2311.16097v1 | null |
2023-11-27 | Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling | Zhe Li et.al. | 2311.16096v1 | link |
2023-11-27 | Three-dimensional |
Alexander C. Tyner et.al. | 2311.16092v1 | null |
2023-11-27 | BERT Goes Off-Topic: Investigating the Domain Transfer Challenge using Genre Classification | Dmitri Roussinov et.al. | 2311.16083v1 | link |
2023-11-27 | ViT-Lens-2: Gateway to Omni-modal Intelligence | Weixian Lei et.al. | 2311.16081v1 | link |
2023-11-27 | Correlated Spectral and Recurrence Variations of Cygnus X-1 | E. M. Broadbent et.al. | 2311.16070v1 | null |
2023-11-27 | DiffSLVA: Harnessing Diffusion Models for Sign Language Video Anonymization | Zhaoyang Xia et.al. | 2311.16060v1 | link |
2023-11-24 | SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation | Lingchen Meng et.al. | 2311.14671v1 | link |
2023-11-24 | JetLOV: Enhancing Jet Tree Tagging through Neural Network Learning of Optimal LundNet Variables | Mauricio A. Diaz et.al. | 2311.14654v1 | link |
2023-11-24 | Learning in Deep Factor Graphs with Gaussian Belief Propagation | Seth Nabarro et.al. | 2311.14649v1 | null |
2023-11-24 | Continuous football player tracking from discrete broadcast data | Matthew J. Penn et.al. | 2311.14642v1 | null |
2023-11-24 | Emergent Topology in Many-Body Dissipative Quantum Chaos | Antonio M. García-García et.al. | 2311.14640v1 | null |
2023-11-24 | Unsupervised high-throughput segmentation of cells and cell nuclei in quantitative phase images | Julia Sistermanns et.al. | 2311.14639v1 | null |
2023-11-24 | ARIA: On the interaction between Architectures, Aggregation methods and Initializations in federated visual classification | Vasilis Siomos et.al. | 2311.14625v1 | null |
2023-11-24 | Neural Style Transfer for Computer Games | Eleftherios Ioannou et.al. | 2311.14617v1 | null |
2023-11-24 | Animate124: Animating One Image to 4D Dynamic Scene | Yuyang Zhao et.al. | 2311.14603v1 | null |
2023-11-24 | A Metalearned Neural Circuit for Nonparametric Bayesian Inference | Jake C. Snell et.al. | 2311.14601v1 | link |
2023-11-22 | WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space | Katja Schwarz et.al. | 2311.13570v1 | null |
2023-11-22 | Belted sum decompositions of fully augmented links | Porter Morgan et.al. | 2311.13540v1 | null |
2023-11-22 | Learned Nonlinear Predictor for Critically Sampled 3D Point Cloud Attribute Compression | Tam Thuc Do et.al. | 2311.13539v1 | null |
2023-11-22 | Leveraging CNNs and Ensemble Learning for Automated Disaster Image Classification | Archit Rathod et.al. | 2311.13531v1 | null |
2023-11-22 | Applying Dimensionality Reduction as Precursor to LSTM-CNN Models for Classifying Imagery and Motor Signals in ECoG-Based BCIs | Soham Bafana et.al. | 2311.13507v1 | link |
2023-11-22 | Current Topological and Machine Learning Applications for Bias Detection in Text | Colleen Farrelly et.al. | 2311.13495v1 | null |
2023-11-22 | Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning | Bhavya Mehta et.al. | 2311.13490v1 | null |
2023-11-22 | Deep-learning-based acceleration of MRI for radiotherapy planning of pediatric patients with brain tumors | Shahinur Alam et.al. | 2311.13485v1 | link |
2023-11-22 | Solution discovery via reconfiguration for problems in P | Mario Grobler et.al. | 2311.13478v1 | null |
2023-11-22 | Experimentation in Early-Stage Video Game Startups: Practices and Challenges | Henry Edison et.al. | 2311.13462v1 | null |
2023-11-21 | Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models | David Stotko et.al. | 2311.12796v1 | null |
2023-11-21 | Quantifying Impairment and Disease Severity Using AI Models Trained on Healthy Subjects | Boyang Yu et.al. | 2311.12781v1 | link |
2023-11-21 | Swift Parameter-free Attention Network for Efficient Super-Resolution | Cheng Wan et.al. | 2311.12770v1 | link |
2023-11-22 | Investigating Weight-Perturbed Deep Neural Networks With Application in Iris Presentation Attack Detection | Renu Sharma et.al. | 2311.12764v2 | link |
2023-11-21 | High-resolution Image-based Malware Classification using Multiple Instance Learning | Tim Peters et.al. | 2311.12760v1 | link |
2023-11-21 | SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction | Yuanhui Huang et.al. | 2311.12754v1 | link |
2023-11-21 | Image Transformation for IoT Time-Series Data: A Review | Duygu Altunkaya et.al. | 2311.12742v1 | null |
2023-11-21 | Exploring Graph Classification Techniques Under Low Data Constraints: A Comprehensive Study | Kush Kothari et.al. | 2311.12737v1 | null |
2023-11-21 | Not Just Training, Also Testing: High School Youths' Perspective-Taking through Peer Testing Machine Learning-Powered Applications | L. Morales-Navarro et.al. | 2311.12733v1 | null |
2023-11-21 | Cascade Learning Localises Discriminant Features in Visual Scene Classification | Junwen Wang et.al. | 2311.12704v1 | null |
2023-11-20 | Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation | Wenhao Li et.al. | 2311.12028v1 | null |
2023-11-20 | GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration | Naoki Wake et.al. | 2311.12015v1 | null |
2023-11-20 | Evaluating Supervision Levels Trade-Offs for Infrared-Based People Counting | David Latortue et.al. | 2311.11974v1 | null |
2023-11-20 | SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masks | Jin Ye et.al. | 2311.11969v1 | link |
2023-11-20 | Correlated Attention in Transformers for Multivariate Time Series | Quang Minh Nguyen et.al. | 2311.11959v1 | null |
2023-11-20 | Tubular Curvature Filter: Implicit Pointwise Curvature Calculation Method for Tubular Objects | Elifnur Sunger et.al. | 2311.11931v1 | null |
2023-11-20 | LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions | Songhao Han et.al. | 2311.11904v1 | null |
2023-11-20 | Multimodal Characterization of Emotion within Multimedia Space | Dayo Samuel Banjo et.al. | 2311.11892v1 | null |
2023-11-20 | SniffyArt: The Dataset of Smelling Persons | Mathias Zinnen et.al. | 2311.11888v1 | null |
2023-11-20 | Multi-Task Faces (MTF) Data Set: A Legally and Ethically Compliant Collection of Face Images for Various Classification Tasks | Rami Haffar et.al. | 2311.11882v1 | link |
2023-11-17 | Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning | Rohit Girdhar et.al. | 2311.10709v1 | null |
2023-11-17 | SpACNN-LDVAE: Spatial Attention Convolutional Latent Dirichlet Variational Autoencoder for Hyperspectral Pixel Unmixing | Soham Chitnis et.al. | 2311.10701v1 | null |
2023-11-17 | A note on the convergence of the Bayesian entropy estimator for exchangeable partitions | Servet Martinez et.al. | 2311.10698v1 | null |
2023-11-17 | Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections | Lihan Zha et.al. | 2311.10678v1 | link |
2023-11-17 | 3D-TexSeg: Unsupervised Segmentation of 3D Texture using Mutual Transformer Learning | Iyyakutti Iyappan Ganapathi et.al. | 2311.10651v1 | null |
2023-11-17 | User Dynamics-Aware Edge Caching and Computing for Mobile Virtual Reality | Mushu Li et.al. | 2311.10645v1 | null |
2023-11-17 | Image-Domain Material Decomposition for Dual-energy CT using Unsupervised Learning with Data-fidelity Loss | Junbo Peng et.al. | 2311.10641v1 | null |
2023-11-17 | Scaling TabPFN: Sketching and Feature Selection for Tabular Prior-Data Fitted Networks | Benjamin Feuer et.al. | 2311.10609v1 | null |
2023-11-17 | Designing Reconfigurable Intelligent Systems with Markov Blankets | Boris Sedlak et.al. | 2311.10597v1 | null |
2023-11-17 | FOCAL: A Cost-Aware Video Dataset for Active Learning | Kiran Kokilepersaud et.al. | 2311.10591v1 | link |
2023-11-16 | Traffic Video Object Detection using Motion Prior | Lihao Liu et.al. | 2311.10092v1 | null |
2023-11-16 | Moduli space of rank three logarithmic connections on the projective line with three poles | Takafumi Matsumoto et.al. | 2311.10071v1 | null |
2023-11-16 | Inherently Interpretable Time Series Classification via Multiple Instance Learning | Joseph Early et.al. | 2311.10049v1 | link |
2023-11-16 | On the potential of Carbon-Enhanced Metal-Poor stars for Galactic Archaeology | Aruna Goswami et.al. | 2311.10043v1 | null |
2023-11-16 | Match and Locate: low-frequency monocular odometry based on deep feature matching | Stepan Konev et.al. | 2311.10034v1 | null |
2023-11-16 | Revolutionizing Customer Interactions: Insights and Challenges in Deploying ChatGPT and Generative Chatbots for FAQs | Feriel Khennouche et.al. | 2311.09976v1 | null |
2023-11-16 | From Pretext to Purpose: Batch-Adaptive Self-Supervised Learning | Jiansong Zhang et.al. | 2311.09974v1 | null |
2023-11-16 | VertDetect: Fully End-to-End 3D Vertebral Instance Segmentation Model | Geoff Klein et.al. | 2311.09958v1 | null |
2023-11-16 | Harnessing Transformers: A Leap Forward in Lung Cancer Image Detection | Amine Bechar et.al. | 2311.09942v1 | null |
2023-11-17 | A Framework for Monitoring and Retraining Language Models in Real-World Applications | Jaykumar Kasundra et.al. | 2311.09930v2 | null |
2023-11-15 | Single-Image 3D Human Digitization with Shape-Guided Diffusion | Badour AlBahar et.al. | 2311.09221v1 | null |
2023-11-15 | ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy | Kirill Vishniakov et.al. | 2311.09215v1 | link |
2023-11-15 | Topology of Pulsar Profiles (ToPP). I. Graph theory method and classification of the EPN | D. Vohl et.al. | 2311.09201v1 | null |
2023-11-15 | ExpM+NF: Differentially Private Machine Learning that Surpasses DPSGD | Robert A. Bridges et.al. | 2311.09200v1 | null |
2023-11-15 | Domain Aligned CLIP for Few-shot Classification | Muhammad Waleed Gondal et.al. | 2311.09191v1 | null |
2023-11-15 | ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models | Jierui Li et.al. | 2311.09182v1 | null |
2023-11-15 | RBPGAN: Recurrent Back-Projection GAN for Video Super Resolution | Dareen Hussein et.al. | 2311.09178v1 | null |
2023-11-15 | Model Agnostic Explainable Selective Regression via Uncertainty Estimation | Andrea Pugnana et.al. | 2311.09145v1 | null |
2023-11-15 | Explainable Text Classification Techniques in Legal Document Review: Locating Rationales without Using Human Annotated Training Text Snippets | Christian Mahoney et.al. | 2311.09133v1 | null |
2023-11-15 | Cross-view and Cross-pose Completion for 3D Human Understanding | Matthieu Armando et.al. | 2311.09104v1 | null |
2023-11-14 | MVSA-Net: Multi-View State-Action Recognition for Robust and Deployable Trajectory Generation | Ehsan Asali et.al. | 2311.08393v1 | null |
2023-11-14 | USLR: an open-source tool for unbiased and smooth longitudinal registration of brain MR | Adrià Casamitjana et.al. | 2311.08371v1 | link |
2023-11-14 | Inverse Learning with Extremely Sparse Feedback for Recommendation | Guanyu Lin et.al. | 2311.08302v1 | null |
2023-11-14 | Level Set KSVD | Omer Sapir et.al. | 2311.08284v1 | null |
2023-11-14 | TENT: Connect Language Models with IoT Sensors for Zero-Shot Activity Recognition | Yunjiao Zhou et.al. | 2311.08245v1 | null |
2023-11-14 | MCMC to address model misspecification in Deep Learning classification of Radio Galaxies | Devina Mohan et.al. | 2311.08243v1 | null |
2023-11-14 | Learning Physics-Inspired Regularization for Medical Image Registration with Hypernetworks | Anna Reithmeir et.al. | 2311.08239v1 | link |
2023-11-14 | Counterfactual Explanation for Regression via Disentanglement in Latent Space | Xuan Zhao et.al. | 2311.08228v1 | null |
2023-11-14 | Uni-COAL: A Unified Framework for Cross-Modality Synthesis and Super-Resolution of MR Images | Zhiyun Song et.al. | 2311.08225v1 | null |
2023-11-14 | Eval-GCSC: A New Metric for Evaluating ChatGPT's Performance in Chinese Spelling Correction | Kunting Li et.al. | 2311.08219v1 | link |
2023-11-13 | GPT-4V(ision) as A Social Media Analysis Engine | Hanjia Lyu et.al. | 2311.07547v1 | link |
2023-11-13 | mlscorecheck: Testing the consistency of reported performance scores and experiments in machine learning | György Kovács et.al. | 2311.07541v1 | null |
2023-11-13 | FEMDA: a unified framework for discriminant analysis | Pierre Houdouin et.al. | 2311.07518v1 | null |
2023-11-13 | Reducing the Need for Backpropagation and Discovering Better Optima With Explicit Optimizations of Neural Networks | Jake Ryland Williams et.al. | 2311.07498v1 | null |
2023-11-13 | Towards Robotic Tree Manipulation: Leveraging Graph Representations | Chung Hee Kim et.al. | 2311.07479v1 | null |
2023-11-13 | Temporal Performance Prediction for Deep Convolutional Long Short-Term Memory Networks | Laura Fieback et.al. | 2311.07477v1 | null |
2023-11-13 | Masked Face Dataset Generation and Masked Face Recognition | Rui Cai et.al. | 2311.07475v1 | link |
2023-11-13 | A Bayesian Approach to Strong Lens Finding in the Era of Wide-area Surveys | Philip Holloway et.al. | 2311.07455v1 | null |
2023-11-13 | On the Robustness of Neural Collapse and the Neural Collapse of Robustness | Jingtong Su et.al. | 2311.07444v1 | null |
2023-11-13 | Optimising Human-AI Collaboration by Learning Convincing Explanations | Alex J. Chan et.al. | 2311.07426v1 | null |
2023-11-10 | Learning Human Action Recognition Representations Without Real Humans | Howard Zhong et.al. | 2311.06231v1 | link |
2023-11-10 | Semantic-aware Video Representation for Few-shot Action Recognition | Yutao Tang et.al. | 2311.06218v1 | null |
2023-11-10 | MultiIoT: Towards Large-scale Multisensory Learning for the Internet of Things | Shentong Mo et.al. | 2311.06217v1 | null |
2023-11-10 | Deep learning segmentation of fibrous cap in intravascular optical coherence tomography images | Juhwan Lee et.al. | 2311.06202v1 | null |
2023-11-10 | An Automated Pipeline for Tumour-Infiltrating Lymphocyte Scoring in Breast Cancer | Adam J Shephard et.al. | 2311.06185v1 | link |
2023-11-10 | Automatic Report Generation for Histopathology images using pre-trained Vision Transformers | Saurav Sengupta et.al. | 2311.06176v1 | null |
2023-11-10 | Two vertex geometrically irreducible algebras | Grzegorz Bobinski et.al. | 2311.06173v1 | null |
2023-11-10 | Time Scale Network: A Shallow Neural Network For Time Series Data | Trevor Meyer et.al. | 2311.06170v1 | null |
2023-11-10 | Deep Fast Vision: A Python Library for Accelerated Deep Transfer Learning Vision Prototyping | Fabi Prezja et.al. | 2311.06169v1 | link |
2023-11-10 | Going beyond persistent homology using persistent homology | Johanna Immonen et.al. | 2311.06152v1 | null |
2023-11-09 | FogROS2-Sky: Optimizing Latency and Cost for Multi-Cloud Robot Applications | Kaiyuan Chen et.al. | 2311.05600v1 | null |
2023-11-09 | A Coefficient Makes SVRG Effective | Yida Yin et.al. | 2311.05589v1 | link |
2023-11-09 | Outlier-Robust Wasserstein DRO | Sloan Nietert et.al. | 2311.05573v1 | link |
2023-11-09 | Exploring Emotion Expression Recognition in Older Adults Interacting with a Virtual Coach | Cristina Palmero et.al. | 2311.05567v1 | null |
2023-11-09 | Disentangling Quantum and Classical Contributions in Hybrid Quantum Machine Learning Architectures | Michael Kölle et.al. | 2311.05559v1 | null |
2023-11-09 | L-WaveBlock: A Novel Feature Extractor Leveraging Wavelets for Generative Adversarial Networks | Mirat Shah et.al. | 2311.05548v1 | null |
2023-11-09 | BakedAvatar: Baking Neural Fields for Real-Time Head Avatar Synthesis | Hao-Bin Duan et.al. | 2311.05521v1 | null |
2023-11-09 | Dirichlet Active Learning | Kevin Miller et.al. | 2311.05501v1 | null |
2023-11-09 | Retinal OCT Synthesis with Denoising Diffusion Probabilistic Models for Layer Segmentation | Yuli Wu et.al. | 2311.05479v1 | null |
2023-11-09 | Robust Retraining-free GAN Fingerprinting via Personalized Normalization | Jianwei Fei et.al. | 2311.05478v1 | null |
2023-11-08 | Towards Few-Annotation Learning in Computer Vision: Application to Image Classification and Object Detection tasks | Quentin Bouniot et.al. | 2311.04888v1 | null |
2023-11-08 | Are foundation models efficient for medical image segmentation? | Danielle Ferreira et.al. | 2311.04847v1 | null |
2023-11-08 | Bayesian multi-band fitting of alerts for kilonovae detection | Biswajit Biswas et.al. | 2311.04845v1 | null |
2023-11-08 | Hierarchically Gated Recurrent Neural Network for Sequence Modeling | Zhen Qin et.al. | 2311.04823v1 | link |
2023-11-08 | A Lightweight Architecture for Real-Time Neuronal-Spike Classification | Muhammad Ali Siddiqi et.al. | 2311.04808v1 | null |
2023-11-08 | Determination of toxic comments and unintended model bias minimization using Deep learning approach | Md Azim Khan et.al. | 2311.04789v1 | null |
2023-11-08 | VioLA: Aligning Videos to 2D LiDAR Scans | Jun-Jee Chao et.al. | 2311.04783v1 | null |
2023-11-08 | FetMRQC: an open-source machine learning framework for multi-centric fetal brain MRI quality control | Thomas Sanchez et.al. | 2311.04780v1 | link |
2023-11-08 | GCS-ICHNet: Assessment of Intracerebral Hemorrhage Prognosis using Self-Attention with Domain Knowledge Integration | Xuhao Shan et.al. | 2311.04772v1 | link |
2023-11-08 | An attention-based deep learning network for predicting Platinum resistance in ovarian cancer | Haoming Zhuang et.al. | 2311.04769v1 | null |
2023-11-08 | Video Instance Matting | Jiachen Li et.al. | 2311.04212v2 | link |
2023-11-07 | JPAVE: A Generation and Classification-based Model for Joint Product Attribute Prediction and Value Extraction | Zhongfen Deng et.al. | 2311.04196v1 | link |
2023-11-07 | Linear to circular conversion in the polarized radio emission of a magnetar | Marcus E. Lower et.al. | 2311.04195v1 | null |
2023-11-07 | SpaDeLeF: A Dataset for Hierarchical Classification of Lexical Functions for Collocations in Spanish | Yevhen Kostiuk et.al. | 2311.04189v1 | null |
2023-11-07 | A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis | Dipanjyoti Paul et.al. | 2311.04157v1 | link |
2023-11-07 | Galaxy Spectra neural Network (GaSNet). II. Using Deep Learning for Spectral Classification and Redshift Predictions | Fucheng Zhong et.al. | 2311.04146v1 | null |
2023-11-07 | I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models | Shiwei Zhang et.al. | 2311.04145v1 | null |
2023-11-07 | Modelling Sentiment Analysis: LLMs and data augmentation techniques | Guillem Senabre Prades et.al. | 2311.04139v1 | null |
2023-11-07 | Improved Topological Preservation in 3D Axon Segmentation and Centerline Detection using Geometric Assessment-driven Topological Smoothing (GATS) | Nina I. Shamsi et.al. | 2311.04116v1 | null |
2023-11-07 | Joint modelling of recurrent and terminal events with discretely-distributed non-parametric frailty: application on re-hospitalizations and death in heart failure patients | Chiara Masci et.al. | 2311.04103v1 | null |
2023-11-06 | A Classification of Graphs through Quadratic Embedding Constants and Clique Graph Insights | Edy Tri Baskoro et.al. | 2311.03342v1 | null |
2023-11-06 | Tackling Concept Shift in Text Classification using Entailment-style Modeling | Sumegh Roychowdhury et.al. | 2311.03320v1 | null |
2023-11-06 | A Foundation Model for Music Informatics | Minz Won et.al. | 2311.03318v1 | link |
2023-11-06 | FATE: Feature-Agnostic Transformer-based Encoder for learning generalized embedding spaces in flow cytometry data | Lisa Weijler et.al. | 2311.03314v1 | link |
2023-11-06 | A Single 2D Pose with Context is Worth Hundreds for 3D Human Pose Estimation | Qitao Zhao et.al. | 2311.03312v1 | null |
2023-11-06 | Advancing Post Hoc Case Based Explanation with Feature Highlighting | Eoin Kenny et.al. | 2311.03246v1 | null |
2023-11-06 | Machine Learning-Based Tea Leaf Disease Detection: A Comprehensive Review | Faruk Ahmed et.al. | 2311.03240v1 | null |
2023-11-06 | Out-of-distribution Detection Learning with Unreliable Out-of-distribution Sources | Haotian Zheng et.al. | 2311.03236v1 | null |
2023-11-06 | Segmentation of Drone Collision Hazards in Airborne RADAR Point Clouds Using PointNet | Hector Arroyo et.al. | 2311.03221v1 | null |
2023-11-06 | Leveraging Transformers to Improve Breast Cancer Classification and Risk Assessment with Multi-modal and Longitudinal Data | Yiqiu Shen et.al. | 2311.03217v1 | null |
2023-11-03 | LOTUS: Continual Imitation Learning for Robot Manipulation Through Unsupervised Skill Discovery | Weikang Wan et.al. | 2311.02058v1 | null |
2023-11-03 | MetaFast: Enabling Fast Metagenomic Classification via Seed Counting and Edit Distance Approximation | Arvid E. Gollwitzer et.al. | 2311.02029v1 | null |
2023-11-03 | A Structured Pruning Algorithm for Model-based Deep Learning | Chicago Park et.al. | 2311.02003v1 | null |
2023-11-03 | Detection of keratoconus Diseases using deep Learning | AKM Enzam-Ul Haque et.al. | 2311.01996v1 | null |
2023-11-03 | Obtaining Explainable Classification Models using Distributionally Robust Optimization | Sanjeeb Dash et.al. | 2311.01994v1 | null |
2023-11-03 | Leveraging Large-Scale Pretrained Vision Foundation Models for Label-Efficient 3D Point Cloud Segmentation | Shichao Dong et.al. | 2311.01989v1 | null |
2023-11-06 | RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches | Jiayuan Gu et.al. | 2311.01977v2 | null |
2023-11-03 | Welded graphs, Wirtinger groups and knotted punctured spheres | Benjamin Audoux et.al. | 2311.01922v1 | null |
2023-11-03 | Contrast-Agnostic Groupwise Registration by Robust PCA for Quantitative Cardiac MRI | Xinqi Li et.al. | 2311.01916v1 | null |
2023-11-03 | VQPy: An Object-Oriented Approach to Modern Video Analytics | Shan Yu et.al. | 2311.01623v1 | null |
2023-11-02 | Tailoring Mixup to Data using Kernel Warping functions | Quentin Bouniot et.al. | 2311.01434v1 | link |
2023-11-02 | Identifying Alzheimer Disease Dementia Levels Using Machine Learning Methods | Md Gulzar Hussain et.al. | 2311.01428v1 | null |
2023-11-02 | Exploring Deep Learning Techniques for Glaucoma Detection: A Comprehensive Review | Aized Amin Soofi et.al. | 2311.01425v1 | null |
2023-11-02 | Holistic Transfer: Towards Non-Disruptive Fine-Tuning with Partial Target Data | Cheng-Hao Tu et.al. | 2311.01420v1 | null |
2023-11-02 | Learning to See Physical Properties with Active Sensing Motor Policies | Gabriel B. Margolis et.al. | 2311.01405v1 | null |
2023-11-02 | Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors | Gabriele M. Caddeo et.al. | 2311.01380v1 | link |
2023-11-02 | Deep learning based Image Compression for Microscopy Images: An Empirical Study | Yu Zhou et.al. | 2311.01352v1 | null |
2023-11-02 | Unreading Race: Purging Protected Features from Chest X-ray Embeddings | Tobias Weber et.al. | 2311.01349v1 | null |
2023-11-02 | Scattering Vision Transformer: Spectral Mixing Matters | Badri N. Patro et.al. | 2311.01310v1 | null |
2023-11-02 | Hybrid-Fusion Transformer for Multisequence MRI | Jihoon Cho et.al. | 2311.01308v1 | null |
2023-11-01 | Software Repositories and Machine Learning Research in Cyber Security | Mounika Vanamala et.al. | 2311.00691v1 | null |
2023-11-01 | What User Behaviors Make the Differences During the Process of Visual Analytics? | Shahin Doroudian et.al. | 2311.00690v1 | null |
2023-11-01 | Deep Learning-Based Classification of Gamma Photon Interactions in Room-Temperature Semiconductor Radiation Detectors | Sandeep K. Chaudhuri et.al. | 2311.00682v1 | null |
2023-11-01 | Latent Space Translation via Semantic Alignment | Valentino Maiorca et.al. | 2311.00664v1 | link |
2023-11-01 | Rediscussion of eclipsing binaries. Paper XV. The B-type supergiant system V1765 Cygni | John Southworth et.al. | 2311.00655v1 | null |
2023-11-02 | Emergence of Collective Open-Ended Exploration from Decentralized Meta-Reinforcement Learning | Richard Bornemann et.al. | 2311.00651v2 | null |
2023-11-01 | Understanding the Issues and Causes in WebAssembly Application Development: A Mining-based Study | Muhammad Waseem et.al. | 2311.00646v1 | null |
2023-11-01 | A Bi-level Framework for Traffic Accident Duration Prediction: Leveraging Weather and Road Condition Data within a Practical Optimum Pipeline | Rafat Tabassum Sukonna et.al. | 2311.00634v1 | null |
2023-11-01 | Controllable Music Production with Diffusion Models and Guidance Gradients | Mark Levy et.al. | 2311.00613v1 | null |
2023-11-01 | A Robust Deep Learning Method with Uncertainty Estimation for the Pathological Classification of Renal Cell Carcinoma based on CT Images | Ni Yao et.al. | 2311.00567v1 | null |
2023-10-31 | Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders | Srijan Das et.al. | 2310.20704v1 | null |
2023-10-31 | SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction | Xinyuan Chen et.al. | 2310.20700v1 | null |
2023-10-31 | StairNet: Visual Recognition of Stairs for Human-Robot Locomotion | Andrew Garrett Kurbis et.al. | 2310.20666v1 | null |
2023-10-31 | Performance Improvement in Multi-class Classification via Automated Hierarchy Generation and Exploitation through Extended LCPN Schemes | Celal Alagoz et.al. | 2310.20641v1 | null |
2023-10-31 | Deepfake detection by exploiting surface anomalies: the SurFake approach | Andrea Ciamarra et.al. | 2310.20621v1 | null |
2023-10-31 | Enhanced Synthetic MRI Generation from CT Scans Using CycleGAN with Feature Extraction | Saba Nikbakhsh et.al. | 2310.20604v1 | null |
2023-10-31 | Finiteness properties for Shimura curves and modified diagonal cycles | Congling Qiu et.al. | 2310.20600v1 | null |
2023-10-31 | Brain-like Flexible Visual Inference by Harnessing Feedback-Feedforward Alignment | Tahereh Toosi et.al. | 2310.20599v1 | link |
2023-10-31 | Tracially Complete C-Algebras* | José R. Carrión et.al. | 2310.20594v1 | null |
2023-10-31 | Strongly Magnetized Tidal Disruption Event Disks via Stream Injection in GRMHD | Brandon Curd et.al. | 2310.20592v1 | null |
2023-10-29 | Improved Motor Imagery Classification Using Adaptive Spatial Filters Based on Particle Swarm Optimization Algorithm | Xiong Xiong et.al. | 2310.19202v1 | null |
2023-10-29 | Enhancing Motor Imagery Decoding in Brain Computer Interfaces using Riemann Tangent Space Mapping and Cross Frequency Coupling | Xiong Xiong et.al. | 2310.19198v1 | null |
2023-10-29 | A Survey on Watching Social Issue Videos among YouTube and TikTok Users | Shuo Niu et.al. | 2310.19193v1 | null |
2023-10-29 | Subjective Quality Evaluation of Point Clouds Using a Head Mounted Display | Joao Prazeres et.al. | 2310.19179v1 | null |
2023-10-29 | Robustifying Language Models with Test-Time Adaptation | Noah Thomas McDermott et.al. | 2310.19177v1 | null |
2023-10-29 | Predicting recovery following stroke: deep learning, multimodal data and feature selection using explainable AI | Adam White et.al. | 2310.19174v1 | null |
2023-10-29 | BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping | Srikumar Sastry et.al. | 2310.19168v1 | link |
2023-10-29 | Unified Representation for Non-compositional and Compositional Expressions | Ziheng Zeng et.al. | 2310.19127v1 | null |
2023-10-29 | Efficient IoT Inference via Context-Awareness | Mohammad Mehdi Rastikerdar et.al. | 2310.19112v1 | null |
2023-10-29 | Pushdown Layers: Encoding Recursive Structure in Transformer Language Models | Shikhar Murty et.al. | 2310.19089v1 | null |
2023-10-27 | Addressing GAN Training Instabilities via Tunable Classification Losses | Monica Welfert et.al. | 2310.18291v1 | null |
2023-10-27 | PlantPlotGAN: A Physics-Informed Generative Adversarial Network for Plant Disease Prediction | Felipe A. Lopes et.al. | 2310.18268v1 | null |
2023-10-27 | MalFake: A Multimodal Fake News Identification for Malayalam using Recurrent Neural Networks and VGG-16 | Adhish S. Sujan et.al. | 2310.18263v1 | null |
2023-10-27 | Edge AI-Based Vein Detector for Efficient Venipuncture in the Antecubital Fossa | Edwin Salcedo et.al. | 2310.18234v1 | null |
2023-10-27 | TBDLNet: a network for classifying multidrug-resistant and drug-sensitive tuberculosis | Ziquan Zhu et.al. | 2310.18222v1 | null |
2023-10-27 | ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models | Benjamin Feuer et.al. | 2310.18208v1 | link |
2023-10-27 | Artifact-Robust Graph-Based Learning in Digital Pathology | Saba Heidari Gheshlaghi et.al. | 2310.18192v1 | null |
2023-10-27 | Globular clusters and bar: captured or not captured? | Anton A. Smirnov et.al. | 2310.18172v1 | null |
2023-10-27 | Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN | Neeraj Kumar et.al. | 2310.18169v1 | null |
2023-10-27 | DESiRED -- Dynamic, Enhanced, and Smart iRED: A P4-AQM with Deep Reinforcement Learning and In-band Network Telemetry | Leandro C. de Almeida et.al. | 2310.18159v1 | null |
2023-10-26 | A Coarse-to-Fine Pseudo-Labeling (C2FPL) Framework for Unsupervised Video Anomaly Detection | Anas Al-lahham et.al. | 2310.17650v1 | null |
2023-10-26 | torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP | Yoshitomo Matsubara et.al. | 2310.17644v1 | link |
2023-10-26 | Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models | Tsun-Hsuan Wang et.al. | 2310.17642v1 | null |
2023-10-26 | Skew Products on the Berkovich Projective Line | Richard A. P. Birkett et.al. | 2310.17628v1 | null |
2023-10-26 | A Survey on Transferability of Adversarial Examples across Deep Neural Networks | Jindong Gu et.al. | 2310.17626v1 | link |
2023-10-26 | MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations | Ajay Mandlekar et.al. | 2310.17596v1 | null |
2023-10-26 | Linear |
Jerson Caro et.al. | 2310.17592v1 | null |
2023-10-26 | A minimax optimal control approach for robust neural ODEs | Cristina Cipriani et.al. | 2310.17584v1 | null |
2023-10-26 | BLIS-Net: Classifying and Analyzing Signals on Graphs | Charles Xu et.al. | 2310.17579v1 | null |
2023-10-26 | Knots bounding non-isotopic ribbon disks | Jeffrey Meier et.al. | 2310.17564v1 | null |
2023-10-25 | RDBench: ML Benchmark for Relational Databases | Zizhao Zhang et.al. | 2310.16837v1 | link |
2023-10-25 | TD-MPC2: Scalable, Robust World Models for Continuous Control | Nicklas Hansen et.al. | 2310.16828v1 | null |
2023-10-26 | Deep machine learning for meteor monitoring: advances with transfer learning and gradient-weighted class activation mapping | Eloy Peña-Asensio et.al. | 2310.16826v2 | null |
2023-10-25 | Uncovering a new group of T Tauri stars in the Taurus-Auriga molecular complex from Gaia and GALEX data | Ana Inés Gómez de Castro et.al. | 2310.16820v1 | null |
2023-10-25 | Using Diffusion Models to Generate Synthetic Labelled Data for Medical Image Segmentation | Daniel Saragih et.al. | 2310.16794v1 | null |
2023-10-25 | Navigating Socio-Emotional Risk through Comfort-Building in a Physics Teaching Community of Practice: A Case Study | Maggie Mahmood et.al. | 2310.16778v1 | null |
2023-10-25 | IntenDD: A Unified Contrastive Learning Approach for Intent Detection and Discovery | Bhavuk Singhal et.al. | 2310.16761v1 | null |
2023-10-25 | Interferometric Neural Networks | Arun Sehrawat et.al. | 2310.16742v1 | link |
2023-10-25 | A No-Reference Quality Assessment Method for Digital Human Head | Yingjie Zhou et.al. | 2310.16732v1 | null |
2023-10-25 | Spherical Wavefront Near-Field DoA Estimation in THz Automotive Radar | Ahmet M. Elbir et.al. | 2310.16724v1 | null |
2023-10-24 | From Posterior Sampling to Meaningful Diversity in Image Restoration | Noa Cohen et.al. | 2310.16047v1 | null |
2023-10-24 | Finetuning Offline World Models in the Real World | Yunhai Feng et.al. | 2310.16029v1 | null |
2023-10-24 | Human-in-the-Loop Task and Motion Planning for Imitation Learning | Ajay Mandlekar et.al. | 2310.16014v1 | null |
2023-10-24 | CVPR 2023 Text Guided Video Editing Competition | Jay Zhangjie Wu et.al. | 2310.16003v1 | null |
2023-10-24 | Vision-Language Pseudo-Labels for Single-Positive Multi-Label Learning | Xin Xing et.al. | 2310.15985v1 | link |
2023-10-24 | Geometry-Aware Video Quality Assessment for Dynamic Digital Human | Zicheng Zhang et.al. | 2310.15984v1 | null |
2023-10-24 | Minimax Forward and Backward Learning of Evolving Tasks with Performance Guarantees | Verónica Álvarez et.al. | 2310.15974v1 | link |
2023-10-24 | Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection | Manyuan Zhang et.al. | 2310.15955v1 | null |
2023-10-25 | Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles | Xing Shen et.al. | 2310.15952v2 | null |
2023-10-24 | ShARc: Shape and Appearance Recognition for Person Identification In-the-wild | Haidong Zhu et.al. | 2310.15946v1 | null |
2023-10-23 | FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling | Haonan Qiu et.al. | 2310.15169v1 | null |
2023-10-23 | Bitrate Ladder Prediction Methods for Adaptive Video Streaming: A Review and Benchmark | Ahmed Telili et.al. | 2310.15163v1 | null |
2023-10-23 | Linear Representations of Sentiment in Large Language Models | Curt Tigges et.al. | 2310.15154v1 | null |
2023-10-23 | Unlocking the Transferability of Tokens in Deep Models for Tabular Data | Qi-Le Zhou et.al. | 2310.15149v1 | null |
2023-10-23 | When Should the FDA Inspect Pharmaceutical Manufacturing Facilities to Better Mitigate Drug Shortages? | Daniel Kosmas et.al. | 2310.15146v1 | null |
2023-10-23 | Novel-View Acoustic Synthesis from 3D Reconstructed Rooms | Byeongjoo Ahn et.al. | 2310.15130v1 | link |
2023-10-23 | Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models | Gabriel Sarch et.al. | 2310.15127v1 | null |
2023-10-23 | SpVOS: Efficient Video Object Segmentation with Triple Sparse Convolution | Weihao Lin et.al. | 2310.15115v1 | null |
2023-10-23 | The Self 2.0: How AI-Enhanced Self-Clones Transform Self-Perception and Improve Presentation Skills | Qingxiao Zheng et.al. | 2310.15112v1 | null |
2023-10-23 | Matryoshka Diffusion Models | Jiatao Gu et.al. | 2310.15111v1 | null |
2023-10-20 | Using Human-like Mechanism to Weaken Effect of Pre-training Weight Bias in Face-Recognition Convolutional Neural Network | Haojiang Ying et.al. | 2310.13674v1 | null |
2023-10-23 | Explainable Depression Symptom Detection in Social Media | Eliseo Bao Souto et.al. | 2310.13664v2 | null |
2023-10-20 | Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification | Amr Keleg et.al. | 2310.13661v1 | link |
2023-10-20 | Optimal Transport for Measures with Noisy Tree Metric | Tam Le et.al. | 2310.13653v1 | null |
2023-10-20 | Principal |
Shigeo Koshitani et.al. | 2310.13621v1 | null |
2023-10-20 | Skin Lesion Segmentation Improved by Transformer-based Networks with Inter-scale Dependency Modeling | Sania Eskandari et.al. | 2310.13604v1 | link |
2023-10-20 | Classification of quantum states of light using random measurements through a multimode fiber | Saroch Leedumrongwatthanakun et.al. | 2310.13599v1 | null |
2023-10-20 | Longer-range Contextualized Masked Autoencoder | Taekyung Kim et.al. | 2310.13593v1 | null |
2023-10-20 | POTLoc: Pseudo-Label Oriented Transformer for Point-Supervised Temporal Action Localization | Elahe Vahdani et.al. | 2310.13585v1 | null |
2023-10-20 | Progressive Dual Priori Network for Generalized Breast Tumor Segmentation | Li Wang et.al. | 2310.13574v1 | null |
2023-10-19 | Putting the Object Back into Video Object Segmentation | Ho Kei Cheng et.al. | 2310.12982v1 | link |
2023-10-19 | Variational Inference for SDEs Driven by Fractional Noise | Rembert Daems et.al. | 2310.12975v1 | null |
2023-10-19 | Frozen Transformers in Language Models Are Effective Visual Encoder Layers | Ziqi Pang et.al. | 2310.12973v1 | link |
2023-10-19 | Bialgebra structures on flat Lie algebras | Amine Bahayou et.al. | 2310.12966v1 | null |
2023-10-19 | End-to-End Delay Minimization based on Joint Optimization of DNN Partitioning and Resource Allocation for Cooperative Edge Inference | Xinrui Ye et.al. | 2310.12937v1 | null |
2023-10-19 | Digital Twin-Enabled Intelligent DDoS Detection Mechanism for Autonomous Core Networks | Yagmur Yigit et.al. | 2310.12924v1 | null |
2023-10-19 | Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning | Juan Rocamonde et.al. | 2310.12921v1 | null |
2023-10-19 | Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey | Oriane Siméoni et.al. | 2310.12904v1 | link |
2023-10-19 | A Markovian dynamics for |
Antonio C. Costa et.al. | 2310.12883v1 | link |
2023-10-19 | Perceptual Assessment and Optimization of High Dynamic Range Image Rendering | Peibei Cao et.al. | 2310.12877v1 | null |
2023-10-18 | SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks | Mohammadreza Salehi et.al. | 2310.12126v1 | null |
2023-10-18 | Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture | Daniel Y. Fu et.al. | 2310.12109v1 | null |
2023-10-18 | HSTR-Net: Reference Based Video Super-resolution for Aerial Surveillance with Dual Cameras | H. Umut Suluhan et.al. | 2310.12092v1 | null |
2023-10-18 | Chemical Analysis of the Brightest Star of the Cetus II Ultra-Faint Dwarf Galaxy Candidate | K. B. Webber et.al. | 2310.12090v1 | null |
2023-10-18 | One-Shot Imitation Learning: A Pose Estimation Perspective | Pietro Vitiello et.al. | 2310.12077v1 | null |
2023-10-18 | Exploring Fairness in Pre-trained Visual Transformer based Natural and GAN Generated Image Detection Systems and Understanding the Impact of Image Compression in Fairness | Manjary P. Gangan et.al. | 2310.12076v1 | null |
2023-10-18 | Black-Box Training Data Identification in GANs via Detector Networks | Lukman Olagoke et.al. | 2310.12063v1 | null |
2023-10-19 | Robust Class-Conditional Distribution Alignment for Partial Domain Adaptation | Sandipan Choudhuri et.al. | 2310.12060v2 | null |
2023-10-18 | Exact and efficient solutions of the LMC Multitask Gaussian Process model | Olivier Truffinet et.al. | 2310.12032v1 | link |
2023-10-18 | CORE: A Few-Shot Company Relation Classification Dataset for Robust Domain Adaptation | Philipp Borchert et.al. | 2310.12024v1 | link |
2023-10-17 | DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis | Youngjoong Kwon et.al. | 2310.11449v1 | null |
2023-10-18 | 4K4D: Real-Time 4D View Synthesis at 4K Resolution | Zhen Xu et.al. | 2310.11448v2 | null |
2023-10-18 | EvalCrafter: Benchmarking and Evaluating Large Video Generation Models | Yaofang Liu et.al. | 2310.11440v2 | null |
2023-10-17 | Transitive generalized toggle groups containing a cycle | Jonathan S. Bloom et.al. | 2310.11387v1 | null |
2023-10-17 | DialogueLLM: Context and Emotion Knowledge-Tuned LLaMA Models for Emotion Recognition in Conversations | Yazhou Zhang et.al. | 2310.11374v1 | null |
2023-10-17 | VECHR: A Dataset for Explainable and Robust Classification of Vulnerability Type in the European Court of Human Rights | Shanshan Xu et.al. | 2310.11368v1 | null |
2023-10-17 | Lie Group Decompositions for Equivariant Neural Networks | Mircea Mironenco et.al. | 2310.11366v1 | null |
2023-10-17 | Hybrid quantum-classical graph neural networks for tumor classification in digital pathology | Anupama Ray et.al. | 2310.11353v1 | null |
2023-10-17 | The effect of stemming and lemmatization on Portuguese fake news text classification | Lucca de Freitas Santos et.al. | 2310.11344v1 | null |
2023-10-17 | Influencing factors on false positive rates when classifying tumor cell line response to drug treatment | Priyanka Vasanthakumari et.al. | 2310.11329v1 | null |
2023-10-16 | A Survey on Video Diffusion Models | Zhen Xing et.al. | 2310.10647v1 | link |
2023-10-16 | Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting | Zeyu Yang et.al. | 2310.10642v1 | link |
2023-10-16 | Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models | Kevin Black et.al. | 2310.10639v1 | null |
2023-10-16 | Efficacy of Dual-Encoders for Extreme Multi-Label Classification | Nilesh Gupta et.al. | 2310.10636v1 | null |
2023-10-16 | Overcoming the Rayleigh limit in extremely low SNR | Hyunsoo Choi et.al. | 2310.10633v1 | null |
2023-10-16 | Video Language Planning | Yilun Du et.al. | 2310.10625v1 | null |
2023-10-16 | DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing | Jia-Wei Liu et.al. | 2310.10624v1 | null |
2023-10-16 | BiLL-VTG: Bridging Large Language Models and Lightweight Visual Tools for Video-based Texts Generation | Ji Qi et.al. | 2310.10586v1 | null |
2023-10-16 | RefConv: Re-parameterized Refocusing Convolution for Powerful ConvNets | Zhicheng Cai et.al. | 2310.10563v1 | link |
2023-10-16 | Deep learning applied to EEG data with different montages using spatial attention | Dung Truong et.al. | 2310.10550v1 | null |
2023-10-13 | An Unbiased Look at Datasets for Visuo-Motor Pre-Training | Sudeep Dasari et.al. | 2310.09289v1 | null |
2023-10-13 | Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning | Geri Skenderi et.al. | 2310.09278v1 | null |
2023-10-13 | A Hybrid Approach for Depression Classification: Random Forest-ANN Ensemble on Motor Activity Signals | Anket Patil et.al. | 2310.09277v1 | null |
2023-10-13 | PromptRE: Weakly-Supervised Document-Level Relation Extraction via Prompting-Based Data Programming | Chufan Gao et.al. | 2310.09265v1 | null |
2023-10-13 | Political claim identification and categorization in a multilingual setting: First experiments | Urs Zaberer et.al. | 2310.09256v1 | null |
2023-10-13 | It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models | Lin Chen et.al. | 2310.09250v1 | null |
2023-10-13 | A Multifaceted Look at Starlink Performance | Nitinder Mohan et.al. | 2310.09242v1 | null |
2023-10-13 | Time CNN and Graph Convolution Network for Epileptic Spike Detection in MEG Data | Pauline Mouches et.al. | 2310.09236v1 | null |
2023-10-13 | Ultrasound Image Segmentation of Thyroid Nodule via Latent Semantic Feature Co-Registration | Xuewei Li et.al. | 2310.09221v1 | null |
2023-10-13 | PaLI-3 Vision Language Models: Smaller, Faster, Stronger | Xi Chen et.al. | 2310.09199v1 | null |
2023-10-12 | Octopus: Embodied Vision-Language Programmer from Environmental Feedback | Jingkang Yang et.al. | 2310.08588v1 | link |
2023-10-12 | Is Generalized Dynamic Novel View Synthesis from Monocular Videos Possible Today? | Xiaoming Zhao et.al. | 2310.08587v1 | null |
2023-10-12 | Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes | Haotong Lin et.al. | 2310.08585v1 | null |
2023-10-12 | Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video | Shashanka Venkataramanan et.al. | 2310.08584v1 | null |
2023-10-12 | Universal Visual Decomposer: Long-Horizon Manipulation Made Easy | Zichen Zhang et.al. | 2310.08581v1 | null |
2023-10-12 | Learning to Act from Actionless Videos through Dense Correspondences | Po-Chen Ko et.al. | 2310.08576v1 | null |
2023-10-12 | Effective isometries of periodic shells | Hussein Nassar et.al. | 2310.08531v1 | null |
2023-10-12 | LLM-augmented Preference Learning from Natural Language | Inwon Kang et.al. | 2310.08523v1 | null |
2023-10-12 | Impact of time and note duration tokenizations on deep learning symbolic music modeling | Nathan Fradet et.al. | 2310.08497v1 | link |
2023-10-12 | GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language Models | Yuanchun Shen et.al. | 2310.08487v1 | link |
2023-10-11 | ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models | Yingqing He et.al. | 2310.07702v1 | link |
2023-10-11 | ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation | Bo Peng et.al. | 2310.07697v1 | null |
2023-10-11 | Large-scale photonic computing with nonlinear disordered media | Hao Wang et.al. | 2310.07690v1 | null |
2023-10-11 | Deep Video Inpainting Guided by Audio-Visual Self-Supervision | Kyuyeon Kim et.al. | 2310.07663v1 | null |
2023-10-11 | Hypercomplex Multimodal Emotion Recognition from EEG and Peripheral Physiological Signals | Eleonora Lopez et.al. | 2310.07648v1 | null |
2023-10-11 | Attention-Map Augmentation for Hypercomplex Breast Cancer Classification | Eleonora Lopez et.al. | 2310.07633v1 | null |
2023-10-11 | Differentiable Euler Characteristic Transforms for Shape Classification | Ernst Roell et.al. | 2310.07630v1 | link |
2023-10-11 | Time-Resolved Reconstruction of Motion, Force, and Stiffness using Spectro-Dynamic MRI | Max H. C. van Riel et.al. | 2310.07622v1 | null |
2023-10-11 | Reinforcement Learning-based Knowledge Graph Reasoning for Explainable Fact-checking | Gustav Nikopensius et.al. | 2310.07613v1 | null |
2023-10-11 | QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking | Liangming Pan et.al. | 2310.07609v1 | link |
2023-10-10 | Convivial Solipsism as a maximally perspectival interpretation | Herve Zwirn et.al. | 2310.06815v1 | null |
2023-10-10 | A Supervised Embedding and Clustering Anomaly Detection method for classification of Mobile Network Faults | R. Mosayebi et.al. | 2310.06779v1 | null |
2023-10-10 | Optical assembly of nanostructures mediated by surface roughness | Robert G. Felsted et.al. | 2310.06774v1 | null |
2023-10-10 | Uni3D: Exploring Unified 3D Representation at Scale | Junsheng Zhou et.al. | 2310.06773v1 | link |
2023-10-10 | Improved convergence rates for some kernel random forest algorithms | Isidoros Iakovidis et.al. | 2310.06760v1 | null |
2023-10-10 | Geographic Location Encoding with Spherical Harmonics and Sinusoidal Representation Networks | Marc Rußwurm et.al. | 2310.06743v1 | link |
2023-10-10 | Multi-domain improves out-of-distribution and data-limited scenarios for medical image analysis | Ece Ozkan et.al. | 2310.06737v1 | null |
2023-10-10 | S4Sleep: Elucidating the design space of deep-learning-based sleep stage classification models | Tiezhi Wang et.al. | 2310.06715v1 | link |
2023-10-10 | Tertiary Lymphoid Structures Generation through Graph-based Diffusion | Manuel Madeira et.al. | 2310.06661v1 | null |
2023-10-10 | Assessing the Impact of a Supervised Classification Filter on Flow-based Hybrid Network Anomaly Detection | Dominik Macko et.al. | 2310.06656v1 | link |
2023-10-09 | FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing | Yuren Cong et.al. | 2310.05922v1 | null |
2023-10-09 | Enumerating Calabi-Yau Manifolds: Placing bounds on the number of diffeomorphism classes in the Kreuzer-Skarke list | Aditi Chandra et.al. | 2310.05909v1 | null |
2023-10-09 | ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models | Kaiwen Zhou et.al. | 2310.05872v1 | null |
2023-10-10 | Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models | Guangzhi Sun et.al. | 2310.05863v2 | link |
2023-10-09 | Latent Wander: an Alternative Interface for Interactive and Serendipitous Discovery of Large AV Archives | Yuchen Yang et.al. | 2310.05835v1 | null |
2023-10-09 | Write What You Want: Applying Text-to-video Retrieval to Audiovisual Archives | Yuchen Yang et.al. | 2310.05825v1 | null |
2023-10-09 | Dipole-Spread Function Engineering for 6D Super-Resolution Microscopy | Tingting Wu et.al. | 2310.05810v1 | null |
2023-10-09 | A Simple Open-Loop Baseline for Reinforcement Learning Locomotion Tasks | Antonin Raffin et.al. | 2310.05808v1 | null |
2023-10-09 | Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis | Haoyu Zhang et.al. | 2310.05804v1 | null |
2023-10-10 | Two-timescale Derivative Free Optimization for Performative Prediction with Markovian Data | Haitong Liu et.al. | 2310.05792v2 | null |
2023-10-06 | Exploiting Transformer Activation Sparsity with Dynamic Inference | Mikołaj Piórczyński et.al. | 2310.04361v1 | null |
2023-10-06 | SwimXYZ: A large-scale dataset of synthetic swimming motions and videos | Fiche Guénolé et.al. | 2310.04360v1 | null |
2023-10-06 | Large-Scale Korean Text Dataset for Classifying Biased Speech in Real-World Online Services | Dasol Choi et.al. | 2310.04313v1 | null |
2023-10-06 | Convergent ADMM Plug and Play PET Image Reconstruction | Florent Sureau et.al. | 2310.04299v1 | null |
2023-10-06 | A Plug-and-Play Image Registration Network | Junhao Hu et.al. | 2310.04297v1 | null |
2023-10-06 | Towards Non-contact 3D Ultrasound for Wrist Imaging | Antony Jerald et.al. | 2310.04296v1 | null |
2023-10-06 | Spectroscopic variability of massive pre-main-sequence stars in M17 | A. R. Derkink et.al. | 2310.04287v1 | null |
2023-10-06 | Multi-Industry Simplex : A Probabilistic Extension of GICS | Maksim Papenkov et.al. | 2310.04280v1 | null |
2023-10-06 | Bringing Quantum Algorithms to Automated Machine Learning: A Systematic Review of AutoML Frameworks Regarding Extensibility for QML Algorithms | Dennis Klau et.al. | 2310.04238v1 | null |
2023-10-06 | Written and spoken corpus of real and fake social media postings about COVID-19 | Ng Bee Chin et.al. | 2310.04237v1 | null |
2023-10-05 | The Un-Kidnappable Robot: Acoustic Localization of Sneaking People | Mengyu Yang et.al. | 2310.03743v1 | null |
2023-10-05 | Agent Instructs Large Language Models to be General Zero-Shot Reasoners | Nicholas Crispino et.al. | 2310.03710v1 | link |
2023-10-05 | OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable Evasion Attacks | Ofir Bar Tal et.al. | 2310.03707v1 | null |
2023-10-05 | Role of Spatial Coherence in Diffractive Optical Neural Networks | Matthew J. Filipovich et.al. | 2310.03679v1 | null |
2023-10-05 | Certification of Deep Learning Models for Medical Image Segmentation | Othmane Laousy et.al. | 2310.03664v1 | null |
2023-10-05 | Autoregressive Coefficients based Intelligent Protection of Transmission Lines Connected to Type-3 Wind Farms | Pallav Kumar Bera et.al. | 2310.03663v1 | null |
2023-10-05 | Robustness-Guided Image Synthesis for Data-Free Quantization | Jianhong Bai et.al. | 2310.03661v1 | null |
2023-10-05 | Balancing Autonomy and Alignment: A Multi-Dimensional Taxonomy for Autonomous LLM-powered Multi-Agent Architectures | Thorsten Händler et.al. | 2310.03659v1 | null |
2023-10-05 | Strategic Evaluation: Subjects, Evaluators, and Society | Benjamin Laufer et.al. | 2310.03655v1 | null |
2023-10-05 | CLEVRER-Humans: Describing Physical and Causal Events the Human Way | Jiayuan Mao et.al. | 2310.03635v1 | null |
2023-10-04 | SemiReward: A General Reward Model for Semi-supervised Learning | Siyuan Li et.al. | 2310.03013v1 | link |
2023-10-04 | High-dimensional SGD aligns with emerging outlier eigenspaces | Gerard Ben Arous et.al. | 2310.03010v1 | null |
2023-10-05 | IBCL: Zero-shot Model Generation for Task Trade-offs in Continual Learning | Pengyuan Lu et.al. | 2310.02995v2 | link |
2023-10-04 | Multiple Physics Pretraining for Physical Surrogate Models | Michael McCabe et.al. | 2310.02994v1 | null |
2023-10-04 | UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network | Siddhant Arora et.al. | 2310.02973v1 | null |
2023-10-04 | Fully Automatic Segmentation of Gross Target Volume and Organs-at-Risk for Radiotherapy Planning of Nasopharyngeal Carcinoma | Mehdi Astaraki et.al. | 2310.02972v1 | null |
2023-10-04 | Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model | Kai-Wei Chang et.al. | 2310.02971v1 | null |
2023-10-05 | Co-modeling the Sequential and Graphical Routes for Peptide Representation Learning | Zihan Liu et.al. | 2310.02964v2 | link |
2023-10-04 | CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection | Yang Cao et.al. | 2310.02960v1 | link |
2023-10-04 | HappyFeat -- An interactive and efficient BCI framework for clinical applications | Arthur Desbois et.al. | 2310.02948v1 | null |
2023-10-03 | DREAM: Visual Decoding from Reversing Human Visual System | Weihao Xia et.al. | 2310.02265v1 | null |
2023-10-03 | RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving | Tong Zhao et.al. | 2310.02262v1 | null |
2023-10-03 | Harnessing Pre-Trained Sentence Transformers for Offensive Language Detection in Indian Languages | Ananya Joshi et.al. | 2310.02249v1 | null |
2023-10-04 | Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks | Greg Yang et.al. | 2310.02244v2 | null |
2023-10-03 | MIS-AVioDD: Modality Invariant and Specific Representation for Audio-Visual Deepfake Detection | Vinaya Sree Katamneni et.al. | 2310.02234v1 | null |
2023-10-03 | HoloNets: Spectral Convolutions do extend to Directed Graphs | Christian Koke et.al. | 2310.02232v1 | null |
2023-10-03 | Extraction of Medication and Temporal Relation from Clinical Text by Harnessing Different Deep Learning Models | Hangyu Tu et.al. | 2310.02229v1 | null |
2023-10-03 | Symmetry-based classification of exact flat bands in single and bilayer moiré systems | Siddhartha Sarkar et.al. | 2310.02218v1 | null |
2023-10-03 | Learnable Data Augmentation for One-Shot Unsupervised Domain Adaptation | Julio Ivan Davila Carrazco et.al. | 2310.02201v1 | null |
2023-10-03 | CNN photometric redshifts in the SDSS at |
M. Treyer et.al. | 2310.02173v1 | null |
2023-09-29 | A Large Language Model Approach to Educational Survey Feedback Analysis | Michael J. Parker et.al. | 2309.17447v1 | null |
2023-10-02 | LLM-grounded Video Diffusion Models | Long Lian et.al. | 2309.17444v2 | null |
2023-09-29 | Classification of Potholes Based on Surface Area Using Pre-Trained Models of Convolutional Neural Network | Chauhdary Fazeel Ahmad et.al. | 2309.17426v1 | null |
2023-09-29 | CNN-based automatic segmentation of Lumen & Media boundaries in IVUS images using closed polygonal chains | Pavel Sinha et.al. | 2309.17406v1 | null |
2023-09-29 | AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition | Andrew Rouditchenko et.al. | 2309.17395v1 | null |
2023-09-29 | Tree Cross Attention | Leo Feng et.al. | 2309.17388v1 | null |
2023-09-29 | Adversarial Imitation Learning from Visual Observations using Latent Information | Vittorio Giammarino et.al. | 2309.17371v1 | link |
2023-09-29 | SpinView: General interactive visual analysis tool for multiscale computational magnetism | Qichen Xu et.al. | 2309.17367v1 | null |
2023-09-29 | Asynchronous Graph Generators | Christopher P. Ley et.al. | 2309.17335v1 | null |
2023-09-29 | Multi-Depth Branches Network for Efficient Image Super-Resolution | Huiyuan Tian et.al. | 2309.17334v1 | link |
2023-09-29 | Demystifying CLIP Data | Hu Xu et.al. | 2309.16671v2 | link |
2023-09-28 | Decaf: Monocular Deformation Capture for Face and Hand Interactions | Soshi Shimada et.al. | 2309.16670v1 | null |
2023-09-28 | Training a Large Video Model on a Single Machine in a Day | Yue Zhao et.al. | 2309.16669v1 | link |
2023-09-28 | Novel Deep Learning Pipeline for Automatic Weapon Detection | Haribharathi Sivakumar et.al. | 2309.16654v1 | null |
2023-09-28 | ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning | Qiao Gu et.al. | 2309.16650v1 | null |
2023-09-29 | Mixup Your Own Pairs | Yilei Wu et.al. | 2309.16633v2 | link |
2023-09-28 | Class Activation Map-based Weakly supervised Hemorrhage Segmentation using Resnet-LSTM in Non-Contrast Computed Tomography images | Shreyas H Ramananda et.al. | 2309.16627v1 | null |
2023-09-28 | The twisting index in semitoric systems | Jaume Alonso et.al. | 2309.16614v1 | null |
2023-09-28 | Exploiting Edge Features in Graphs with Fused Network Gromov-Wasserstein Distance | Junjie Yang et.al. | 2309.16604v1 | null |
2023-09-28 | Can LLMs Effectively Leverage Structural Information for Graph Learning: When and Why | Jin Huang et.al. | 2309.16595v1 | null |
2023-09-27 | SHACIRA: Scalable HAsh-grid Compression for Implicit Neural Representations | Sharath Girish et.al. | 2309.15848v1 | null |
2023-09-27 | Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing | Brian Yan et.al. | 2309.15826v1 | null |
2023-09-27 | Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation | David Junhao Zhang et.al. | 2309.15818v1 | link |
2023-09-27 | Convolutional Networks with Oriented 1D Kernels | Alexandre Kirchmeyer et.al. | 2309.15812v1 | link |
2023-09-27 | A Quantum-Classical Hybrid Block-Matching Algorithm in Noisy Environment using Dissimilarity Measure | M. Martínez-Felipe et.al. | 2309.15792v1 | null |
2023-09-27 | Large Language Model Routing with Benchmark Datasets | Tal Shnitzer et.al. | 2309.15789v1 | null |
2023-09-27 | One For All: Video Conversation is Feasible Without Video Instruction Tuning | Ruyang Liu et.al. | 2309.15785v1 | null |
2023-09-27 | Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback | Teresa Yeo et.al. | 2309.15762v1 | null |
2023-09-27 | Automated CT Lung Cancer Screening Workflow using 3D Camera | Brian Teixeira et.al. | 2309.15750v1 | null |
2023-09-27 | Data-Driven Latent Space Representation for Robust Bipedal Locomotion Learning | Guillermo A. Castillo et.al. | 2309.15740v1 | null |
2023-09-26 | Classification of symmetry-enriched topological quantum spin liquids | Weicheng Ye et.al. | 2309.15118v1 | null |
2023-09-26 | Doduo: Learning Dense Visual Correspondence from Unsupervised Semantic-Aware Flow | Zhenyu Jiang et.al. | 2309.15110v1 | null |
2023-09-27 | LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models | Yaohui Wang et.al. | 2309.15103v2 | null |
2023-09-26 | VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning | Han Lin et.al. | 2309.15091v1 | null |
2023-09-26 | Video-adverb retrieval with compositional adverb-action embeddings | Thomas Hummel et.al. | 2309.15086v1 | null |
2023-09-26 | Challenges of building medical image datasets for development of deep learning software in stroke | Alessandro Fontanella et.al. | 2309.15081v1 | null |
2023-09-26 | On Excess Risk Convergence Rates of Neural Network Classifiers | Hyunouk Ko et.al. | 2309.15075v1 | null |
2023-09-26 | Language-EXtended Indoor SLAM (LEXIS): A Versatile System for Real-time Visual Scene Understanding | Christina Kassab et.al. | 2309.15065v1 | null |
2023-09-26 | QUILT: Effective Multi-Class Classification on Quantum Computers Using an Ensemble of Diverse Quantum Classifiers | Daniel Silver et.al. | 2309.15056v1 | null |
2023-09-26 | Thalamic nuclei segmentation from T$_1$-weighted MRI: unifying and benchmarking state-of-the-art methods with young and old cohorts | Brendan Williams et.al. | 2309.15053v1 | null |
2023-09-25 | Extreme Parkour with Legged Robots | Xuxin Cheng et.al. | 2309.14341v1 | null |
2023-09-25 | Chop & Learn: Recognizing and Generating Object-State Compositions | Nirat Saini et.al. | 2309.14339v1 | null |
2023-09-25 | Human-Assisted Continual Robot Learning with Foundation Models | Meenal Parakh et.al. | 2309.14321v1 | null |
2023-09-25 | MUTEX: Learning Unified Policies from Multimodal Task Specifications | Rutav Shah et.al. | 2309.14320v1 | null |
2023-09-25 | DeepMesh: Mesh-based Cardiac Motion Tracking using Deep Learning | Qingjie Meng et.al. | 2309.14306v1 | null |
2023-09-25 | NAS-NeRF: Generative Neural Architecture Search for Neural Radiance Fields | Saeejith Nair et.al. | 2309.14293v1 | null |
2023-09-25 | CLIP-DIY: CLIP Dense Inference Yields Open-Vocabulary Semantic Segmentation For-Free | Monika Wysoczańska et.al. | 2309.14289v1 | null |
2023-09-25 | Comparison of One- Two- and Three- Dimensional CNN models for Drawing-Test-Based Diagnostics of the Parkinson's Disease | Xuechao Wang et.al. | 2309.14288v1 | null |
2023-09-26 | Virtual Hyperspectral Images Using Symmetric Autoencoders | Archisman Bhattacharjee et.al. | 2309.14286v2 | null |
2023-09-25 | OmniEvent: A Comprehensive, Fair, and Easy-to-Use Toolkit for Event Understanding | Hao Peng et.al. | 2309.14258v1 | link |
2023-09-22 | Robotic Offline RL from Internet Videos via Value-Function Pre-Training | Chethan Bhateja et.al. | 2309.13041v1 | null |
2023-09-22 | Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception? | Xiaoxiao Sun et.al. | 2309.13038v1 | null |
2023-09-22 | Encoding optimization for quantum machine learning demonstrated on a superconducting transmon qutrit | Shuxiang Cao et.al. | 2309.13036v1 | null |
2023-09-22 | Performance Analysis of UNet and Variants for Medical Image Segmentation | Walid Ehab et.al. | 2309.13013v1 | null |
2023-09-22 | Pursuing Counterfactual Fairness via Sequential Autoencoder Across Domains | Yujie Lin et.al. | 2309.13005v1 | null |
2023-09-22 | Braid groups, elliptic curves, and resolving the quartic | Peter Huxford et.al. | 2309.12999v1 | null |
2023-09-22 | License Plate Recognition Based On Multi-Angle View Model | Dat Tran-Anh et.al. | 2309.12972v1 | null |
2023-09-22 | PI-RADS v2 Compliant Automated Segmentation of Prostate Zones Using co-training Motivated Multi-task Dual-Path CNN | Arnab Das et.al. | 2309.12970v1 | null |
2023-09-22 | Detect Every Thing with Few Examples | Xinyu Zhang et.al. | 2309.12969v1 | link |
2023-09-22 | Massive End-to-end Models for Short Search Queries | Weiran Wang et.al. | 2309.12963v1 | null |
2023-09-21 | ForceSight: Text-Guided Mobile Manipulation with Visual-Force Goals | Jeremy A. Collins et.al. | 2309.12312v1 | null |
2023-09-21 | LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent | Jianing Yang et.al. | 2309.12311v1 | null |
2023-09-21 | TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning | Chaeyoung Jung et.al. | 2309.12306v1 | null |
2023-09-22 | PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation | Shilin Yan et.al. | 2309.12303v2 | link |
2023-09-21 | See to Touch: Learning Tactile Dexterity through Visual Incentives | Irmak Guzey et.al. | 2309.12300v1 | null |
2023-09-21 | The Broad Impact of Feature Imitation: Neural Enhancements Across Financial, Speech, and Physiological Domains | Reza Khanmohammadi et.al. | 2309.12279v1 | null |
2023-09-21 | Enabling Quartile-based Estimated-Mean Gradient Aggregation As Baseline for Federated Image Classifications | Yusen Wu et.al. | 2309.12267v1 | null |
2023-09-21 | Parallelizing non-linear sequential models over the sequence length | Yi Heng Lim et.al. | 2309.12252v1 | null |
2023-09-21 | Adaptive Input-image Normalization for Solving Mode Collapse Problem in GAN-based X-ray Images | Muhammad Muneeb Saad et.al. | 2309.12245v1 | null |
2023-09-21 | Model-based Clustering using Non-parametric Hidden Markov Models | Elisabeth Gassiat et.al. | 2309.12238v1 | null |
2023-09-20 | A Large-scale Dataset for Audio-Language Representation Learning | Luoyi Sun et.al. | 2309.11500v1 | null |
2023-09-20 | FreeU: Free Lunch in Diffusion U-Net | Chenyang Si et.al. | 2309.11497v1 | null |
2023-09-21 | Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning | Tianbao Xie et.al. | 2309.11489v2 | null |
2023-09-20 | First detection of CO$_2$ emission in a Centaur: JWST NIRSpec observations of 39P/Oterma | O. Harrington Pinto et.al. | 2309.11486v1 | null |
2023-09-20 | Multi-Label Takagi-Sugeno-Kang Fuzzy System | Qiongdan Lou et.al. | 2309.11469v1 | null |
2023-09-20 | Budget-Aware Pruning: Handling Multiple Domains with Less Parameters | Samuel Felipe dos Santos et.al. | 2309.11464v1 | null |
2023-09-20 | AudioFool: Fast, Universal and synchronization-free Cross-Domain Attack on Speech Recognition | Mohamad Fakih et.al. | 2309.11462v1 | null |
2023-09-20 | SkeleTR: Towrads Skeleton-based Action Recognition in the Wild | Haodong Duan et.al. | 2309.11445v1 | null |
2023-09-20 | A Systematic Review of Few-Shot Learning in Medical Imaging | Eva Pachetti et.al. | 2309.11433v1 | null |
2023-09-21 | Video Screens for Hearing Research: Transmittance and Reflectance of Professional and Other Fabrics | Jan Heeren et.al. | 2309.11430v2 | null |
2023-09-19 | Assessing the capacity of a denoising diffusion probabilistic model to reproduce spatial context | Rucha Deshpande et.al. | 2309.10817v1 | null |
2023-09-19 | Multisource Holography | Grace Kuo et.al. | 2309.10816v1 | null |
2023-09-19 | Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning | Tianhua Zhang et.al. | 2309.10814v1 | link |
2023-09-19 | Semantic Text Compression for Classification | Emrecan Kutay et.al. | 2309.10809v1 | null |
2023-09-19 | Multi-Context Dual Hyper-Prior Neural Image Compression | Atefeh Khoshkhahtinat et.al. | 2309.10799v1 | null |
2023-09-19 | Multi-spectral Entropy Constrained Neural Compression of Solar Imagery | Ali Zafari et.al. | 2309.10791v1 | null |
2023-09-19 | Guide Your Agent with Adaptive Multimodal Rewards | Changyeon Kim et.al. | 2309.10790v1 | link |
2023-09-19 | Physics-Informed Machine Learning for Data Anomaly Detection, Classification, Localization, and Mitigation: A Review, Challenges, and Path Forward | Mehdi Jabbari Zideh et.al. | 2309.10788v1 | null |
2023-09-19 | AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models | Yuan Tseng et.al. | 2309.10787v1 | link |
2023-09-19 | Context-Aware Neural Video Compression on Solar Dynamics Observatory | Atefeh Khoshkhahtinat et.al. | 2309.10784v1 | null |
2023-09-19 | Des-q: a quantum algorithm to construct and efficiently retrain decision trees for regression and binary classification | Niraj Kumar et.al. | 2309.09976v2 | null |
2023-09-18 | Empirical Study of Mix-based Data Augmentation Methods in Physiological Time Series Data | Peikun Guo et.al. | 2309.09970v1 | null |
2023-09-18 | vSHARP: variable Splitting Half-quadratic ADMM algorithm for Reconstruction of inverse-Problems | George Yiasemis et.al. | 2309.09954v1 | null |
2023-09-18 | TransientViT: A novel CNN - Vision Transformer hybrid real/bogus transient classifier for the Kilodegree Automatic Transient Survey | Zhuoyang Chen et.al. | 2309.09937v1 | null |
2023-09-18 | Algebra of Self-Replication | Lawrence S. Moss et.al. | 2309.09931v1 | null |
2023-09-18 | Evaluating Adversarial Robustness with Expected Viable Performance | Ryan McCoppin et.al. | 2309.09928v1 | null |
2023-09-18 | Impact of Augmented reality system on elementary school ESL learners in country side of china: Motivations, achievements, behaviors and cognitive attainment | Ijaz Ul Haq et.al. | 2309.09894v1 | null |
2023-09-18 | Not Enough Labeled Data? Just Add Semantics: A Data-Efficient Method for Inferring Online Health Texts | Joseph Gatto et.al. | 2309.09877v1 | null |
2023-09-18 | Domain Generalization with Fourier Transform and Soft Thresholding | Hongyi Pan et.al. | 2309.09866v1 | null |
2023-09-18 | Unsupervised Open-Vocabulary Object Localization in Videos | Ke Fan et.al. | 2309.09858v1 | null |
2023-09-18 | Closing the Loop on Runtime Monitors with Fallback-Safe MPC | Rohan Sinha et.al. | 2309.08603v2 | null |
2023-09-15 | Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes | Fabien Delattre et.al. | 2309.08588v1 | null |
2023-09-15 | Compositional Foundation Models for Hierarchical Planning | Anurag Ajay et.al. | 2309.08587v1 | null |
2023-09-15 | HINT: Healthy Influential-Noise based Training to Defend against Data Poisoning Attacks | Minh-Hao Van et.al. | 2309.08549v1 | null |
2023-09-15 | Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens | Minsu Kim et.al. | 2309.08531v1 | null |
2023-09-15 | Generalised Probabilistic Diffusion Scale-Spaces | Pascal Peter et.al. | 2309.08511v1 | null |
2023-09-15 | Deep-learning-powered data analysis in plankton ecology | Harshith Bachimanchi et.al. | 2309.08500v1 | link |
2023-09-15 | P-ROCKET: Pruning Random Convolution Kernels for Time Series Classification | Shaowu Chen et.al. | 2309.08499v1 | link |
2023-09-15 | YCB-Ev: Event-vision dataset for 6DoF object pose estimation | Pavel Rojtberg et.al. | 2309.08482v1 | link |
2023-09-15 | Current and future directions in network biology | Marinka Zitnik et.al. | 2309.08478v1 | null |
2023-09-14 | Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning | Zhiwu Qing et.al. | 2309.07911v1 | link |
2023-09-14 | Generative Image Dynamics | Zhengqi Li et.al. | 2309.07906v1 | null |
2023-09-14 | Ambiguity-Aware In-Context Learning with Large Language Models | Lingyu Gao et.al. | 2309.07900v1 | null |
2023-09-14 | SMARTFEAT: Efficient Feature Construction through Feature-Level Foundation Model Interactions | Yin Lin et.al. | 2309.07856v1 | null |
2023-09-14 | Two Timin': Repairing Smart Contracts With A Two-Layered Approach | Abhinav Jain et.al. | 2309.07841v1 | null |
2023-09-14 | Text Classification of Cancer Clinical Trial Eligibility Criteria | Yumeng Yang et.al. | 2309.07812v1 | null |
2023-09-14 | What Matters to Enhance Traffic Rule Compliance of Imitation Learning for Automated Driving | Hongkuan Zhou et.al. | 2309.07808v1 | null |
2023-09-14 | Improving Multimodal Classification of Social Media Posts by Leveraging Image-Text Auxiliary tasks | Danae Sánchez Villegas et.al. | 2309.07794v1 | null |
2023-09-14 | A Multi-In and Multi-Out Dendritic Neuron Model and its Optimization | Yu Ding et.al. | 2309.07791v1 | null |
2023-09-15 | Virchow: A Million-Slide Digital Pathology Foundation Model | Eugene Vorontsov et.al. | 2309.07778v2 | null |
2023-09-13 | Contrastive Deep Encoding Enables Uncertainty-aware Machine-learning-assisted Histopathology | Nirhoshan Sivaroopan et.al. | 2309.07113v1 | null |
2023-09-13 | Data Augmentation via Subgroup Mixup for Improving Fairness | Madeline Navarro et.al. | 2309.07110v1 | null |
2023-09-13 | The end sum of surfaces | Liam K. Axon et.al. | 2309.07101v1 | null |
2023-09-13 | Revisiting the classics: On the evolutionary origin of the "Fe II" and "He/N" spectral classes of novae | E. Aydi et.al. | 2309.07097v1 | null |
2023-09-13 | RadarLCD: Learnable Radar-based Loop Closure Detection Pipeline | Mirko Usuelli et.al. | 2309.07094v1 | null |
2023-09-13 | Mitigating Group Bias in Federated Learning for Heterogeneous Devices | Khotso Selialia et.al. | 2309.07085v1 | null |
2023-09-13 | The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning | Alexander Bastounis et.al. | 2309.07072v1 | null |
2023-09-13 | Aggregating Long-term Sharp Features via Hybrid Transformers for Video Deblurring | Dongwei Ren et.al. | 2309.07054v1 | link |
2023-09-13 | Thurston's theorem and the Nielsen-Thurston classification via Teichmüller's theorem | James Belk et.al. | 2309.06993v1 | null |
2023-09-13 | Neural network-based coronary dominance classification of RCA angiograms | Ivan Kruzhilov et.al. | 2309.06958v1 | null |
2023-09-12 | Learning Disentangled Avatars with Hybrid 3D Representations | Yao Feng et.al. | 2309.06441v1 | null |
2023-09-12 | LEAP Hand: Low-Cost, Efficient, and Anthropomorphic Hand for Robot Learning | Kenneth Shaw et.al. | 2309.06440v1 | null |
2023-09-12 | AGMDT: Virtual Staining of Renal Histology Images with Adjacency-Guided Multi-Domain Transfer | Tao Ma et.al. | 2309.06421v1 | null |
2023-09-12 | Style2Fab: Functionality-Aware Segmentation for Fabricating Personalized 3D Models with Generative AI | Faraz Faruqi et.al. | 2309.06379v1 | null |
2023-09-12 | Padding-free Convolution based on Preservation of Differential Characteristics of Kernels | Kuangdai Leng et.al. | 2309.06370v1 | null |
2023-09-12 | Using Reed-Muller Codes for Classification with Rejection and Recovery | Daniel Fentham et.al. | 2309.06359v1 | link |
2023-09-12 | Eccentric graph of trees and their Cartesian products | Anita Arora et.al. | 2309.06338v1 | null |
2023-09-12 | Exploring Flat Minima for Domain Generalization with Large Learning Rates | Jian Zhang et.al. | 2309.06337v1 | null |
2023-09-12 | Grounded Language Acquisition From Object and Action Imagery | James Robert Kubricht et.al. | 2309.06335v1 | null |
2023-09-12 | Visualising Game Engine Subsystem Coupling | Gabriel C. Ullmann et.al. | 2309.06329v1 | null |
2023-09-11 | Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips | Yufei Ye et.al. | 2309.05663v1 | null |
2023-09-11 | From Capture to Display: A Survey on Volumetric Video | Yili Jin et.al. | 2309.05658v1 | null |
2023-09-11 | Potentials of Deterministic Radio Propagation Simulation for AI-Enabled Localization and Sensing | Albrecht Michler et.al. | 2309.05650v1 | null |
2023-09-11 | A Novel Supervised Deep Learning Solution to Detect Distributed Denial of Service (DDoS) attacks on Edge Systems using Convolutional Neural Networks (CNN) | Vedanth Ramanathan et.al. | 2309.05646v1 | null |
2023-09-11 | Boundary Peeling: Outlier Detection Method Using One-Class Peeling | Sheikh Arafat et.al. | 2309.05630v1 | null |
2023-09-11 | Temporal Action Localization with Enhanced Instant Discriminability | Dingfeng Shi et.al. | 2309.05590v1 | link |
2023-09-11 | Anisotropic Diffusion Stencils: From Simple Derivations over Stability Estimates to ResNet Implementations | Karl Schrader et.al. | 2309.05575v1 | null |
2023-09-11 | On the Meromorphic Integrability of the Critical Systems for Optimal Sums of Eigenvalues | Yuzhou Tian et.al. | 2309.05568v1 | null |
2023-09-11 | OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data | Giuseppe Cartella et.al. | 2309.05551v1 | link |
2023-09-11 | Distance-Aware eXplanation Based Learning | Misgina Tsighe Hagos et.al. | 2309.05548v1 | link |
2023-09-08 | Generalized Cross-domain Multi-label Few-shot Learning for Chest X-rays | Aroof Aimen et.al. | 2309.04462v1 | null |
2023-09-08 | Generalized Variable Selection Algorithms for Gaussian Process Models by LASSO-like Penalty | Zhiyong Hu et.al. | 2309.04455v1 | null |
2023-09-08 | Vis-SPLIT: Interactive Hierarchical Modeling for mRNA Expression Classification | Braden Roper et.al. | 2309.04423v1 | null |
2023-09-08 | Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving | Thomas E. Huang et.al. | 2309.04422v1 | null |
2023-09-08 | Seeing-Eye Quadruped Navigation with Force Responsive Locomotion Control | David DeFazio et.al. | 2309.04370v1 | null |
2023-09-08 | Active Learning for Classifying 2D Grid-Based Level Completability | Mahsa Bazzaz et.al. | 2309.04367v1 | link |
2023-09-08 | Sparse Codesigned Communication and Radar Systems | Hyeon Seok Rou et.al. | 2309.04362v1 | null |
2023-09-08 | Learning from Power Signals: An Automated Approach to Electrical Disturbance Identification Within a Power Transmission System | Jonathan D. Boyd et.al. | 2309.04361v1 | null |
2023-09-08 | Zero-Shot Robustification of Zero-Shot Models With Foundation Models | Dyah Adila et.al. | 2309.04344v1 | null |
2023-09-08 | Encoding Multi-Domain Scientific Papers by Ensembling Multiple CLS Tokens | Ronald Seoh et.al. | 2309.04333v1 | link |
2023-09-07 | A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal Multi-Organ Segmentation | Ziyan Huang et.al. | 2309.03906v1 | link |
2023-09-07 | ImageBind-LLM: Multi-modality Instruction Tuning | Jiaming Han et.al. | 2309.03905v1 | link |
2023-09-07 | Tracking Anything with Decoupled Video Segmentation | Ho Kei Cheng et.al. | 2309.03903v1 | link |
2023-09-07 | Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction | Su-Kai Chen et.al. | 2309.03900v1 | null |
2023-09-07 | The Making and Breaking of Camouflage | Hala Lamdouar et.al. | 2309.03899v1 | null |
2023-09-07 | ProPainter: Improving Propagation and Transformer for Video Inpainting | Shangchen Zhou et.al. | 2309.03897v1 | null |
2023-09-07 | Zero-Shot Audio Captioning via Audibility Guidance | Tal Shaharabany et.al. | 2309.03884v1 | null |
2023-09-07 | Text-to-feature diffusion for audio-visual few-shot learning | Otniel-Bogdan Mercea et.al. | 2309.03869v1 | null |
2023-09-07 | Classification of Killing Magnetic Curves In H^3 | Özgür Kelekçi et.al. | 2309.03859v1 | null |
2023-09-07 | CenTime: Event-Conditional Modelling of Censoring in Survival Analysis | Ahmed H. Shahin et.al. | 2309.03851v1 | link |
2023-09-07 | Terahertz-Band Direction Finding With Beam-Split and Mutual Coupling Calibration | Ahmet M. Elbir et.al. | 2309.03195v2 | null |
2023-09-06 | Signatures of Bayesian inference emerge from energy efficient synapses | James Malkin et.al. | 2309.03194v1 | null |
2023-09-06 | 3D Transformer based on deformable patch location for differential diagnosis between Alzheimer's disease and Frontotemporal dementia | Huy-Dung Nguyen et.al. | 2309.03183v1 | null |
2023-09-06 | PDiscoNet: Semantically consistent part discovery for fine-grained recognition | Robert van der Klis et.al. | 2309.03173v1 | null |
2023-09-06 | ResFields: Residual Neural Fields for Spatiotemporal Signals | Marko Mihajlovic et.al. | 2309.03160v1 | null |
2023-09-06 | Normal mode decomposition of atomic motion in solids | Jaeyun Moon et.al. | 2309.03140v1 | null |
2023-09-06 | Serving Time: Real-Time, Safe Motion Planning and Control for Manipulation of Unsecured Objects | Zachary Brei et.al. | 2309.03111v1 | null |
2023-09-06 | The Secrets of Non-Blind Poisson Deconvolution | Abhiram Gnanasambandam et.al. | 2309.03105v1 | null |
2023-09-06 | On the |
Marcos Escartín Ferrer et.al. | 2309.03091v1 | null |
2023-09-06 | Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy Protection | Yu Chen et.al. | 2309.03057v1 | null |
2023-09-05 | ReliTalk: Relightable Talking Portrait Generation from a Single Video | Haonan Qiu et.al. | 2309.02434v1 | link |
2023-09-05 | A Likelihood Approach to Incorporating Self-Report Data in HIV Recency Classification | Wenlong Yang et.al. | 2309.02430v1 | null |
2023-09-05 | Building a Winning Team: Selecting Source Model Ensembles using a Submodular Transferability Estimation Approach | Vimal K B et.al. | 2309.02429v1 | null |
2023-09-05 | EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding | Yue Xu et.al. | 2309.02423v1 | null |
2023-09-05 | Doppelgangers: Learning to Disambiguate Images of Similar Structures | Ruojin Cai et.al. | 2309.02420v1 | link |
2023-09-05 | Classification of La3+ and Gd3+ rare earth ions using surface-enhanced Raman scattering | Hao Jin et.al. | 2309.02409v1 | null |
2023-09-05 | Semantic Communications Based on Adaptive Generative Models and Information Bottleneck | S. Barbarossa et.al. | 2309.02387v1 | null |
2023-09-05 | On the classification of primitive ideals for complex classical Lie algebras, IV | William McGovern et.al. | 2309.02363v1 | null |
2023-09-05 | Generating Infinite-Resolution Texture using GANs with Patch-by-Patch Paradigm | Alhasan Abdellatif et.al. | 2309.02340v1 | null |
2023-09-05 | DEEPBEAS3D: Deep Learning and B-Spline Explicit Active Surfaces | Helena Williams et.al. | 2309.02335v1 | null |
2023-09-01 | Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following | Ziyu Guo et.al. | 2309.00615v1 | link |
2023-09-01 | Amyloid-Beta Axial Plane PET Synthesis from Structural MRI: An Image Translation Approach for Screening Alzheimer's Disease | Fernando Vega et.al. | 2309.00569v1 | null |
2023-09-01 | Powder-Bot: A Modular Autonomous Multi-Robot Workflow for Powder X-Ray Diffraction | Amy M. Lunt et.al. | 2309.00544v1 | null |
2023-09-01 | A Machine Vision Method for Correction of Eccentric Error: Based on Adaptive Enhancement Algorithm | Fanyi Wang et.al. | 2309.00514v1 | null |
2023-09-01 | Multi-stage Deep Learning Artifact Reduction for Computed Tomography | Jiayang Shi et.al. | 2309.00494v1 | null |
2023-09-01 | Geometry-aware Line Graph Transformer Pre-training for Molecular Property Prediction | Peizhen Bai et.al. | 2309.00483v1 | null |
2023-09-01 | Deep Joint Source-Channel Coding for Adaptive Image Transmission over MIMO Channels | Haotian Wu et.al. | 2309.00470v1 | null |
2023-09-01 | New metrics for analyzing continual learners | Nicolas Michel et.al. | 2309.00462v1 | null |
2023-09-01 | The miniJPAS survey quasar selection IV: Classification and redshift estimation with SQUEzE | Ignasi Pérez-Ràfols et.al. | 2309.00461v1 | null |
2023-09-01 | CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding | Étienne Labbé et.al. | 2309.00454v1 | link |
2023-08-31 | PointLLM: Empowering Large Language Models to Understand Point Clouds | Runsen Xu et.al. | 2308.16911v1 | link |
2023-08-31 | StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation | Yuhan Wang et.al. | 2308.16909v1 | link |
2023-08-31 | Learning to Taste: A Multimodal Wine Dataset | Thoranna Bender et.al. | 2308.16900v1 | null |
2023-08-31 | EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in the Wild | Manuel Kaufmann et.al. | 2308.16894v1 | link |
2023-08-31 | On the Role of Non-Localities in Fundamental Diagram Estimation | Jing Liu et.al. | 2308.16878v1 | null |
2023-08-31 | SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation | Jiaben Chen et.al. | 2308.16876v1 | null |
2023-08-31 | Understanding defects in amorphous silicon with million-atom simulations and machine learning | Joe D. Morrow et.al. | 2308.16868v1 | null |
2023-08-31 | Self-pruning Graph Neural Network for Predicting Inflammatory Disease Activity in Multiple Sclerosis from Brain MR Images | Chinmay Prabhakar et.al. | 2308.16863v1 | link |
2023-08-31 | Facing Unknown: Open-World Encrypted Traffic Classification Based on Contrastive Pre-Training | Xiang Li et.al. | 2308.16861v1 | null |
2023-08-31 | Majorization-Minimization for sparse SVMs | Alessandro Benfenati et.al. | 2308.16858v1 | null |
2023-08-30 | Fully Non-Linear Neuromorphic Computing with Linear Wave Scattering | Clara C. Wanjura et.al. | 2308.16181v1 | null |
2023-08-30 | General Purpose Audio Effect Removal | Matthew Rice et.al. | 2308.16177v1 | null |
2023-08-30 | Algebraic, Topological, and Mereological Foundations of Existential Granules | Mani A et.al. | 2308.16157v1 | null |
2023-08-31 | MMVP: Motion-Matrix-based Video Prediction | Yiqi Zhong et.al. | 2308.16154v2 | link |
2023-08-30 | Modality Cycles with Masked Conditional Diffusion for Unsupervised Anomaly Segmentation in MRI | Ziyun Liang et.al. | 2308.16150v1 | null |
2023-08-30 | Spatial Graph Coarsening: Weather and Weekday Prediction with London's Bike-Sharing Service using GNN | Yuta Sato et.al. | 2308.16122v1 | null |
2023-08-30 | Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-Spectral Image Fusion | Man Zhou et.al. | 2308.16083v1 | null |
2023-08-30 | A Classification of Observation-Driven State-Space Count Models for Panel Data | Jae Youn Ahn et.al. | 2308.16058v1 | null |
2023-08-30 | Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs | Jiani Liu et.al. | 2308.16056v1 | null |
2023-08-30 | Telepresence Lantern -- Designing an Immersive Video-Mediated Communication Device for Older Adults | Thomas H. Weisswange et.al. | 2308.16052v1 | null |
2023-08-29 | An Adaptive Tangent Feature Perspective of Neural Networks | Daniel LeJeune et.al. | 2308.15478v1 | null |
2023-08-29 | A General-Purpose Self-Supervised Model for Computational Pathology | Richard J. Chen et.al. | 2308.15474v1 | null |
2023-08-29 | Learning Modulated Transformation in GANs | Ceyuan Yang et.al. | 2308.15472v1 | null |
2023-08-30 | Policy composition in reinforcement learning via multi-objective policy optimization | Shruti Mishra et.al. | 2308.15470v2 | null |
2023-08-29 | Input margins can predict generalization too | Coenraad Mouton et.al. | 2308.15466v1 | null |
2023-08-29 | A Comparative Study of Loss Functions: Traffic Predictions in Regular and Congestion Scenarios | Yangxinyu Xie et.al. | 2308.15464v1 | link |
2023-08-29 | Online Overexposed Pixels Hallucination in Videos with Adaptive Reference Frame Selection | Yazhou Xing et.al. | 2308.15462v1 | null |
2023-08-29 | From SMOTE to Mixup for Deep Imbalanced Classification | Wei-Chao Cheng et.al. | 2308.15457v1 | link |
2023-08-29 | Pseudo-Boolean Polynomials Approach To Edge Detection And Image Segmentation | Tendai Mapungwana Chikake et.al. | 2308.15453v1 | null |
2023-08-29 | WrappingNet: Mesh Autoencoder via Deep Sphere Deformation | Eric Lei et.al. | 2308.15413v1 | null |
2023-08-28 | MagicEdit: High-Fidelity and Temporally Coherent Video Editing | Jun Hao Liew et.al. | 2308.14749v1 | null |
2023-08-28 | MagicAvatar: Multimodal Avatar Generation and Animation | Jianfeng Zhang et.al. | 2308.14748v1 | null |
2023-08-28 | CoVR: Learning Composed Video Retrieval from Web Video Captions | Lucas Ventura et.al. | 2308.14746v1 | link |
2023-08-28 | Total Selfie: Generating Full-Body Selfies | Bowei Chen et.al. | 2308.14740v1 | null |
2023-08-28 | PanoSwin: a Pano-style Swin Transformer for Panorama Understanding | Zhixin Ling et.al. | 2308.14726v1 | null |
2023-08-28 | VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation | Xudong Wang et.al. | 2308.14710v1 | link |
2023-08-28 | Fine-Tuning Llama 2 Large Language Models for Detecting Online Sexual Predatory Chats and Abusive Texts | Thanh Thi Nguyen et.al. | 2308.14683v1 | null |
2023-08-28 | Video-Based Hand Pose Estimation for Remote Assessment of Bradykinesia in Parkinson's Disease | Gabriela T. Acevedo Trebbau et.al. | 2308.14679v1 | null |
2023-08-28 | Noncommutative tensor triangular geometry: classification via noetherian spectra | James Rowe et.al. | 2308.14661v1 | null |
2023-08-28 | Towards Standardized Disturbance Rejection Testing of Legged Robot Locomotion with Linear Impactor: A Preliminary Study, Observations, and Implications | Bowen Weng et.al. | 2308.14636v1 | null |
2023-08-25 | Unveiling the Role of Message Passing in Dual-Privacy Preservation on GNNs | Tianyi Zhao et.al. | 2308.13513v1 | null |
2023-08-25 | Joint Modeling of Feature, Correspondence, and a Compressed Memory for Video Object Segmentation | Jiaming Zhang et.al. | 2308.13505v1 | null |
2023-08-25 | Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning | Pranav Balaji et.al. | 2308.13503v1 | null |
2023-08-25 | Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers | Matthew Dutson et.al. | 2308.13494v1 | link |
2023-08-25 | Temporal Uncertainty Localization to Enable Human-in-the-loop Analysis of Dynamic Contrast-enhanced Cardiac MRI Datasets | Dilek M. Yalcinkaya et.al. | 2308.13488v1 | null |
2023-08-25 | QKSAN: A Quantum Kernel Self-Attention Network | Ren-Xin Zhao et.al. | 2308.13422v1 | null |
2023-08-25 | An investigation into the impact of deep learning model choice on sex and race bias in cardiac MR segmentation | Tiarna Lee et.al. | 2308.13415v1 | null |
2023-08-25 | Self-Supervised Representation Learning with Cross-Context Learning between Global and Hypercolumn Features | Zheng Gao et.al. | 2308.13392v1 | null |
2023-08-25 | Direction-aware Video Demoireing with Temporal-guided Bilateral Learning | Shuning Xu et.al. | 2308.13388v1 | null |
2023-08-25 | On flags of holomorphic foliations associated with singular second-order ordinary differential equations | Fernando Lourenço et.al. | 2308.13370v1 | null |
2023-08-24 | POCO: 3D Pose and Shape Estimation with Confidence | Sai Kumar Dwivedi et.al. | 2308.12965v1 | null |
2023-08-24 | Motion-Guided Masking for Spatiotemporal Representation Learning | David Fan et.al. | 2308.12962v1 | null |
2023-08-24 | Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment | Sheng Zhang et.al. | 2308.12960v1 | link |
2023-08-24 | Beyond Document Page Classification: Design, Datasets, and Challenges | Jordy Van Landeghem et.al. | 2308.12896v1 | null |
2023-08-24 | Large Language Models Vote: Prompting for Rare Disease Identification | David Oniani et.al. | 2308.12890v1 | link |
2023-08-24 | Multi-stage feature decorrelation constraints for improving CNN classification performance | Qiuyu Zhu et.al. | 2308.12880v1 | null |
2023-08-24 | ToonTalker: Cross-Domain Face Reenactment | Yuan Gong et.al. | 2308.12866v1 | null |
2023-08-24 | Learned Local Attention Maps for Synthesising Vessel Segmentations | Yash Deo et.al. | 2308.12861v1 | null |
2023-08-24 | Algebraicity of hypergeometric functions with arbitrary parameters | Florian Fürnsinn et.al. | 2308.12855v1 | null |
2023-08-24 | Eric Bergshoeff et.al. | 2308.12852v1 | null | |
2023-08-23 | Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models | Nancy Tyagi et.al. | 2308.12272v1 | null |
2023-08-23 | Bugsplainer: Leveraging Code Structures to Explain Software Bugs with Neural Machine Translation | Parvez Mahbub et.al. | 2308.12267v1 | null |
2023-08-23 | SPPNet: A Single-Point Prompt Network for Nuclei Image Segmentation | Qing Xu et.al. | 2308.12231v1 | link |
2023-08-23 | Towards Real-Time Analysis of Broadcast Badminton Videos | Nitin Nilesh et.al. | 2308.12199v1 | null |
2023-08-23 | Sign Language Translation with Iterative Prototype | Huijie Yao et.al. | 2308.12191v1 | null |
2023-08-23 | Tumor-Centered Patching for Enhanced Medical Image Segmentation | Mutyyba Asghar et.al. | 2308.12168v1 | null |
2023-08-23 | Constant mean curvature hypersurfaces in Anti-de Sitter space | Enrico Trebeschi et.al. | 2308.12167v1 | null |
2023-08-23 | NPF-200: A Multi-Modal Eye Fixation Dataset and Method for Non-Photorealistic Videos | Ziyu Yang et.al. | 2308.12163v1 | null |
2023-08-23 | A Probabilistic Fluctuation based Membership Inference Attack for Generative Models | Wenjie Fu et.al. | 2308.12143v1 | null |
2023-08-23 | Masking Strategies for Background Bias Removal in Computer Vision Models | Ananthu Aniraj et.al. | 2308.12127v1 | link |
2023-08-22 | StoryBench: A Multifaceted Benchmark for Continuous Story Visualization | Emanuele Bugliarello et.al. | 2308.11606v1 | link |
2023-08-22 | Semantic Multi-Resolution Communications | Matin Mortaheb et.al. | 2308.11604v1 | null |
2023-08-22 | EndoNet: model for automatic calculation of H-score on histological slides | Egor Ushakov et.al. | 2308.11562v1 | null |
2023-08-22 | Open Set Synthetic Image Source Attribution | Shengbang Fang et.al. | 2308.11557v1 | null |
2023-08-22 | Multi-event Video-Text Retrieval | Gengyuan Zhang et.al. | 2308.11551v1 | link |
2023-08-22 | Furnishing Sound Event Detection with Language Model Abilities | Hualei Wang et.al. | 2308.11530v1 | null |
2023-08-22 | LCCo: Lending CLIP to Co-Segmentation | Xin Duan et.al. | 2308.11506v1 | null |
2023-08-23 | Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition | Qitong Wang et.al. | 2308.11489v2 | link |
2023-08-22 | Opening the Vocabulary of Egocentric Actions | Dibyadip Chatterjee et.al. | 2308.11488v1 | null |
2023-08-22 | Free Lunch for Gait Recognition: A Novel Relation Descriptor | Jilong Wang et.al. | 2308.11487v1 | null |
2023-08-21 | Structured World Models from Human Videos | Russell Mendonca et.al. | 2308.10901v1 | null |
2023-08-21 | Unlocking Accuracy and Fairness in Differentially Private Image Classification | Leonard Berrada et.al. | 2308.10888v1 | null |
2023-08-21 | Evaluating quantum generative models via imbalanced data classification benchmarks | Graham R. Enos et.al. | 2308.10847v1 | null |
2023-08-21 | Pixel Adaptive Deep Unfolding Transformer for Hyperspectral Image Reconstruction | Miaoyu Li et.al. | 2308.10820v1 | null |
2023-08-21 | Improving Continuous Sign Language Recognition with Cross-Lingual Signs | Fangyun Wei et.al. | 2308.10809v1 | null |
2023-08-21 | DynED: Dynamic Ensemble Diversification in Data Stream Classification | Soheil Abadifard et.al. | 2308.10807v1 | link |
2023-08-21 | MGMAE: Motion Guided Masking for Video Masked Autoencoding | Bingkun Huang et.al. | 2308.10794v1 | null |
2023-08-21 | Extraction of Text from Optic Nerve Optical Coherence Tomography Reports | Iyad Majid et.al. | 2308.10790v1 | null |
2023-08-21 | Dense Error Map Estimation for MRI-Ultrasound Registration in Brain Tumor Surgery Using Swin UNETR | Soorena Salari et.al. | 2308.10784v1 | null |
2023-08-21 | Superfluid weight in the isolated band limit within the generalized random phase approximation | Minh Tam et.al. | 2308.10780v1 | null |
2023-08-18 | Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization | Soumik Mukhopadhyay et.al. | 2308.09716v1 | link |
2023-08-18 | Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis | Jonathon Luiten et.al. | 2308.09713v1 | null |
2023-08-18 | SimDA: Simple Diffusion Adapter for Efficient Video Generation | Zhen Xing et.al. | 2308.09710v1 | null |
2023-08-18 | Invariant Training 2D-3D Joint Hard Samples for Few-Shot Point Cloud Recognition | Xuanyu Yi et.al. | 2308.09694v1 | null |
2023-08-18 | A Lightweight Transformer for Faster and Robust EBSD Data Collection | Harry Dong et.al. | 2308.09693v1 | link |
2023-08-18 | Audiovisual Moments in Time: A Large-Scale Annotated Dataset of Audiovisual Actions | Michael Joannou et.al. | 2308.09685v1 | link |
2023-08-18 | Quantifying Uncertainties of Contact Classifications in a Human-Robot Collaboration with Parallel Robots | Aran Mohammad et.al. | 2308.09675v1 | null |
2023-08-18 | Classification of modular data up to rank 11 | Siu-Hung Ng et.al. | 2308.09670v1 | null |
2023-08-18 | Collision Isolation and Identification Using Proprioceptive Sensing for Parallel Robots to Enable Human-Robot Collaboration | Aran Mohammad et.al. | 2308.09650v1 | null |
2023-08-18 | Robust Uncertainty Quantification using Conformalised Monte Carlo Prediction | Daniel Bethell et.al. | 2308.09647v1 | link |
2023-08-16 | MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions | Henghui Ding et.al. | 2308.08544v1 | link |
2023-08-16 | Deployment and Analysis of Instance Segmentation Algorithm for In-field Grade Estimation of Sweetpotatoes | Hoang M. Nguyen et.al. | 2308.08534v1 | null |
2023-08-16 | Diagnosing Human-object Interaction Detectors | Fangrui Zhu et.al. | 2308.08529v1 | link |
2023-08-17 | Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction | Yuhao Yang et.al. | 2308.08518v2 | null |
2023-08-17 | Two-and-a-half Order Score-based Model for Solving 3D Ill-posed Inverse Problems | Zirong Li et.al. | 2308.08511v2 | null |
2023-08-16 | ResBuilder: Automated Learning of Depth with Residual Structures | Julian Burghoff et.al. | 2308.08504v1 | null |
2023-08-16 | Galactic Archaeology: Tracing the Milky Way's Formation and Evolution through Stellar Populations | J. Alfredo Collazos et.al. | 2308.08492v1 | null |
2023-08-16 | Label Propagation Techniques for Artifact Detection in Imbalanced Classes using Photoplethysmogram Signals | Clara Macabiau et.al. | 2308.08480v1 | null |
2023-08-16 | DeDoDe: Detect, Don't Describe -- Describe, Don't Detect for Local Feature Matching | Johan Edstedt et.al. | 2308.08479v1 | link |
2023-08-16 | Classification Committee for Active Deep Object Detection | Lei Zhao et.al. | 2308.08476v1 | null |
2023-08-15 | CoDeF: Content Deformation Fields for Temporally Consistent Video Processing | Hao Ouyang et.al. | 2308.07926v1 | link |
2023-08-15 | Helping Hands: An Object-Aware Ego-Centric Video Recognition Model | Chuhan Zhang et.al. | 2308.07918v1 | link |
2023-08-15 | Relightable and Animatable Neural Avatar from Sparse-View Video | Zhen Xu et.al. | 2308.07903v1 | null |
2023-08-15 | Back to Basics: A Sanity Check on Modern Time Series Classification Algorithms | Bhaskar Dhariyal et.al. | 2308.07886v1 | link |
2023-08-15 | The Challenge of Fetal Cardiac MRI Reconstruction Using Deep Learning | Denis Prokopenko et.al. | 2308.07885v1 | null |
2023-08-15 | Towards Temporal Edge Regression: A Case Study on Agriculture Trade Between Nations | Lekang Jiang et.al. | 2308.07883v1 | link |
2023-08-15 | Synthesizing Political Zero-Shot Relation Classification via Codebook Knowledge, NLI, and ChatGPT | Yibo Hu et.al. | 2308.07876v1 | null |
2023-08-15 | SEDA: Self-Ensembling ViT with Defensive Distillation and Adversarial Training for robust Chest X-rays Classification | Raza Imam et.al. | 2308.07874v1 | link |
2023-08-15 | Sequence Processing with Quantum Tensor Networks | Carys Harvey et.al. | 2308.07865v1 | null |
2023-08-15 | ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition | Yixuan Zhou et.al. | 2308.07815v1 | link |
2023-08-14 | Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification | Olesya Razuvayevskaya et.al. | 2308.07282v1 | null |
2023-08-14 | A Robust Approach Towards Distinguishing Natural and Computer Generated Images using Multi-Colorspace fused and Enriched Vision Transformer | Manjary P Gangan et.al. | 2308.07279v1 | null |
2023-08-14 | EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models | Peng Wang et.al. | 2308.07269v1 | link |
2023-08-14 | Diving with Penguins: Detecting Penguins and their Prey in Animal-borne Underwater Videos via Deep Learning | Kejia Zhang et.al. | 2308.07267v1 | null |
2023-08-14 | Large-kernel Attention for Efficient and Robust Brain Lesion Segmentation | Liam Chalcroft et.al. | 2308.07251v1 | link |
2023-08-14 | LCE -- An Augmented Combination of Bagging and Boosting in Python | Kevin Fauvel et.al. | 2308.07250v1 | link |
2023-08-14 | Large-scale environment mapping and immersive human-robot interaction for agricultural mobile robot teleoperation | Tao Liu et.al. | 2308.07231v1 | null |
2023-08-14 | Almost fine gradings on algebras and classification of gradings up to isomorphism | Alberto Elduque et.al. | 2308.07230v1 | null |
2023-08-14 | Distance Matters For Improving Performance Estimation Under Covariate Shift | Mélanie Roschewitz et.al. | 2308.07223v1 | link |
2023-08-15 | AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes | Zhaohui Li et.al. | 2308.07221v2 | link |
2023-08-11 | ARGUS: Visualization of AI-Assisted Task Guidance in AR | Sonia Castelo et.al. | 2308.06246v1 | null |
2023-08-11 | Exploring Predicate Visual Context in Detecting of Human-Object Interactions | Frederic Z. Zhang et.al. | 2308.06202v1 | link |
2023-08-11 | Weakly Supervised Text Classification on Free Text Comments in Patient-Reported Outcome Measures | Anna-Grace Linton et.al. | 2308.06199v1 | null |
2023-08-11 | Physical Adversarial Attacks For Camera-based Smart Systems: Current Trends, Categorization, Applications, Research Challenges, and Future Outlook | Amira Guesmi et.al. | 2308.06173v1 | null |
2023-08-11 | Extrinsic geometry and linear differential equations of |
Boris Doubrov et.al. | 2308.06169v1 | null |
2023-08-11 | Rethinking the Localization in Weakly Supervised Object Localization | Rui Xu et.al. | 2308.06161v1 | null |
2023-08-11 | Identification of the Relevance of Comments in Codes Using Bag of Words and Transformer Based Models | Sruthi S et.al. | 2308.06144v1 | link |
2023-08-11 | Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping | Yasser Abdelaziz Dahou Djilali et.al. | 2308.06112v1 | null |
2023-08-11 | Diffusion-based Visual Counterfactual Explanations -- Towards Systematic Quantitative Evaluation | Philipp Vaeth et.al. | 2308.06100v1 | link |
2023-08-11 | Automated Construction of Time-Space Diagrams for Traffic Analysis Using Street-View Video Sequence | Tanay Rastogi et.al. | 2308.06098v1 | null |
2023-08-10 | Follow Anything: Open-set detection, tracking, and following in real-time | Alaa Maalouf et.al. | 2308.05737v1 | link |
2023-08-10 | FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models | Guangkai Xu et.al. | 2308.05733v1 | null |
2023-08-10 | Optimizing Performance of Feedforward and Convolutional Neural Networks through Dynamic Activation Functions | Chinmay Rane et.al. | 2308.05724v1 | null |
2023-08-10 | Towards the Automorphism Conjecture I: Combinatorial Control and Compensation for Factorials | Bernd S. W. Schröder et.al. | 2308.05715v1 | null |
2023-08-10 | Automatic Extraction of Relevant Road Infrastructure using Connected vehicle data and Deep Learning Model | Adu-Gyamfi Kojo et.al. | 2308.05658v1 | null |
2023-08-10 | Attention-based 3D CNN with Multi-layer Features for Alzheimer's Disease Diagnosis using Brain Images | Yanteng Zhang et.al. | 2308.05655v1 | null |
2023-08-10 | Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization | Zezhong Lv et.al. | 2308.05648v1 | link |
2023-08-10 | Self-Supervised Monocular Depth Estimation by Direction-aware Cumulative Convolution Network | Wencheng Han et.al. | 2308.05605v1 | link |
2023-08-10 | Object Goal Navigation with Recursive Implicit Maps | Shizhe Chen et.al. | 2308.05602v1 | null |
2023-08-10 | You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content | Xinlei He et.al. | 2308.05596v1 | null |
2023-08-09 | Improved Multi-Shot Diffusion-Weighted MRI with Zero-Shot Self-Supervised Learning Reconstruction | Jaejin Cho et.al. | 2308.05103v1 | link |
2023-08-09 | DOST -- Domain Obedient Self-supervised Training for Multi Label Classification with Noisy Labels | Soumadeep Saha et.al. | 2308.05101v1 | null |
2023-08-09 | Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic Role Labeling | Yu Zhao et.al. | 2308.05081v1 | null |
2023-08-10 | Geometric Learning-Based Transformer Network for Estimation of Segmentation Errors | Sneha Sree C et.al. | 2308.05068v2 | null |
2023-08-09 | PAT: Position-Aware Transformer for Dense Multi-Label Action Detection | Faegheh Sardari et.al. | 2308.05051v1 | null |
2023-08-09 | Collaborative Wideband Spectrum Sensing and Scheduling for Networked UAVs in UTM Systems | Sravan Reddy Chintareddy et.al. | 2308.05036v1 | null |
2023-08-09 | Expert load matters: operating networks at high accuracy and low manual effort | Sara Sangalli et.al. | 2308.05035v1 | null |
2023-08-09 | MetRoBERTa: Leveraging Traditional Customer Relationship Management Data to Develop a Transit-Topic-Aware Language Model | Michael Leong et.al. | 2308.05012v1 | null |
2023-08-09 | Exploring Multilingual Text Data Distillation | Shivam Sahni et.al. | 2308.04982v1 | link |
2023-08-09 | CasCIFF: A Cross-Domain Information Fusion Framework Tailored for Cascade Prediction in Social Networks | Hongjun Zhu et.al. | 2308.04961v1 | null |
2023-08-08 | A Deep-Learning Method Using Auto-encoder and Generative Adversarial Network for Anomaly Detection on Ancient Stone Stele Surfaces | Yikun Liu et.al. | 2308.04426v1 | null |
2023-08-08 | A Bi-directional Multi-hop Inference Model for Joint Dialog Sentiment Classification and Act Recognition | Li Zheng et.al. | 2308.04424v1 | null |
2023-08-08 | DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from Optical Satellite Images | Xuechao Zou et.al. | 2308.04417v1 | null |
2023-08-08 | Probabilistic Invariant Learning with Randomized Linear Classifiers | Leonardo Cotta et.al. | 2308.04412v1 | null |
2023-08-08 | Data Augmentation-Based Unsupervised Domain Adaptation In Medical Imaging | Sebastian Nørgaard Llambias et.al. | 2308.04395v1 | null |
2023-08-08 | SSTFormer: Bridging Spiking Neural Network and Memory Support Transformer for Frame-Event based Recognition | Xiao Wang et.al. | 2308.04369v1 | link |
2023-08-08 | Vascular Ageing and Smoking Habit Prediction via a Low-Cost Single-Lead ECG Module | S. Anas Ali et.al. | 2308.04355v1 | null |
2023-08-08 | A Lightweight and Accurate Face Detection Algorithm Based on Retinaface | Baozhu Liu et.al. | 2308.04340v1 | null |
2023-08-08 | Pengembangan Model untuk Mendeteksi Kerusakan pada Terumbu Karang dengan Klasifikasi Citra | Fadhil Muhammad et.al. | 2308.04337v1 | null |
2023-08-08 | Embracing Safe Contacts with Contact-aware Planning and Control | Zhaoting Li et.al. | 2308.04323v1 | null |
2023-08-07 | 3D Motion Magnification: Visualizing Subtle Motions with Time Varying Radiance Fields | Brandon Y. Feng et.al. | 2308.03757v1 | null |
2023-08-07 | What about translation? New coding system for content analysis on the perception of literary translation around the political transformation in 1989 in Hungary as a classification problem on an unbalanced dataset | Dalma Galambos et.al. | 2308.03742v1 | null |
2023-08-07 | Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge Distillation | Renjie Liang et.al. | 2308.03725v1 | null |
2023-08-07 | Automated Real Time Delineation of Supraclavicular Brachial Plexus in Neck Ultrasonography Videos: A Deep Learning Approach | Abhay Tyagi et.al. | 2308.03717v1 | null |
2023-08-08 | Communication-Efficient Framework for Distributed Image Semantic Wireless Transmission | Bingyan Xie et.al. | 2308.03713v2 | null |
2023-08-07 | Scaling may be all you need for achieving human-level object recognition capacity with human-like visual experience | A. Emin Orhan et.al. | 2308.03712v1 | link |
2023-08-07 | Video-based Person Re-identification with Long Short-Term Representation Learning | Xuehu Liu et.al. | 2308.03703v1 | null |
2023-08-08 | Screen-based 3D Subjective Experiment Software | Songlin Fan et.al. | 2308.03698v2 | null |
2023-08-07 | Learning Concise and Descriptive Attributes for Visual Recognition | An Yan et.al. | 2308.03685v1 | null |
2023-08-07 | Detecting Spells in Fantasy Literature with a Transformer Based Artificial Intelligence | Marcel Moravek et.al. | 2308.03660v1 | null |
2023-08-04 | Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP | Qihang Yu et.al. | 2308.02487v1 | link |
2023-08-04 | BlindSage: Label Inference Attacks against Node-level Vertical Federated Graph Neural Networks | Marco Arazzi et.al. | 2308.02465v1 | null |
2023-08-04 | Nonprehensile Planar Manipulation through Reinforcement Learning with Multimodal Categorical Exploration | Juan Del Aguila Ferrandis et.al. | 2308.02459v1 | null |
2023-08-04 | Getting the Ball Rolling: Learning a Dexterous Policy for a Biomimetic Tendon-Driven Hand with Rolling Contact Joints | Yasunori Toshimitsu et.al. | 2308.02453v1 | null |
2023-08-04 | Adaptive Preferential Attached kNN Graph With Distribution-Awareness | Shaojie Min et.al. | 2308.02442v1 | link |
2023-08-04 | Scaling Survival Analysis in Healthcare with Federated Survival Forests: A Comparative Study on Heart Failure and Breast Cancer Genomics | Alberto Archetti et.al. | 2308.02382v1 | null |
2023-08-04 | Brain MRI Segmentation using Template-Based Training and Visual Perception Augmentation | Fang-Cheng Yeh et.al. | 2308.02363v1 | null |
2023-08-04 | T-UNet: Triplet UNet for Change Detection in High-Resolution Remote Sensing Images | Huan Zhong et.al. | 2308.02356v1 | link |
2023-08-04 | Adapting to Change: Robust Counterfactual Explanations in Dynamic Data Landscapes | Bardh Prenkaj et.al. | 2308.02353v1 | link |
2023-08-04 | Generative Image Priors for MRI Reconstruction Trained from Magnitude-Only Images | Guanxiong Luo et.al. | 2308.02340v1 | null |
2023-08-03 | FROD: Robust Object Detection for Free | Muhammad et.al. | 2308.01888v1 | null |
2023-08-03 | Similar image retrieval using Autoencoder. I. Automatic morphology classification of galaxies | Eunsuk Seo et.al. | 2308.01871v1 | null |
2023-08-03 | Tag Prediction of Competitive Programming Problems using Deep Learning Techniques | Taha Lokat et.al. | 2308.01863v1 | null |
2023-08-03 | URET: Universal Robustness Evaluation Toolkit (for Evasion) | Kevin Eykholt et.al. | 2308.01840v1 | link |
2023-08-03 | Distribution-Free Inference for the Regression Function of Binary Classification | Ambrus Tamás et.al. | 2308.01835v1 | null |
2023-08-03 | Deep Neural Networks Fused with Textures for Image Classification | Asish Bera et.al. | 2308.01813v1 | null |
2023-08-03 | Deep Learning-based Prediction of Stress and Strain Maps in Arterial Walls for Improved Cardiovascular Risk Assessment | Yasin Shokrollahi1 et.al. | 2308.01771v1 | null |
2023-08-03 | Focus on Content not Noise: Improving Image Generation for Nuclei Segmentation by Suppressing Steganography in CycleGAN | Jonas Utz et.al. | 2308.01769v1 | null |
2023-08-03 | A Novel Tensor Decomposition of arbitrary order based on Block Convolution with Reflective Boundary Conditions for Multi-Dimensional Data Analysis | Mahdi Molavi et.al. | 2308.01768v1 | null |
2023-08-03 | NuInsSeg: A Fully Annotated Dataset for Nuclei Instance Segmentation in H&E-Stained Histological Images | Amirreza Mahbod et.al. | 2308.01760v1 | link |
2023-08-02 | ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders | Shawn Xu et.al. | 2308.01317v1 | null |
2023-08-02 | More Context, Less Distraction: Visual Classification by Inferring and Conditioning on Contextual Attributes | Bang An et.al. | 2308.01313v1 | link |
2023-08-02 | Revisiting DETR Pre-training for Object Detection | Yan Ma et.al. | 2308.01300v1 | null |
2023-08-02 | A Probabilistic Approach to Self-Supervised Learning using Cyclical Stochastic Gradient MCMC | Masoumeh Javanbakhat et.al. | 2308.01271v1 | null |
2023-08-02 | Incorporating Season and Solar Specificity into Renderings made by a NeRF Architecture using Satellite Images | Michael Gableman et.al. | 2308.01262v1 | link |
2023-08-02 | Quantum Imprint of the Anharmonic Oscillator | Prisco Lo Chiatto et.al. | 2308.01244v1 | null |
2023-08-03 | CMUNeXt: An Efficient Medical Image Segmentation Network based on Large Kernel and Skip Fusion | Fenghe Tang et.al. | 2308.01239v2 | link |
2023-08-02 | LSF-IDM: Lightweight Deep Learning Models for Automotive Intrusion Detection Model Based on Semantic Fusion | Pengzhou Cheng et.al. | 2308.01237v1 | null |
2023-08-02 | JADES. The diverse population of infant Black Holes at 4<z<11: merging, tiny, poor, but mighty | Roberto Maiolino et.al. | 2308.01230v1 | null |
2023-08-02 | TeachCLIP: Multi-Grained Teaching for Efficient Text-to-Video Retrieval | Kaibin Tian et.al. | 2308.01217v1 | null |
2023-08-01 | Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models | Cheng-Yu Hsieh et.al. | 2308.00675v1 | null |
2023-08-01 | Human-M3: A Multi-view Multi-modal Dataset for 3D Human Pose Estimation in Outdoor Scenes | Bohao Fan et.al. | 2308.00628v1 | link |
2023-08-01 | NeRT: Implicit Neural Representations for General Unsupervised Turbulence Mitigation | Weiyun Jiang et.al. | 2308.00622v1 | null |
2023-08-01 | Beyond One-Hot-Encoding: Injecting Semantics to Drive Image Classifiers | Alan Perotti et.al. | 2308.00607v1 | link |
2023-08-01 | Relation-Aware Distribution Representation Network for Person Clustering with Multiple Modalities | Kaijian Liu et.al. | 2308.00588v1 | null |
2023-08-01 | Gradient Scaling on Deep Spiking Neural Networks with Spike-Dependent Local Information | Seongsik Park et.al. | 2308.00558v1 | null |
2023-08-01 | SF-IDS: An Imbalanced Semi-Supervised Learning Framework for Fine-grained Intrusion Detection | Xinran Zheng et.al. | 2308.00542v1 | null |
2023-08-01 | Compressed Private Aggregation for Scalable and Robust Federated Learning over Massive Networks | Natalie Lang et.al. | 2308.00540v1 | link |
2023-08-01 | Predicting Early Dropouts of an Active and Healthy Ageing App | Vasileios Perifanis et.al. | 2308.00539v1 | null |
2023-08-01 | PressureTransferNet: Human Attribute Guided Dynamic Ground Pressure Profile Transfer using 3D simulated Pressure Maps | Lala Shakti Swarup Ray et.al. | 2308.00538v1 | null |
2023-07-31 | A Quantized Interband Topological Index in Two-Dimensional Systems | Tharindu Fernando et.al. | 2307.16893v1 | null |
2023-07-31 | Foundational Models for Fault Diagnosis of Electrical Motors | Sriram Anbalagan et.al. | 2307.16891v1 | null |
2023-07-31 | Discovering Adaptable Symbolic Algorithms from Scratch | Stephen Kelly et.al. | 2307.16890v1 | null |
2023-07-31 | Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models | Weikang Yu et.al. | 2307.16865v1 | null |
2023-07-31 | Nonlinearity-induced topological phase transition characterized by the nonlinear Chern number | Kazuki Sone et.al. | 2307.16827v1 | null |
2023-07-31 | Defense of Adversarial Ranking Attack in Text Retrieval: Benchmark and Baseline via Detection | Xuanang Chen et.al. | 2307.16816v1 | null |
2023-07-31 | Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality Assessment | Kun Yuan et.al. | 2307.16813v1 | null |
2023-07-31 | DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures | Hannah Rose Kirk et.al. | 2307.16811v1 | null |
2023-07-31 | DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation | Yue Zhang et.al. | 2307.16803v1 | null |
2023-07-31 | Classification with Deep Neural Networks and Logistic Loss | Zihan Zhang et.al. | 2307.16792v1 | null |
2023-07-28 | Quantum-noise-limited optical neural networks operating at a few quanta per activation | Shi-Yuan Ma et.al. | 2307.15712v1 | null |
2023-07-31 | MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking | Ruopeng Gao et.al. | 2307.15700v2 | null |
2023-07-28 | PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding | Davide Boscaini et.al. | 2307.15692v1 | null |
2023-07-28 | ODTlearn: A Package for Learning Optimal Decision Trees for Prediction and Prescription | Patrick Vossler et.al. | 2307.15691v1 | link |
2023-07-28 | Dynamic Analysis and an Eigen Initializer for Recurrent Neural Networks | Ran Dou et.al. | 2307.15679v1 | null |
2023-07-28 | Bayesian Time-Series Classifier for Decoding Simple Visual Stimuli from Intracranial Neural Activity | Navid Ziaei et.al. | 2307.15672v1 | null |
2023-07-28 | Classifying core collapse supernova remnants by their morphology as shaped by the last exploding jets | Noam Soker et.al. | 2307.15666v1 | null |
2023-07-28 | Multi-layer Aggregation as a key to feature-based OOD detection | Benjamin Lambert et.al. | 2307.15647v1 | null |
2023-07-28 | Scale-aware Test-time Click Adaptation for Pulmonary Nodule and Mass Segmentation | Zhihao Li et.al. | 2307.15645v1 | link |
2023-07-28 | TriadNet: Sampling-free predictive intervals for lesional volume in 3D brain MR images | Benjamin Lambert et.al. | 2307.15638v1 | null |
2023-07-27 | PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking | Yang Zheng et.al. | 2307.15055v1 | null |
2023-07-27 | A Transformer-based Approach for Arabic Offline Handwritten Text Recognition | Saleh Momeni et.al. | 2307.15045v1 | null |
2023-07-27 | Drive Asymmetry, Convergence and the Origin of Turbulence in ICF Implosions | Vincent A. Thomas et.al. | 2307.15028v1 | null |
2023-07-27 | Self-Supervised Graph Transformer for Deepfake Detection | Aminollah Khormali et.al. | 2307.15019v1 | null |
2023-07-27 | The last patch for classifying shuffle groups | Junyang Zhang et.al. | 2307.15012v1 | null |
2023-07-27 | Gzip versus bag-of-words for text classification with KNN | Juri Opitz et.al. | 2307.15002v1 | null |
2023-07-27 | Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs | Or Sharir et.al. | 2307.14988v1 | null |
2023-07-27 | Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models | Ziyi Wang et.al. | 2307.14971v1 | link |
2023-07-27 | Federated Model Aggregation via Self-Supervised Priors for Highly Imbalanced Medical Image Classification | Marawan Elbatel et.al. | 2307.14959v1 | link |
2023-07-27 | Multi-Source Domain Adaptation through Dataset Dictionary Learning in Wasserstein Space | Eduardo Fernandes Montesuma et.al. | 2307.14953v1 | null |
2023-07-26 | MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation | Rajeev Yasarla et.al. | 2307.14336v1 | null |
2023-07-26 | Event-based Vision for Early Prediction of Manipulation Actions | Daniel Deniz et.al. | 2307.14332v1 | null |
2023-07-26 | Waypoint-Based Imitation Learning for Robotic Manipulation | Lucy Xiaoyang Shi et.al. | 2307.14326v1 | null |
2023-07-26 | Unraveling the Complexity of Splitting Sequential Data: Tackling Challenges in Video and Time Series Analysis | Diego Botache et.al. | 2307.14294v1 | null |
2023-07-26 | G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory | Hongxiang Li et.al. | 2307.14277v1 | null |
2023-07-26 | Deepfake Image Generation for Improved Brain Tumor Segmentation | Roa'a Al-Emaryeen et.al. | 2307.14273v1 | null |
2023-07-26 | Sim-to-Real Model-Based and Model-Free Deep Reinforcement Learning for Tactile Pushing | Max Yang et.al. | 2307.14272v1 | null |
2023-07-26 | Artifact Restoration in Histology Images with Diffusion Probabilistic Models | Zhenqi He et.al. | 2307.14262v1 | link |
2023-07-26 | Defending Adversarial Patches via Joint Region Localizing and Inpainting | Junwen Chen et.al. | 2307.14242v1 | null |
2023-07-26 | DisguisOR: Holistic Face Anonymization for the Operating Room | Lennart Bastian et.al. | 2307.14241v1 | link |
2023-07-25 | RED CoMETS: An ensemble classifier for symbolically represented multivariate time series | Luca A. Bennett et.al. | 2307.13679v1 | link |
2023-07-25 | QuickQual: Lightweight, convenient retinal image quality scoring with off-the-shelf pretrained models | Justin Engelmann et.al. | 2307.13646v1 | link |
2023-07-25 | Manifestly Covariant Worldline Actions from Coadjoint Orbits. Part I: Generalities and Vectorial Descriptions | Thomas Basile et.al. | 2307.13644v1 | null |
2023-07-25 | Optical Flow boosts Unsupervised Localization and Segmentation | Xinyu Zhang et.al. | 2307.13640v1 | link |
2023-07-25 | Insights into Cognitive Engagement: Comparing the Effectiveness of Game-Based and Video-Based Learning | Shayla Sharmin et.al. | 2307.13637v1 | null |
2023-07-25 | Contributions to the Improvement of Question Answering Systems in the Biomedical Domain | Mourad Sarrouti et.al. | 2307.13631v1 | null |
2023-07-25 | Chandra X-ray Observatory Observations of 13 Fermi LAT Sources | Blagoy Rangelov et.al. | 2307.13594v1 | null |
2023-07-25 | Reinterpreting survival analysis in the universal approximator age | Sören Dittmer et.al. | 2307.13579v1 | link |
2023-07-25 | PT$\mathrm{L}^{p}$: Partial Transport |
Xinran Liu et.al. | 2307.13571v1 | null |
2023-07-25 | Group Activity Recognition in Computer Vision: A Comprehensive Review, Challenges, and Future Perspectives | Chuanchuan Wang et.al. | 2307.13541v1 | null |
2023-07-24 | Leveraging Label Variation in Large Language Models for Zero-Shot Text Classification | Flor Miriam Plaza-del-Arco et.al. | 2307.12973v1 | null |
2023-07-24 | A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning | Benjamin Eysenbach et.al. | 2307.12968v1 | link |
2023-07-24 | Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment | Sarah Ibrahimi et.al. | 2307.12964v1 | null |
2023-07-24 | Rule By Example: Harnessing Logical Rules for Explainable Hate Speech Detection | Christopher Clarke et.al. | 2307.12935v1 | link |
2023-07-25 | Towards a Visual-Language Foundation Model for Computational Pathology | Ming Y. Lu et.al. | 2307.12914v2 | null |
2023-07-24 | Dyn-E: Local Appearance Editing of Dynamic Neural Radiance Fields | Shangzhan Zhang et.al. | 2307.12909v1 | null |
2023-07-24 | Conditional Residual Coding: A Remedy for Bottleneck Problems in Conditional Inter Frame Coding | Fabian Brand et.al. | 2307.12864v1 | null |
2023-07-24 | Multiscale Video Pretraining for Long-Term Activity Forecasting | Reuben Tan et.al. | 2307.12854v1 | null |
2023-07-25 | Spatiotemporal Modeling Encounters 3D Medical Image Analysis: Slice-Shift UNet with Multi-View Fusion | C. I. Ugwu et.al. | 2307.12853v2 | null |
2023-07-24 | Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization | Hancheng Min et.al. | 2307.12851v1 | null |
2023-07-21 | Advanced Monte Carlo simulation techniques to study polymers under equilibrium conditions | Monika Angwani et.al. | 2307.11722v1 | null |
2023-07-21 | Deep Learning Hyperspectral Pansharpening on large scale PRISMA dataset | Simone Zini et.al. | 2307.11666v1 | null |
2023-07-21 | FEDD -- Fair, Efficient, and Diverse Diffusion-based Lesion Segmentation and Malignancy Classification | Héctor Carrión et.al. | 2307.11654v1 | null |
2023-07-21 | Sparse Cholesky factorization by greedy conditional selection | Stephen Huan et.al. | 2307.11648v1 | link |
2023-07-24 | Morphological Image Analysis and Feature Extraction for Reasoning with AI-based Defect Detection and Classification Models | Jiajun Zhang et.al. | 2307.11643v2 | null |
2023-07-21 | Deep Reinforcement Learning Based System for Intraoperative Hyperspectral Video Autofocusing | Charlie Budd et.al. | 2307.11638v1 | null |
2023-07-21 | Computational Image Formation | Stanley H. Chan et.al. | 2307.11635v1 | null |
2023-07-21 | Finding Optimal Diverse Feature Sets with Alternative Feature Selection | Jakob Bach et.al. | 2307.11607v1 | null |
2023-07-21 | Cascaded multitask U-Net using topological loss for vessel segmentation and centerline extraction | Pierre Rougé et.al. | 2307.11603v1 | null |
2023-07-21 | Mixbiotic society measures: Assessment of community well-going as living system | Takeshi Kato et.al. | 2307.11594v1 | null |
2023-07-20 | GLSFormer: Gated - Long, Short Sequence Transformer for Step Recognition in Surgical Videos | Nisarg A. Shah et.al. | 2307.11081v1 | link |
2023-07-20 | Driving Policy Prediction based on Deep Learning Models | Fuxiao Liu et.al. | 2307.11058v1 | null |
2023-07-20 | Cascade-DETR: Delving into High-Quality Universal Object Detection | Mingqiao Ye et.al. | 2307.11035v1 | link |
2023-07-20 | Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification | Neel Guha et.al. | 2307.11031v1 | null |
2023-07-20 | Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering | Yijun Dong et.al. | 2307.11030v1 | null |
2023-07-20 | Multi-objective point cloud autoencoders for explainable myocardial infarction prediction | Marcel Beetz et.al. | 2307.11017v1 | null |
2023-07-20 | Treatment And Follow-Up Guidelines For Multiple Brain Metastases: A Systematic Review | Ana Sofia Santos et.al. | 2307.11016v1 | null |
2023-07-21 | Dense Sample Deep Learning | Stephen Josè Hanson et.al. | 2307.10991v2 | null |
2023-07-20 | Deep Spiking-UNet for Image Processing | Hebei Li et.al. | 2307.10974v1 | link |
2023-07-20 | Spinal nerve segmentation method and dataset construction in endoscopic surgical scenarios | Shaowu Peng et.al. | 2307.10955v1 | link |
2023-07-19 | DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering | Wei Cheng et.al. | 2307.10173v1 | link |
2023-07-19 | Adversarial Latent Autoencoder with Self-Attention for Structural Image Synthesis | Jiajie Fan et.al. | 2307.10166v1 | null |
2023-07-19 | Leveraging Visemes for Better Visual Speech Representation and Lip Reading | Javad Peymanfard et.al. | 2307.10157v1 | null |
2023-07-19 | Remarks on a theorem of Pink in presence of bad reduction | Wojciech Gajda et.al. | 2307.10140v1 | null |
2023-07-19 | Gradient Sparsification For Masked Fine-Tuning of Transformers | James O' Neill et.al. | 2307.10098v1 | null |
2023-07-19 | Boundary-Refined Prototype Generation: A General End-to-End Paradigm for Semi-Supervised Semantic Segmentation | Junhao Dong et.al. | 2307.10097v1 | null |
2023-07-19 | Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis | Lingting Zhu et.al. | 2307.10094v1 | null |
2023-07-19 | Divert More Attention to Vision-Language Object Tracking | Mingzhe Guo et.al. | 2307.10046v1 | link |
2023-07-19 | A non-monotone extra-gradient trust-region method with noisy oracles | Natasa Krejic et.al. | 2307.10038v1 | null |
2023-07-20 | Class Attention to Regions of Lesion for Imbalanced Medical Image Recognition | Jia-Xin Zhuang et.al. | 2307.10036v2 | null |
2023-07-18 | AnyDoor: Zero-shot Object-level Image Customization | Xi Chen et.al. | 2307.09481v1 | null |
2023-07-18 | FACTS: Facial Animation Creation using the Transfer of Styles | Jack Saunders et.al. | 2307.09480v1 | null |
2023-07-18 | GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping | Zhuoling Li et.al. | 2307.09472v1 | null |
2023-07-18 | Smooth Attention for Deep Multiple Instance Learning: Application to CT Intracranial Hemorrhage Detection | Yunan Wu et.al. | 2307.09457v1 | link |
2023-07-19 | A comparative analysis of SRGAN models | Fatemeh Rezapoor Nikroo et.al. | 2307.09456v2 | null |
2023-07-19 | Pseudo Outlier Exposure for Out-of-Distribution Detection using Pretrained Transformers | Jaeyoung Kim et.al. | 2307.09455v2 | null |
2023-07-18 | Measuring Student Behavioral Engagement using Histogram of Actions | Ahmed Abdelkawy et.al. | 2307.09420v1 | null |
2023-07-18 | Is this Snippet Written by ChatGPT? An Empirical Study with a CodeBERT-Based Classifier | Phuong T. Nguyen et.al. | 2307.09381v1 | null |
2023-07-18 | CertPri: Certifiable Prioritization for Deep Neural Networks via Movement Cost in Feature Space | Haibin Zheng et.al. | 2307.09375v1 | null |
2023-07-18 | Enhancing Pattern Classification in Support Vector Machines through Matrix Formulation | Sambhav Jain Reshma Rastogi et.al. | 2307.09372v1 | null |
2023-07-17 | Diffusion Models Beat GANs on Image Classification | Soumik Mukhopadhyay et.al. | 2307.08702v1 | null |
2023-07-17 | Neural Video Depth Stabilizer | Yiran Wang et.al. | 2307.08695v1 | link |
2023-07-17 | SEMI-DiffusionInst: A Diffusion Model Based Approach for Semiconductor Defect Classification and Segmentation | Vic De Ridder et.al. | 2307.08693v1 | null |
2023-07-17 | FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning | Tri Dao et.al. | 2307.08691v1 | link |
2023-07-17 | Implementation of a perception system for autonomous vehicles using a detection-segmentation network in SoC FPGA | Maciej Baczmanski et.al. | 2307.08682v1 | null |
2023-07-17 | Neural Image Compression: Generalization, Robustness, and Spectral Biases | Kelsey Lieberman et.al. | 2307.08657v1 | null |
2023-07-17 | PolyGNN: Polyhedron-based Graph Neural Network for 3D Building Reconstruction from Point Clouds | Zhaiyu Chen et.al. | 2307.08636v1 | null |
2023-07-17 | Deficiency-Aware Masked Transformer for Video Inpainting | Yongsheng Yu et.al. | 2307.08629v1 | link |
2023-07-17 | BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs | Yang Zhao et.al. | 2307.08581v1 | null |
2023-07-18 | Deep Learning with Passive Optical Nonlinear Mapping | Fei Xia et.al. | 2307.08558v2 | null |
2023-07-14 | Expressive Monotonic Neural Networks | Ouail Kitouni et.al. | 2307.07512v1 | link |
2023-07-14 | Streaming CTR Prediction: Rethinking Recommendation Task for Real-World Streaming Data | Qi-Wei Wang et.al. | 2307.07509v1 | null |
2023-07-14 | Brain Tumor Detection using Convolutional Neural Networks with Skip Connections | Aupam Hamran et.al. | 2307.07503v1 | null |
2023-07-14 | TALL: Thumbnail Layout for Deepfake Video Detection | Yuting Xu et.al. | 2307.07494v1 | null |
2023-07-14 | DreamTeacher: Pretraining Image Backbones with Deep Generative Models | Daiqing Li et.al. | 2307.07487v1 | null |
2023-07-14 | Multimodal Distillation for Egocentric Action Recognition | Gorjan Radevski et.al. | 2307.07483v1 | null |
2023-07-14 | Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based Tumor Classification | Simon Holdenried-Krafft et.al. | 2307.07482v1 | null |
2023-07-14 | Passage-times for partially-homogeneous reflected random walks on the quadrant | Conrado da Costa et.al. | 2307.07458v1 | null |
2023-07-14 | An equivariant surgery classification of |
Kelly Pohland et.al. | 2307.07446v1 | null |
2023-07-14 | Can Large Language Models Empower Molecular Property Prediction? | Chen Qian et.al. | 2307.07443v1 | link |
2023-07-13 | Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition | Syed Talal Wasim et.al. | 2307.06947v1 | link |
2023-07-13 | InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation | Yi Wang et.al. | 2307.06942v1 | link |
2023-07-13 | Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation | Yingqing He et.al. | 2307.06940v1 | link |
2023-07-13 | DRAGON: A Dialogue-Based Robot for Assistive Navigation with Visual Language Grounding | Shuijing Liu et.al. | 2307.06924v1 | null |
2023-07-13 | Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks | Liam Collins et.al. | 2307.06887v1 | null |
2023-07-13 | LVLane: Deep Learning for Lane Detection and Classification in Challenging Conditions | Zillur Rahman et.al. | 2307.06853v1 | link |
2023-07-13 | Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks | Denis Coquenet et.al. | 2307.06795v1 | link |
2023-07-13 | Robotic surface exploration with vision and tactile sensing for cracks detection and characterisation | Francesca Palermo et.al. | 2307.06784v1 | null |
2023-07-13 | Generalizing Supervised Deep Learning MRI Reconstruction to Multiple and Unseen Contrasts using Meta-Learning Hypernetworks | Sriprabha Ramanarayanan et.al. | 2307.06771v1 | link |
2023-07-13 | Pairs of inner projections and two applications | Ramlal Debnath et.al. | 2307.06744v1 | null |
2023-07-12 | Deep Learning of Crystalline Defects from TEM images: A Solution for the Problem of "Never Enough Training Data" | Kishan Govind et.al. | 2307.06322v1 | null |
2023-07-12 | A geometric classification of rod complements in the 3-torus | Connie On Yu Hui et.al. | 2307.06317v1 | null |
2023-07-12 | Facial Reenactment Through a Personalized Generator | Ariel Elazary et.al. | 2307.06307v1 | null |
2023-07-12 | Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution | Mostafa Dehghani et.al. | 2307.06304v1 | null |
2023-07-12 | Feature Embeddings from Large-Scale Acoustic Bird Classifiers Enable Few-Shot Transfer Learning | Burooj Ghani et.al. | 2307.06292v1 | null |
2023-07-12 | Stochastic Light Field Holography | Florian Schiffers et.al. | 2307.06277v1 | null |
2023-07-12 | Machine learning and Topological data analysis identify unique features of human papillae in 3D scans | Rayna Andreeva et.al. | 2307.06255v1 | null |
2023-07-12 | On the Importance of Denoising when Learning to Compress Images | Benoit Brummer et.al. | 2307.06233v1 | link |
2023-07-12 | Ashaar: Automatic Analysis and Generation of Arabic Poetry Using Deep Learning Approaches | Zaid Alyafeai et.al. | 2307.06218v1 | link |
2023-07-12 | Local Conditional Neural Fields for Versatile and Generalizable Large-Scale Reconstructions in Computational Imaging | Hao Wang et.al. | 2307.06207v1 | null |
2023-07-11 | Fractonic Higher-Order Topological Phases in Open Quantum Systems | Jian-Hao Zhang et.al. | 2307.05474v1 | null |
2023-07-11 | Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives | Tom Monnier et.al. | 2307.05473v1 | null |
2023-07-11 | EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone | Shraman Pramanick et.al. | 2307.05463v1 | null |
2023-07-11 | Improving the Security of Smartwatch Payment with Deep Learning | George Webber et.al. | 2307.05437v1 | null |
2023-07-11 | One-Versus-Others Attention: Scalable Multimodal Integration | Michal Golovanevsky et.al. | 2307.05435v1 | link |
2023-07-11 | Identifying Acoustic Wave Sources on the Sun. II. Improved Filter Techniques for Source Wavefield Seismology | Shah Mohammad Bahauddin et.al. | 2307.05433v1 | null |
2023-07-11 | Effective Whitney Stratification of Real Algebraic Varieties | Martin Helmer et.al. | 2307.05427v1 | null |
2023-07-11 | Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform | Mateusz Wójcik et.al. | 2307.05399v1 | link |
2023-07-11 | ShredGP: Guitarist Style-Conditioned Tablature Generation | Pedro Sarmento et.al. | 2307.05324v1 | null |
2023-07-11 | Class Instance Balanced Learning for Long-Tailed Classification | Marc-Antoine Lavoie et.al. | 2307.05322v1 | null |
2023-07-10 | Semantic-SAM: Segment and Recognize Anything at Any Granularity | Feng Li et.al. | 2307.04767v1 | link |
2023-07-10 | Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos | Sagnik Majumder et.al. | 2307.04760v1 | null |
2023-07-10 | Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement | Anthony Simeonov et.al. | 2307.04751v1 | null |
2023-07-10 | RoCo: Dialectic Multi-Robot Collaboration with Large Language Models | Zhao Mandi et.al. | 2307.04738v1 | link |
2023-07-10 | AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning | Yuwei Guo et.al. | 2307.04725v1 | null |
2023-07-10 | Quark/Gluon Discrimination and Top Tagging with Dual Attention Transformer | Minxuan He et.al. | 2307.04723v1 | null |
2023-07-10 | CVPR MultiEarth 2023 Deforestation Estimation Challenge:SpaceVision4Amazon | Sunita Arya et.al. | 2307.04715v1 | null |
2023-07-10 | Multimodal brain age estimation using interpretable adaptive population-graph learning | Kyriaki-Margarita Bintsi et.al. | 2307.04639v1 | null |
2023-07-10 | Learning Fine Pinch-Grasp Skills using Tactile Sensing from Real Demonstration Data | Xiaofeng Mao et.al. | 2307.04619v1 | null |
2023-07-10 | Weakly-supervised positional contrastive learning: application to cirrhosis classification | Emma Sarfati et.al. | 2307.04617v1 | null |
2023-07-07 | On the representation theory of cyclic and dihedral quandles | Mohamed Elhamdadi et.al. | 2307.03728v1 | null |
2023-07-07 | Polybot: Training One Policy Across Robots While Embracing Variability | Jonathan Yang et.al. | 2307.03719v1 | null |
2023-07-07 | Motion Magnification in Robotic Sonography: Enabling Pulsation-Aware Artery Segmentation | Dianye Huang et.al. | 2307.03698v1 | null |
2023-07-07 | Detecting the Sensing Area of A Laparoscopic Probe in Minimally Invasive Cancer Surgery | Baoru Huang et.al. | 2307.03662v1 | null |
2023-07-07 | Physical-aware Cross-modal Adversarial Network for Wearable Sensor-based Human Action Recognition | Jianyuan Ni et.al. | 2307.03638v1 | null |
2023-07-07 | VesselVAE: Recursive Variational Autoencoders for 3D Blood Vessel Synthesis | Paula Feldman et.al. | 2307.03592v1 | null |
2023-07-07 | SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained Networks | Xingyu Lin et.al. | 2307.03567v1 | null |
2023-07-07 | VariGrad: A Novel Feature Vector Architecture for Geometric Deep Learning on Unregistered Data | Emmanuel Hartman et.al. | 2307.03553v1 | null |
2023-07-07 | TBGC: Task-level Backbone-Oriented Gradient Clip for Multi-Task Foundation Model Learning | Zelun Zhang et.al. | 2307.03465v1 | null |
2023-07-07 | A Deep Active Contour Model for Delineating Glacier Calving Fronts | Konrad Heidler et.al. | 2307.03461v1 | null |
2023-07-06 | Synthesizing Artistic Cinemagraphs from Text | Aniruddha Mahapatra et.al. | 2307.03190v1 | null |
2023-07-06 | Long-term follow-up observations of extreme coronal line emitting galaxies | Peter Clark et.al. | 2307.03182v1 | null |
2023-07-06 | Push Past Green: Learning to Look Behind Plant Foliage by Moving It | Xiaoyu Zhang et.al. | 2307.03175v1 | null |
2023-07-06 | VideoGLUE: Video General Understanding Evaluation of Foundation Models | Liangzhe Yuan et.al. | 2307.03166v1 | null |
2023-07-06 | Can Domain Adaptation Improve Accuracy and Fairness of Skin Lesion Classification? | Janet Wang et.al. | 2307.03157v1 | null |
2023-07-06 | MultiVENT: Multilingual Videos of Events with Aligned Natural Text | Kate Sanders et.al. | 2307.03153v1 | null |
2023-07-06 | Topology-Aware Loss for Aorta and Great Vessel Segmentation in Computed Tomography Images | Seher Ozcelik et.al. | 2307.03137v1 | null |
2023-07-06 | Distilling Large Vision-Language Model with Out-of-Distribution Generalizability | Xuanlin Li et.al. | 2307.03135v1 | link |
2023-07-06 | Benchmarking Test-Time Adaptation against Distribution Shifts in Image Classification | Yongcan Yu et.al. | 2307.03133v1 | link |
2023-07-06 | VisKoP: Visual Knowledge oriented Programming for Interactive Knowledge Base Question Answering | Zijun Yao et.al. | 2307.03130v1 | null |
2023-07-05 | Building Cooperative Embodied Agents Modularly with Large Language Models | Hongxin Zhang et.al. | 2307.02485v1 | null |
2023-07-05 | Elastic Decision Transformer | Yueh-Hua Wu et.al. | 2307.02484v1 | null |
2023-07-05 | What Matters in Training a GPT4-Style Language Model with Multimodal Inputs? | Yan Zeng et.al. | 2307.02469v1 | null |
2023-07-05 | Supersymmetric asymptotically locally AdS$_5$ gravitational solitons | Turkuler Durgut et.al. | 2307.02466v1 | null |
2023-07-05 | AxonCallosumEM Dataset: Axon Semantic Segmentation of Whole Corpus Callosum cross section from EM Images | Ao Cheng et.al. | 2307.02464v1 | null |
2023-07-05 | Expert-Agnostic Ultrasound Image Quality Assessment using Deep Variational Clustering | Deepak Raina et.al. | 2307.02462v1 | null |
2023-07-05 | LLCaps: Learning to Illuminate Low-Light Capsule Endoscopy with Curved Wavelet Attention and Reverse Diffusion | Long Bai et.al. | 2307.02452v1 | link |
2023-07-05 | On Deep Learning Classification of Digitally Modulated Signals Using Raw I/Q Data | John A. Snoap et.al. | 2307.02450v1 | null |
2023-07-05 | Vulnerable Source Code Detection using SonarCloud Code Analysis | Alifia Puspaningrum et.al. | 2307.02446v1 | null |
2023-07-05 | Base Layer Efficiency in Scalable Human-Machine Coding | Yalda Foroutan et.al. | 2307.02430v1 | null |
2023-07-03 | Real-time Monocular Full-body Capture in World Space via Sequential Proxy-to-Motion Learning | Yuxiang Zhang et.al. | 2307.01200v1 | null |
2023-07-03 | Segment Anything Meets Point Tracking | Frano Rajič et.al. | 2307.01197v1 | link |
2023-07-03 | Online nearest neighbor classification | Sanjoy Dasgupta et.al. | 2307.01170v1 | null |
2023-07-03 | Don't freeze: Finetune encoders for better Self-Supervised HAR | Vitor Fortes Rey et.al. | 2307.01168v1 | null |
2023-07-03 | Characteristic signatures of accreting binary black holes produced by eccentric minidisks | John Ryan Westernacher-Schneider et.al. | 2307.01154v1 | null |
2023-07-03 | Integral cohomology rings of weighted Grassmann orbifolds and Rigidity properties | Koushik Brahma et.al. | 2307.01153v1 | null |
2023-07-03 | Investigating Data Memorization in 3D Latent Diffusion Models for Medical Image Synthesis | Salman Ul Hassan Dar et.al. | 2307.01148v1 | null |
2023-07-05 | AVSegFormer: Audio-Visual Segmentation with Transformer | Shengyi Gao et.al. | 2307.01146v2 | link |
2023-07-03 | Cross-modality Attention Adapter: A Glioma Segmentation Fine-tuning Method for SAM Using Multimodal Brain MR Images | Xiaoyu Shi et.al. | 2307.01124v1 | null |
2023-07-03 | Supervised Manifold Learning via Random Forest Geometry-Preserving Proximities | Jake S. Rhodes et.al. | 2307.01077v1 | null |
2023-07-03 | SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs | Lijun Yu et.al. | 2306.17842v2 | null |
2023-06-30 | Learning Evacuee Models from Robot-Guided Emergency Evacuation Experiments | Mollik Nayyar et.al. | 2306.17824v1 | null |
2023-06-30 | Act3D: Infinite Resolution Action Detection Transformer for Robotic Manipulation | Theophile Gervet et.al. | 2306.17817v1 | null |
2023-06-30 | Topologically Attributed Graphs for Shape Discrimination | Justin Curry et.al. | 2306.17805v1 | null |
2023-06-30 | Vision Through the Veil: Differential Privacy in Federated Learning for Medical Image Classification | Kishore Babu Nampalle et.al. | 2306.17794v1 | null |
2023-06-30 | Precision Anti-Cancer Drug Selection via Neural Ranking | Vishal Dey et.al. | 2306.17771v1 | null |
2023-06-30 | Improved NL2SQL based on Multi-layer Expert Network | Chenduo Hao et.al. | 2306.17727v1 | null |
2023-06-30 | Content-Preserving Diffusion Model for Unsupervised AS-OCT image Despeckling | Li Sanqian et.al. | 2306.17717v1 | null |
2023-06-30 | Evaluation of the Benefits of Zero Velocity Update in Decentralized EKF-Based Cooperative Localization Algorithms for GNSS-Denied Multi-Robot Systems | Cagri Kilic et.al. | 2306.17703v1 | null |
2023-06-30 | Generalized Time Warping Invariant Dictionary Learning for Time Series Classification and Clustering | Ruiyu Xu et.al. | 2306.17690v1 | null |
2023-06-29 | An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training | Zitian Chen et.al. | 2306.17165v1 | null |
2023-06-29 | Can Machines Garden? Systematically Comparing the AlphaGarden vs. Professional Horticulturalists | Simeon Adebola et.al. | 2306.17162v1 | null |
2023-06-29 | FogROS2-SGC: A ROS2 Cloud Robotics Platform for Secure Global Connectivity | Kaiyuan Chen et.al. | 2306.17157v1 | null |
2023-06-29 | Orbit Classification of asteroids using implementation of radial Basis Function on Support Vector Machines | Yashvir Tiberwal et.al. | 2306.17138v1 | null |
2023-06-29 | On separably integrable symmetric convex bodies | Vladyslav Yaskin et.al. | 2306.17127v1 | null |
2023-06-29 | PVP: Personalized Video Prior for Editable Dynamic Portraits using StyleGAN | Kai-En Lin et.al. | 2306.17123v1 | null |
2023-06-29 | Learning Nuclei Representations with Masked Image Modelling | Piotr Wójcik et.al. | 2306.17116v1 | null |
2023-06-29 | Deep Ensemble for Rotorcraft Attitude Prediction | Hikmat Khan et.al. | 2306.17104v1 | null |
2023-06-29 | Twice Binnable Color Filter Arrays | Mritunjay Singh et.al. | 2306.17078v1 | null |
2023-06-29 | Extremal behavior of reduced type of one dimensional rings | Sarasij Maitra et.al. | 2306.17069v1 | null |
2023-06-28 | Class Numbers, Congruent Numbers and Umbral Moonshine | Miranda C. N. Cheng et.al. | 2306.16414v1 | null |
2023-06-28 | Information-Computation Tradeoffs for Learning Margin Halfspaces with Random Classification Noise | Ilias Diakonikolas et.al. | 2306.16352v1 | null |
2023-06-28 | Accurate, uncertainty-aware classification of molecular chemical motifs from multi-modal X-ray absorption spectroscopy | Matthew R. Carbone et.al. | 2306.16349v1 | null |
2023-06-28 | DoseDiff: Distance-aware Diffusion Model for Dose Prediction in Radiotherapy | Yiwen Zhang et.al. | 2306.16324v1 | null |
2023-06-28 | Universal theory of spin-momentum-orbital-site locking | Yuntian Liu et.al. | 2306.16312v1 | null |
2023-06-28 | Generalizing Surgical Instruments Segmentation to Unseen Domains with One-to-Many Synthesis | An Wang et.al. | 2306.16285v1 | link |
2023-06-28 | Emotion Analysis of Tweets Banning Education in Afghanistan | Mohammad Ali Hussiny et.al. | 2306.16268v1 | null |
2023-06-28 | Reconfigurable Robot Control Using Flexible Coupling Mechanisms | Sha Yi et.al. | 2306.16265v1 | null |
2023-06-28 | Latent SDEs on Homogeneous Spaces | Sebastian Zeng et.al. | 2306.16248v1 | null |
2023-06-28 | Investigating the Uncanny Valley Phenomenon Through the Temporal Dynamics of Neural Responses to Virtual Characters | Chiara Gorlini et.al. | 2306.16233v1 | null |
2023-06-27 | Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties | Hsiao-Yu Tung et.al. | 2306.15668v1 | null |
2023-06-27 | Enhancing Representation Learning on High-Dimensional, Small-Size Tabular Data: A Divide and Conquer Method with Ensembled VAEs | Navindu Leelarathna et.al. | 2306.15661v1 | null |
2023-06-27 | Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos | Chiori Hori et.al. | 2306.15644v1 | null |
2023-06-27 | Biclustering random matrix partitions with an application to classification of forensic body fluids | Chieh-Hsi Wu et.al. | 2306.15622v1 | null |
2023-06-27 | Recurrent Neural Network-coupled SPAD TCSPC System for Real-time Fluorescence Lifetime Imaging | Yang Lin et.al. | 2306.15599v1 | null |
2023-06-27 | Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning | Sherly Alfonso-Sánchez et.al. | 2306.15585v1 | null |
2023-06-27 | Parity doublet model for baryon octets: diquark classifications and mass hierarchy based on the quark-line diagram | Takuya Minamikawa et.al. | 2306.15564v1 | null |
2023-06-27 | You Can Mask More For Extremely Low-Bitrate Image Compression | Anqi Li et.al. | 2306.15561v1 | link |
2023-06-27 | A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC Platforms | Cristina Silvano et.al. | 2306.15552v1 | null |
2023-06-27 | Self-supervised Learning of Event-guided Video Frame Interpolation for Rolling Shutter Frames | Yunfan Lu et.al. | 2306.15507v1 | null |
2023-06-26 | FunQA: Towards Surprising Video Comprehension | Binzhu Xie et.al. | 2306.14899v1 | link |
2023-06-26 | Mapping out phase diagrams with generative classifiers | Julian Arnold et.al. | 2306.14894v1 | null |
2023-06-26 | Fuzzy-Conditioned Diffusion and Diffusion Projection Attention Applied to Facial Image Correction | Majed El Helou et.al. | 2306.14891v1 | link |
2023-06-26 | A Fully Unsupervised Instance Segmentation Technique for White Blood Cell Images | Shrijeet Biswas et.al. | 2306.14875v1 | null |
2023-06-26 | ANYmal Parkour: Learning Agile Navigation for Quadrupedal Robots | David Hoeller et.al. | 2306.14874v1 | null |
2023-06-26 | Leveraging Task Structures for Improved Identifiability in Neural Network Representations | Wenlin Chen et.al. | 2306.14861v1 | null |
2023-06-26 | ViNT: A Foundation Model for Visual Navigation | Dhruv Shah et.al. | 2306.14846v1 | null |
2023-06-26 | An open-source robust machine learning platform for real-time detection and classification of 2D material flakes | Jan-Lucas Uslu et.al. | 2306.14845v1 | null |
2023-06-26 | A Flyweight CNN with Adaptive Decoder for Schistosoma mansoni Egg Detection | Leonardo de Melo Joao et.al. | 2306.14840v1 | null |
2023-06-26 | Label-Aware Hyperbolic Embeddings for Fine-grained Emotion Classification | Chih-Yao Chen et.al. | 2306.14822v1 | link |
2023-06-23 | Adversarial Robustness Certification for Bayesian Neural Networks | Matthew Wicker et.al. | 2306.13614v1 | link |
2023-06-23 | TACOformer:Token-channel compounded Cross Attention for Multimodal Emotion Recognition | Xinda Li et.al. | 2306.13592v1 | null |
2023-06-23 | Estimating Residential Solar Potential Using Aerial Data | Ross Goroshin et.al. | 2306.13564v1 | null |
2023-06-23 | Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning | Takumi Yoshida et.al. | 2306.13561v1 | null |
2023-06-26 | FPGA Implementation of Convolutional Neural Network for Real-Time Handwriting Recognition | Shichen Qiao et.al. | 2306.13557v2 | link |
2023-06-23 | Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation | Massimiliano Patacchiola et.al. | 2306.13554v1 | link |
2023-06-23 | Manifold Contrastive Learning with Variational Lie Group Operators | Kion Fallah et.al. | 2306.13544v1 | null |
2023-06-23 | Torsion Graph Neural Networks | Cong Shen et.al. | 2306.13541v1 | link |
2023-06-23 | Topological learning for the classification of disorder: an application to the design of metasurfaces | Tristan Madeleine et.al. | 2306.13540v1 | null |
2023-06-23 | WBCAtt: A White Blood Cell Dataset Annotated with Detailed Morphological Attributes | Satoshi Tsutsui et.al. | 2306.13531v1 | link |
2023-06-22 | A Comparison of Time-based Models for Multimodal Emotion Recognition | Ege Kesim et.al. | 2306.13076v1 | null |
2023-06-22 | Auditing Predictive Models for Intersectional Biases | Kate S. Boxer et.al. | 2306.13064v1 | null |
2023-06-22 | Impacts and Risk of Generative AI Technology on Cyber Defense | Subash Neupane et.al. | 2306.13033v1 | null |
2023-06-22 | Toward Automated Detection of Microbleeds with Anatomical Scale Localization: A Complete Clinical Diagnosis Support Using Deep Learning | Jun-Ho Kim et.al. | 2306.13020v1 | null |
2023-06-22 | Minimalist and High-Quality Panoramic Imaging with PSF-aware Transformers | Qi Jiang et.al. | 2306.12992v1 | link |
2023-06-22 | Can a single image processing algorithm work equally well across all phases of DCE-MRI? | Adam G. Tattersall et.al. | 2306.12988v1 | null |
2023-06-22 | Radiation Emission during the Erasure of Magnetic Monopoles | Maximilian Bachmaier et.al. | 2306.12958v1 | null |
2023-06-22 | Robust Semantic Segmentation: Strong Adversarial Attacks and Fast Training of Robust Models | Francesco Croce et.al. | 2306.12941v1 | link |
2023-06-22 | Deficit of Hot Dust in Low-redshift Active Galactic Nuclei | Suyeon Son et.al. | 2306.12927v1 | null |
2023-06-22 | Machine-Learning-Assisted and Real-Time-Feedback-Controlled Growth of InAs/GaAs Quantum Dots | Chao Shen et.al. | 2306.12898v1 | null |
2023-06-21 | Spectroscopy of the Supernova H0pe Host Galaxy at Redshift 1.78 | M. Polletta et.al. | 2306.12385v1 | null |
2023-06-21 | Geometric Algorithms for |
Diego Ihara Centurion et.al. | 2306.12377v1 | null |
2023-06-21 | M-VAAL: Multimodal Variational Adversarial Active Learning for Downstream Medical Image Analysis Tasks | Bidur Khanal et.al. | 2306.12376v1 | link |
2023-06-21 | One Policy to Dress Them All: Learning to Dress People with Diverse Poses and Garments | Yufei Wang et.al. | 2306.12372v1 | null |
2023-06-21 | Attention Hybrid Variational Net for Accelerated MRI Reconstruction | Guoyao Shen et.al. | 2306.12365v1 | null |
2023-06-21 | Linear and Non-Linear Barrier Coverage in Deterministic and Uncertain environment in WSNs: A New Classification | Adda Boualem et.al. | 2306.12355v1 | null |
2023-06-21 | An efficient, provably exact algorithm for the 0-1 loss linear classification problem | Xi He et.al. | 2306.12344v1 | null |
2023-06-21 | Geometric Pooling: maintaining more useful information | Hao Xu et.al. | 2306.12341v1 | null |
2023-06-22 | Do you still need a manual smart contract audit? | Isaac David et.al. | 2306.12338v2 | null |
2023-06-22 | Beyond Deep Ensembles: A Large-Scale Evaluation of Bayesian Deep Learning under Distribution Shift | Florian Seligmann et.al. | 2306.12306v2 | link |
2023-06-20 | Segment Anything Model (SAM) for Radiation Oncology | Lian Zhang et.al. | 2306.11730v1 | null |
2023-06-20 | Dense Video Object Captioning from Disjoint Supervision | Xingyi Zhou et.al. | 2306.11729v1 | link |
2023-06-20 | How can objects help action recognition? | Xingyi Zhou et.al. | 2306.11726v1 | link |
2023-06-20 | Low-complexity Multidimensional DCT Approximations | V. A. Coutinho et.al. | 2306.11724v1 | null |
2023-06-20 | Meta-Analysis of Transfer Learning for Segmentation of Brain Lesions | Sovesh Mohapatra et.al. | 2306.11714v1 | null |
2023-06-20 | Hexagonal circular 3-webs with polar curves of degree three | Sergey I. Agafonov et.al. | 2306.11707v1 | null |
2023-06-20 | SkyGPT: Probabilistic Short-term Solar Forecasting Using Synthetic Sky Videos from Physics-constrained VideoGPT | Yuhao Nie et.al. | 2306.11682v1 | null |
2023-06-20 | The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks | Yuan Cao et.al. | 2306.11680v1 | null |
2023-06-20 | A primal-dual data-driven method for computational optical imaging with a photonic lantern | Carlos Santos Garcia et.al. | 2306.11679v1 | null |
2023-06-20 | Deep Learning Methods for Retinal Blood Vessel Segmentation: Evaluation on Images with Retinopathy of Prematurity | Gorana Gojić et.al. | 2306.11576v1 | null |
2023-06-16 | Variational quantum algorithms for machine learning: theory and applications | Stefano Mangini et.al. | 2306.09984v1 | null |
2023-06-16 | HePCo: Data-Free Heterogeneous Prompt Consolidation for Continual Federated Learning | Shaunak Halbe et.al. | 2306.09970v1 | null |
2023-06-16 | Training shallow ReLU networks on noisy data using hinge loss: when do we overfit and is it benign? | Erin George et.al. | 2306.09955v1 | null |
2023-06-16 | Towards Better Certified Segmentation via Diffusion Models | Othmane Laousy et.al. | 2306.09949v1 | null |
2023-06-16 | Knowledge Distillation for Efficient Audio-Visual Video Captioning | Özkan Çaylı et.al. | 2306.09947v1 | null |
2023-06-16 | RealImpact: A Dataset of Impact Sound Fields for Real Objects | Samuel Clarke et.al. | 2306.09944v1 | null |
2023-06-16 | A classification of supersymmetric Kaluza-Klein black holes with a single axial symmetry | David Katona et.al. | 2306.09933v1 | null |
2023-06-16 | A Metaheuristic-based Machine Learning Approach for Energy Prediction in Mobile App Development | Seyed Jalaleddin Mousavirad et.al. | 2306.09931v1 | null |
2023-06-16 | Learning to Summarize and Answer Questions about a Virtual Robot's Past Actions | Chad DeChant et.al. | 2306.09922v1 | null |
2023-06-16 | No Strong Feelings One Way or Another: Re-operationalizing Neutrality in Natural Language Inference | Animesh Nighojkar et.al. | 2306.09918v1 | null |
2023-06-16 | UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video | Zhi-Hao Lin et.al. | 2306.09349v2 | null |
2023-06-15 | Causal classification of spatiotemporal quantum correlations | Minjeong Song et.al. | 2306.09336v1 | null |
2023-06-15 | Class-Conditional Conformal Prediction With Many Classes | Tiffany Ding et.al. | 2306.09335v1 | link |
2023-06-15 | Personalized Image Enhancement Featuring Masked Style Modeling | Satoshi Kosugi et.al. | 2306.09334v1 | link |
2023-06-15 | Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers | Dominick Reilly et.al. | 2306.09331v1 | link |
2023-06-15 | WizMap: Scalable Interactive Visualization for Exploring Large Machine Learning Embeddings | Zijie J. Wang et.al. | 2306.09328v1 | link |
2023-06-15 | Language-Guided Music Recommendation for Video via Prompt Analogies | Daniel McKee et.al. | 2306.09327v1 | null |
2023-06-15 | Single-Stage Visual Query Localization in Egocentric Videos | Hanwen Jiang et.al. | 2306.09324v1 | null |
2023-06-15 | Crowd-Powered Photo Enhancement Featuring an Active Learning Based Local Filter | Satoshi Kosugi et.al. | 2306.09321v1 | link |
2023-06-15 | Learnable Weight Initialization for Volumetric Medical Image Segmentation | Shahina Kunhimon et.al. | 2306.09320v1 | link |
2023-06-13 | Classification of branched Willmore spheres | Dorian Martino et.al. | 2306.07965v1 | null |
2023-06-13 | Supervised-Contrastive Loss Learns Orthogonal Frames and Batching Matters | Ganesh Ramachandra Kini et.al. | 2306.07960v1 | link |
2023-06-13 | Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation | Shuai Yang et.al. | 2306.07954v1 | null |
2023-06-13 | MOFI: Learning Image Representations from Noisy Entity Annotated Images | Wentao Wu et.al. | 2306.07952v1 | null |
2023-06-13 | Image Captioners Are Scalable Vision Learners Too | Michael Tschannen et.al. | 2306.07915v1 | null |
2023-06-13 | Massively Multilingual Corpus of Sentiment Datasets and Multi-faceted Sentiment Classification Benchmark | Łukasz Augustyniak et.al. | 2306.07902v1 | link |
2023-06-13 | Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks | Veniamin Veselovsky et.al. | 2306.07899v1 | link |
2023-06-13 | CAMEO: A Causal Transfer Learning Approach for Performance Optimization of Configurable Computer Systems | Md Shahriar Iqbal et.al. | 2306.07888v1 | null |
2023-06-13 | Deep Learning-Enabled Zero-Touch Device Identification: Mitigating the Impact of Channel Variability Through MIMO Diversity | Bechir Hamdaoui et.al. | 2306.07878v1 | null |
2023-06-13 | On the flow unsteadiness and operational characteristics of a novel supersonic fluidic oscillator | Spandan Maikap et.al. | 2306.07849v1 | null |
2023-06-12 | Waffling around for Performance: Visual Classification with Random Words and Broad Concepts | Karsten Roth et.al. | 2306.07282v1 | link |
2023-06-12 | The Cheltsov--Rubinstein problem for strongly asymptotically log del Pezzo surfaces | Chenzi Jin et.al. | 2306.07278v1 | null |
2023-06-12 | Gaussian Membership Inference Privacy | Tobias Leemann et.al. | 2306.07273v1 | null |
2023-06-12 | MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images | Junchen Zhu et.al. | 2306.07257v1 | null |
2023-06-12 | On the Expected Size of Conformal Prediction Sets | Guneet S. Dhillon et.al. | 2306.07254v1 | null |
2023-06-12 | RB-Dust -- A Reference-based Dataset for Vision-based Dust Removal | Peter Buckel et.al. | 2306.07244v1 | null |
2023-06-12 | Strokes2Surface: Recovering Curve Networks From 4D Architectural Design Sketches | S. Rasoulzadeh et.al. | 2306.07220v1 | null |
2023-06-12 | Cyclic objects from surfaces | Ivan Bartulović et.al. | 2306.07216v1 | null |
2023-06-12 | Valley: Video Assistant with Large Language model Enhanced abilitY | Ruipu Luo et.al. | 2306.07207v1 | null |
2023-06-12 | A Survey of Vision-Language Pre-training from the Lens of Multimodal Machine Translation | Jeremy Gwinnup et.al. | 2306.07198v1 | null |
2023-06-09 | Shock Cooling and Possible Precursor Emission in the Early Light Curve of the Type II SN 2023ixf | Griffin Hosseinzadeh et.al. | 2306.06097v1 | null |
2023-06-09 | Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding | Mu Cai et.al. | 2306.06094v1 | null |
2023-06-09 | Virtual Node Tuning for Few-shot Node Classification | Zhen Tan et.al. | 2306.06063v1 | null |
2023-06-09 | Ion-Driven Instabilities in the Inner Heliosphere II: Classification and Multi-Dimensional Mapping | Mihailo M. Martinovic et.al. | 2306.06060v1 | null |
2023-06-09 | Exploring the Impact of Image Resolution on Chest X-ray Classification Performance | Alessandro Wollek et.al. | 2306.06051v1 | null |
2023-06-09 | How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models? | Yifei Ming et.al. | 2306.06048v1 | null |
2023-06-09 | GANeRF: Leveraging Discriminators to Optimize Neural Radiance Fields | Barbara Roessle et.al. | 2306.06044v1 | null |
2023-06-09 | WindowNet: Learnable Windows for Chest X-ray Classification | Alessandro Wollek et.al. | 2306.06038v1 | null |
2023-06-09 | Benchmarking self-supervised video representation learning | Akash Kumar et.al. | 2306.06010v1 | null |
2023-06-09 | Beyond Detection: Visual Realism Assessment of Deepfakes | Luka Dragar et.al. | 2306.05985v1 | null |
2023-06-08 | MIMIC-IT: Multi-Modal In-Context Instruction Tuning | Bo Li et.al. | 2306.05425v1 | link |
2023-06-08 | Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models | Muhammad Maaz et.al. | 2306.05424v1 | link |
2023-06-08 | ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process | Changyao Tian et.al. | 2306.05423v1 | null |
2023-06-08 | Tracking Everything Everywhere All at Once | Qianqian Wang et.al. | 2306.05422v1 | null |
2023-06-08 | 2D Supervised Monocular 3D Object Detection by Global-to-Local 3D Reconstruction | Jiawei He et.al. | 2306.05418v1 | null |
2023-06-08 | Tracking Objects with 3D Representation from Videos | Jiawei He et.al. | 2306.05416v1 | null |
2023-06-08 | Quantum symmetries in 2+1 dimensions: Carroll, (a)dS-Carroll, Galilei and (a)dS-Galilei | Tomasz Trześniewski et.al. | 2306.05409v1 | null |
2023-06-08 | Deformation theory for prismatic |
Kazuhiro Ito et.al. | 2306.05361v1 | null |
2023-06-08 | Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models | Nan Liu et.al. | 2306.05357v1 | null |
2023-06-08 | Predictive Modeling of Equine Activity Budgets Using a 3D Skeleton Reconstructed from Surveillance Recordings | Ernest Pokropek et.al. | 2306.05311v1 | null |
2023-06-08 | Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt | Kai Chen et.al. | 2306.04607v2 | null |
2023-06-07 | MarineVRS: Marine Video Retrieval System with Explainability via Semantic Understanding | Tan-Sang Ha et.al. | 2306.04593v1 | null |
2023-06-07 | A Dataset for Deep Learning-based Bone Structure Analyses in Total Hip Arthroplasty | Kaidong Zhang et.al. | 2306.04579v1 | link |
2023-06-07 | ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models | Sophie Jentzsch et.al. | 2306.04563v1 | link |
2023-06-07 | Contrastive Bootstrapping for Label Refinement | Shudi Hou et.al. | 2306.04544v1 | null |
2023-06-07 | Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications | Paul Pu Liang et.al. | 2306.04539v1 | link |
2023-06-07 | Long-form analogies generated by chatGPT lack human-like psycholinguistic properties | S. M. Seals et.al. | 2306.04537v1 | null |
2023-06-07 | ContriMix: Unsupervised disentanglement of content and attribute for domain generalization in microscopy image analysis | Tan H. Nguyen et.al. | 2306.04527v1 | null |
2023-06-07 | Cross-attention learning enables real-time nonuniform rotational distortion correction in OCT | Haoran Zhang et.al. | 2306.04512v1 | null |
2023-06-07 | Hardness of Deceptive Certificate Selection | Stephan Wäldchen et.al. | 2306.04505v1 | null |
2023-06-06 | CL-UZH at SemEval-2023 Task 10: Sexism Detection through Incremental Fine-Tuning and Multi-Task Learning with Label Descriptions | Janis Goldzycher et.al. | 2306.03907v1 | null |
2023-06-06 | Utterance Classification with Logical Neural Network: Explainable AI for Mental Disorder Diagnosis | Yeldar Toleubay et.al. | 2306.03902v1 | null |
2023-06-06 | Towards Label-free Scene Understanding by Vision Foundation Models | Runnan Chen et.al. | 2306.03899v1 | null |
2023-06-06 | Multi-Label ECG Classification using Temporal Convolutional Neural Network | Eedara Prabhakararao et.al. | 2306.03844v1 | null |
2023-06-06 | Atrial Septal Defect Detection in Children Based on Ultrasound Video Using Multiple Instances Learning | Yiman Liu et.al. | 2306.03835v1 | null |
2023-06-06 | MTS2Graph: Interpretable Multivariate Time Series Classification with Temporal Evolving Graphs | Raneen Younis et.al. | 2306.03834v1 | null |
2023-06-06 | GEO-Bench: Toward Foundation Models for Earth Monitoring | Alexandre Lacoste et.al. | 2306.03831v1 | link |
2023-06-06 | Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How | Sebastian Pineda Arango et.al. | 2306.03828v1 | null |
2023-06-06 | Learning to Ground Instructional Articles in Videos through Narrations | Effrosyni Mavroudi et.al. | 2306.03802v1 | null |
2023-06-06 | Matched Pair Calibration for Ranking Fairness | Hannah Korevaar et.al. | 2306.03775v1 | null |
2023-06-05 | Neuralangelo: High-Fidelity Neural Surface Reconstruction | Zhaoshuo Li et.al. | 2306.03092v1 | null |
2023-06-05 | Dismantling Hate: Understanding Hate Speech Trends Against NBA Athletes | Edinam Kofi Klutse et.al. | 2306.03086v1 | null |
2023-06-05 | MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion | Chiyu Max Jiang et.al. | 2306.03083v1 | null |
2023-06-05 | **Of Mice and Mates: Automated Classification and Modelling of Mouse |
-
Notifications
You must be signed in to change notification settings - Fork 17
DWCTOD/cv-arxiv-daily
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published