I tried building a memory-first AI… and ended up discovering smaller models can beat larger ones
Dataset Model Acc F1 Δ vs Log Δ vs Static Avg Params Peak Params Steps Infer ms Size Banking77-20 Logistic TF-IDF 92.37% 0.9230 +0.00pp +...
Data analysis, statistics, and data engineering
Dataset Model Acc F1 Δ vs Log Δ vs Static Avg Params Peak Params Steps Infer ms Size Banking77-20 Logistic TF-IDF 92.37% 0.9230 +0.00pp +...
I’ve been reading more about attention mechanisms in transformers and how they effectively learn to weight and prioritize relevant inputs...
Dataset Model Acc F1 Δ vs Log Δ vs Static Avg Params Peak Params Steps Infer ms Size Banking77-20 Logistic TF-IDF 92.37% 0.9230 +0.00pp +...
AeroDGS presents a novel framework for 4D reconstruction from monocular UAV videos, addressing challenges in depth ambiguity and motion e...
The paper presents RETLLM, a novel framework for multimodal information retrieval (MMIR) that operates without the need for training or l...
This article presents a novel deep learning approach for accurately solving the geodesic problem on continuous surfaces, achieving third-...
The paper introduces VAE-MS, an Asymmetric Variational Autoencoder designed to enhance mutational signature extraction in cancer research...
The article presents CrossLLM-Mamba, a novel framework for RNA interaction prediction that utilizes multimodal state space fusion of larg...
This article presents a framework using multimodal large language models (MLLMs) to analyze the 'hooking period' of video ads, focusing o...
The paper introduces SQaLe, a large-scale text-to-SQL dataset designed to enhance the development of models that convert natural language...
This paper presents a novel approach to reconstruct audio and images from clipped measurements using self-supervised learning, addressing...
The paper presents a method for reducing model disagreement in machine learning by using an anchoring technique, demonstrating its effect...
The paper presents PLADA, a novel method for efficient dataset transmission in machine learning, significantly reducing payload size whil...
FlashOptim introduces innovative optimizers that significantly reduce memory usage in neural network training, enhancing efficiency witho...
CryoNet.Refine introduces a one-step diffusion model for efficiently refining structural models using cryo-EM density maps, offering a si...
The paper 'Poisoned Acoustics' explores training-data poisoning attacks on deep neural networks, demonstrating significant vulnerabilitie...
This article presents efficient algorithms for estimating the mean from coarse data, addressing key questions in Gaussian mean estimation...
This paper presents a novel differentiable approximation to the zero-one loss, enhancing gradient-based optimization in machine learning ...
The paper introduces a novel scoring rule for evaluating generative virtual staining models in high-throughput screening, emphasizing the...
This article introduces 'Inferential Mechanics,' a framework combining causal theories with machine learning in chemical biology, address...
This article explores how single-cell foundation models like scGPT encode biological knowledge through high-dimensional gene representati...
This article presents FedWQ-CP, a novel approach to federated uncertainty quantification that addresses dual heterogeneity in data and mo...
This paper presents a novel approach to offline goal-conditioned reinforcement learning by introducing a physics-informed regularization ...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime