Seeking Critique on Research Approach to Open Set Recognition (Novelty Detection) [R]
Hey guys, I'm an independent researcher working on a project that tries to address a very specific failure mode in LLMs and embedding bas...
GPUs, training clusters, MLOps, and deployment
Hey guys, I'm an independent researcher working on a project that tries to address a very specific failure mode in LLMs and embedding bas...
I built a cognitive architecture where all computation reduces to three bit operations: XOR, MAJ, POPCNT. No GEMM. No GPU. No floating-po...
I'm profoundly ambivalent re: how to feel about this; is it great -- what a scrappy, bold pivot! Or wildly dumb - its so far from their c...
This article presents a novel Multi-Layer Hierarchical Federated Learning framework (QMLHFL) that enhances scalability and flexibility in...
The Sparse Latent Factor Forecaster (SLFF) proposes a new approach for predicting commodity futures by addressing forecast errors and enh...
This paper explores the application of weak neural networks in mastering impartial games like NIM, utilizing an AlphaZero-inspired multi-...
This article presents Robust Multi-Objective Decoding (RMOD), an innovative algorithm designed to enhance the performance of Large Langua...
The paper presents LO-BCQ, a novel block clustered quantization method for 4-bit LLM inference, achieving less than 1% accuracy loss whil...
The paper proposes BFS-PO, a new reinforcement learning algorithm that enhances the performance of Large Reasoning Models by reducing com...
The paper presents SWIFT, a lightweight model that enhances time series forecasting using wavelet decomposition, achieving state-of-the-a...
The paper introduces Adaptive Width Neural Networks, a novel approach that optimizes the width of neural network layers during training, ...
The paper presents a Bayesian framework for gradient sparsification called Regularized Top-k (RegTop-k), which improves convergence in di...
The paper introduces Lynx, a system designed to enhance the efficiency of Mixture-of-Expert (MoE) models by implementing dynamic batch-aw...
This article presents a new approach to optimizing training in machine learning by introducing a simple one-line modification to existing...
The paper presents Llamdex, a framework for customizing large language models (LLMs) as a service, allowing clients to upload domain-spec...
This survey explores the integration of Foundation Models (FMs) and Federated Learning (FL), termed Federated Foundation Models (FedFM), ...
The paper introduces Sparse MeZO, a novel optimization technique for fine-tuning large language models (LLMs) that reduces memory usage w...
This article explores a structural misalignment in Transformers, particularly regarding residual connections and their impact on next-tok...
This paper presents PIVID, a novel method for inferring distributions over permutations and directed acyclic graphs (DAGs) using variatio...
This paper explores the efficiency of offline policy selection (OPS) in reinforcement learning, connecting it to off-policy evaluation (O...
The paper introduces a novel regression algorithm called Learning with Subset Stacking (LESS), which effectively learns from heterogeneou...
Orcheo is an open-source platform designed to streamline conversational search by offering a modular architecture, production-ready infra...
The paper presents Qute, a quantum-native database that integrates quantum computation into database operations, enhancing performance ov...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime