Natural Language Processing

Text understanding and language tasks

Top This Week

Machine Learning

[P] Trained a small BERT on 276K Kubernetes YAMLs using tree positional encoding instead of sequential

I trained a BERT-style transformer on 276K Kubernetes YAML files, replacing standard positional encoding with learned tree coordinates (d...

Reddit - Machine Learning · 1 min ·
Machine Learning

I am doing a multi-model graph database in pure Rust with Cypher, SQL, Gremlin, and native GNN looking for extreme speed and performance

Hi guys, I'm a PhD student in Applied AI and I've been building an embeddable graph database engine from scratch in Rust. I'd love feedba...

Reddit - Artificial Intelligence · 1 min ·
Llms

Chatgpt vs purpose built ai for cre underwriting: which one can finish the job?

I keep seeing people recommend chatgpt for financial modeling and I need to push back because I spent a month testing it for multifamily ...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2602.22255] Deep Sequence Modeling with Quantum Dynamics: Language as a Wave Function
Machine Learning

[2602.22255] Deep Sequence Modeling with Quantum Dynamics: Language as a Wave Function

This paper presents a novel sequence modeling framework using quantum dynamics, where language is treated as a wave function evolving und...

arXiv - AI · 4 min ·
[2602.22227] To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning
Llms

[2602.22227] To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning

The paper introduces AOT-SFT, an adversarial dataset aimed at enhancing the robustness of Multimodal Large Language Models (MLLMs) agains...

arXiv - AI · 3 min ·
Ai Infrastructure

I Made a Auto-complete AI form scratch in python and thought it would be funny to use family guy episodes as a database. It was not a good idea.

The article discusses a humorous attempt to create an auto-complete AI using Family Guy episodes as a database, highlighting the unexpect...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] ASURA: Recursive LMs done right

The article discusses the potential of Recursive Language Models (RLMs) and suggests methods to enhance their performance, challenging th...

Reddit - Machine Learning · 1 min ·
Llms

[D] Categorising 8000+ txt files according to themes

The article discusses the challenge of categorizing over 8000 text files into themes using a hybrid model of Key LLM and HDBSCAN, aiming ...

Reddit - Machine Learning · 1 min ·
Do you have to be polite to AI?
Llms

Do you have to be polite to AI?

The article explores the effectiveness of various communication strategies when interacting with AI chatbots, revealing that common belie...

AI Tools & Products · 9 min ·
[2512.16902] In-Context Algebra
Machine Learning

[2512.16902] In-Context Algebra

The paper 'In-Context Algebra' explores how transformers can solve arithmetic problems using variable tokens whose meanings are context-d...

arXiv - Machine Learning · 3 min ·
[2510.22037] ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality
Machine Learning

[2510.22037] ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality

The paper presents ATLAS, a study on adaptive transfer scaling laws for multilingual pretraining, finetuning, and decoding, based on exte...

arXiv - Machine Learning · 4 min ·
[2510.11789] Minimax Rates for Learning Pairwise Interactions in Attention-Style Models
Machine Learning

[2510.11789] Minimax Rates for Learning Pairwise Interactions in Attention-Style Models

This paper examines the convergence rates for learning pairwise interactions in attention-style models, demonstrating a minimax rate that...

arXiv - Machine Learning · 3 min ·
[2509.14659] Aligning Audio Captions with Human Preferences
Machine Learning

[2509.14659] Aligning Audio Captions with Human Preferences

The paper presents a novel framework for audio captioning that aligns captions with human preferences using Reinforcement Learning from H...

arXiv - Machine Learning · 3 min ·
[2509.11517] PeruMedQA: Benchmarking Large Language Models (LLMs) on Peruvian Medical Exams -- Dataset Construction and Evaluation
Llms

[2509.11517] PeruMedQA: Benchmarking Large Language Models (LLMs) on Peruvian Medical Exams -- Dataset Construction and Evaluation

The PeruMedQA study evaluates large language models (LLMs) on Peruvian medical exams, creating a specialized dataset and demonstrating th...

arXiv - Machine Learning · 4 min ·
[2505.04382] Discrete Optimal Transport and Voice Conversion
Nlp

[2505.04382] Discrete Optimal Transport and Voice Conversion

This paper introduces kDOT, a discrete optimal transport framework for voice conversion, demonstrating improved performance over traditio...

arXiv - Machine Learning · 4 min ·
[2504.17203] High-Fidelity And Complex Test Data Generation For Google SQL Code Generation Services
Machine Learning

[2504.17203] High-Fidelity And Complex Test Data Generation For Google SQL Code Generation Services

This paper presents a method for generating high-fidelity test data for SQL code generation services, addressing limitations of tradition...

arXiv - Machine Learning · 4 min ·
[2411.19253] Quantum feedback control with a transformer neural network architecture
Machine Learning

[2411.19253] Quantum feedback control with a transformer neural network architecture

This article presents a novel approach to quantum feedback control using transformer neural networks, demonstrating their effectiveness i...

arXiv - Machine Learning · 4 min ·
[2211.02003] Private Blind Model Averaging - Distributed, Non-interactive, and Convergent
Machine Learning

[2211.02003] Private Blind Model Averaging - Distributed, Non-interactive, and Convergent

This paper presents Private Blind Model Averaging, a method for distributed, non-interactive, and convergent learning that enhances priva...

arXiv - Machine Learning · 4 min ·
[2512.16762] NRGPT: An Energy-based Alternative for GPT
Llms

[2512.16762] NRGPT: An Energy-based Alternative for GPT

The paper presents NRGPT, an energy-based alternative to GPT, proposing a novel approach that integrates energy-based modeling with langu...

arXiv - Machine Learning · 3 min ·
[2512.02435] Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering
Nlp

[2512.02435] Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering

This paper presents a novel framework for cross-domain offline reinforcement learning, introducing a method that filters data based on bo...

arXiv - Machine Learning · 4 min ·
[2510.10625] ImpMIA: Leveraging Implicit Bias for Membership Inference Attack
Machine Learning

[2510.10625] ImpMIA: Leveraging Implicit Bias for Membership Inference Attack

The paper introduces ImpMIA, a novel Membership Inference Attack that leverages implicit bias in neural networks to identify training sam...

arXiv - Machine Learning · 4 min ·
[2509.25800] Characterization and Learning of Causal Graphs with Latent Confounders and Post-treatment Selection from Interventional Data
Nlp

[2509.25800] Characterization and Learning of Causal Graphs with Latent Confounders and Post-treatment Selection from Interventional Data

This paper presents a novel approach to causal discovery that accounts for latent confounders and post-treatment selection, enhancing the...

arXiv - Machine Learning · 4 min ·
[2510.01988] PepCompass: Navigating peptide embedding spaces using Riemannian Geometry
Machine Learning

[2510.01988] PepCompass: Navigating peptide embedding spaces using Riemannian Geometry

PepCompass introduces a geometry-aware framework for exploring peptide spaces, enhancing antimicrobial peptide discovery through advanced...

arXiv - Machine Learning · 4 min ·
Previous Page 72 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime