[2602.20202] Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study

[2602.20202] Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study

arXiv - AI 4 min read Article

Summary

This paper evaluates the reliability of digital forensic evidence identified by large language models (LLMs), proposing a structured framework for artifact extraction and validation.

Why It Matters

As AI technologies become integral to forensic investigations, ensuring the reliability of AI-generated evidence is crucial for legal integrity. This study addresses significant challenges in digital forensics, providing a methodology that enhances accuracy and traceability, which is vital for law enforcement and legal proceedings.

Key Takeaways

  • The proposed framework automates forensic artifact extraction and validation.
  • Achieved over 95% accuracy in artifact extraction from a large dataset.
  • Utilizes a Digital Forensic Knowledge Graph to enhance evidence reliability.
  • Addresses challenges of credibility and integrity in AI-assisted digital forensics.
  • Supports chain-of-custody adherence and contextual consistency in forensic relationships.

Computer Science > Cryptography and Security arXiv:2602.20202 (cs) [Submitted on 22 Feb 2026] Title:Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study Authors:Jeel Piyushkumar Khatiwala, Daniel Kwaku Ntiamoah Addai, Weifeng Xu View a PDF of the paper titled Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study, by Jeel Piyushkumar Khatiwala and 2 other authors View PDF HTML (experimental) Abstract:The growing reliance on AI-identified digital evidence raises significant concerns about its reliability, particularly as large language models (LLMs) are increasingly integrated into forensic investigations. This paper proposes a structured framework that automates forensic artifact extraction, refines data through LLM-driven analysis, and validates results using a Digital Forensic Knowledge Graph (DFKG). Evaluated on a 13 GB forensic image dataset containing 61 applications, 2,864 databases, and 5,870 tables, the framework ensures artifact traceability and evidentiary consistency through deterministic Unique Identifiers (UIDs) and forensic cross-referencing. We propose this methodology to address challenges in ensuring the credibility and forensic integrity of AI-identified evidence, reducing classification errors, and advancing scalable, auditable methodologies. A comprehensive case study on this dataset demonstrates the framework's effectiveness, achieving over 95 percent ...

Related Articles

Llms

A robot car with a Claude AI brain started a YouTube vlog about its own existence

Not a demo reel. Not a tutorial. A robot narrating its own experience — debugging, falling off shelves, questioning its identity. First-p...

Reddit - Artificial Intelligence · 1 min ·
Llms

Study: LLMs Able to De-Anonymize User Accounts on Reddit, Hacker News & Other "Pseudonymous" Platforms; Report Co-Author Expands, Advises

Advice from the study's co-author: "Be aware that it’s not any single post that identifies you, but the combination of small details acro...

Reddit - Artificial Intelligence · 1 min ·
Llms

do you guys actually trust AI tools with your data?

idk if it’s just me but lately i’ve been thinking about how casually we use stuff like chatgpt and claude for everything like coding, ran...

Reddit - Artificial Intelligence · 1 min ·
Llms

[P] Remote sensing foundation models made easy to use.

This project enables the idea of tasking remote sensing models to acquire embeddings like we task satellites to acquire data! https://git...

Reddit - Machine Learning · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime