[2407.03646] Differentiating Between Human-Written and AI-Generated Texts Using Automatically Extracted Linguistic Features
Summary
This article explores the differences between human-written and AI-generated texts by analyzing linguistic features, revealing significant distinctions in their composition.
Why It Matters
Understanding the differences between human and AI-generated texts is crucial for improving AI writing capabilities and ensuring effective language assessment tools. This research highlights the need for enhanced training methodologies in AI to produce more human-like text, which is increasingly relevant in fields like education and content creation.
Key Takeaways
- AI-generated texts show notable differences in linguistic features compared to human-written texts.
- The study utilized Open Brain AI to analyze phonological, morphological, syntactic, and lexical components.
- Findings suggest a need for improved AI training methodologies to enhance text generation.
- Automated tools can significantly reduce time and effort in language assessment.
- The research contributes to ongoing discussions about AI's capabilities in mimicking human writing.
Computer Science > Computation and Language arXiv:2407.03646 (cs) [Submitted on 4 Jul 2024 (v1), last revised 17 Feb 2026 (this version, v4)] Title:Differentiating Between Human-Written and AI-Generated Texts Using Automatically Extracted Linguistic Features Authors:Georgios P. Georgiou View a PDF of the paper titled Differentiating Between Human-Written and AI-Generated Texts Using Automatically Extracted Linguistic Features, by Georgios P. Georgiou View PDF Abstract:While extensive research has focused on ChatGPT in recent years, very few studies have systematically quantified and compared linguistic features between human-written and artificial intelligence (AI)-generated language. This exploratory study aims to investigate how various linguistic components are represented in both types of texts, assessing the ability of AI to emulate human writing. Using human-authored essays as a benchmark, we prompted ChatGPT to generate essays of equivalent length. These texts were analyzed using Open Brain AI, an online computational tool, to extract measures of phonological, morphological, syntactic, and lexical constituents. Despite AI-generated texts appearing to mimic human speech, the results revealed significant differences across multiple linguistic features such as specific types of consonants, nouns, adjectives, pronouns, adjectival/prepositional modifiers, and use of difficult words, among others. These findings underscore the importance of integrating automated tools for...