→ n-grams (bigrams, trigrams, etc.) are sequences of N words used to analyze co-occurrences and build conceptual or contextual associations between terms in natural language processing (NLP). This helps in understanding the semantic structure of language and is ideal for finding relationships between words.
Why the other options are incorrect:
B: NER (Named Entity Recognition) identifies entities like names or dates; it doesn’t focus on conceptual associations.
C: TF-IDF scores term importance relative to documents, not associations.
D: POS (Part of Speech) tagging identifies word roles (noun, verb, etc.), not direct associations.
Official References:
CompTIA DataX (DY0-001) Official Study Guide – Section 6.3:“n-gram analysis is useful for discovering common patterns and associations in unstructured text data.”
Natural Language Processing with Python (NLTK Book), Chapter 3:“N-grams help capture collocations and associations between words that often co-occur, essential for understanding context.”
—
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit