Corpus Linguistics: The Basics

Phloneme
7 Aug 201505:00

Summary

TLDRIn this video, the concept of corpora in linguistics is introduced, exploring their role in research and language analysis. The presenter explains how corpora β€” collections of authentic texts β€” are used to study language trends, grammatical changes, and lexical patterns. Using real-life examples like analyzing Justin Bieber's tweets and the classic research by Laurie Bauer on adjective comparison, the video highlights the benefits of corpus analysis. Viewers learn how tools like AntConc can help uncover word frequency, collocations, and concordance, offering valuable insights for linguists and language learners alike.

Takeaways

  • πŸ˜€ A corpus is a collection of authentic texts used for linguistic research, derived from the Latin word for 'body'.
  • πŸ˜€ Corpora can consist of any type of authentic text, such as newspapers, novels, social media posts, or even tweets.
  • πŸ˜€ The use of corpora in linguistic research is highlighted through Laurie Bauer's example of analyzing grammatical change in comparative adjectives using historical newspapers.
  • πŸ˜€ With modern technology, analyzing corpora is much easier and faster compared to past methods, which relied on manual scanning of printed texts.
  • πŸ˜€ Word frequency analysis in corpora helps researchers understand the prominence of certain words within a text or dataset, giving insights into its style or audience.
  • πŸ˜€ Collocation analysis identifies pairs of words that tend to appear together, helping reveal patterns and relationships in language use.
  • πŸ˜€ Concordance analysis allows researchers to view larger chunks of context around a search term, providing insight into its usage and surrounding discourse.
  • πŸ˜€ Corpus linguistics offers objective, data-driven insights into language patterns, such as morphosyntax, language change, and discourse traces.
  • πŸ˜€ Corpora are useful not only for academic research but also for language learners to verify word usage or prepositions in different contexts.
  • πŸ˜€ Tools like AntConc can be used to analyze corpora and extract valuable linguistic data, with no need for manual calculations.
  • πŸ˜€ Google's frequency count can also be a simple, effective tool for checking the common usage of words or phrases in a corpus, especially for beginners.

Q & A

  • What is a corpus in linguistics?

    -A corpus is a collection of authentic texts, often written by native speakers, used for linguistic research. It can be made up of various types of texts, such as newspapers, novels, social media posts, or tweets.

  • Where does the word 'corpus' come from?

    -The word 'corpus' comes from Latin, where it means 'body.' It is related to the word 'corpse' because of the metaphorical use of 'body' to describe a collection of texts.

  • What role did Laurie Bauer's research play in the development of corpus linguistics?

    -Laurie Bauer used newspapers as a corpus to study grammatical changes, specifically how the rule of comparing adjectives evolved over time. His research would have been much easier with the availability of digital text and computers.

  • How can a corpus be used in modern linguistics research?

    -A corpus can be used to analyze various linguistic features such as word frequency, collocation (how words appear together), and concordance (the surrounding context of a word). It provides objective data for studying language patterns and change.

  • What is word frequency in the context of corpus analysis?

    -Word frequency refers to how often a word appears in a text or corpus. This information can be useful to determine the register and audience of the text, helping to analyze its thematic focus.

  • What is collocation in corpus linguistics?

    -Collocation is the study of how words tend to appear together in close proximity within a text. By analyzing collocates of specific words, linguists can identify common word pairings and understand how words naturally combine in a language.

  • What is concordance, and why is it useful in corpus analysis?

    -Concordance refers to the method of viewing the larger context around a specific word or phrase in a corpus. It helps to understand how words are used in different contexts and is particularly useful for examining word usage patterns and meanings.

  • How does corpus linguistics benefit high-proficiency language learners?

    -Corpus linguistics provides language learners with real-world examples of language use, helping them to understand word usage, collocations, and syntactic patterns. It also allows learners to check correct prepositions or other language structures through corpora.

  • Can Google’s frequency count be used for corpus analysis?

    -Yes, Google’s frequency count is a useful tool for simple corpus analysis, allowing users to check how often a word or phrase appears in online content, providing a quick way to understand word usage trends.

  • What kinds of linguistic phenomena can be studied using corpora?

    -Corpora can be used to study a wide range of linguistic phenomena, including grammatical rules, language change over time, morphosyntactic patterns, and lexical traces of discourses. This makes corpus analysis highly versatile in linguistics research.

Outlines

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Mindmap

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Keywords

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Highlights

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Transcripts

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now
Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
Corpus LinguisticsLanguage ResearchText AnalysisLanguage LearningWord FrequencyCollocationConcordanceDiscourse AnalysisAntConcLinguistic PatternsMorphosyntax