CLAWS Tagger

Purpose: 

A software tool for performing Parts-of-Speech (POS) tagging - the classification of words into one or more categories based upon its definition, relationship with other words, or other context - on a body of text. CLAWS (Constituent Likelihood Automatic Word-tagging System) uses several methods to identify parts of speech, most notably a system called Hidden Markov models (HMMs) which involve counting cases and making a table of the probabilities of certain sequences of words. For example, if an article and verb appear together, the next word is more likely to be a preposition, article, or noun, rather than another verb.

Features: 
  • Parts-of-Speech tagging with an accuracy rate of approximately 96-97 percent for text analysed
  • Template tagging
A&H use case 1 description: 
The NECTE project amalgamated two separate corpora of recorded speech collected from local people on Tyneside in the UK. It has used the CLAWS tagging service to create part-of-speech tags within the corpus.
A&H use case 2 description: 
The 'Grammatical change in recent English (1961-1991)' project employed the CLAWS tagger to investigate recent changes in English grammar during the period 1961-1991.
Publisher: 
University Centre for Computer Corpus Research on Language (UCREL), University of Lancaster
Creator: 
University Centre for Computer Corpus Research on Language (UCREL), University of Lancaster
lifecycleStage: 
Specifications: 
Discipline: 
Software/programming languages used: