text clustering

tool: MONK

Purpose: 

MONK (Metadata Offer New Knowledge) is an online toolset created to assist humanities researchers with the discovery and analysis of patterns within a textual resource. It supports micro analyses of the verbal texture of an individual text and macro analyses of hundreds or thousands of text objects. Each text is converted to an TEI compliant schema using Abbot, normalised using Morphadorner with tokenization, sentence boundaries, standard spellings, parts of speech and lemmata and finally ingested using a Prior tool into a database that provides Java access methods for data extraction.

Features: 
  • Micro analysis of verbal texture of an individual text
  • Macro analysis that allows the user to locate texts in the context of a larger document space consisting of thousands of other texts.
A&H use case 1 description: 
The Text Creation Partnership (EEBO and ECCO) and ProQuest (Chadwyck-Healey Nineteenth-Century Fiction) are using MONK to analyse approximately 1000 works of British literature from the 16th through the 19th century. See MONK project web site for further information.
A&H use case 2 description: 
The MONK project web site provides a sample collection of approximately 525 works of American literature from the 18th and 19th centuries, and 37 plays and 5 works of poetry by William Shakespeare for analysis. See MONK project web site for further information.
Publisher: 
Andrew W. Mellon Foundation, University of Alberta & University of Illinois at Urbana-Champaign
Creator: 
Andrew W. Mellon Foundation, University of Alberta & University of Illinois at Urbana-Champaign
Data analysis: 
Data structuring and enhancement: 
lifecycleStage: 
Platform: 
Licence: 
Software/programming languages used: 
Specifications: