The Libraries are dedicated to furthering the fields of computational linguistics and natural language processing at Columbia. This guide serves as an entry point to researchers in those fields. As these fields are closely related in many ways with text and data mining, we encourage researchers to also look at our research guide on text mining for even more resources that might assist in answering computational, language-related research questions.
The Columbia University Libraries' Research Data Services (RDS) team provides expert advice on identifying and using corpora for many different types of scholarly projects, including in the field of linguistics. RDS also provides information on corpora licensed by the Libraries for the Columbia community.
TL;DR: Email Research Data Services (data@library.columbia.edu
) to set up a consultation for your text mining research project.
This list will grow over time as the Columbia research community builds more bridges to various data sources for computational linguistics and natural language process.