portrait

Constantine Lignos

Assistant Professor
Michtom School of Computer Science
Volen National Center for Complex Systems
Brandeis University

Email: lastname at brandeis dot edu
Twitter: @ConstantineLig

I direct the Broadening Linguistic Technologies Lab at Brandeis University, where I am affiliated with the Michtom School of Computer Science, Computational Linguistics Program, and Linguistics Program. The overarching goal of my research is to broaden the depth and breadth of human language technology, with a focus on understudied problems in computational linguistics.

The primary thrust of my current work is eliminating the barriers to useful language technology for every living written language, especially lower-resourced and minoritized languages. I also continue to study the representation of language in the mind, including language acquisition, processing, and change.

I did my graduate work in Computer Science at The University of Pennsylvania (Ph.D. 2013), advised by Mitch Marcus and Charles Yang. I then completed a post-doctoral fellowship at The Children's Hospital of Philadelphia exploring clinical applications of statistical models of language processing. I was a researcher at BBN Technologies and USC Information Sciences Institute. In summer 2019, I joined the computational linguistics faculty at Brandeis University.


Latest news

Announcing Multilingual Open Text 1.0

My lab is excited to relase Multilingual Open Text 1.0, a new dataset of permissively-licensed text in 44 languages, many of them lower-resourced. Many thanks to joint first authors Chester Palen-Michel and June Kim!

Honorable mention for best paper at Eval4NLP 2021

My lab’s paper SeqScore: Addressing Barriers to Reproducible Named Entity Recognition Evaluation received an honorable mention for the best paper award at Eval4NLP 2021. Congratulation to the student authors on the paper, Chester Palen-Michel (Brandeis Ph.D. student) and Nolan Holley (Williams undergraduate)!


Software

SeqScore: a Python package for evaluating named entity recognition (NER) and other chunking tasks.

MORSEL: a cognitively-motivated state-of-the-art unsupervised morphological analyzer I developed for Morpho Challenge 2010. It achieved state-of-the-art results in English and Finnish.

Codeswitchador: a system for identifying code-switching in social media data. This work enables the creation of large scale corpora of code-switching and identification of bilingual users. I developed this as a participant of the SCALE summer workshop at the Johns Hopkins Center of Excellence in Human Language Technology.

Regrettably, much of my research over the last few years is closed-source. However, many projects are publicly available on GitHub.


Teaching

My current teaching at Brandeis includes introductory and advanced courses in computational linguistics. My course materials are all posted on the Brandeis LATTE learning management system, but names of courses I've taught recently are available on my faculty guide page.

In the past I have taught researchers to write great Python code. The notes for the bootcamps I've done are available at Python Boot Camp for Researchers.

I used to maintain a list of common mistakes that programmers new to Python make: Anti-Patterns in Python Coding. It's now outdated, but still has some good content.

In Spring 2011 and 2012, I taught one of the CIS department's "mini-courses," Python Programming (CIS 192).

In the past I've also led some informal groups for learning Python. The slides from those groups can be found on my Python for Language Researchers Site.