Assistant Professor of Linguistics
Email: lastname at brandeis dot edu
I perform research at the intersection of computation and language. My work is fundamentally interdisciplinary, combining machine learning with theoretical and experimental linguistics to develop simple yet effective models to study language in the mind and build natural language processing systems.
I did my graduate work in Computer Science at The University of Pennsylvania (Ph.D. 2013), advised by Mitch Marcus and Charles Yang. I then completed a post-doctoral fellowship at The Children's Hospital of Philadelphia exploring clinical applications of statistical models of language processing. I was a researcher at BBN Technologies and USC Information Sciences Institute. In summer 2019, I joined the computational linguistics faculty at Brandeis University.
Presenting at an AACL 2020 workshop
I’ll be presenting my AACL 2020 Technologies for MT of Low Resource Languages paper Effective Architectures for Low Resource Multilingual Named Entity Transliteration. This work was done with my student Molly Moran, who graduated from the Brandeis Computational Linguistics MS Program in 2020.
Presenting at an EMNLP 2020 workshop
I’ll be presenting my EMNLP 2020 Insights from Negative Results Workshop paper If You Build Your Own NER Scorer, Non-replicable Results Will Come. This work was done with my student Marjan Kamyab, who graduated from the Brandeis Computational Linguistics MS Program in 2020.
Presenting at EMNLP 2019
I’ll be presenting a poster for my EMNLP 2019 paper The Challenges of Optimizing Machine Translation for Low Resource Cross-Language Information Retrieval. This work was done at USC Information Sciences Institute in collaboration with UMass Amherst Center for Intelligent Information Retrieval.
Presenting at ACL 2019
I’ll be presenting a demo and poster for SARAL, a cross-language information retrieval system for lower-resourced languages. This work was done in collaboration with many, many researchers at USC Information Sciences Institute and Idiap Research Institute.
MORSEL: a cognitively-motivated state-of-the-art unsupervised morphological analyzer I developed for Morpho Challenge 2010. It achieved state-of-the-art results in English and Finnish.
Codeswitchador: a system for identifying code-switching in social media data. This work enables the creation of large scale corpora of code-switching and identification of bilingual users. I developed this as a participant of the SCALE summer workshop at the Johns Hopkins Center of Excellence in Human Language Technology.
Regrettably, much of my research over the last few years is closed-source. However, many projects are publicly available on GitHub.
I teach researchers to write great Python code. The notes for the bootcamps I've done are available at Python Boot Camp for Researchers.
I maintain a list of common mistakes that programmers new to Python make: Anti-Patterns in Python Coding.
In Spring 2011 and 2012, I taught one of the CIS department's "mini-courses," Python Programming (CIS 192).
In the past I've also led some informal groups for learning Python. The slides from those groups can be found on my Python for Language Researchers Site.