Biography
I am a sixth-year Ph.D. student at the Department of Computer Science at Boston University, currently advised by Prof. Steven Homer. In general, I am interested in Computational Complexity Theory and Algorithm Design. My research focuses on Circuit Complexity and Meta-Complexity. In particular, I study the mysterious Minimum Circuit Size Problem (MCSP) and focus on proving some barriers on proving its hardness as well as its connections with other topics in Theoretical Computer Science.
During the time when I am not occupied with my main research, I tend to participate in Kaggle's and Coursera's challenges on different types of problems such as regression, text classifying, etc. to self-study and enhance my basic knowledge in Machine Learning and Data Science. Visit my Github (link above) for the full collection of side projects I have done. Some are listed below.
In May 2020, I received a B.A. with Summa Cum Laude and High Honors in Computer Science from Clark University where I was advised by
Prof. Frederick Green. I did some research on Human-Computer Interaction (HCI) during
my undergraduate study under the guidance of Prof. John Magee.
Teaching Experience
- Teaching Fellow
- CS630 - Graduates Algorithms (Fall 2021)
- CS332 - Theory of Computation (Spring 2023, Fall 2023, 2024)
- CS237 - Probability in Computing (Summer 2024)
- CS235 - Algebraic Algorithms (Spring 2021, Fall 2025)
- CS132 - Geometric Algorithms (Summer 2022)
- CS131 - Combinatoric Structures (Summer 2022, 2023)
- Grader
- CS535 - Complexity Theory (Fall 2023)
Research Projects
- Algorithm Design
- Simple Circuit Extensions For XOR in PTIME. 2024
- Joint work with Marco Carmosino and Tim Jackman.
- Description: we say an extension g of a total Boolean function f with m extra
variables is simple, if we can take an optimal Boolean circuit C computing f
and add m extra variables and logic gates to C in some way to construct an optimal circuit
computing g. For this work, we consider the DeMorgan Basis (which allows AND, OR, and NOT gates only),
and NOT-gates can be used for free.
When f = XOR, the exclusive-OR function, then the decision problem of finding a
simple extension for XOR is conjectured to be intractable which implies
that this decision problem is a plausible candidate for proving NP-hardness of MCSP.
In this work, we showed an efficient algorithm that decides the simple extension problem for
XOR and ultimately rules out this decision problem as a candidate for solving the
hardness mystery of MCSP once and for all. Thus, our algorithm formally raised another barrier
on proving such hardness.
- A preprint of this work can be found here.
- Circuit Complexity
- On the possibility of a 2n + O(√n) tight-bound for the Multiplexer function. 2024
- Description: In the early 19070s, the circuit lower-bound of 2n - O(1) was shown for the single-output Multiplexer function (MUX), a Boolean function that commonly appears in practice, by Wolfgang Paul.
In this work, we aim to improve this lower-bound to 2n + O(√n) which matches the best known upper-bound for Multiplexer shown by Peter Klein and M.S. Paterson in 1978.
- Work in progress. A survey of what we know so far about MUX can be found here.
- Minimal XOR Circuits: The One True Shape is a Binary Tree. 2023
- Joint work with: Marco Carmosino and Tim Jackman.
- Description: The Exclusive-OR function is one of the most fundamental Boolean functions that has appeared
in many different contexts in both theory and practice. The circuit lower-bound of 3n - O(1) was shown
for XOR by Claus-Peter Schnorr in 1974 under the DeMorgan Basis (which allows AND, OR, and NOT gates only),
and NOT-gates can be used for free, and in this setting, XOR has a trivial matching upper-bound.
In this work, we push our understanding of XOR further by characterizing the structure of optimal
circuits computing XOR.
- Unpublished Manuscript. This work was integrated with Simple Circuit Extensions For XOR in PTIME.
A write-up on this work can be found here (for interest in this part of the our project only).
- Other Publications & Manuscripts
- Marco Carmosino, Ngu Dang, Tim Jackman. 2023. Formalizing Gate Elimination via Term Graphs Rewriting. Work In Progress (preprint will be available soon).
- Mariah Papy, Duncan Calder, Ngu Dang, Aidan McLaughlin, Breanna Desrochers, and John Magee.
2019. Simulation of Motor Impairment with “Reversed Angle Mouse”
in Head-Controlled Pointer Fitts’ Law Task. In Proceedings of the 21st International ACM SIGACCESS
Conference on Computers and Accessibility (ASSETS’19). ACM, Pittsburgh, PA, USA, 545-547.
Selected Personal Projects
- Tweet Dialect Classifier: Built a dialect classifying pipeline in Python with BERTweet-based model that distinguishes African American Vernacular English from Standard and regular African American English and achieved 0.95, 0.99, and 0.97 for accuracy, recall, and F1 score respectively. Integrated the classifier into a bias-aware sentiment analysis pipeline, with statistical analysis to provide insights on fairness in interpretation of social media text across different models (i.e. RoBERTa, RoBERTa-Latest, BERTweet). Github Link.
- Human Activity Recognition Using Deep Learning: Built a deep learning pipeline using Python and PyTorch to classify human activities from Wi-Fi Channel State Information (CSI) data, achieving 0.98 accuracy score with a custom CNN-LSTM model. The Human Activities via Wi-Fi CSI dataset can be found here. Github Link.
- Churn Predictor for Subscription Services (Coursera's Challenge): Implemented an end-to-end churn prediction pipeline in Python for a video streaming service using a real-world imbalanced subscription dataset using an ensemble of three models — a neural network, XGBoost, and Random Forest — using weighted soft voting to optimize class ranking and maximize AUC. Github Link.
- Edible Mushroom Classifier (Kaggle's Challenge): Implemented a Random Forest model classifying edible mushrooms from toxic ones based on physical characteristics. The dataset used in this project (train and test) was generated from a deep learning model trained on the UCI Mushroom dataset . The training set contains 3116945 data points; the test set contains 2077964 data points, with 22 features. The model achieved an accuracy score of 0.987. Github Link .
Certifications
Academic Awards
- Outstanding Academic Achivement. Issued by Department of Mathematics & Computer Science
- Clark University, May 2020
- Phi Beta Kappa Membership. Issued by Phi Beta Kappa
- Lambda of Massachusetts at Clark University, May 2020