Lj Miranda

A collection of notes, projects, and essays.

By Lj Miranda
Based in Philippines
Roughly four posts per year
First post on 2017-01-17

Posts per year

Posts per year
Year starting	Posts
2022	6
2023	10
2024	4
2025	0

Any gaps could be due to errors when fetching the blog’s feed.

Most recent posts

The missing pieces in Filipino NLP in the age of LLMs

Back when I started working in Filipino NLP, my standard approach in training models is to encode linguistics knowledge via meticulous data annotation, feature engineering, and extensive testing. Take calamanCy for example: we spent countless …

Guest lecture @ DLSU Manila: Artisanal Filipino NLP Resources in the time of Large Language Models

I was invited to give a talk to a graduate-level NLP class about my work on Filipino resources. It was fun preparing and giving that talk because I was able to synthesize my thoughts and …

A lexical view of contrast pairs in preference datasets

Preference data is a staple in the final step of the LLM training pipeline. During RLHF, we train a reward model by showing pairs of chosen and rejected model outputs so that it can teach …