Skip to content

Lj Miranda

A collection of notes, projects, and essays.

  • By Lj Miranda
  • Based in Philippines
  • Roughly four posts per year
  • First post on

Posts per year

Data for this chart is available in the table below
Posts per year
Year starting Posts
2022 6
2023 10
2024 4
2025 0

Any gaps could be due to errors when fetching the blog’s feed.

Most recent posts

The missing pieces in Filipino NLP in the age of LLMs
Back when I started working in Filipino NLP, my standard approach in training models is to encode linguistics knowledge via meticulous data annotation, feature engineering, and extensive testing. Take calamanCy for example: we spent countless …
On , by LJ MIRANDA, 1,993 words
Guest lecture @ DLSU Manila: Artisanal Filipino NLP Resources in the time of Large Language Models
I was invited to give a talk to a graduate-level NLP class about my work on Filipino resources. It was fun preparing and giving that talk because I was able to synthesize my thoughts and …
On , by LJ MIRANDA, 2,238 words
A lexical view of contrast pairs in preference datasets
Preference data is a staple in the final step of the LLM training pipeline. During RLHF, we train a reward model by showing pairs of chosen and rejected model outputs so that it can teach …
On , by LJ MIRANDA, 1,638 words