Research

How we build and evaluate.

Technical notes on the work behind our models: how we assemble training data, how we measure whether a model is actually useful, and what we learn along the way.

Search & sources14 July 2026

Behind IsonAI: How Ison Search Reaches Indonesian Sources

How Ison Search combines BM25 and vector search, merges their results with RRF, then reranks candidates into cited context for IsonAI.

Retrieval10 July 2026

Searching the Indonesian web by meaning

Keyword search misses Indonesian questions asked in everyday words. How we added retrieval by meaning to IsonSearch, made vector search over millions of pages fit on our own hardware, and what it changes for IsonAI.

Curation & quality14 June 2026

Curating the Indonesian web for AI

The open web is full of spam, scraper traps, and copied pages. How IsonSearch Curator decides what enters the index, judges quality the web does not label, and runs as automation built to be corrected.

Corpus & data29 May 2026

Building an Indonesian regulatory corpus

A specialist model is only as good as its corpus. How we assembled a large, clean body of Indonesia's public regulation, from gathering to deduplication, and what surprised us along the way.

Evaluation28 May 2026

Evaluating AI for Indonesian government work

Popular benchmarks say little about whether a model can trace a legal basis or draft an official letter. The evaluation we built instead, and how we grade it.