JSONL and NDJSON: The Complete Developer Guide
What JSON Lines is, when to use it over a JSON array, how to validate it, and why it has become the format of choice for ML training data and log pipelines.
What is JSONL?
JSONL (JSON Lines), also known as NDJSON (Newline Delimited JSON), is a text format where each line is a complete, valid JSON value. The format has three simple rules:
- Each line is a valid JSON value (object, array, string, number, etc.)
- Lines are separated by
\n(Unix newlines recommended) - Empty lines are allowed and should be ignored
{"id":1,"name":"Alice","role":"admin"}
{"id":2,"name":"Bob","role":"user"}
{"id":3,"name":"Charlie","role":"user"}Notice there are no surrounding brackets and no commas between lines — every line stands alone.
JSONL vs JSON Array
A JSON array of objects ([{...}, {...}]) is the obvious alternative. JSONL wins in specific scenarios:
Streaming
With a JSON array, you must receive the entire response before parsing begins — the parser needs to see the closing ]. With JSONL, you can parse and process each line as it arrives. This is why LLM APIs like OpenAI and Anthropic use JSONL for streaming responses.
Large files
Appending to a JSONL file is a single write call. Appending to a JSON array requires reading the file, removing the trailing ], appending a comma and new object, then re-adding the ]. For log files that grow continuously, JSONL wins decisively.
Partial processing
You can grep, head, tail, and wc -l a JSONL file with standard Unix tools. A JSON array requires a proper parser for any meaningful operation.
When to Use a JSON Array Instead
- REST API responses — clients expect standard JSON
- Small datasets where streaming is irrelevant
- When the data has a natural top-level structure beyond a flat list
- Browser-side data where you need
JSON.parse()to work directly
Common Uses of JSONL
Machine Learning Training Data
JSONL is the de facto format for LLM fine-tuning datasets. OpenAI's fine-tuning API, Anthropic's training pipelines, and Hugging Face datasets all use JSONL:
{"messages":[{"role":"user","content":"What is 2+2?"},{"role":"assistant","content":"4"}]}
{"messages":[{"role":"user","content":"Translate 'hello' to French."},{"role":"assistant","content":"Bonjour"}]}Application Logs
Structured logging tools (Pino, Winston, Bunyan) output JSONL by default. Each log entry is a self-contained JSON object that log aggregators (Datadog, Loki, CloudWatch) can parse without preprocessing.
Database Exports
MongoDB's mongoexport outputs JSONL. ClickHouse, BigQuery, and DynamoDB all support JSONL as an import/export format because it maps cleanly to a sequence of rows.
Validating JSONL
Validation is simple — parse each non-empty line with JSON.parse() and report errors by line number:
function validateJsonl(text) {
return text.split('
').map((line, i) => {
if (!line.trim()) return { line: i + 1, ok: true };
try {
JSON.parse(line);
return { line: i + 1, ok: true };
} catch (e) {
return { line: i + 1, ok: false, error: e.message };
}
});
}Converting JSONL to a JSON Array
const jsonArray = text
.split('
')
.filter(line => line.trim())
.map(line => JSON.parse(line));
// Then stringify for output:
JSON.stringify(jsonArray, null, 2);File Extension
Both .jsonl and .ndjson are used. .jsonl is more common in ML tooling; .ndjson is preferred in some API and logging contexts. The formats are identical — the difference is naming only.
Validate your JSONL instantly
Paste JSONL data to validate each line and convert valid records to a clean JSON array — all in your browser.
Open JSONL Validator →