Skip to main content
Loading time...

CSV vs TSV vs JSON: Which Data Format Should You Use?

A practical comparison of the three most common text-based data formats, with guidance on when each one shines.

Three Formats, One Problem

When you need to move structured data between systems -- exporting from a database, feeding a data pipeline, sharing results with a colleague, or loading data into a web application -- you will almost always reach for one of three formats: CSV, TSV, or JSON. Each has decades of real-world use, broad tooling support, and genuine strengths. But they also have meaningful differences that affect correctness, performance, and developer experience.

Quick Comparison

FeatureCSVTSVJSON
Delimiter,\tStructural (braces, brackets)
Data typesAll stringsAll stringsString, number, boolean, null, array, object
NestingNoNoYes (arbitrary depth)
Spreadsheet supportExcellentGoodPoor (needs conversion)
API supportRareVery rareUniversal
File sizeSmallestSmallLarger (key repetition)
Formal specRFC 4180 (loose)IANA (informal)RFC 8259 (strict)

CSV: The Universal Spreadsheet Format

CSV (Comma-Separated Values) is the lingua franca of tabular data. Every spreadsheet application, every database, every data analysis tool can read and write CSV. Its simplicity is its strength: each line is a row, fields are separated by commas, and fields containing commas or newlines are quoted with double quotes.

Strengths

  • Universal compatibility: Excel, Google Sheets, LibreOffice, R, Python pandas, SQL COPY -- everything handles CSV.
  • Human-readable: You can open a CSV in any text editor and understand the data immediately.
  • Compact: Minimal overhead per record. A million-row CSV is typically 30-50% smaller than the same data in JSON.
  • Streamable: You can process CSV line-by-line without loading the entire file into memory.

Weaknesses

  • No type information: Everything is a string. The number 42, the boolean true, and the date 2026-01-01 are all just text. Consumers must infer types.
  • Quoting ambiguity: Different tools handle quoting differently. Excel on Windows uses different defaults than Excel on macOS. See our guide on CSV parsing edge cases.
  • No nesting: CSV is flat. If your data has hierarchical relationships, you must flatten them (and lose structure) or use a different format.
  • Encoding issues: No standard encoding declaration. UTF-8 is assumed but not guaranteed. Excel exports often use locale-specific encodings.

Best For

Data exchange with spreadsheet users, database imports/exports, log files, configuration data, and any scenario where the data is naturally tabular with uniform columns.

TSV: CSV Without the Quoting Problem

TSV (Tab-Separated Values) uses tab characters instead of commas as delimiters. This seemingly minor change eliminates the most common source of CSV parsing errors: commas in data values. Tabs almost never appear naturally in data, so quoting is rarely needed.

Strengths

  • Less ambiguity: Tabs in data are extremely rare, so the quoting rules that trip up CSV parsers are almost never triggered.
  • Copy-paste friendly: When you copy a table from a web page or spreadsheet, it is typically tab-delimited. TSV matches this native format.
  • Bioinformatics standard: TSV is the dominant format in genomics, proteomics, and scientific data pipelines (BED, GFF, VCF files are all tab-delimited).

Weaknesses

  • Less universal than CSV: Some tools that auto-detect CSV will not auto-detect TSV without configuration.
  • Invisible delimiter: In a text editor, you cannot see the difference between tabs and spaces, which can cause subtle bugs.
  • Same type limitations: Like CSV, everything is a string with no type metadata.

Best For

Scientific data pipelines, bioinformatics, data with commas in values (addresses, descriptions), and clipboard-based data transfer.

JSON: The API Standard

JSON (JavaScript Object Notation) is the default data format for web APIs and modern application development. Unlike CSV and TSV, JSON supports data types (strings, numbers, booleans, null), nesting (objects within objects), and arrays of mixed types.

Strengths

  • Rich types: Numbers are numbers, booleans are booleans, null is null. No type inference needed.
  • Hierarchical: Naturally represents nested data: a customer with multiple orders, each with multiple line items.
  • Self-describing: Keys are embedded in the data, so a JSON document can be understood without external schema documentation.
  • Strict specification: RFC 8259 leaves almost no room for ambiguity. A valid JSON document parses the same way everywhere.

Weaknesses

  • Verbose: Key names are repeated for every record. A 1M-row dataset repeats column names 1 million times, inflating file size by 2-5x compared to CSV.
  • Not streamable (by default): Standard JSON must be fully parsed before any data can be accessed. JSON Lines (JSONL) solves this but is a separate convention.
  • Poor spreadsheet support: Opening JSON in Excel or Google Sheets requires conversion or a plugin.
  • No comments: JSON does not support comments, which makes it awkward for configuration files (though JSONC and JSON5 extensions exist).

Best For

Web APIs, application configuration, document databases, nested or hierarchical data, and any scenario where type preservation matters.

Decision Guide

Choose your format based on who or what will consume the data:

When to Use Each Format

  • CSVSpreadsheet users, database import/export, data analysis (pandas, R), simple flat data
  • TSVScientific workflows, data with commas, clipboard operations, bioinformatics pipelines
  • JSONWeb APIs, nested data, type-sensitive applications, config files, document stores

Converting Between Formats

Our CSV Viewer can open CSV and TSV files and export them as JSON. For the reverse direction, our JSON Formatter can help you inspect and transform JSON data. For flat JSON arrays of objects, the CSV Viewer can convert to CSV/TSV for spreadsheet consumption.

Further Reading