🍋
Menu
File Formats

TSV

Tab-Separated Values

A plain text format for tabular data where columns are separated by tab characters instead of commas.

技術的詳細

File tsv determines how bytes map to characters. ASCII uses 7 bits (128 characters). ISO 8859-1 (Latin-1) extends to 256 characters. UTF-8 is the universal standard: backward-compatible with ASCII, it uses 1-4 bytes per character and covers all 149,813 Unicode characters. UTF-8 accounts for 98%+ of web pages. The BOM (Byte Order Mark, U+FEFF) optionally identifies encoding — required for UTF-16 (to distinguish big-endian from little-endian) but unnecessary for UTF-8.

```javascript
// Detect and convert text encoding
const decoder = new TextDecoder('utf-8');
const text = decoder.decode(uint8Array);

// Check for BOM (Byte Order Mark)
if (bytes[0] === 0xEF && bytes[1] === 0xBB && bytes[2] === 0xBF) {
  console.log('UTF-8 with BOM detected');
}
```

関連フォーマット

関連ツール

関連用語