Meridianmeridian

File Extension

representation.file.extension

File name extension or suffix (txt, pdf, docx, jpg, etc.). May include or exclude leading dot.

Domain
representation
Category
file
Casts to
VARCHAR
Scope
broad_words

Try it

CLI
$ finetype infer -i "txt"
→ representation.file.extension

DuckDB

Detect
SELECT finetype('txt');
-- → 'representation.file.extension'
Cast expression
LOWER(REGEXP_REPLACE({col}, '^\.*', ''))
Safe cast pipeline
-- Normalise and cast in one step
SELECT TRY_CAST(finetype_cast(my_column) AS VARCHAR) AS clean_value
FROM my_table
WHERE finetype(my_column) = 'representation.file.extension';

Struct Expansion

Expression
category: CASE WHEN {col} IN ('txt', 'doc', 'docx', 'pdf', 'rtf') THEN 'document' WHEN {col} IN ('jpg', 'jpeg', 'png', 'gif', 'bmp', 'svg') THEN 'image' WHEN {col} IN ('mp4', 'avi', 'mov', 'mkv', 'webm') THEN 'video' WHEN {col} IN ('mp3', 'wav', 'flac', 'aac', 'm4a') THEN 'audio' WHEN {col} IN ('zip', 'rar', '7z', 'tar', 'gz') THEN 'archive' ELSE 'other' END

JSON Schema

finetype schema representation.file.extension
{
  "$id": "https://meridian.online/schemas/representation.file.extension",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "description": "File name extension or suffix (txt, pdf, docx, jpg, etc.). May include or exclude leading dot.",
  "examples": [
    "txt",
    ".pdf",
    "docx",
    "jpg",
    "xlsx"
  ],
  "pattern": "^\\.?[a-zA-Z0-9]{1,10}$",
  "title": "File Extension",
  "type": "string",
  "x-finetype-broad-type": "VARCHAR",
  "x-finetype-transform": "LOWER(REGEXP_REPLACE({col}, '^\\.*', ''))"
}

Examples

txt.pdfdocxjpgxlsx

Aliases

file_type