Daflip ¶

A fast and flexible data format conversion tool built with Python, pandas, and PyArrow.

Features ¶

Multiple Format Support: Convert between CSV, Parquet, Excel, Feather, ORC, and more
High Performance: Leverages pandas and PyArrow for efficient data processing
Chunked Processing: Handle large files with memory-efficient chunked processing
Schema Management: Export and apply custom schemas for data validation
Compression Support: Built-in compression for output formats
Command Line Interface: Simple CLI for quick conversions
Python API: Full Python API for programmatic use

Quick Start ¶

# Install Daflip
pip install daflip

# Convert a CSV file to Parquet
daflip data.csv data.parquet

# Convert with compression
daflip data.csv data.parquet --compression snappy

# Convert large files with chunking
daflip large.csv large.parquet --input-chunk-size 10000

# Or use uvx for one-off conversions without installing
uvx daflip data.csv data.parquet

Supported Formats ¶

Input	Output	Notes
CSV	✅	With custom separators
Parquet	✅	With compression
Excel (.xlsx, .xls)	✅	With sheet selection
Feather	✅
ORC	✅
SAS7BDAT	✅
Stata	✅
SPSS	✅
HTML	✅	With table selection

Why Daflip?¶

Simple: One command to convert between any supported format
Fast: Optimized for performance with pandas and PyArrow
Flexible: Support for compression, chunking, and custom schemas
Reliable: Comprehensive test suite and error handling

Get Started ¶

Choose your path:

Installation - How to install Daflip
Quick Start - Your first conversion
User Guide - Detailed usage instructions
API Reference - Complete CLI and Python API docs

Contributing ¶

We welcome contributions! See our Contributing Guide for details.

Daflip¶

Features¶

Quick Start¶

Supported Formats¶

Why Daflip?¶

Get Started¶

Contributing¶

Daflip ¶

Features ¶

Quick Start ¶

Supported Formats ¶

Get Started ¶

Contributing ¶