PostgreSQL's COPY command
PostgreSQL's COPY command is a core tool for efficient data import and export, and its performance is far beyond conventional INSERT statements. The following is a deep analysis of the COPY command:
A COPY command basics
1.1 Basic syntax comparison
Command Type | Syntax example | Execution location | File access permissions |
---|---|---|---|
Server side COPY | COPY table FROM '/path/'; |
Database Server | Requires user permissions for postgres system |
Client COPY | \copy table FROM ''; |
Client Machine | Use client user permissions |
1.2 Core Function Matrix
Function | COPY FROM | COPY TO |
---|---|---|
Data loading speed | Ten thousand rows per second | Ten thousand rows per second |
Transaction processing | Single transaction operation | Single transaction operation |
Binary support | yes | yes |
Error handling | Skip the error line | - |
2 Advanced usage skills
2.1 Complex data conversion
-- Convert data types when importing COPY users(id, name, reg_date) FROM '/data/' WITH (FORMAT csv, HEADER, DELIMITER '|', NULL 'NULL', FORCE_NOT_NULL (id, name), ENCODING 'UTF8');
2.2 Conditional Export
-- Export query results COPY (SELECT * FROM orders WHERE order_date > '2025-01-01') TO '/data/recent_orders.csv' WITH (FORMAT csv, HEADER);
Three Performance Optimization Solution
3.1 Batch Loading Best Practices
# Use parallel loading (after splitting the file)for i in {1..4}; do psql -c "COPY large_table FROM '/data/part$' WITH (FORMAT csv)" & done wait
3.2 Key performance parameters
parameter | Recommended value | Influence |
---|---|---|
maintenance_work_mem |
1GB+ | Improve import sorting efficiency |
max_wal_size |
4GB+ | Reduce WAL checkpoints |
synchronous_commit |
off | Disable synchronous submission speed up import |
Four Exception handling mechanism
4.1 Error logging
-- Create an error log table CREATE TABLE import_errors ( line_num integer, error_msg text, raw_data text ); -- Import with error records BEGIN; CREATE TEMP TABLE temp_import (LIKE target_table); COPY temp_import FROM '/data/' WITH (FORMAT csv, HEADER); INSERT INTO target_table SELECT * FROM temp_import ON CONFLICT DO NOTHING; INSERT INTO import_errors SELECT pg_copy_log(); COMMIT;
4.2 Binary format processing
# Export binary datapg_dump -t table_name -Fc -f dbname # Binary file conversionpg_restore -l >
5. Monitoring and Maintenance
5.1 Performance monitoring query
-- CheckCOPYOperation history SELECT query, duration FROM pg_stat_statements WHERE query LIKE 'COPY%' ORDER BY duration DESC; -- Check the import progress(PostgreSQL 14+) SELECT pid, query, pg_stat_get_progress_info('COPY') FROM pg_stat_activity WHERE backend_type = 'client backend';
5.2 Maintenance suggestions
- Regularly clean temporary files: COPY operation may generate a large number of WAL logs
- Version upgrade verification: COPY behavior may differ in different PostgreSQL versions
- Network optimization: Consider compression options when transferring across data centers
The COPY command is the core tool for PostgreSQL data migration. Mastering its advanced usage can significantly improve ETL efficiency. For TB-level data migration, it is recommended:
- Reduce I/O with binary format
- Parallel loading in combination with table partitions
- Disable WAL archive in the maintenance window
- Consider using pg_bulkload extension to process hyperscale data
For more details, please check the official documentation:
/docs/17/
Remember: be respectful and stop doing things.
This is the end of this article about PostgreSQL's COPY command. For more related PostgreSQL COPY command content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!