Python File System Operations
· 約3分
- Use pathlib to handle file paths and operations by default
- For finegraind read/write over file I/O (streaming), use context manager
- Use the
tempfile
Module for Temporary Files/Directories - Use shutil for High-level Operations
Details
Use pathlib
is the modern, oo way to handle file paths, and operations.
from pathlib import Path
# Create Path Objects:
my_file = Path("data")
# Use Path methods
my_file.exists()
# Content I/O
my_file.read_text()
#Path.write_text()
#Path.write_text()
#Path.write_bytes()
Create a Path Object
# From CWD
#
current_dir = Path.cwd()
# From Home
home_dir = Path.home()
# From absolute Paths
abs_path = Path("/usr/local/bin/python")
# From relative paths (relative to CWD)
relative_path = Path("data/input.csv")
# Create path by manipulation
base_dir = Path('/opt/my_app')
config_file = base_dire / "config" / "settings.yaml"
parent_dir = config_file.parent
Dealing with file name
# Get file / directory name
config_file.name
# Getting Stem
config_file.stem # settings
# Getting suffix
config_file.suffix
config_file.suffixes
# Get absolute path
config_file.resolve()
# or
config_file.absolute()
# Get relative path
relative_to = config_file.relative_to(project_root)
Check / Query File System
my_file.exists()
my_file.is_file()
my_file.is_dir()
my_file.is_symlink()
# Statistics
stats = temp_file.stat()
Operations
# Create directories
new_dir.mkdir()
# create empty file
empty_file.touch()
# delete file
file_to_delete.unlink()
# delete empty directories
empty_folder.rmdir()
# rename / move file or directories
old_path.rename(new_path)
# Changing suffix
config_file.with_suffix('.yml')
File Content I/O
config_path = Path("config.txt")
config_path.write_text("debug=True\nlog_level=INFO")
content = config_path.read_text()
binary_data_file = Path("binary_data.bin")
binary_data_file.write_bytes(b'\x01\x02\x03\x04')
data = binary_data_file.read_bytes()
print(f"Binary data: {data}")
directory iteration / traversal
# List
project_root.iterdir()
# Globbing
project_root.glob("*.py")
# Walking Directory Tree (Python 3.12+)
project_root.walk()
Use Context Managers (with open(...)) for File I/O
When you need more fine-grained control over file reading/writing, (streaming large files, specific encoding, or binary modes), use the with statement.
try:
with open("my_large_file.csv", "w", encoding="utf-8") as f:
f.write("Header1,Header2\n")
for i in range(1000):
f.write(f"data_{i},value_{i}\n")
except IOError as e:
print(f"Error writing file: {e}")
Use the tempfile
Module for Temporary Files/Directories
import tempfile
from pathlib import Path
# Using a temporary directory
with tempfile.TemporaryDirectory() as tmp_dir_str:
tmp_dir = Path(tmp_dir_str)
temp_file = tmp_dir / "temp_report.txt"
temp_file.write_text("Ephemeral data.")
print(f"Created temporary file at: {temp_file}")
# At the end of the 'with' block, tmp_dir_str and its contents are deleted
print("Temporary directory removed.")
Use shutil for High-level Operations
shutil Focuses on operations that involing moving, copying, or deleting entire trees of files and directories, or other utility functions that go beyond a single Path obejct's scope.
import shutil
source_dir = Path("my_data")
destination_dir = Path("backup_data")
try:
shutil.copytree(source_dir, destination_dir)
print(f"Copied '{source_dir}' to '{destination_dir}'")
except FileExistsError:
print(f"Destination '{destination_dir}' already exists. Skipping copy.")
except Exception as e:
print(f"Error copying tree: {e}")
import shutil
from pathlib import Path
dir_to_delete = Path("backup_data") # Assuming this exists from the copytree example
if dir_to_delete.exists():
print(f"Deleting '{dir_to_delete}'...")
shutil.rmtree(dir_to_delete)
print("Directory deleted.")
else:
print(f"Directory '{dir_to_delete}' does not exist.")
Zip / Tarring
shutil even can create compressed archieves, and unpack them.
archive_path = shutil.make_archive(archive_name, 'zip', source_dir)
print(f"Created archive: {archive_path}")
Copy File Metadata
- shutil.copystat(src, dst) copy permission bits, last access time, last modification time and flags from one file to another
- shutil.copy2(src, dst) copies the file and metadata
Getting Disk Usage
usage = shutil.disk_usage(Path(".")) # Check current directory's disk
print(f"Total: {usage.total / (1024**3):.2f} GB")
print(f"Used: {usage.used / (1024**3):.2f} GB")
print(f"Free: {usage.free / (1024**3):.2f} GB")
Do not
- Avoid os.system() or subprocess.run() for file operations in most case