メインコンテンツまでスキップ

Python File System Operations

· 約3分
Mikyan
白い柴犬
  • Use pathlib to handle file paths and operations by default
  • For finegraind read/write over file I/O (streaming), use context manager
  • Use the tempfile Module for Temporary Files/Directories
  • Use shutil for High-level Operations

Details

Use pathlib is the modern, oo way to handle file paths, and operations.

from pathlib import Path

# Create Path Objects:
my_file = Path("data")

# Use Path methods
my_file.exists()

# Content I/O

my_file.read_text()
#Path.write_text()
#Path.write_text()
#Path.write_bytes()

Create a Path Object

# From CWD
#
current_dir = Path.cwd()
# From Home
home_dir = Path.home()
# From absolute Paths
abs_path = Path("/usr/local/bin/python")
# From relative paths (relative to CWD)
relative_path = Path("data/input.csv")

# Create path by manipulation
base_dir = Path('/opt/my_app')
config_file = base_dire / "config" / "settings.yaml"

parent_dir = config_file.parent

Dealing with file name

# Get file / directory name
config_file.name

# Getting Stem
config_file.stem # settings

# Getting suffix
config_file.suffix
config_file.suffixes

# Get absolute path
config_file.resolve()
# or
config_file.absolute()

# Get relative path
relative_to = config_file.relative_to(project_root)

Check / Query File System

my_file.exists()

my_file.is_file()
my_file.is_dir()
my_file.is_symlink()

# Statistics
stats = temp_file.stat()

Operations

# Create directories
new_dir.mkdir()

# create empty file
empty_file.touch()

# delete file

file_to_delete.unlink()


# delete empty directories

empty_folder.rmdir()

# rename / move file or directories

old_path.rename(new_path)

# Changing suffix
config_file.with_suffix('.yml')

File Content I/O

config_path = Path("config.txt")
config_path.write_text("debug=True\nlog_level=INFO")
content = config_path.read_text()

binary_data_file = Path("binary_data.bin")
binary_data_file.write_bytes(b'\x01\x02\x03\x04')
data = binary_data_file.read_bytes()
print(f"Binary data: {data}")

directory iteration / traversal

# List
project_root.iterdir()

# Globbing
project_root.glob("*.py")

# Walking Directory Tree (Python 3.12+)
project_root.walk()

Use Context Managers (with open(...)) for File I/O

When you need more fine-grained control over file reading/writing, (streaming large files, specific encoding, or binary modes), use the with statement.

try:
with open("my_large_file.csv", "w", encoding="utf-8") as f:
f.write("Header1,Header2\n")
for i in range(1000):
f.write(f"data_{i},value_{i}\n")
except IOError as e:
print(f"Error writing file: {e}")

Use the tempfile Module for Temporary Files/Directories

import tempfile
from pathlib import Path

# Using a temporary directory
with tempfile.TemporaryDirectory() as tmp_dir_str:
tmp_dir = Path(tmp_dir_str)
temp_file = tmp_dir / "temp_report.txt"
temp_file.write_text("Ephemeral data.")
print(f"Created temporary file at: {temp_file}")
# At the end of the 'with' block, tmp_dir_str and its contents are deleted
print("Temporary directory removed.")

Use shutil for High-level Operations

shutil Focuses on operations that involing moving, copying, or deleting entire trees of files and directories, or other utility functions that go beyond a single Path obejct's scope.

import shutil

source_dir = Path("my_data")
destination_dir = Path("backup_data")


try:
shutil.copytree(source_dir, destination_dir)
print(f"Copied '{source_dir}' to '{destination_dir}'")
except FileExistsError:
print(f"Destination '{destination_dir}' already exists. Skipping copy.")
except Exception as e:
print(f"Error copying tree: {e}")
import shutil
from pathlib import Path

dir_to_delete = Path("backup_data") # Assuming this exists from the copytree example

if dir_to_delete.exists():
print(f"Deleting '{dir_to_delete}'...")
shutil.rmtree(dir_to_delete)
print("Directory deleted.")
else:
print(f"Directory '{dir_to_delete}' does not exist.")

Zip / Tarring

shutil even can create compressed archieves, and unpack them.

archive_path = shutil.make_archive(archive_name, 'zip', source_dir)
print(f"Created archive: {archive_path}")

Copy File Metadata

  • shutil.copystat(src, dst) copy permission bits, last access time, last modification time and flags from one file to another
  • shutil.copy2(src, dst) copies the file and metadata

Getting Disk Usage

usage = shutil.disk_usage(Path(".")) # Check current directory's disk
print(f"Total: {usage.total / (1024**3):.2f} GB")
print(f"Used: {usage.used / (1024**3):.2f} GB")
print(f"Free: {usage.free / (1024**3):.2f} GB")

Do not

  • Avoid os.system() or subprocess.run() for file operations in most case