Session-14: File Handling in Python
Read, Write, CSV, JSON & Context Managers — Practical
Guide
Files are where your programs persist data—logs,
configs, user data, analytics, you name it. In this session, you’ll learn how
to safely read and write files, process CSV and JSON, and
avoid common pitfalls.
Table of Contents
- #why-file-handling
- #opening-files-modes--encodings
- #reading-files-text
- #writing--appending
- #the-with-statement-context-managers
- #working-with-csv
- #working-with-json
- #paths-existence--errors
- #best-practices--pitfalls
- #mini-project-simple-log-analyzer
- #faqs
- #summary--next-steps
Why File Handling?
- Persist
data across runs (e.g., configs, user data).
- Exchange
data with other systems via CSV/JSON.
- Log
activity and errors for debugging & audits.
Opening Files: Modes & Encodings
Common modes
- 'r'
— read (default)
- 'w'
— write (truncate/create)
- 'a'
— append (create if not exists)
- 'x'
— create, fail if exists
- Add 'b'
for binary: e.g., 'rb' (images, PDFs)
- Add '+'
for read/write: e.g., 'r+', 'w+'
Encoding Use encoding="utf-8" for text
files (handles most languages).
Python
# Open a file for reading text with UTF-8
f = open("notes.txt", mode="r",
encoding="utf-8")
content = f.read()
f.close()
Show more lines
Tip: Prefer the with statement to auto-close files
(see below).
Reading Files (Text)
Python
# Read entire file
with open("notes.txt", "r",
encoding="utf-8") as f:
text = f.read()
# Read line by line (memory-friendly)
with open("notes.txt", "r",
encoding="utf-8") as f:
for line in f:
print(line.strip())
# Read all lines into a list
with open("notes.txt", "r",
encoding="utf-8") as f:
lines = f.readlines() # includes newline characters
Show more lines
Writing & Appending
Python
# Overwrite (or create) a file
with open("output.txt", "w",
encoding="utf-8") as f:
f.write("First line\n")
f.write("Second line\n")
# Append to an existing file
with open("output.txt", "a",
encoding="utf-8") as f:
f.write("Appended line\n")
# Write multiple lines
lines = ["alpha\n", "beta\n",
"gamma\n"]
with open("list.txt", "w",
encoding="utf-8") as f:
f.writelines(lines)
Show more lines
The with Statement (Context Managers)
with ensures the file is closed even if an exception occurs.
Python
from pathlib import Path
path = Path("data/info.txt")
path.parent.mkdir(parents=True, exist_ok=True) # ensure
folder exists
with open(path, "w", encoding="utf-8")
as f:
f.write("Safe write with context manager.\n")
Show more lines
Working with CSV
Use Python’s built-in csv module for spreadsheet-like data.
Python
import csv
# Write CSV (list of dicts)
rows = [
{"name": "Hitesh", "role":
"Engineer", "score": 92},
{"name": "Anita", "role":
"Analyst", "score": 88},
]
with open("people.csv", "w",
newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=["name",
"role", "score"])
writer.writeheader()
writer.writerows(rows)
# Read CSV
with open("people.csv", "r",
encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
print(row["name"], row["score"])
Show more lines
Note: newline="" prevents extra blank lines
on Windows when writing CSVs.
Working with JSON
Use json for structured data (APIs, configs, settings).
Python
import json
config = {
"app": "blogtools",
"version": "1.0.0",
"features": ["slugify",
"reading_time"],
"debug": True
}
# Write JSON (pretty)
with open("config.json", "w",
encoding="utf-8") as f:
json.dump(config, f, ensure_ascii=False, indent=2)
# Read JSON
with open("config.json", "r",
encoding="utf-8") as f:
data = json.load(f)
# Convert between Python object <-> JSON string
json_str = json.dumps(config) # to string
config2 = json.loads(json_str) # from string
Show more lines
Tip: Use ensure_ascii=False to keep non-English
characters readable.
Paths, Existence & Errors
Use pathlib for clean, cross-platform paths.
Python
from pathlib import Path
p = Path("data") / "inputs" /
"file.txt"
if p.exists():
print("File exists:", p.resolve())
else:
print("Not found:", p)
# Safely create directories
p.parent.mkdir(parents=True, exist_ok=True)
Show more lines
Handling exceptions
Python
from pathlib import Path
try:
with open("missing.txt", "r",
encoding="utf-8") as f:
content = f.read()
except FileNotFoundError:
print("The file does not exist.")
except PermissionError:
print("Permission denied.")
except OSError as e:
print("OS error:", e)
Show more lines
Best Practices & Pitfalls
Do:
- Use with
open(...) to ensure closing.
- Specify
encoding="utf-8" for text files.
- Use csv.DictReader/DictWriter
for clarity.
- Use json.dump(...,
indent=2) for human-readable configs.
- Prefer
pathlib.Path over raw strings for paths.
Avoid:
- Reading
huge files fully into memory—iterate line by line.
- String
concatenation for paths (use Path).
- Silent
failures—log/raise meaningful errors.
- Writing
partial files—consider writing to a temp file then renaming for critical
ops.
Performance Tips:
- For
large reads, process in chunks or stream line-by-line.
- Use binary
modes ('rb'/'wb') for non-text (images, PDFs).
- Buffering
is automatic; rarely tweak unless profiling suggests.
Mini-Project: Simple Log Analyzer
Goal: Read an application log, compute stats, and
write a summary JSON.
Sample app.log (create it):
2025-12-15 10:01:10 INFO User logged in
2025-12-15 10:03:22 ERROR Failed DB connection
2025-12-15 10:05:05 WARNING High memory usage
2025-12-15 10:06:42 INFO Page rendered
2025-12-15 10:08:00 ERROR Timeout on API
analyze_logs.py:
Python
from pathlib import Path
import json
from collections import Counter
def parse_log(path: Path) -> Counter:
"""
Count log levels (INFO/WARNING/ERROR) from a simple
space-delimited log.
"""
level_counts = Counter()
with open(path, "r", encoding="utf-8")
as f:
for line in f:
parts = line.strip().split()
if len(parts) >= 3:
level = parts[2] # INFO, WARNING, ERROR
level_counts[level] += 1
return level_counts
def write_summary(summary: dict, out_path: Path) -> None:
out_path.parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w",
encoding="utf-8") as f:
json.dump(summary, f, indent=2, ensure_ascii=False)
def main():
log_path = Path("app.log")
if not log_path.exists():
print("Log file not found:", log_path)
return
counts = parse_log(log_path)
total = sum(counts.values())
summary = {
"total_lines": total,
"counts": dict(counts),
"has_errors": counts.get("ERROR", 0)
> 0
}
write_summary(summary,
Path("reports/summary.json"))
print("Summary written to reports/summary.json")
if __name__ == "__main__":
main()
Show more lines
Run:
Shell
python analyze_logs.py
Show more lines
Output reports/summary.json:
JSON
{
"total_lines": 5,
"counts": {
"INFO": 2,
"ERROR": 2,
"WARNING": 1
},
"has_errors": true
Show more lines
FAQs
Q1. What encoding should I use?
Use utf-8 unless you know the file’s specific encoding.
Q2. Why do I see extra blank lines in CSV on Windows?
Open the file with newline="" when writing using csv to avoid extra
blanks.
Q3. When should I use JSON vs CSV?
- CSV:
Tabular, spreadsheet-like data.
- JSON:
Nested, hierarchical, or configuration data.
Q4. How do I read big files efficiently?
Iterate line-by-line (for line in f:), or process in chunks.
Summary & Next Steps
In this session, you learned:
- Safe
open/read/write patterns for files.
- Processing
CSV and JSON.
- Using
context managers and pathlib.
- Handling
errors and encodings.
Up next: Exception handling in depth (try/except/else/finally,
custom exceptions), and logging best practices to complement file I/O.
No comments:
Post a Comment