Manu

Friday, 27 February 2026

Python Learning Session-12: File Handling in Python

 

Session-14: File Handling in Python

Read, Write, CSV, JSON & Context Managers — Practical Guide

Files are where your programs persist data—logs, configs, user data, analytics, you name it. In this session, you’ll learn how to safely read and write files, process CSV and JSON, and avoid common pitfalls.


Table of Contents

  1. #why-file-handling
  2. #opening-files-modes--encodings
  3. #reading-files-text
  4. #writing--appending
  5. #the-with-statement-context-managers
  6. #working-with-csv
  7. #working-with-json
  8. #paths-existence--errors
  9. #best-practices--pitfalls
  10. #mini-project-simple-log-analyzer
  11. #faqs
  12. #summary--next-steps

Why File Handling?

  • Persist data across runs (e.g., configs, user data).
  • Exchange data with other systems via CSV/JSON.
  • Log activity and errors for debugging & audits.

Opening Files: Modes & Encodings

Common modes

  • 'r' — read (default)
  • 'w' — write (truncate/create)
  • 'a' — append (create if not exists)
  • 'x' — create, fail if exists
  • Add 'b' for binary: e.g., 'rb' (images, PDFs)
  • Add '+' for read/write: e.g., 'r+', 'w+'

Encoding Use encoding="utf-8" for text files (handles most languages).

Python

# Open a file for reading text with UTF-8

f = open("notes.txt", mode="r", encoding="utf-8")

content = f.read()

f.close()

Show more lines

Tip: Prefer the with statement to auto-close files (see below).


Reading Files (Text)

Python

# Read entire file

with open("notes.txt", "r", encoding="utf-8") as f:

text = f.read()

 

# Read line by line (memory-friendly)

with open("notes.txt", "r", encoding="utf-8") as f:

for line in f:

print(line.strip())

 

# Read all lines into a list

with open("notes.txt", "r", encoding="utf-8") as f:

lines = f.readlines() # includes newline characters

Show more lines


Writing & Appending

Python

# Overwrite (or create) a file

with open("output.txt", "w", encoding="utf-8") as f:

f.write("First line\n")

f.write("Second line\n")

 

# Append to an existing file

with open("output.txt", "a", encoding="utf-8") as f:

f.write("Appended line\n")

 

# Write multiple lines

lines = ["alpha\n", "beta\n", "gamma\n"]

with open("list.txt", "w", encoding="utf-8") as f:

f.writelines(lines)

Show more lines


The with Statement (Context Managers)

with ensures the file is closed even if an exception occurs.

Python

from pathlib import Path

 

path = Path("data/info.txt")

path.parent.mkdir(parents=True, exist_ok=True) # ensure folder exists

 

with open(path, "w", encoding="utf-8") as f:

f.write("Safe write with context manager.\n")

 

Show more lines


Working with CSV

Use Python’s built-in csv module for spreadsheet-like data.

Python

import csv

 

# Write CSV (list of dicts)

rows = [

{"name": "Hitesh", "role": "Engineer", "score": 92},

{"name": "Anita", "role": "Analyst", "score": 88},

]

with open("people.csv", "w", newline="", encoding="utf-8") as f:

writer = csv.DictWriter(f, fieldnames=["name", "role", "score"])

writer.writeheader()

writer.writerows(rows)

 

# Read CSV

with open("people.csv", "r", encoding="utf-8") as f:

reader = csv.DictReader(f)

for row in reader:

print(row["name"], row["score"])

Show more lines

Note: newline="" prevents extra blank lines on Windows when writing CSVs.


Working with JSON

Use json for structured data (APIs, configs, settings).

Python

import json

 

config = {

"app": "blogtools",

"version": "1.0.0",

"features": ["slugify", "reading_time"],

"debug": True

}

 

# Write JSON (pretty)

with open("config.json", "w", encoding="utf-8") as f:

json.dump(config, f, ensure_ascii=False, indent=2)

 

# Read JSON

with open("config.json", "r", encoding="utf-8") as f:

data = json.load(f)

 

# Convert between Python object <-> JSON string

json_str = json.dumps(config) # to string

config2 = json.loads(json_str) # from string

 

Show more lines

Tip: Use ensure_ascii=False to keep non-English characters readable.


Paths, Existence & Errors

Use pathlib for clean, cross-platform paths.

Python

from pathlib import Path

 

p = Path("data") / "inputs" / "file.txt"

 

if p.exists():

print("File exists:", p.resolve())

else:

print("Not found:", p)

 

# Safely create directories

p.parent.mkdir(parents=True, exist_ok=True)

Show more lines

Handling exceptions

Python

from pathlib import Path

 

try:

with open("missing.txt", "r", encoding="utf-8") as f:

content = f.read()

except FileNotFoundError:

print("The file does not exist.")

except PermissionError:

print("Permission denied.")

except OSError as e:

print("OS error:", e)

Show more lines


Best Practices & Pitfalls

Do:

  • Use with open(...) to ensure closing.
  • Specify encoding="utf-8" for text files.
  • Use csv.DictReader/DictWriter for clarity.
  • Use json.dump(..., indent=2) for human-readable configs.
  • Prefer pathlib.Path over raw strings for paths.

Avoid:

  • Reading huge files fully into memory—iterate line by line.
  • String concatenation for paths (use Path).
  • Silent failures—log/raise meaningful errors.
  • Writing partial files—consider writing to a temp file then renaming for critical ops.

Performance Tips:

  • For large reads, process in chunks or stream line-by-line.
  • Use binary modes ('rb'/'wb') for non-text (images, PDFs).
  • Buffering is automatic; rarely tweak unless profiling suggests.

Mini-Project: Simple Log Analyzer

Goal: Read an application log, compute stats, and write a summary JSON.

Sample app.log (create it):

2025-12-15 10:01:10 INFO User logged in

2025-12-15 10:03:22 ERROR Failed DB connection

2025-12-15 10:05:05 WARNING High memory usage

2025-12-15 10:06:42 INFO Page rendered

2025-12-15 10:08:00 ERROR Timeout on API

analyze_logs.py:

Python

from pathlib import Path

import json

from collections import Counter

 

def parse_log(path: Path) -> Counter:

"""

Count log levels (INFO/WARNING/ERROR) from a simple space-delimited log.

"""

level_counts = Counter()

with open(path, "r", encoding="utf-8") as f:

for line in f:

parts = line.strip().split()

if len(parts) >= 3:

level = parts[2] # INFO, WARNING, ERROR

level_counts[level] += 1

return level_counts

 

def write_summary(summary: dict, out_path: Path) -> None:

out_path.parent.mkdir(parents=True, exist_ok=True)

with open(out_path, "w", encoding="utf-8") as f:

json.dump(summary, f, indent=2, ensure_ascii=False)

 

def main():

log_path = Path("app.log")

if not log_path.exists():

print("Log file not found:", log_path)

return

 

counts = parse_log(log_path)

total = sum(counts.values())

 

summary = {

"total_lines": total,

"counts": dict(counts),

"has_errors": counts.get("ERROR", 0) > 0

}

 

write_summary(summary, Path("reports/summary.json"))

print("Summary written to reports/summary.json")

 

if __name__ == "__main__":

main()

Show more lines

Run:

Shell

python analyze_logs.py

Show more lines

Output reports/summary.json:

JSON

{

"total_lines": 5,

"counts": {

"INFO": 2,

"ERROR": 2,

"WARNING": 1

},

"has_errors": true

 

Show more lines


FAQs

Q1. What encoding should I use?
Use utf-8 unless you know the file’s specific encoding.

Q2. Why do I see extra blank lines in CSV on Windows?
Open the file with newline="" when writing using csv to avoid extra blanks.

Q3. When should I use JSON vs CSV?

  • CSV: Tabular, spreadsheet-like data.
  • JSON: Nested, hierarchical, or configuration data.

Q4. How do I read big files efficiently?
Iterate line-by-line (for line in f:), or process in chunks.


Summary & Next Steps

In this session, you learned:

  • Safe open/read/write patterns for files.
  • Processing CSV and JSON.
  • Using context managers and pathlib.
  • Handling errors and encodings.

Up next: Exception handling in depth (try/except/else/finally, custom exceptions), and logging best practices to complement file I/O.