Strings + Text Processing
Immutability, slicing, formatting, and methods
Why it matters
Most interview problems involve text. String mastery saves time.
The idea
Strings are immutable sequences. Slicing produces NEW strings; you never modify in place. f"" strings are the modern way to format.
Try it
Slicing — s[start:stop:step]:
s = "abcdefghij"
print(s[2:5]) # 'cde'
print(s[:4]) # 'abcd'
print(s[-3:]) # 'hij'
print(s[::2]) # 'acegi'
print(s[::-1]) # 'jihgfedcba' (reverse)
Common methods — split, strip, join, replace, find:
raw = " Hello, World, Python "
parts = [p.strip() for p in raw.split(",")]
print(parts)
print(",".join(parts))
print("python" in raw.lower())
Formatting — f-strings beat .format() and % every time:
name, score = "Ada", 0.875
print(f"{name:>10} | {score:6.1%}")
print(f"{name!r:<10} | binary: {42:08b}")
# Multi-line + expressions
print(f"""
sum: {2 + 3}
list: {[i*i for i in range(5)]}
""")
String methods you must know
| Category | Methods | What they do |
| --- | --- | --- |
| Clean | strip, lstrip, rstrip | remove whitespace |
| Case | lower, upper, title, capitalize | change case |
| Search | find, index, count | locate and count |
| Match | startswith, endswith | prefix/suffix checks |
| Split/Join | split, rsplit, splitlines, 'sep'.join(...) | tokenizing and joining |
| Replace | replace | substitution |
| Check | isalnum, isalpha, isdigit, isnumeric, isspace | validation |
name = " Ada Lovelace "
print(name.strip().lower())
"data-science".split("-")
"-".join(["a", "b"])
Palindrome normalization pattern
A reusable recipe: keep only alphanumeric characters and lowercase them, then compare against the reverse.
def normalize(s: str) -> str:
return "".join(ch.lower() for ch in s if ch.isalnum())
Quick check
- Q: Are strings mutable? A: No.
Mini drills
- Reverse a string two ways.
- Count vowels in a sentence.
- Remove punctuation and lowercase.
Do's and don'ts
- Do use
s.strip()before comparisons. - Don't build strings with
+in loops.
Going deeper — bytes vs str
Text and binary are different types in Python 3:
str= text (Unicode)bytes= raw binary
Encode to go from text to bytes, decode to come back:
b = "hello".encode("utf-8")
text = b.decode("utf-8")Common mistakes
- Mistake: Trying to modify
s[0]. Fix: Build a new string. - Mistake: Using
+in a loop. Fix: Use"".join(...). - Mistake: Using
findthen not checking-1. Fix: Verify the result.
Key takeaways
- Strings are immutable —
s.upper()returns a new string, doesn't mutates. - Slicing creates a new string;
[::-1]reverses,[::2]takes every other. - f-strings can embed expressions and format specs (
{x:>10.2f},{x!r},{x:08b}). str.split()/"sep".join(iterable)is the join/split idiom.- Use
.joinfor performance; normalize text before comparisons.