Interview Prep
Interview Questions on Python for Experienced — GIL, Generators, Async/Await, Metaclasses & Memory Management
Experienced Python interviews go beyond syntax. They test your understanding of how Python works internally — GIL, memory management, metaclasses, and concurrency patterns. Here are the questions senior Python roles at product companies and data teams actually ask.

Senior Python interviews test how well you understand the language internals, not just the syntax.
Python for Experienced Developers
Experienced Python interviews go beyond syntax. They test your understanding of how Python works internally — GIL, memory management, metaclasses, and concurrency patterns. Product companies like Google, Microsoft, Flipkart, and Razorpay expect senior Python developers to explain why things work, not just how.
Data teams at companies like Swiggy, Zomato, and PhonePe test generators, multiprocessing, and memory optimization. ML teams ask about C extensions, profiling, and the GIL. Backend teams focus on async/await, design patterns, and system design with Python.
This guide covers 10 advanced Python questions for experienced developers — the questions that separate a 3-year Python developer from a 7-year one.
“Explain the GIL and when you would use threading vs multiprocessing vs asyncio” — this is the question that defines senior Python interviews.
Advanced Concepts
Q1: What are generators? How are they different from lists?
Generators = lazy iterators that produce values one at a time.
Lists = store ALL values in memory at once.
# List comprehension — all values in memory
squares_list = [x**2 for x in range(1_000_000)]
# Uses ~8MB of memory
# Generator expression — one value at a time
squares_gen = (x**2 for x in range(1_000_000))
# Uses ~120 bytes of memory
# Generator function using yield
def fibonacci():
a, b = 0, 1
while True:
yield a # pauses here, returns value
a, b = b, a + b
fib = fibonacci()
print(next(fib)) # 0
print(next(fib)) # 1
print(next(fib)) # 1
print(next(fib)) # 2
# send() method — send values INTO a generator
def accumulator():
total = 0
while True:
value = yield total
total += value
acc = accumulator()
next(acc) # initialize (returns 0)
acc.send(10) # returns 10
acc.send(20) # returns 30
# Generator pipeline pattern
def read_lines(file):
for line in open(file):
yield line.strip()
def filter_errors(lines):
for line in lines:
if "ERROR" in line:
yield line
def extract_message(lines):
for line in lines:
yield line.split("ERROR:")[-1]
# Pipeline: read → filter → extract (memory efficient)
lines = read_lines("app.log")
errors = filter_errors(lines)
messages = extract_message(errors)
for msg in messages:
print(msg)Q2: What are metaclasses? When would you use them?
Metaclass = class of a class.
type is the default metaclass in Python.
# Everything in Python is an object:
# int is an instance of type
# str is an instance of type
# Your class is an instance of type (or your metaclass)
type(42) # <class 'int'>
type(int) # <class 'type'>
type(type) # <class 'type'>
# Simple metaclass example:
class ValidateMeta(type):
def __new__(mcs, name, bases, namespace):
# Enforce: all methods must have docstrings
for key, value in namespace.items():
if callable(value) and not key.startswith('_'):
if not value.__doc__:
raise TypeError(
f"Method '{key}' in '{name}' must have a docstring"
)
return super().__new__(mcs, name, bases, namespace)
class MyAPI(metaclass=ValidateMeta):
def get_users(self):
"""Fetch all users.""" # OK — has docstring
pass
def delete_user(self): # ERROR — no docstring
pass
# Use cases for metaclasses:
# 1. ORM (Django models) — fields become DB columns
# 2. API validation — enforce method signatures
# 3. Singleton pattern — control instance creation
# 4. Automatic registration — register subclasses
# 5. Abstract base classes (abc.ABCMeta)
# Rule: "If you're not sure you need a metaclass,
# you don't need one." — Tim PetersQ3: What are descriptors? Explain __get__, __set__, __delete__.
Descriptors = objects that control attribute access.
@property is a descriptor under the hood.
# Descriptor protocol:
# __get__(self, obj, objtype) → called on attribute access
# __set__(self, obj, value) → called on assignment
# __delete__(self, obj) → called on deletion
# Custom descriptor for type validation:
class TypedField:
def __init__(self, name, expected_type):
self.name = name
self.expected_type = expected_type
def __get__(self, obj, objtype=None):
if obj is None:
return self
return obj.__dict__.get(self.name)
def __set__(self, obj, value):
if not isinstance(value, self.expected_type):
raise TypeError(
f"'{self.name}' must be {self.expected_type.__name__}, "
f"got {type(value).__name__}"
)
obj.__dict__[self.name] = value
def __delete__(self, obj):
del obj.__dict__[self.name]
class User:
name = TypedField("name", str)
age = TypedField("age", int)
def __init__(self, name, age):
self.name = name # calls TypedField.__set__
self.age = age
user = User("Alice", 30) # OK
user = User("Alice", "30") # TypeError!
# Data descriptor vs Non-data descriptor:
# Data descriptor: defines __get__ AND __set__ (or __delete__)
# Non-data descriptor: defines only __get__
# Data descriptors take priority over instance __dict__
# Non-data descriptors can be overridden by instance __dict__Concurrency & GIL
Q4: What is the GIL? Why does Python have it?
GIL = Global Interpreter Lock
Only ONE thread can execute Python bytecode at a time.
Even on a 16-core machine, only 1 core runs Python code.
Why does the GIL exist?
- CPython's memory management (reference counting) is NOT thread-safe
- Without GIL, every ref count change would need a lock
- GIL is simpler and faster for single-threaded code
Impact:
┌─────────────────────────────────────────┐
│ CPU-bound tasks: │
│ Threading does NOT help (GIL blocks) │
│ 4 threads on 4 cores → still 1 core │
│ │
│ I/O-bound tasks: │
│ Threading DOES help (GIL released) │
│ While thread 1 waits for I/O, │
│ thread 2 can execute Python code │
└─────────────────────────────────────────┘
Workarounds:
1. multiprocessing — separate processes, each has own GIL
2. C extensions (NumPy) — release GIL during C code
3. asyncio — single thread, event loop for I/O
4. Cython — compile to C, release GIL explicitly
5. Python 3.13+ — experimental free-threaded mode (no GIL)
# Proof: threading doesn't help CPU-bound tasks
import threading, time
def count():
total = 0
for _ in range(50_000_000):
total += 1
# Single thread: ~3.5 seconds
# Two threads: ~3.5 seconds (no improvement!)
# Two processes: ~1.8 seconds (real parallelism)Q5: Threading vs Multiprocessing vs Asyncio — when to use which?
Three concurrency models in Python:
Threading — I/O-bound tasks
- GIL released during I/O (network, file, DB)
- Shared memory (easy data sharing)
- Use for: API calls, file I/O, database queries
Multiprocessing — CPU-bound tasks
- Separate processes, each has own GIL
- True parallelism on multiple cores
- Use for: data processing, ML training, image processing
Asyncio — high-concurrency I/O
- Single thread, event loop
- Thousands of concurrent connections
- Use for: web servers, websockets, many API calls
# Threading example:
import threading
def fetch_url(url):
response = requests.get(url)
return response.text
threads = [threading.Thread(target=fetch_url, args=(url,))
for url in urls]
for t in threads: t.start()
for t in threads: t.join()
# Multiprocessing example:
from multiprocessing import Pool
def process_data(chunk):
return heavy_computation(chunk)
with Pool(4) as pool:
results = pool.map(process_data, data_chunks)
# Asyncio example:
import asyncio, aiohttp
async def fetch(session, url):
async with session.get(url) as resp:
return await resp.text()
async def main():
async with aiohttp.ClientSession() as session:
tasks = [fetch(session, url) for url in urls]
results = await asyncio.gather(*tasks)
asyncio.run(main())
# Decision tree:
# CPU-bound? → multiprocessing
# I/O-bound, few connections? → threading
# I/O-bound, many connections? → asyncioQ6: How does async/await work in Python?
async/await = cooperative multitasking on a single thread.
# Core concepts:
# - Event loop: schedules and runs coroutines
# - Coroutine: async def function
# - await: suspends execution until result is ready
import asyncio
async def fetch_data(name, delay):
print(f"Start {name}")
await asyncio.sleep(delay) # suspends, lets others run
print(f"Done {name}")
return f"{name} result"
async def main():
# Sequential: 3 seconds total
r1 = await fetch_data("A", 1)
r2 = await fetch_data("B", 2)
# Concurrent: 2 seconds total (max of delays)
r1, r2 = await asyncio.gather(
fetch_data("A", 1),
fetch_data("B", 2)
)
asyncio.run(main())
# Real-world: fetching multiple URLs concurrently
import aiohttp
async def fetch_url(session, url):
async with session.get(url) as response:
return await response.json()
async def main():
urls = ["https://api.example.com/1",
"https://api.example.com/2",
"https://api.example.com/3"]
async with aiohttp.ClientSession() as session:
tasks = [fetch_url(session, url) for url in urls]
results = await asyncio.gather(*tasks)
# All 3 requests run concurrently!
# Sequential: 3 requests × 200ms = 600ms
# Async: 3 requests concurrent = ~200ms
# Key: await only works inside async def
# Key: asyncio.run() starts the event loop
# Key: asyncio.gather() runs coroutines concurrently
Understanding Python internals is what separates senior developers from those who just know the syntax.
Memory & Internals
Q7: How does Python manage memory? What is garbage collection?
Python memory management:
1. Reference counting (primary)
2. Generational garbage collector (for cycles)
# Reference counting:
import sys
a = [1, 2, 3]
print(sys.getrefcount(a)) # 2 (a + getrefcount arg)
b = a # refcount = 3
c = a # refcount = 4
del b # refcount = 3
del c # refcount = 2
# When refcount reaches 0 → memory freed immediately
# Problem: circular references
class Node:
def __init__(self):
self.ref = None
a = Node()
b = Node()
a.ref = b # a → b
b.ref = a # b → a (circular!)
del a, b # refcount never reaches 0!
# Solution: generational garbage collector
import gc
# 3 generations (0, 1, 2)
# New objects start in gen 0
# Surviving objects promoted to gen 1, then gen 2
# Gen 0 collected most frequently
gc.collect() # force collection
gc.get_count() # (gen0, gen1, gen2) counts
gc.disable() # disable GC (rare, for performance)
# Memory pools (pymalloc):
# Objects < 512 bytes → pymalloc (fast, internal pools)
# Objects >= 512 bytes → system malloc
# Weak references (for caching):
import weakref
cache = weakref.WeakValueDictionary()
# Objects can be garbage collected even if in cache
# __del__ method — called when object is destroyed
# Avoid: unpredictable timing, can prevent GC of cyclesQ8: What are __slots__? When should you use them?
__slots__ = restricts attributes to a fixed set.
Saves memory by NOT creating __dict__ per instance.
# Without __slots__ (default):
class PointDefault:
def __init__(self, x, y):
self.x = x
self.y = y
p = PointDefault(1, 2)
p.z = 3 # OK — dynamic attributes allowed
print(p.__dict__) # {'x': 1, 'y': 2, 'z': 3}
# With __slots__:
class PointSlots:
__slots__ = ('x', 'y')
def __init__(self, x, y):
self.x = x
self.y = y
p = PointSlots(1, 2)
p.z = 3 # AttributeError!
# No __dict__ → less memory per instance
# Memory comparison:
import sys
# Without slots: ~152 bytes per instance (includes __dict__)
# With slots: ~56 bytes per instance
# For 1 million instances:
# Without slots: ~152 MB
# With slots: ~56 MB (63% less memory!)
# When to use __slots__:
# ✓ Classes with millions of instances (data points, records)
# ✓ Performance-critical inner loops
# ✓ Known, fixed set of attributes
# Trade-offs:
# ✗ No dynamic attributes (can't add new attrs)
# ✗ No __dict__ (some libraries expect it)
# ✗ Multiple inheritance with conflicting slots is tricky
# ✗ Slightly more complex class definition
# Tip: Use dataclasses with slots (Python 3.10+):
from dataclasses import dataclass
@dataclass(slots=True)
class Point:
x: float
y: floatDesign Patterns
Q9: Implement Singleton, Factory, and Observer patterns in Python.
# Pythonic design patterns (not Java-style!)
# 1. Singleton — using __new__
class Singleton:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
a = Singleton()
b = Singleton()
print(a is b) # True
# Simpler Singleton: just use a module!
# config.py → module-level instance IS a singleton
# settings = Settings() # created once on import
# 2. Factory — function that returns different classes
class JSONParser:
def parse(self, data): return json.loads(data)
class XMLParser:
def parse(self, data): return ET.fromstring(data)
class CSVParser:
def parse(self, data): return csv.reader(io.StringIO(data))
def get_parser(format_type):
parsers = {
"json": JSONParser,
"xml": XMLParser,
"csv": CSVParser,
}
parser_class = parsers.get(format_type)
if not parser_class:
raise ValueError(f"Unknown format: {format_type}")
return parser_class()
parser = get_parser("json")
result = parser.parse('{"key": "value"}')
# 3. Observer — subject notifies observers on state change
class EventEmitter:
def __init__(self):
self._listeners = {}
def on(self, event, callback):
self._listeners.setdefault(event, []).append(callback)
def emit(self, event, *args, **kwargs):
for callback in self._listeners.get(event, []):
callback(*args, **kwargs)
# Usage:
emitter = EventEmitter()
emitter.on("user_created", lambda user: print(f"Welcome {user}"))
emitter.on("user_created", lambda user: send_email(user))
emitter.emit("user_created", "Alice")
# Both callbacks fireQ10: What are context managers? How do you create a custom one?
Context managers = __enter__ and __exit__ methods.
Used with the 'with' statement for resource management.
# Built-in example:
with open("file.txt") as f:
data = f.read()
# File is automatically closed, even if exception occurs
# Method 1: Class-based context manager
class Timer:
def __enter__(self):
self.start = time.time()
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.elapsed = time.time() - self.start
print(f"Elapsed: {self.elapsed:.3f}s")
return False # don't suppress exceptions
with Timer() as t:
time.sleep(1)
# Output: Elapsed: 1.001s
# Method 2: Decorator-based (simpler)
from contextlib import contextmanager
@contextmanager
def timer():
start = time.time()
yield # code inside 'with' block runs here
elapsed = time.time() - start
print(f"Elapsed: {elapsed:.3f}s")
with timer():
time.sleep(1)
# Use cases:
# 1. File handling (auto-close)
# 2. Database connections (auto-commit/rollback)
@contextmanager
def db_transaction(conn):
try:
yield conn.cursor()
conn.commit()
except Exception:
conn.rollback()
raise
# 3. Locks (auto-release)
# 4. Temporary state changes
@contextmanager
def temporary_directory():
tmpdir = tempfile.mkdtemp()
try:
yield tmpdir
finally:
shutil.rmtree(tmpdir)How to Prepare
Advanced Python — Priority by Role
Senior Backend
- • async/await & asyncio
- • Design patterns
- • Performance profiling
- • Context managers
- • Type hints & protocols
Data Engineer
- • Generators & itertools
- • Multiprocessing
- • Memory management
- • __slots__ optimization
- • Pandas internals
ML Engineer
- • GIL & C extensions
- • Profiling (cProfile)
- • NumPy memory layout
- • Multiprocessing pools
- • Cython basics
Tech Lead
- • Architecture patterns
- • Code review patterns
- • Testing strategies
- • Metaclasses & descriptors
- • Package design