THE MODN CHRONICLES

Interview Prep

Interview Questions for Data Analyst — The 5 Rounds You'll Actually Face

Every guide gives you 50 random questions. Here is how data analyst interviews actually work — round by round, with the SQL queries, case studies, and business scenarios that decide who gets the offer.

Data analysis on screen during interview preparation

Data analyst interviews follow a predictable structure — once you know the rounds, you can prepare for each one specifically.

How Data Analyst Interviews Actually Work

Most data analyst interviews follow a 3–5 round structure: recruiter screen, SQL test, case study, tools round, and a business/behavioral round. The exact combination depends on the company type. Indian analytics firms like Mu Sigma, Fractal, and Tiger Analytics run a fast batch-hiring process with SQL tests and guesstimates. Product companies like Flipkart and Swiggy go deeper — they test SQL live, add a statistics round, and evaluate business thinking. GCCs like Google, Amazon, and Microsoft India run the most rigorous process with 4–5 rounds covering everything from SQL to system design for analytics.

But across every company type, one thing is consistent: SQL is the gatekeeper. Fail the SQL round and nothing else matters. You will not get to the case study. You will not get to discuss your Tableau dashboards. The interview ends at SQL for 90% of candidates who do not make it.

This guide walks you through each round in the order you will actually face them — with the exact questions, code, and frameworks that interviewers use to evaluate candidates. No random question lists. Just the rounds, in order, with what you need to pass each one.

SQL is the gatekeeper of every data analyst interview. If you cannot write a window function, you will not pass — no matter how good your Tableau dashboards are.

Round 1: The SQL Round (Make or Break)

This is the most important round in any data analyst interview. 90% of DA interviews test SQL, and most companies use it as the first technical filter. If you clear this round, you are in the top 30% of candidates. If you do not, the interview is over regardless of your other skills.

Q1: Find the top 3 products by revenue in each category

Why they ask: This tests window functions — the single most asked SQL concept in data analyst interviews. If you cannot write a window function, you will not pass. Every company, from Mu Sigma to Google, tests this.

SELECT category, product_name, revenue
FROM (
    SELECT category, product_name, revenue,
           ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) as rn
    FROM sales
) ranked
WHERE rn <= 3;

What separates good answers: Explaining why you chose ROW_NUMBER() over RANK() or DENSE_RANK() — ROW_NUMBER gives exactly 3 results per category even with ties, while RANK could give more. The interviewer wants to see that you understand the difference and can choose deliberately.

Q2: Calculate month-over-month revenue growth percentage

Why they ask: This tests LAG/LEAD functions and percentage calculations — the bread and butter of business reporting. Every analyst builds MoM and YoY reports. If you cannot write this query, you cannot do the job.

SELECT month,
       revenue,
       LAG(revenue) OVER (ORDER BY month) as prev_month,
       ROUND((revenue - LAG(revenue) OVER (ORDER BY month)) * 100.0 / 
             LAG(revenue) OVER (ORDER BY month), 2) as growth_pct
FROM monthly_revenue;

Follow-up trap: “What happens in the first row where there is no previous month?” Answer: LAG returns NULL, and dividing by NULL gives NULL — not an error. A good candidate adds COALESCE or filters out the first row. The interviewer is checking if you think about edge cases in your data.

Q3: Find customers who made purchases in January but not in February

Why they ask: This tests subqueries and set operations. The interviewer wants to see if you think in sets, not loops. Candidates who come from a programming background often try to write procedural logic — that is a red flag in a SQL interview.

SELECT DISTINCT customer_id
FROM orders
WHERE EXTRACT(MONTH FROM order_date) = 1
  AND customer_id NOT IN (
      SELECT customer_id FROM orders
      WHERE EXTRACT(MONTH FROM order_date) = 2
  );

Bonus points: Mention that NOT IN can behave unexpectedly with NULLs — if the subquery returns any NULL customer_id, the entire NOT IN returns no results. A safer alternative is NOT EXISTS or a LEFT JOIN with a NULL check. This level of SQL awareness impresses interviewers.

SQL Round Format by Company Type

Indian analytics firms (Mu Sigma, Fractal) give you a SQL test on HackerRank with 5–8 questions in 60 minutes. Product companies (Flipkart, Meesho) ask you to write SQL live while explaining your approach. Practice both formats — timed tests build speed, live coding builds communication skills.

Round 2: Excel & Case Study

This round tests whether you can analyze a messy dataset and extract insights that matter to the business. The common format: you get a dataset (CSV or Excel) and 30–45 minutes to answer 3–4 business questions. No one tells you which columns to use or which analysis to run. You figure it out.

Example Case: Revenue Drop Investigation

“You are given 6 months of e-commerce order data. The CEO wants to know why revenue dropped 15% last month. Walk me through your analysis.”

The framework that works:

1. Segment by dimension — break revenue down by product category, geography, customer cohort, and acquisition channel. Identify which segment drove the drop.

2. Check for data quality issues — are there missing records, duplicate orders, or a tracking change that affected the numbers? Sometimes the “drop” is a data problem, not a business problem.

3. Compare against external factors — was there a holiday shift, a competitor sale, or a seasonal pattern that explains the change?

4. Present findings with a recommendation — do not just say “Category X dropped.” Say “Category X dropped 25% driven by a stockout in the top 3 SKUs. Recommendation: fix inventory forecasting for high-velocity items.”

Indian companies specifically test pivot tables, VLOOKUP/INDEX-MATCH, and conditional formatting in this round. If you are not comfortable building a pivot table from scratch in under 2 minutes, practice until you are. These are not advanced skills — they are table stakes.

The Real Evaluation Criteria

The case study is not about finding the right answer — it is about showing a structured analytical approach. Interviewers evaluate your thinking process, not your final number. A candidate who methodically segments the data and finds the wrong root cause will score higher than a candidate who guesses the right answer without showing their work.

Analytics dashboard showing business metrics

The case study round tests whether you can turn messy data into a clear business recommendation.

Round 3: Statistics & Probability

Not every company has this round, but product companies and GCCs do. Analytics firms sometimes skip it for entry-level roles. When it does appear, it focuses on practical statistics — not textbook proofs, but whether you can apply statistical thinking to real business problems.

Q: Explain the difference between Type I and Type II errors with a business example

What they want to hear: Use fraud detection as the example. A Type I error (false positive) means flagging a legitimate transaction as fraud — the customer gets blocked and calls support, which costs money and damages trust. A Type II error (false negative) means letting a fraudulent transaction through — the company loses money directly.

The business insight: The cost of each error type is different, and the business decides which one to minimize. A bank might tolerate more false positives (block legitimate users) to reduce fraud losses. An e-commerce platform might tolerate more fraud to avoid blocking real customers during a sale. The analyst needs to understand this tradeoff, not just the statistical definition.

Q: How would you determine if a new checkout flow actually improved conversion?

The answer framework: This is an A/B testing question. You need to cover: defining the metric (conversion rate), calculating the required sample size (based on baseline conversion, minimum detectable effect, and desired statistical power), running the test for a sufficient duration, and interpreting the results using statistical significance (p-value below 0.05) and confidence intervals.

Common trap: Candidates say “just compare the conversion rates.” Without statistical testing, a 2% vs 2.5% difference could be random noise. The interviewer wants to hear you talk about sample size, significance, and confidence intervals — not just “A is higher than B.”

Q: What is the Central Limit Theorem and why does it matter for data analysis?

Plain English answer: The Central Limit Theorem says that if you take enough random samples from any population, the distribution of sample means will be approximately normal — regardless of the original distribution. This matters because it enables us to use normal distribution assumptions (z-tests, confidence intervals) for sample means even when the underlying data is skewed.

Business application: When you calculate average order value from a sample of 1000 orders, CLT guarantees that this average follows a normal distribution. This lets you build confidence intervals and make statements like “we are 95% confident the true average order value is between ₹450 and ₹520.”

Statistics Round by Company Type

Indian product companies (Swiggy, PhonePe, CRED) test A/B testing knowledge heavily. If you cannot explain statistical significance, p-values, and confidence intervals in plain English, prepare for this specifically. GCCs go deeper — they may ask about Bayesian vs frequentist approaches, regression assumptions, or experimental design.

Round 4: Tools — Tableau, Power BI, or Python

This round varies by company. Some test Tableau or Power BI live — they give you a dataset and 30 minutes to build a dashboard. Others test Python (pandas) — they give you a messy CSV and ask you to clean, transform, and analyze it. The tool changes, but the evaluation is the same: can you go from raw data to insight quickly?

Tableau Question Example

“Build a dashboard showing regional sales performance with drill-down capability.” They give you a dataset and 30 minutes. The evaluation is not about making it pretty — it is about choosing the right chart types, building useful filters, and creating a layout that answers the business question without requiring explanation.

What impresses interviewers: Adding calculated fields (like YoY growth), using LOD expressions for complex aggregations, and building a dashboard that tells a story — not just displays numbers. A bar chart with a filter is entry-level. A dashboard with KPI cards, trend lines, and drill-down from region to city to store is what gets you the offer.

Python Question Example: Cohort Retention Analysis

“Clean this messy CSV and calculate customer retention by cohort.” This tests your pandas fluency and whether you understand retention as a business concept.

# Cohort retention analysis
df['cohort'] = df.groupby('customer_id')['order_date'].transform('min').dt.to_period('M')
df['order_month'] = df['order_date'].dt.to_period('M')
cohort_data = df.groupby(['cohort', 'order_month']).agg(
    n_customers=('customer_id', 'nunique')
).reset_index()

What they evaluate: Can you handle messy data (missing values, duplicates, wrong data types) without being told? Do you know pandas well enough to write this without Googling every function? Can you explain what the cohort analysis shows and why it matters for the business?

Tools by Company Type

Analytics firms test Excel and SQL primarily. Product companies test Python (pandas). GCCs test both Python and SQL deeply. Know your target company's stack before the interview — preparing Tableau when the company uses Power BI wastes your limited preparation time.

Round 5: Business Scenarios (Where Most Candidates Fail)

This is the round that separates analysts from report generators. Technical skills get you to this round. Business thinking gets you the offer. Most candidates fail here because they give technically correct but business-irrelevant answers.

Q: Our app's daily active users dropped 20% this week. How would you investigate?

The framework: Clarify the metric (how is DAU defined — unique logins, unique sessions, unique actions?). Segment the data (did it drop across all platforms or just iOS? All geographies or just one city? All user cohorts or just new users?). Form hypotheses (app update bug, server outage, seasonal pattern, tracking code change). Propose analysis for each hypothesis. Recommend action based on findings.

Common mistake: Jumping to “let me check the data” without first clarifying what DAU means and segmenting the problem. The interviewer wants to see structured thinking before you touch any data.

Q: The marketing team wants to spend ₹50L on a new campaign. How would you measure its success?

The answer: Define success metrics before the campaign launches — not after. Primary metric: incremental revenue or customer acquisition attributable to the campaign. Secondary metrics: brand awareness, engagement rate, cost per acquisition. Set up a control group (geographic or time-based holdout) to measure incrementality. Calculate ROI: (incremental revenue - campaign cost) / campaign cost. Report results with confidence intervals, not just point estimates.

What impresses: Mentioning attribution challenges (last-click vs multi-touch), suggesting a holdout test design, and framing the answer in terms of business ROI rather than vanity metrics like impressions or clicks.

Q: We are launching in a new city. What data would you need to decide which city to launch in first?

The answer: Market size (population, target demographic density, smartphone penetration), competition (existing players, market share, pricing), demand signals (search trends, waitlist signups, social media mentions), operational feasibility (logistics infrastructure, regulatory environment, talent availability), and financial projections (estimated CAC, LTV, break-even timeline).

Key insight: The interviewer is not looking for a perfect answer. They want to see that you can break a vague business question into specific, measurable data requirements. The candidates who fail this round list generic factors without explaining how they would actually measure or compare them.

Indian interviewers specifically test whether you can translate a vague business question into a structured analytical plan. The candidates who fail give technically correct but business-irrelevant answers.

How to Prepare — By Company Type

The preparation strategy depends entirely on your target company type. A one-size-fits-all approach wastes time. Here is a realistic plan for each:

Analytics Firms — Mu Sigma, Fractal, Tiger Analytics (2–3 weeks)

Focus on SQL (practice on HackerRank and LeetCode SQL problems — aim for 50+ problems), Excel case studies (download sample datasets and practice building pivot tables, charts, and summaries under time pressure), and guesstimates (market sizing questions like “how many ATMs are there in Mumbai?”). These companies hire in batches and the process is fast — typically 2–3 rounds completed in a week. Speed matters more than depth here.

Product Companies — Flipkart, Swiggy, Meesho, PhonePe (4–6 weeks)

SQL + Python (pandas) + A/B testing + business scenarios. The bar is higher and the process has 4–5 rounds. Practice writing SQL while explaining your approach out loud — product companies do live coding, not timed tests. Learn pandas well enough to clean and analyze a dataset without documentation. Study A/B testing deeply — sample size calculation, statistical significance, common pitfalls (peeking, novelty effects, Simpson's paradox). Prepare 5–6 business scenario answers using the framework: clarify metric → segment data → form hypotheses → propose analysis → recommend action.

GCCs — Google, Amazon, Microsoft India (6–8 weeks)

Everything from the product company prep, plus: statistics depth (probability distributions, hypothesis testing, regression analysis), system design for analytics (how would you design a real-time dashboard for a million daily users?), and behavioral rounds (STAR method answers for leadership principles). The acceptance rate is very low — prepare for 4–5 rounds with potential rejection after any round. Mock interviews are essential at this level.

Practice With Real Interview Simulations

Reading questions is not the same as answering them under pressure. Practice with timed mock interviews that simulate real data analyst rounds — with feedback on your SQL, case study approach, and business thinking.

The best data analyst interview preparation is not memorizing SQL queries. It is practicing the full loop — from understanding the business question to writing the query to presenting the insight.

Data analyst interviews reward structured thinking, not raw intelligence. The candidates who get offers are the ones who can take a vague business question, break it into analytical steps, execute with the right tools, and present findings that drive decisions. Know your target company type, prepare for the specific rounds they run, and practice under realistic conditions. SQL gets you in the door. Business thinking gets you the offer.

Prepare for Your Data Analyst Interview

Practice with AI-powered mock interviews, get your resume ATS-ready, and walk into your next data analyst interview with confidence.

Free · AI-powered · Instant feedback