Introduction: Complex SQL Interview Questions
Are you gearing up for a technical interview involving SQL? Understanding complex SQL concepts is crucial for acing those challenging interviews. In this comprehensive guide, we’ll delve into the depths of SQL complexity to equip you with the knowledge needed to navigate through technical interviews confidently.
First, you need to build strength on the basics of SQL. Then, you can progressively apply your knowledge to solving complex SQL interview questions.
In this journey don’t feel frustrated if you can’t solve complex SQL interview questions in your first or second attempt. Refresh your knowledge and keep trying.
Unveiling the Essence of SQL Complexity
Let us explore the significance of complex SQL concepts and Complex SQL Interview Questions in technical interviews and why they matter.
Key Topics You Must Master to clear complex SQL interview questions
- Subqueries and Nested Queries: Understanding the nuances and practical applications.
- Joins and Join Types: A detailed look at various join operations and their usage.
- Window Functions: Harnessing the power of window functions with practical examples.
- Common Table Expressions (CTEs): Differentiating CTEs from subqueries and their benefits.
Business Analyst Questions: Meesho SQL Assignment Round 1
These are Complex SQL Interview questions asked in the First Technical Round in Meesho. These are SQL Complex Interview Questions. To solve these questions you must have a revision of Subqueries in SQL, Joins, and Self Join in SQL, Window Functions in SQL Like Rank(), Dense_rank(), Lead(), and Lag() functions. Here are SQL Interview Questions including advanced SQL query examples with answers.
- Employee Tracker
A company records its employees’ Movement In and Out of the office on a table with 3 columns
(Employee ID, Action (In/Out), Created)
Data structure:
Employee ID Action Created
1 In 2019-04-01 12:00:00
1 Out 2019-04-01 15:00:00
1 In 2019-04-01 17:00:00
1 Out 2019-04-01 21:00:00
The first entry for each employee is “In”
“In” is succeeded by an “Out” and vice versa
No data gaps and, employees can work across days (eg. an employee can work for 3 days without leaving) These are SQL Complex Interview Questions.
Question:1.Find the number of employees inside the Office at the current time.
SOLUTION:
LOGIC APPLIED: Each “IN” Action is canceled by each “OUT” Action. If any “IN” is balanced. then it will count inside Employees.
The time frame is less than or equal to the Current Time.
Assumption: If we want the current number and our data is accurate (no duplicate INs and so on), then by aggregation we can get:
Complex SQL Interview Questions: SQL Query
SELECT SUM(CASE WHEN Action = 'In' THEN 1 Action = 'Out' THEN -1 ELSE Null END)
AS Num_Of_Emp_Inside
FROM Employee_Tracker
WHERE Created <= Now();
Question:2. Find the number of employees inside the Office at “2019-05-01 19:05:00”
Solution:
LOGIC APPLIED: Each “IN” Action is canceled by each “OUT” Action. If any “IN” is balanced. then it will count inside Employees.
and Given time Frame is less than or equal to “2019-05-01 19:05:00”
SQL Query in MYSQL:
SELECT SUM(CASE WHEN Action = ‘In’ THEN 1
WHEN Action = ‘Out’ THEN -1 ELSE Null END) Num_Of_Emp_Inside
FROM Employee_Tracker
WHERE Created <= “2019-05-01 19:05:00”;
Question: 3. a) Measure amount of hours spent by each employee inside the office since the day they started (Account for current shift if she/he is working)
Solution :
Assumptions: Assuming Total hours worked since the first day of action.
Logic Used: first inserted Row Number to track every in and out Action in the first common table Expression (cte) table. then in the second common table Expression (cte) table TIMESTAMPDIFF is used to calculate the number of hours worked. This is Asked in Complex SQL Interview Questions for Experienced Data Analysts or Data Engineers.
MySQL SQL Query :
WITH New_Table1 AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY Employee_Id,
Action ORDER BY Created) row_num
FROM Employee_Tracker),
New_Table2 AS (SELECT Employee_Id, row_num,
TIMESTAMPDIFF(HOUR,
MAX(CASE WHEN Action = 'In' THEN Created END),
MAX(CASE WHEN max(Created)=min(Created)
AND Action = 'Out' then "2019-04-02 10:00:00"
WHEN max(Created)<>min(Created)
AND Action = 'Out' THEN Created END)) AS num_hours
FROM New_Table1
GROUP BY Employee_Id, row_num)
SELECT Employee_Id, SUM(num_hours) AS total_num_hours
FROM New_Table2
GROUP BY Employee_Id;
Question: 3. b) Measure amount of hours spent by each employee inside the office between “2019-04-01 14:00:00” and “2019-04-02 10:00:00”
For each problem, the solution should summarize the logic used and the SQL query to execute it. Moderate-Level Questions in SQL Complex Interview Questions
Solution :
WITH New_Table1 AS (SELECT *,
ROW_NUMBER() OVER (PARTITION BY Employee_Id, Action ORDER BY Created) row_num
FROM Employee_Tracker
WHERE Created BETWEEN “2019-04-01 14:00:00” AND “2019-04-02 10:00:00”),
New_Table2 AS (SELECT Employee_Id, row_num,
TIMESTAMPDIFF(HOUR,
MAX(CASE WHEN Action = 'In' THEN Created END),
MAX(CASE WHEN max(Created)=min(Created) AND Action = 'Out' then "2019-04-02 10:00:00"
WHEN max(Created)<>min(Created) AND Action = 'Out' THEN Created END)) AS num_hours
FROM New_Table1
GROUP BY Employee_Id, row_num)
SELECT Employee_Id, SUM(num_hours) AS total_num_hours
FROM New_Table2
GROUP BY Employee_Id;
Question 4.) Ranker System Question in SQL Complex Interview Questions
A company wants to build a recommendation system for its users, they plan to show 50 products to the customers that they are most likely to buy (highest score). To improve diversity they want to keep products from all categories 5, in the system
Given table with columns user_id, catalog_id, score, category
Write a query that gives 50 catalogs, the top 10 for each category for each user.
Solution:
SQL Query:
WITH ranking_table AS (SELECT catalog_id,user_id, score, category,
ROW() OVER (Partition by user_id, Category order by Score DESC) AS rnk
FROM Table);
SELECT user_id,category,catalog_id FROM ranking_table
WHERE rnk<=10
Question: 5) Retention Cohort-based Complex SQL Interview Questions
A company has the transaction data for its e-commerce site stored in a table that contains columns/fields (users_id,transaction_id, timestamp )
For users acquired in a given week (the first transaction for that user was in that week), the week starting 29th Sep 2022. Find the D+7, D+30, and D+60 retained transactors.
D+i → Retained transactors are users who transacted at least i days after their onboarding date
So for eg. if a user was onboarded on 31st Nov 2022 and did his 2nd transaction and last transaction on 1st Jan, that seller will be D+7 and D+30 retained but not D+60 retained
Write a query to get the desired output. And this is SQL Complex Interview Questions in This round
SQL Query for Retention Cohort
SELECT first_Day,
SUM(CASE WHEN Day_number >= 7 THEN 1 ELSE 0 END) AS "D+7",
SUM(CASE WHEN Day_number >=30 THEN 1 ELSE 0 END) AS "D+30",
SUM(CASE WHEN Day_number >=60 THEN 1 ELSE 0 END) AS "D+60",
FROM
(select a.user_id,a.Transaction_Day,b.first_Day as first_Day,
a.Transaction_Day-b.first_Day as Day_number
FROM
(
SELECT
user_id,
Date(timestamp) AS Transaction_Day
FROM Transaction_table
GROUP BY user_id,Date(timestamp)) a,
(SELECT
user_id,
min(Day(timestamp)) AS first_Day
FROM Transaction_table
GROUP BY user_id) b
where a.user_id=b.user_id ) as with_Day_number
group by first_Day
order by first_Day
Real Complex SQL Interview Questions of Bold Technology
Data Tables:
portal: Contains the list of all resume portals
portal_id | portal_code | portal_name |
1 | MPR | My Perfect Resume |
2 | RN | Resume Now |
3 | ZETY | Zety |
4 | LC | Live Career |
5 | GEN | Resume Genius |
6 | HELP | Resume Help |
portal_id – Unique identifier for the portal
portal_code – short code for the portal portal_name – Full portal name
user_registration: User-level data which contains one record per user. A new record is created for every new user registration on any portal
user_id | portal_id | registration_timestamp | subscription_flag | subscription_timestamp |
12323128 | 2 | 08/08/2020 09:27:44 | N | |
22324238 | 5 | 08/11/2020 22:07:11 | Y | 08/12/2020 02:27:26 |
37435434 | 5 | 09/19/2020 22:05:11 | Y | 09/20/2020 07:27:26 |
53232933 | 1 | 09/18/2020 23:48:04 | Y | 09/20/2020 19:27:26 |
13982456 | 1 | 09/21/2020 18:30:24 | N |
user_id – Unique identifier for the table which is the id of the user
registration_timestamp – Timestamp when the user registered
subscription_flag – if the user purchased a subscription or not
subscription_timestamp – timestamp when the user purchased a subscription
resume_doc: Contains all the resume documents created by users. Data complexity is at a resume document level, a user can create multiple resumes
user_id | resume_id | date_created | experience_years |
37435434 | 2398 | 09/19/2020 23:01:11 | 4 |
37435434 | 3274 | 10/08/2020 15:18:45 | 5 |
54027535 | 1529 | 08/15/2020 06:30:24 | 3 |
87082621 | 9466 | 10/08/2020 11:27:44 | 9 |
user_id – unique ID of the user
resume_id – document id for a resume (unique identifier)
date_created – date timestamp when a resume is created
experience_years – total years of experience added to the resume
SQL Query Questions Based On Given Tables
Question:01) What are the registrations every month on the ‘Resume Now’ portal for 2020?
Solution:
SELECT
MonthName(UR.registration_timestamp) || '-' || Year(UR.registration_timestamp) AS Month_Year,
COUNT(UR.USER_ID) AS Registration_count
FROM USER_REGISTRATION UR
JOIN
PORTAL P ON UR.PORTAL_ID = P.PORTAL_ID
WHERE
Year(UR.registration_timestamp) = 2020 AND P.portal_name = 'Resume Now'
GROUP BY 1;
Question:02) Which portal has the highest reg to sub rate for users registered in the last 30 days?
Solution:
WITH cte AS (
SELECT
portal_id AS portal,
COUNT(DISTINCT user_id) AS num_of_registrations,
COUNT(CASE WHEN subscription_flag = 'Y' THEN 1 ELSE NULL END) AS num_subs_count,
COUNT(CASE WHEN subscription_flag = 'Y' THEN 1 ELSE NULL END) / COUNT(DISTINCT user_id) AS subs_rate
FROM
user_registration
WHERE
DATEDIFF(CURDATE(), DATE(registration_timestamp)) = 30
GROUP BY 1
ORDER BY 4 DESC)
SELECT
portal
FROM cte
TOP 1;
Question: 3) Create a list of users who subscribed in 2020 on the Zety portal and get the experience_years on their first resume.
Solution:
WITH Zety_users AS (
SELECT
r.user_id AS Users,
MIN(rd.date_created) AS first_resume_building_time,
rd.experience_year AS Exp_years
FROM user_registration r
JOIN
portal p ON r.portal_id = p.portal_id
JOIN
resume_doc rd ON r.portal_id = rd.portal_id
WHERE
p.portal_code = 'ZETY'
AND YEAR(r.subscription_timestamp) = 2020
GROUP BY 1, 2 )
SELECT
Users
FROM
Zety_users
ORDER BY 1;
Hacker Rank SQL Questioned Asked In Swiggy Data Analyst Interview
Samantha interviews many candidates from different colleges using coding challenges and contests. Write a query to print the contest_id, hacker_id, name, and the sums of total_submissions, total_accepted_submissions, total_views, and total_unique_views for each contest sorted by contest_id. Exclude the contest from the result if all four sums are .
Note: A specific contest can be used to screen candidates at more than one college, but each college only holds screening contest.
The following tables hold interview data:
- Contests: The contest_id is the id of the contest, hacker_id is the ID of the hacker who created the contest, and the name is the name of the hacker.
- Colleges: The college_id is the id of the college, and contest_id is the id of the contest that Samantha used to screen the candidates.
- Challenges: The challenge_id is the id of the challenge that belongs to one of the contests whose contest_id Samantha forgot, and college_id is the id of the college where the challenge was given to candidates.
- View_Stats: The challenge_id is the id of the challenge, total_views is the number of times the challenge was viewed by candidates, and total_unique_views is the number of times the challenge was viewed by unique candidates.
- Submission_Stats: The challenge_id is the id of the challenge, total_submissions is the number of submissions for the challenge, and total_accepted_submission is the number of submissions that achieved full scores.
- Sample Input
- Contests Table: Colleges Table: Challenges Table:
- View_Stats Table: Submission_Stats Table:
- Sample Output
66406 17973 Rose 111 39 156 56 66556 79153 Angela 0 0 11 10 94828 80275 Frank 150 38 41 15
Solution: Complex SQL Interview Questions of Hacker Rank
select con.contest_id,
con.hacker_id,
con.name,
sum(total_submissions),
sum(total_accepted_submissions),
sum(total_views), sum(total_unique_views)
from contests con
join colleges col on con.contest_id = col.contest_id
join challenges cha on col.college_id = cha.college_id
left join
(select challenge_id, sum(total_views) as total_views, sum(total_unique_views) as total_unique_views
from view_stats group by challenge_id) vs on cha.challenge_id = vs.challenge_id
left join
(select challenge_id, sum(total_submissions) as total_submissions, sum(total_accepted_submissions) as total_accepted_submissions from submission_stats group by challenge_id) ss on cha.challenge_id = ss.challenge_id
group by con.contest_id, con.hacker_id, con.name
having sum(total_submissions)!=0 or
sum(total_accepted_submissions)!=0 or
sum(total_views)!=0 or
sum(total_unique_views)!=0
order by contest_id;
Conclusion
In mastering Complex SQL Interview Questions, you’ve embarked on a path toward interview success in 2024. Understanding intricate SQL concepts is pivotal for excelling in technical interviews. Armed with insights into subqueries, joins, window functions, and more, you’re better equipped to tackle challenging queries. These concepts are fundamental in showcasing your expertise and problem-solving abilities to potential employers. Keep practicing and applying these principles in real-world scenarios to reinforce your skills further.
Remember, thorough preparation and confidence in handling complex SQL Interview Questions are key to securing success in your interviews. Elevate your SQL proficiency, stand out in interviews, and pave the way for a promising career in 2024 and beyond.
Thanks for Your Support. I am Pramod Kumar A Data Analyst working with a Health and Wellness Company. I used to write articles based on SQL Complex Interview Questions, Data Analysis, Machine Learning, and Business Analysis Tools like Power BI. Bookmark this page for different upcoming articles for SQL join query interview questions, complex SQL queries asked in interviews, and more.
1 thought on “Complex SQL Interview Questions: Detailed Easy Explanation with 8 + examples”