SQL for data analysts: Essential questions for data extraction and transformation


Image editor
The obvious Getting started
Data analysts need to work with large amounts of information stored in databases. Before they can create reports or gain insights, they must first pull the relevant data and prepare it for use. That's where SQL (Structured Query Language) comes in. SQL is a tool that helps data analysts retrieve data, clean it, and organize it in the desired format.
In this article, we will look at the most important SQL queries that every data analyst should know.
The obvious 1. Selective data selection
This page Choose The statement is the basis of SQL. You can select specific columns or usage * to retrieve all available fields.
SELECT name, age, salary FROM employees;
This question is only a drag name, ageagain salary Columns from employees the table.
The obvious 2. To sort the data there
– Phi narrow lines to those that suit your circumstances. It supports comparison and logical operators to create specific filters.
SELECT * FROM employees WHERE department="Finance";
Where the clause returns only the employees of the Ministry of Finance.
The obvious 3. Sorting results by order by
This page Ordering by Phrase Short Questions in ascending or descending order. Used to rank records by numeric, text, or date values.
SELECT name, salary FROM employees ORDER BY salary DESC;
This question sorts the employees by salary in order, so the highest paid employees.
The obvious 4. Removing unique duplicates
This page – I'm not the same Keyword returns only unique values in a column. It is useful when creating a clean list of categories or attributes.
SELECT DISTINCT department FROM employees;
The exception removes duplicate entries, retrieving each department name only once.
The obvious 5. To reduce the limited effects
This page Slow down The clause restricts the number of rows returned by the query. Often paired with Ordering by To display top results or sample data from large tables.
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;
This returns the top 5 highest paid employees in total Ordering by and Slow down.
The obvious 6. Data including group by
This page The group is CLAise groups groups of rows that share the same values in a specified column. It is used with integrated functions such as SUM(), AVG()or COUNT() generating summaries.
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;
Group by organizing lines by Department, too AVG(salary) It calculates the average salary for each group.
The obvious 7. Filter groups have
This page And that The results collected after integration were used. It is used when conditions depend on aggregate values, such as indices or ratios.
SELECT department, COUNT(*) AS num_employees
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;
The query counts the employees in each department and then filters to keep only departments with more than 10 employees.
The obvious 8. Joining tables
This page Connect A clause combines rows from two or more tables based on a related column. It helps to retrieve linked data, such as employees by their departments.
SELECT e.name, d.name AS department
FROM employees e
JOIN departments d ON e.dept_id = d.id;
Here, the join combines employees with their same Department names.
The obvious 9. Consolidation of results and unity
Union combines the results of two or more queries into a single dataset. It automatically removes duplicates unless you use them UNION ALLwhich keeps them.
SELECT name FROM employees UNION SELECT name FROM customers;
This question combines words from both employees and customers tables in one row.
The obvious 10. String Functions
String functions in SQL are used to manipulate and change text data. They help with tasks such as combining words, changing case, trimming spaces, or extracting parts of a string.
SELECT CONCAT(first_name, ' ', last_name) AS full_name, LENGTH(first_name) AS name_length FROM employees;
This query creates a complete word by combining the first and last words and calculates the length of the first word.
The obvious 11. Time and time activities
Time and period SQL functions allow you to work with temporary data for analysis and reporting. They can calculate the difference, subtract parts like a year or a month, and adjust the days by adding or subtracting intervals. For example, DATEDIFF() and CURRENT_DATE can measure employment.
SELECT name, hire_date, DATEDIFF(CURRENT_DATE, hire_date) AS days_at_company FROM employees;
It calculates how many days each employee was with the company by subtracting their hire date from today.
The obvious 12. Creating new columns by case
This page The case Expect creates new columns with conditional logic, similar to if-else statements. It allows you to categorize or transform data dynamically in your tests.
SELECT name,
CASE
WHEN age < 30 THEN 'Junior'
WHEN age BETWEEN 30 AND 50 THEN 'Mid-level'
ELSE 'Senior'
END AS experience_level
FROM employees;
The case statement creates a new column called experience_level based on age ranges.
The obvious 13. Handling lost values with Coalesce
To shine Handles missing values by returning the first non-profit value in the list. It is often used NULL fields have a default value, such as “n/a.”
SELECT name, COALESCE(phone, 'N/A') AS contact_number FROM customers;
Here, coalesces replace missing phone numbers with “N/A.”
The obvious 14. Subsidiary
Subqueries are queries nested within another query to provide intermediate results. They are used internally WHERE, FROMor SELECT Expressions to filter, compare, or create datasets dynamically.
SELECT name, salary FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);
This query compares each employee's salary to the company's average salary by using a specified subquery.
The obvious 15. Window functions
Windows functions perform calculations on the entire set of rows while returning information for each row. They are often used for construction, freezing values, and comparing values between rows.
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS salary_rank FROM employees;
This page RANK() The job gives each employee a position based on salary, without combining lines.
The obvious Lasting
Mastering SQL is one of the most important skills for any data analyst, because it provides the foundation for extracting, transforming, and interpreting data. From organizing and synthesizing to aggregating and recombining data, Empl Empowers analysts to transform raw data into meaningful insights. By mastering key questions, analysts not only streamline their workflow but also ensure accuracy and precision in their analysis.
Jaita gulati Is a machine learning writer and technical writer driven by his passion for building machine learning models. He holds a master's degree in computer science from the University of Liverpool.



