Python of Data Science (FREE 7-Day Mini-Course)


Photo for Editor | Chatgt
Obvious Introduction
Welcome to Python of the data science, free of 7 days Of beginners! When you start with data science or you want to learn basic Python skills, this first-friendly course is yours. In the next seven days, you will learn to work on data activities only use the important Python.
You will learn that:
- Work with Python basic data structures
- Clean and configure dirty text data
- Summarize and group data with dictionaries (just as you do SQL or Excel)
- Write the duties that are automatically used in your code is tidy and running well
- Treat kindly mistakes so your documents are not crashed to dirty installation data
- And finally, you will create a simple test tool for testing data or any CSV data
Let's get started!
🔗 Link to the code in Githubub
Obvious Day 1: variable, data types, and file i / o
In Data Science, everything starts with the green data: research answers, logs, spreadsheets, broken websites, etc. Before analyzing anything, you need:
- Upload Information
- Understand its formation with its kinds
- Start to clean or check it
Today, you will learn:
- Basic Python Data Types
- How to learn and write green files .txt
// 1. The variable
In python, the variable is referred to in value. With the goals of the data, you can think of it as fields, columns, or metadata.
filename = "responses.txt"
survey_name = "Q3 Customer Feedback"
max_entries = 100
// 2. Data types to use often
Don't worry about subtle types yet. You will use the following:
| Python Type | What is used for | Illustration |
|---|---|---|
| glider | Mature Text, Column Names | Years “,” Unknown “ |
| int | Covering, Discrete variables | 42, 0, -3 |
| float | Continuous variations | 3.14, 0.0, -100.5 |
| bub | Flags / binary results | True, Lies |
| None | Missing Rates / Null | None |
Knowing in the face of your person – and how to check them or change yourself – it is a zero step in the database.
// 3. File Installation: Reading green data
Real country data remains in .txt, .csv, or .log files. You will usually need to upload a line by the line, not all at the same time (especially when large).
Suppose you had a file called responses.txt:
Here's how you read:
with open("responses.txt", "r") as file:
lines = file.readlines()
for i, line in enumerate(lines):
cleaned = line.strip() # removes n and spaces
print(f"{i + 1}: {cleaned}")
Which is output:
1: Yes
2: No
3: Yes
4: Maybe
5: No
// 4. File Out: Data Used
Suppose you want to save only “Yes” answers to a new file:
with open("responses.txt", "r") as infile:
lines = infile.readlines()
yes_responses = []
for line in lines:
if line.strip().lower() == "yes":
yes_responses.append(line.strip())
with open("yes_only.txt", "w") as outfile:
for item in yes_responses:
outfile.write(item + "n")
This simple version of the simple pipe – the loose-pipe, daily use of data in the data before the data.
// ⏭️ Exercise: Write your first data text
Create a file called survey.txt And copy the following lines:
Now write the Python Text:
- You read the file
- It is calculated how often “Yes” appears (the charges). You will learn to work with rope access to the text. But give!
- Print to calculate
- Writes a clean version of data (made, no white) to
cleaned_survey.txt
Obvious Day 2: Python data structures
Data science is about the order and construction of data so it can be cleaned, analyzed, or likened. Today you will learn four key data buildings in Core Python and how to use the actual data:
- List: In the order of lines
- TUPLE: RECORDS ACTIVITY
- Dict: For details of label (such as condum)
- Set: Tracking different values
// 1. List: In order of data lines
List is the most variable and common structure, which is ready for declaration:
- Prize column
- A set of records
- Data with anonymous size
For example: Read the prices from the file from the list.
with open("scores.txt", "r") as file:
scores = [float(line.strip()) for line in file]
print(scores)
This Print:
You can now:
average = sum(scores) / len(scores)
print(f"Average score: {average:.2f}")
Which is output:
// 2. TUPLE: Random records are scheduled
Tuples are like lists, but we do not end and use very well in lines in a known structure, e.g., (name, age).
Example: Read the word and age file.
Suppose we have the next people.txt:
Alice, 34
Bob, 29
Eve, 41
Now let's learn from the content of the file:
with open("people.txt", "r") as file:
records = []
for line in file:
name, age = line.strip().split(",")
records.append((name.strip(), int(age.strip())))
Now you can reach fields with position:
for person in records:
name, age = person
if age > 30:
print(f"{name} is over 30.")
// 3. DIC: For details of label (such as condom)
Dictionaries store the amounts of counts, the closest thing in Core Python in the line of the columns that have named columns.
Example: Change each individual subject to DICT:
people = []
with open("people.txt", "r") as file:
for line in file:
name, age = line.strip().split(",")
person = {
"name": name.strip(),
"age": int(age.strip())
}
people.append(person)
Now your data is very readable and flexible:
for person in people:
if person["age"] < 60:
print(f"{person['name']} is perhaps a working professional.")
// 4. Set: in unity and quick tests of membership
Sets removing automatic. Therefore sets ready:
- Counting different categories
- To check whether the value has been previously seen
- Tracking different values without order
For example: From the email file, find all different backgrounds.
domains = set()
with open("emails.txt", "r") as file:
for line in file:
email = line.strip().lower()
if "@" in email:
domain = email.split("@")[1]
domains.add(domain)
print(domains)
Which is output:
{'gmail.com', 'yahoo.com', 'example.org'}
// ⏭️ Exercise: Copy a small data inspector
Create a file called dataset.txt For the following content:
Now write the Python Text:
- Reads each line and keeps like buttons with buttons: Name, age, role
- Spot how many people in each passage (use a dictionary) and Univel Ages (Use set)
Obvious Day 3: Working with wires
Scriptural cables are everywhere in real world sports – research answers, bios, work headlines, product reviews, emails, and unexpectedly.
Today, you will learn from:
- Clean and estimate unripe text
- Release information from raw
- Build simple features of the text (option you can use for sorting or model)
// 1. Cleaning according to basic references
Suppose you received this green list of works from CSV:
titles = [
" Data Scientistn",
"data scientist",
"Senior Data Scientist ",
"DATA scientist",
"Data engineer",
"Data Scientist"
]
Your work? Use it normally.
cleaned = [title.strip().lower() for title in titles]
Now everything is small and white.
Which is output:
['data scientist', 'data scientist', 'senior data scientist', 'data scientist', 'data engineer', 'data scientist']
// 2. Merquency measuring
Suppose you are interested only in seeing data scientists.
standardized = []
for title in cleaned:
if "data scientist" in title:
standardized.append("data scientist")
else:
standardized.append(title)
// 3. Counting words, test patterns
Helpful Text features:
- Name of words
- That the cord contains the keyword
- That the string is a number or email
For example:
text = " The price is $5,000! "
# Clean up
clean = text.strip().lower().replace("$", "").replace(",", "").replace("!", "")
print(clean)
# Word count
word_count = len(clean.split())
# Contains digit
has_number = any(char.isdigit() for char in clean)
print(word_count)
print(has_number)
Which is output:
"the price is 5000"
4
True
// 4. To distinguish and issue parts
Let's take an email example:
email = " [email protected] "
email = email.strip().lower()
username, domain = email.split("@")
print(f"User: {username}, Domain: {domain}")
This Print:
User: alice.johnson, Domain: example.com
This type of issuance is used by the analysis of user behavior, spam detection, and the like.
// 5. Finding specific text patterns
You do not need regular testing talks for basic patterns.
For example: Check that someone says “Python” in the Response of FREE text:
comment = "I'm learning Python and SQL for data jobs."
if "python" in comment.lower():
print("Mentioned Python")
// ⏭️ Exercise: Clean Views Assessment
Create a file called comments.txt With the following lines:
Great course! Loved the pacing.
Not enough Python examples.
Too basic for experienced users.
python is exactly what I needed!
Would like more SQL content.
Excellent – very beginner-friendly.
Now write the Python Text:
- Cleaning each comment (strap, less, delete punctuation)
- Prints the full number of comment, how many are they are spoken with “python”, and the middle name between each comment
Obvious Day 4: The group, count, and summarize the dictionaries
Used Dice to keep records listed. Today, you will go deeper: dictionaries are used to integrate group, count, and summarize – like a pivot table or group in SQL.
// 1. Field encounters
Suppose you have this data.
data = [
{"name": "Alice", "city": "London"},
{"name": "Bob", "city": "Paris"},
{"name": "Eve", "city": "London"},
{"name": "John", "city": "New York"},
{"name": "Dana", "city": "Paris"},
]
Goal: Count how many people in each city.
city_counts = {}
for person in data:
city = person["city"]
if city not in city_counts:
city_counts[city] = 1
else:
city_counts[city] += 1
print(city_counts)
Which is output:
{'London': 2, 'Paris': 2, 'New York': 1}
// 2. Summary in the field by class
Now let's say we have it:
salaries = [
{"role": "Engineer", "salary": 75000},
{"role": "Analyst", "salary": 62000},
{"role": "Engineer", "salary": 80000},
{"role": "Manager", "salary": 95000},
{"role": "Analyst", "salary": 64000},
]
PURPOSE: Calculate the full and average income of each role.
totals = {}
counts = {}
for person in salaries:
role = person["role"]
salary = person["salary"]
totals[role] = totals.get(role, 0) + salary
counts[role] = counts.get(role, 0) + 1
averages = {role: totals[role] / counts[role] for role in totals}
print(averages)
Which is output:
{'Engineer': 77500.0, 'Analyst': 63000.0, 'Manager': 95000.0}
// 3. Promotional Table (Mode Details)
Get the most common years in data:
ages = [29, 34, 29, 41, 34, 29]
freq = {}
for age in ages:
freq[age] = freq.get(age, 0) + 1
most_common = max(freq.items(), key=lambda x: x[1])
print(f"Most common age: {most_common[0]} (appears {most_common[1]} times)")
Which is output:
Most common age: 29 (appears 3 times)
// ⏭️ Exercise: Analyze worker data
Create a file employees.txt For the following content:
Alice,London,Engineer,75000
Bob,Paris,Analyst,62000
Eve,London,Engineer,80000
John,New York,Manager,95000
Dana,Paris,Analyst,64000
Write the Python Text:
- Loading information from the dictionary list
- Prints the number of employees per city and standard income for each role
Obvious Day 5: Writing activities
Write a code, clean, filters, and summarize data. Now you will include that reasonable thing at work, to know:
- Use your code
- Create Configuration pipelines
- Save texts read and test
// 1. Cleaning of text input
Let's write Work to make basic text:
def clean_text(text):
return text.strip().lower().replace(",", "").replace("$", "")
Now you can use this throughout the field you read in the file.
// 2. Developing line records
Next, here is a simple task of integrating each line in the file and create record:
def parse_row(line):
parts = line.strip().split(",")
return {
"name": parts[0],
"city": parts[1],
"role": parts[2],
"salary": int(parts[3])
}
Now your file load is:
with open("employees.txt") as file:
rows = [parse_row(line) for line in file]
// 3. Aggregation assistants
So far, it has already integrated with measurements and calculations. Let's write Some basic basic activities likewise:
def average(values):
return sum(values) / len(values) if values else 0
def count_by_key(data, key):
counts = {}
for item in data:
k = item[key]
counts[k] = counts.get(k, 0) + 1
return counts
// ⏭️ Exercise: Correct the previous work
The previous solution laster for the reorganized functions:
load_data(filename)average_salary_by_role(data)count_by_city(data)
Then use them in the text printed the same effect on the day 4.
Obvious Day 6: Reading, writing, and basic error management
Data files are often imperfect, corrupted, and not repaired. So how do you deal with them?
Today you will learn:
- You can read and write organized files
- How can you treat the errors
- To skip or wrap bad lines without crashing
// 1. Safe is a file reading
What happens when you try to read the missing file? Here's how 'trying' to open the file and hold the “FileThlontounderer” when the file is missing.
try:
with open("employees.txt") as file:
lines = file.readlines()
except FileNotFoundError:
print("Error: File not found.")
lines = []
// 2. Holding bad lines kindly
Now let's try to skip bad lines and the only process of perfect lines.
records = []
for line in lines:
try:
parts = line.strip().split(",")
if len(parts) != 4:
raise ValueError("Incorrect number of fields")
record = {
"name": parts[0],
"city": parts[1],
"role": parts[2],
"salary": int(parts[3])
}
records.append(record)
except Exception as e:
print(f"Skipping bad line: {line.strip()} ({e})")
// 3. To write data cleaned in the file
Finally, let's write a failed data in the file.
with open("cleaned_employees.txt", "w") as out:
for r in records:
out.write(f"{r['name']},{r['city']},{r['role']},{r['salary']}n")
// ⏭️ Exercise: Make an Appointment to Fear Mistakes
Create a green_the file.txt with a few imperfect or dirty lines such as:
Alice,London,Engineer,75000
Bob,Paris,Analyst
Eve,London,Engineer,eighty thousand
John,New York,Manager,95000
Write the text:
- Loading only valid records
- A form number of valid lines
- He writes
validated_employees.txt
Obvious Day 7: Create a small data profile (project day)
Good work to do so far. Today, you will create a Standalone Python script:
- Loading a CSV file
- Receives column names and types
- Cries for practical figures
- Writes a summary report
// Step-by-Step Framework
1. Upload file:
def load_csv(filename):
with open(filename) as f:
lines = [line.strip() for line in f if line.strip()]
header = lines[0].split(",")
rows = [line.split(",") for line in lines[1:]]
return header, rows
2. Find types of column:
def detect_type(value):
try:
float(value)
return "numeric"
except:
return "text"
3. Profile for each column:
def profile_columns(header, rows):
summary = {}
for i, col in enumerate(header):
values = [row[i].strip() for row in rows if len(row) == len(header)]
col_type = detect_type(values[0])
unique = set(values)
summary[col] = {
"type": col_type,
"unique_count": len(unique),
"most_common": max(set(values), key=values.count)
}
if col_type == "numeric":
nums = [float(v) for v in values if v.replace('.', '', 1).isdigit()]
summary[col]["average"] = sum(nums) / len(nums) if nums else 0
return summary
4. Create a summary:
def write_summary(summary, out_file):
with open(out_file, "w") as f:
for col, stats in summary.items():
f.write(f"Column: {col}n")
for k, v in stats.items():
f.write(f" {k}: {v}n")
f.write("n")
You can use tasks such as:
header, rows = load_csv("employees.csv")
summary = profile_columns(header, rows)
write_summary(summary, "profile_report.txt")
// ⏭️ Last exercise
Use your CSV file (or re-activate the front items). Run the profile and check the result.
Obvious Store
Congratulations! He has completed the Python of Data Science Mini-Course. 🎉
This week, from the basic peps of Python data to write jobs and documents that handle real data problems. These are basics, and for what I say, the basics. I suggest you use this as the first point and learn more about the Python's Library Library (by doing).
Thank you for reading too. Happy Coding and Data Downloadable!
Count Priya c He is the writer and a technical writer from India. He likes to work in mathematical communication, data science and content creation. His areas of interest and professionals includes deliefs, data science and natural language. She enjoys reading, writing, coding, and coffee! Currently, he works by reading and sharing his knowledge and engineering society by disciples of teaching, how they guide, pieces of ideas, and more. Calculate and create views of the resources and instruction of codes.



