ANI

Python of Data Science (FREE 7-Day Mini-Course)

Python of Data Science (FREE 7-Day Mini-Course)
Photo for Editor | Chatgt

Obvious Introduction

Welcome to Python of the data science, free of 7 days Of beginners! When you start with data science or you want to learn basic Python skills, this first-friendly course is yours. In the next seven days, you will learn to work on data activities only use the important Python.

You will learn that:

  • Work with Python basic data structures
  • Clean and configure dirty text data
  • Summarize and group data with dictionaries (just as you do SQL or Excel)
  • Write the duties that are automatically used in your code is tidy and running well
  • Treat kindly mistakes so your documents are not crashed to dirty installation data
  • And finally, you will create a simple test tool for testing data or any CSV data

Let's get started!

🔗 Link to the code in Githubub

Obvious Day 1: variable, data types, and file i / o

In Data Science, everything starts with the green data: research answers, logs, spreadsheets, broken websites, etc. Before analyzing anything, you need:

  • Upload Information
  • Understand its formation with its kinds
  • Start to clean or check it

Today, you will learn:

  • Basic Python Data Types
  • How to learn and write green files .txt

// 1. The variable

In python, the variable is referred to in value. With the goals of the data, you can think of it as fields, columns, or metadata.

filename = "responses.txt"
survey_name = "Q3 Customer Feedback"
max_entries = 100

// 2. Data types to use often

Don't worry about subtle types yet. You will use the following:

Python Type What is used for Illustration
glider Mature Text, Column Names Years “,” Unknown “
int Covering, Discrete variables 42, 0, -3
float Continuous variations 3.14, 0.0, -100.5
bub Flags / binary results True, Lies
None Missing Rates / Null None

Knowing in the face of your person – and how to check them or change yourself – it is a zero step in the database.

// 3. File Installation: Reading green data

Real country data remains in .txt, .csv, or .log files. You will usually need to upload a line by the line, not all at the same time (especially when large).

Suppose you had a file called responses.txt:

Here's how you read:

with open("responses.txt", "r") as file:
    lines = file.readlines()

for i, line in enumerate(lines):
    cleaned = line.strip()  # removes n and spaces
    print(f"{i + 1}: {cleaned}")

Which is output:

1: Yes
2: No
3: Yes
4: Maybe
5: No

// 4. File Out: Data Used

Suppose you want to save only “Yes” answers to a new file:

with open("responses.txt", "r") as infile:
    lines = infile.readlines()

yes_responses = []

for line in lines:
    if line.strip().lower() == "yes":
        yes_responses.append(line.strip())

with open("yes_only.txt", "w") as outfile:
    for item in yes_responses:
        outfile.write(item + "n")

This simple version of the simple pipe – the loose-pipe, daily use of data in the data before the data.

// ⏭️ Exercise: Write your first data text

Create a file called survey.txt And copy the following lines:

Now write the Python Text:

  1. You read the file
  2. It is calculated how often “Yes” appears (the charges). You will learn to work with rope access to the text. But give!
  3. Print to calculate
  4. Writes a clean version of data (made, no white) to cleaned_survey.txt

Obvious Day 2: Python data structures

Data science is about the order and construction of data so it can be cleaned, analyzed, or likened. Today you will learn four key data buildings in Core Python and how to use the actual data:

  • List: In the order of lines
  • TUPLE: RECORDS ACTIVITY
  • Dict: For details of label (such as condum)
  • Set: Tracking different values

// 1. List: In order of data lines

List is the most variable and common structure, which is ready for declaration:

  • Prize column
  • A set of records
  • Data with anonymous size

For example: Read the prices from the file from the list.

with open("scores.txt", "r") as file:
    scores = [float(line.strip()) for line in file]

print(scores)

This Print:

You can now:

average = sum(scores) / len(scores)
print(f"Average score: {average:.2f}")

Which is output:

// 2. TUPLE: Random records are scheduled

Tuples are like lists, but we do not end and use very well in lines in a known structure, e.g., (name, age).

Example: Read the word and age file.
Suppose we have the next people.txt:

Alice, 34
Bob, 29
Eve, 41

Now let's learn from the content of the file:

with open("people.txt", "r") as file:
    records = []
    for line in file:
        name, age = line.strip().split(",")
        records.append((name.strip(), int(age.strip())))

Now you can reach fields with position:

for person in records:
    name, age = person
    if age > 30:
        print(f"{name} is over 30.")

// 3. DIC: For details of label (such as condom)

Dictionaries store the amounts of counts, the closest thing in Core Python in the line of the columns that have named columns.

Example: Change each individual subject to DICT:

people = []

with open("people.txt", "r") as file:
    for line in file:
        name, age = line.strip().split(",")
        person = {
            "name": name.strip(),
            "age": int(age.strip())
        }
        people.append(person)

Now your data is very readable and flexible:

for person in people:
    if person["age"] < 60:
        print(f"{person['name']} is perhaps a working professional.")

// 4. Set: in unity and quick tests of membership

Sets removing automatic. Therefore sets ready:

  • Counting different categories
  • To check whether the value has been previously seen
  • Tracking different values ​​without order

For example: From the email file, find all different backgrounds.

domains = set()

with open("emails.txt", "r") as file:
    for line in file:
        email = line.strip().lower()
        if "@" in email:
            domain = email.split("@")[1]
            domains.add(domain)

print(domains) 

Which is output:

{'gmail.com', 'yahoo.com', 'example.org'}

// ⏭️ Exercise: Copy a small data inspector

Create a file called dataset.txt For the following content:

Now write the Python Text:

  1. Reads each line and keeps like buttons with buttons: Name, age, role
  2. Spot how many people in each passage (use a dictionary) and Univel Ages (Use set)

Obvious Day 3: Working with wires

Scriptural cables are everywhere in real world sports – research answers, bios, work headlines, product reviews, emails, and unexpectedly.

Today, you will learn from:

  • Clean and estimate unripe text
  • Release information from raw
  • Build simple features of the text (option you can use for sorting or model)

// 1. Cleaning according to basic references

Suppose you received this green list of works from CSV:

titles = [
    "  Data Scientistn",
    "data scientist",
    "Senior Data Scientist ",
    "DATA scientist",
    "Data engineer",
    "Data Scientist"
]

Your work? Use it normally.

cleaned = [title.strip().lower() for title in titles]

Now everything is small and white.

Which is output:

['data scientist', 'data scientist', 'senior data scientist', 'data scientist', 'data engineer', 'data scientist']

// 2. Merquency measuring

Suppose you are interested only in seeing data scientists.

standardized = []

for title in cleaned:
    if "data scientist" in title:
        standardized.append("data scientist")
    else:
        standardized.append(title)

// 3. Counting words, test patterns

Helpful Text features:

  • Name of words
  • That the cord contains the keyword
  • That the string is a number or email

For example:

text = " The price is $5,000!  "

# Clean up
clean = text.strip().lower().replace("$", "").replace(",", "").replace("!", "")
print(clean)  

# Word count
word_count = len(clean.split())

# Contains digit
has_number = any(char.isdigit() for char in clean)

print(word_count)
print(has_number)

Which is output:

"the price is 5000"
4
True

// 4. To distinguish and issue parts

Let's take an email example:

email = "  [email protected]  "
email = email.strip().lower()

username, domain = email.split("@")

print(f"User: {username}, Domain: {domain}")

This Print:

User: alice.johnson, Domain: example.com

This type of issuance is used by the analysis of user behavior, spam detection, and the like.

// 5. Finding specific text patterns

You do not need regular testing talks for basic patterns.

For example: Check that someone says “Python” in the Response of FREE text:

comment = "I'm learning Python and SQL for data jobs."

if "python" in comment.lower():
    print("Mentioned Python")

// ⏭️ Exercise: Clean Views Assessment

Create a file called comments.txt With the following lines:

Great course! Loved the pacing.
Not enough Python examples.
Too basic for experienced users.
python is exactly what I needed!
Would like more SQL content.
Excellent – very beginner-friendly.

Now write the Python Text:

  1. Cleaning each comment (strap, less, delete punctuation)
  2. Prints the full number of comment, how many are they are spoken with “python”, and the middle name between each comment

Obvious Day 4: The group, count, and summarize the dictionaries

Used Dice to keep records listed. Today, you will go deeper: dictionaries are used to integrate group, count, and summarize – like a pivot table or group in SQL.

// 1. Field encounters

Suppose you have this data.

data = [
    {"name": "Alice", "city": "London"},
    {"name": "Bob", "city": "Paris"},
    {"name": "Eve", "city": "London"},
    {"name": "John", "city": "New York"},
    {"name": "Dana", "city": "Paris"},
]

Goal: Count how many people in each city.

city_counts = {}

for person in data:
    city = person["city"]
    if city not in city_counts:
        city_counts[city] = 1
    else:
        city_counts[city] += 1

print(city_counts)

Which is output:

{'London': 2, 'Paris': 2, 'New York': 1}

// 2. Summary in the field by class

Now let's say we have it:

salaries = [
    {"role": "Engineer", "salary": 75000},
    {"role": "Analyst", "salary": 62000},
    {"role": "Engineer", "salary": 80000},
    {"role": "Manager", "salary": 95000},
    {"role": "Analyst", "salary": 64000},
]

PURPOSE: Calculate the full and average income of each role.

totals = {}
counts = {}

for person in salaries:
    role = person["role"]
    salary = person["salary"]
    
    totals[role] = totals.get(role, 0) + salary
    counts[role] = counts.get(role, 0) + 1

averages = {role: totals[role] / counts[role] for role in totals}

print(averages)

Which is output:

{'Engineer': 77500.0, 'Analyst': 63000.0, 'Manager': 95000.0}

// 3. Promotional Table (Mode Details)

Get the most common years in data:

ages = [29, 34, 29, 41, 34, 29]

freq = {}

for age in ages:
    freq[age] = freq.get(age, 0) + 1

most_common = max(freq.items(), key=lambda x: x[1])

print(f"Most common age: {most_common[0]} (appears {most_common[1]} times)")

Which is output:

Most common age: 29 (appears 3 times)

// ⏭️ Exercise: Analyze worker data

Create a file employees.txt For the following content:

Alice,London,Engineer,75000
Bob,Paris,Analyst,62000
Eve,London,Engineer,80000
John,New York,Manager,95000
Dana,Paris,Analyst,64000

Write the Python Text:

  1. Loading information from the dictionary list
  2. Prints the number of employees per city and standard income for each role

Obvious Day 5: Writing activities

Write a code, clean, filters, and summarize data. Now you will include that reasonable thing at work, to know:

  • Use your code
  • Create Configuration pipelines
  • Save texts read and test

// 1. Cleaning of text input

Let's write Work to make basic text:

def clean_text(text):
    return text.strip().lower().replace(",", "").replace("$", "")

Now you can use this throughout the field you read in the file.

// 2. Developing line records

Next, here is a simple task of integrating each line in the file and create record:

def parse_row(line):
    parts = line.strip().split(",")
    return {
        "name": parts[0],
        "city": parts[1],
        "role": parts[2],
        "salary": int(parts[3])
    }

Now your file load is:

with open("employees.txt") as file:
    rows = [parse_row(line) for line in file]

// 3. Aggregation assistants

So far, it has already integrated with measurements and calculations. Let's write Some basic basic activities likewise:

def average(values):
    return sum(values) / len(values) if values else 0

def count_by_key(data, key):
    counts = {}
    for item in data:
        k = item[key]
        counts[k] = counts.get(k, 0) + 1
    return counts

// ⏭️ Exercise: Correct the previous work

The previous solution laster for the reorganized functions:

  • load_data(filename)
  • average_salary_by_role(data)
  • count_by_city(data)

Then use them in the text printed the same effect on the day 4.

Obvious Day 6: Reading, writing, and basic error management

Data files are often imperfect, corrupted, and not repaired. So how do you deal with them?

Today you will learn:

  • You can read and write organized files
  • How can you treat the errors
  • To skip or wrap bad lines without crashing

// 1. Safe is a file reading

What happens when you try to read the missing file? Here's how 'trying' to open the file and hold the “FileThlontounderer” when the file is missing.

try:
    with open("employees.txt") as file:
        lines = file.readlines()
except FileNotFoundError:
    print("Error: File not found.")
    lines = []

// 2. Holding bad lines kindly

Now let's try to skip bad lines and the only process of perfect lines.

records = []

for line in lines:
    try:
        parts = line.strip().split(",")
        if len(parts) != 4:
            raise ValueError("Incorrect number of fields")
        record = {
            "name": parts[0],
            "city": parts[1],
            "role": parts[2],
            "salary": int(parts[3])
        }
        records.append(record)
    except Exception as e:
        print(f"Skipping bad line: {line.strip()} ({e})")

// 3. To write data cleaned in the file

Finally, let's write a failed data in the file.

with open("cleaned_employees.txt", "w") as out:
    for r in records:
        out.write(f"{r['name']},{r['city']},{r['role']},{r['salary']}n")

// ⏭️ Exercise: Make an Appointment to Fear Mistakes

Create a green_the file.txt with a few imperfect or dirty lines such as:

Alice,London,Engineer,75000
Bob,Paris,Analyst
Eve,London,Engineer,eighty thousand
John,New York,Manager,95000

Write the text:

  1. Loading only valid records
  2. A form number of valid lines
  3. He writes validated_employees.txt

Obvious Day 7: Create a small data profile (project day)

Good work to do so far. Today, you will create a Standalone Python script:

  • Loading a CSV file
  • Receives column names and types
  • Cries for practical figures
  • Writes a summary report

// Step-by-Step Framework

1. Upload file:

def load_csv(filename):
    with open(filename) as f:
        lines = [line.strip() for line in f if line.strip()]
    header = lines[0].split(",")
    rows = [line.split(",") for line in lines[1:]]
    return header, rows

2. Find types of column:

def detect_type(value):
    try:
        float(value)
        return "numeric"
    except:
        return "text"

3. Profile for each column:

def profile_columns(header, rows):
    summary = {}
    for i, col in enumerate(header):
        values = [row[i].strip() for row in rows if len(row) == len(header)]
        col_type = detect_type(values[0])
        unique = set(values)
        summary[col] = {
            "type": col_type,
            "unique_count": len(unique),
            "most_common": max(set(values), key=values.count)
 }
 if col_type == "numeric":
 nums = [float(v) for v in values if v.replace('.', '', 1).isdigit()]
 summary[col]["average"] = sum(nums) / len(nums) if nums else 0
 return summary

4. Create a summary:

def write_summary(summary, out_file):
    with open(out_file, "w") as f:
        for col, stats in summary.items():
            f.write(f"Column: {col}n")
            for k, v in stats.items():
                f.write(f"  {k}: {v}n")
            f.write("n")

You can use tasks such as:

header, rows = load_csv("employees.csv")
summary = profile_columns(header, rows)
write_summary(summary, "profile_report.txt")

// ⏭️ Last exercise

Use your CSV file (or re-activate the front items). Run the profile and check the result.

Obvious Store

Congratulations! He has completed the Python of Data Science Mini-Course. 🎉

This week, from the basic peps of Python data to write jobs and documents that handle real data problems. These are basics, and for what I say, the basics. I suggest you use this as the first point and learn more about the Python's Library Library (by doing).

Thank you for reading too. Happy Coding and Data Downloadable!

Count Priya c He is the writer and a technical writer from India. He likes to work in mathematical communication, data science and content creation. His areas of interest and professionals includes deliefs, data science and natural language. She enjoys reading, writing, coding, and coffee! Currently, he works by reading and sharing his knowledge and engineering society by disciples of teaching, how they guide, pieces of ideas, and more. Calculate and create views of the resources and instruction of codes.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button