Useful Python One-liners of CSV work


Image by seated
Obvious Introduction
CSV files everywhere in the data travel data, from the database sent to the API answers in the relay of the spreadsheet. While well-working pandas, sometimes you need immediate solutions you can use using Python without installing pandas.
Python Psythese Module built in CSV integrated with Listing List list and serariary talks can manage custom CSV activities in one row of the code. These liners are ready for speedy data evaluation, ETL repairs to repair, or when working in the pressed areas where external libraries are not available.
Let's use the SAMPLess Dataset with 50 records: TASDAY.CSV Then start!
🔗 Link to the code in Githubub
Obvious 1. Find the column sum
Calculate the number of column of number in all lines.
print(f"Total: ${sum(float(r[3]) for r in __import__('csv').reader(open(path)) if r[0] != 'transaction_id'):,.2f}")
Here, path Is the flexibility grabbing the way to the CSV sample file. As a result of this example, on Google Colab, of course path = "/content/data.csv".
Which is output:
Here, __import__('csv') Importing a built-in model of CSV. The general expression of the generator skips a headline, converts column prices to float, find, as well as the capital notation structures. Correct column indicator (3) and header check as needed.
Obvious 2. The group at a higher amount
Find out which group with the highest value combined in your data.
print(max({r[5]: sum(float(row[3]) for row in __import__('csv').reader(open(path)) if row[5] == r[5] and row[0] != 'transaction_id') for r in __import__('csv').reader(open(path)) if r[0] != 'transaction_id'}.items(), key=lambda x: x[1]))
Which is output:
('Mike Rodriguez', 502252.0)
Dictionary Understanding Groups in column 5, Summing Colum values for each group. One passing is collecting group keys and the second to meet. max() with lambda receives the highest value. Adjust the indices to a separate group column.
Obvious 3. Sort and display a subset of lines
Indicate only lines that match a specific situation by the formatted output.
print("n".join(f"{r[1]}: ${float(r[3]):,.2f}" for r in __import__('csv').reader(open(path)) if r[7] == 'Enterprise' and r[0] != 'transaction_id'))
Which is output:
Acme Corp: $45,000.00
Gamma Solutions: $78,900.00
Zeta Systems: $156,000.00
Iota Industries: $67,500.25
Kappa LLC: $91,200.75
Nu Technologies: $76,800.25
Omicron LLC: $128,900.00
Sigma Corp: $89,700.75
Phi Corp: $176,500.25
Omega Technologies: $134,600.50
Alpha Solutions: $71,200.25
Matrix Systems: $105,600.25
Generator Special Fall files where column 7 is equal Enterprisethen columns for formats 1 and 3. You use "n".join(...) To avoid printing a list of None values.
Obvious 4. Group by the distribution of cash
Find individual values separately from the editing column.
print({g: f"${sum(float(row[3]) for row in __import__('csv').reader(open(path)) if row[6] == g and row[0] != 'transaction_id'):,.2f}" for g in set(row[6] for row in __import__('csv').reader(open(path)) if row[0] != 'transaction_id')})
Which is output:
{'Asia Pacific': '$326,551.75', 'Europe': '$502,252.00', 'North America': '$985,556.00'}
The dictionary understanding begins to remove different values from column 6 using the specified understanding, and calculates column 3 in each group. This is effective for memory as a result of the political talks. Change indices in the group in a group with different fields.
Obvious 5. The Threshold Filter in Summer
Find and put all records over a certain number of numbers.
print([(n, f"${v:,.2f}") for n, v in sorted([(r[1], float(r[3])) for r in list(__import__('csv').reader(open(path)))[1:] if float(r[3]) > 100000], key=lambda x: x[1], reverse=True)])
Which is output:
[('Phi Corp', '$176,500.25'), ('Zeta Systems', '$156,000.00'), ('Omega Technologies', '$134,600.50'), ('Omicron LLC', '$128,900.00'), ('Matrix Systems', '$105,600.25')]
This filters lines where column 3 passes 100000Creates the tuples of the name and number of numbers, types in number of numbers, and then performs prices as a display. Correct the limit and column as required.
Obvious 6. Count different amounts
Find quickly what special prices are different from any column.
print(len(set(r[2] for r in __import__('csv').reader(open(path)) if r[0] != 'transaction_id')))
Which is output:
Here, the listed understanding issues different prices from column 2; len() He counts them. This is helpful in assessing data differences or find different categories.
Obvious 7. Conditional integration
Calculate measurements or other statistics for specific subsets of your data.
print(f"Average: ${sum(float(r[3]) for r in __import__('csv').reader(open(path)) if r[6] == 'North America' and r[0] != 'transaction_id') / sum(1 for r in __import__('csv').reader(open(path)) if r[6] == 'North America' and r[0] != 'transaction_id'):,.2f}")
Which is output:
This One-Liner lists column rate 3 to find the lines that match the status in a column 6. Using the amount that is divided into the number (with the genrotator). Reads a file twice but keeps the use of memory low.
Obvious 8. The Multi-Column Filter
Add multiple filters at one time across different columns.
print("n".join(f"{r[1]} | {r[2]} | ${float(r[3]):,.2f}" for r in __import__('csv').reader(open(path)) if r[2] == 'Software' and float(r[3]) > 50000 and r[0] != 'transaction_id'))
Which is output:
Zeta Systems | Software | $156,000.00
Iota Industries | Software | $67,500.25
Omicron LLC | Software | $128,900.00
Sigma Corp | Software | $89,700.75
Phi Corp | Software | $176,500.25
Omega Technologies | Software | $134,600.50
Nexus Corp | Software | $92,300.75
Apex Industries | Software | $57,800.00
It includes many filters with and The operator, evaluates string equality and numerical comparisons, and format and format-dividends.
Obvious 9. Column column statistics
Produce minutes, Max, and the mathematics of the number column with one shot.
vals = [float(r[3]) for r in __import__('csv').reader(open(path)) if r[0] != 'transaction_id']; print(f"Min: ${min(vals):,.2f} | Max: ${max(vals):,.2f} | Avg: ${sum(vals)/len(vals):,.2f}"); print(vals)
Which is output:
Min: $8,750.25 | Max: $176,500.25 | Avg: $62,564.13
[45000.0, 12500.5, 78900.0, 23400.75, 8750.25, 156000.0, 34500.5, 19800.0, 67500.25, 91200.75, 28750.0, 43200.5, 76800.25, 15600.75, 128900.0, 52300.5, 31200.25, 89700.75, 64800.0, 22450.5, 176500.25, 38900.75, 27300.0, 134600.5, 71200.25, 92300.75, 18900.5, 105600.25, 57800.0]
This creates a number of numbers from a column 3, and counts minutes, max, and a measure in one line. The semicolon separates statements. More memory is broader than broadcast but faster than multiple files in these figures.
Obvious 10. Export the filtered data
Create a new CSV file that contains only lines that meet your process.
__import__('csv').writer(open('filtered.csv','w',newline="")).writerows([r for r in list(__import__('csv').reader(open(path)))[1:] if float(r[3]) > 75000])
This is learning CSV, filtering lines based on a specific situation, and they write them a new file. This page newline="" The parameter blocks the additional lines. Note that this example skips head (using [1:]) So, including obviously if you need header in the output area.
Rolling up
I hope you find these single letters of CSV useful.
Such liners are useful:
- Fast Data Checking and Verification
- Simple modification of data
- Prototyping before writing complete texts
But you have to avoid them:
- Data data processing
- Files that need to manage a complex error
- A multi-step conversion
These strategies work with Python's Psyth Module built in the CSV where you need immediate solutions without setup over. Happy Assessment!
Count Priya c He is the writer and a technical writer from India. He likes to work in mathematical communication, data science and content creation. His areas of interest and professionals includes deliefs, data science and natural language. She enjoys reading, writing, codes, and coffee! Currently, he works by reading and sharing his knowledge and engineering society by disciples of teaching, how they guide, pieces of ideas, and more. Calculate and create views of the resources and instruction of codes.



