What's in bookmarking bookmarks: Data Science Edition


Photo by the Author
The obvious Getting started
Keeping up with data science isn't always easy. Every day there are new libraries, papers, datasets, and tools, and I can't remember them all. I've found that just following stories or threads doesn't really work. It helps a lot to have a few tools ready. For me, it's like a little hub where I keep research, coded objects, datasets, visualizations, and quick references all in one place. After trying a bunch of things, I now have 10 bookmarks that I use all the time. They help me stay focused, save time, and know what's going on. Every morning I open them and they set the tone for my day. Here's a look at my top bookmarks and why I keep them:
The obvious 1. ARXIV: Machine Learning (CS.LG) New Papers
arxiv this is where I explore the latest machine learning research. The CS.LG section covers everything from theory using machine learning in NLP, visualization, and RL. I bookmark and check back often so I don't miss papers that might inspire new ideas or projects. It's a great way to stay ahead and learn about new methods before they hit articles or github.
The obvious 2. GitHub Trending Python repos
This page It features the most popular Python projects every week, new libraries and test tools. I'm bookmarking it because data science isn't about algorithms, it's about tools. Advanced scanning helps me identify useful libraries or patterns early, before they become too crowded. Just 10 minutes a week here usually gives me one or two things to try.
The obvious 3. Data is quantitative
Data is abundant Is a newsletter and archive full of unusual and interesting datasets. I keep it bookmarked because it's great to get project ideas, tutorials, or hancathon challenges. Each data has a short description and a link. It's an easy way to explore new data and get ideas beyond kaggle or standard sources.
The obvious 4. The rundown ai
The rundown ai It combines top AI and machine learning news and papers, saving you hours of searching. Whether it's a new paper, a tool release, or an emerging method, it provides a quick overview to see what's relevant. Basically, it's an easy way to stay informed and keep up with trends.
The obvious 5. Rawgraphs
Rawgraphs is a free, browser-based tool for creating clean, visual charts quickly. I can create intuitive visualizations from CSV or JSON without complex writing Matplotlib or is born the code. Great for spotting trends, sellers, or charting reports. Charts are easily exported in vector formats, so they look great on slides or articles.
The obvious 6. Quartz is Bad Data
This page Quartz Data Sheet it's one of my go-tos whenever I'm cleaning up dirty data. It goes through common problems like missing values, wrapped text, inconsistent formatting, and erroneous numbers, and gives tips on how to fix them. Dirty data is part of the job, and this guide saves me a lot of troubleshooting. I also like how it's organized by who should fix what, which makes tracking and troubleshooting very easy.
The obvious 7. Five minute calculations
Five minute math A quick reference to important math concepts and formulas. I can easily cover topics like hypothesis testing, probability distribution, correlation, and descriptive statistics in just a few minutes. It cooks when you look up math, prep courses, or write tutorials without digging in books.
The obvious 8. Terrible data analysis
Terrible data analysis github collection of tools and resources for all parts of the data workflow. I'm keeping it bookmarked because it's great for cleaning, manipulation, data visualization, and machine learning pipelines. When I try new libraries, update my tools, or share with colleagues or students, it helps me quickly to find reliable, well-maintained tools.
The obvious 9. MockAroo
mockAroo is a tool for generating random data and mocking apis. I can quickly create meaningful data in CSV, JSON, SQL, or Excel without typing everything by hand. Great for testing code, dashboards, or machine learning workflows, including tactical edge cases. Funny APIs also let me work early and back at the same time.
The obvious 10. Foorilla
Foorilla is a platform for technical and data work. I use it to search for new openings, follow companies, and filter jobs by title, location, or distance options. You can also export lists in CSV or JSON, making it easy to track opportunities. It's an easy way to stay updated in the job market without rusting between multiple locations.
Kanwal Mehreen Is a machine learning engineer and technical writer with a strong interest in data science and the intersection of AI and medicine. Authored the eBook “Increasing Productivity with Chatgpt”. As a Google Event 2022 APAC host, she is a symbol of diversity and excellence in education. He has also been recognized as a teradata distinction in tech scholar, a mitacs Globalk research scholar, and a Harvard WeCode Scholar. Kanwal is a passionate advocate for change, who has created femcodes to empower women.



