Member-only story

Shining a Light on Dirty Data

Mori
6 min readSep 15, 2023

--

blog.insycle.com

Contents

The Intriguing World of Dirty Data

Welcome to our journey into the world of dirty data! If you’re a data scientist, today’s discussion will certainly pique your interest. This post shines a light on an aspect of data science that can significantly impact the outcomes of your projects.

We’ve all heard the term “garbage-in, garbage-out”. But how often do we pause to consider the quality and integrity of the data we’re feeding into our models? Quite often, the data we start with isn’t perfect. It’s messy, noisy, and yes, you guessed it- it’s dirty. Even the best-prepared data scientists can find themselves wrestling with raw, dirty data at some point in their work.

But fear not! This post aims to serve as your guide to managing, cleaning, and creating more effective data strategies. We will uncover the basics, explore where dirty data comes from, and critically, the consequences it can have on your models. Finally, we’ll delve into some practical Python-based…

--

--

Mori
Mori

Written by Mori

Date Scientist/Machine Learning Engineer | Passionate about solving real-world problems | PhD in Computer Science

Responses (1)