You are currently viewing Are you a Data Hygienist?

Are you a Data Hygienist?

The first real job in my formative years was as a line cook at a Big Boy family restaurant in north-central Indiana.  I learned many lessons about working with people, teamwork, customer service, inventory management, and managing a business. In the kitchen, I remember the grizzled cook that had been there for many years telling me when there was a break in the dinner rush: “If you’ve got time to lean, you’ve got time to clean”.  That mantra stuck with me for the rest of my career. 

In a conversation today with my partners, Jason Thompson and Michael Epstein, we were discussing the dusty data landscape of Microsoft Teams and other collaboration sites.  Every organization, including Kindato, struggles daily with ensuring their team members accidently don’t use erroneous chats and meeting notes. Naturally, the conversation then led towards how to manage SharePoint and shared folders across a myriad of cloud platforms.   

My career in technology allows me to play with terabytes (and sometime petabytes) of unstructured data. Now I’m spending quite a bit of time in Microsoft Teams, Slack, Trello, and a few other collaboration technologies across the different projects we are engaged. These tools are amazing at providing the foundation for remote teams to continue working together especially in the middle of a pandemic.   

The challenge is that collaboration technologies often become a “data dumping ground” for all sorts of data such as meeting notes, files, meeting recordings, and chats. This makes it very difficult to manage on a day-to-day basis as well as raises the risk of retaining large volumes of unknown data.   

Building data hygiene and data management into the culture of your organization is a healthy way to manage the exploding volume of data that is stored across these platforms. 

Here are some approaches to good data hygiene. 

  • Define “Sources of Truth” – All potential systems are potentially relevant in an eDiscovery matter.  However, you can have specific sources for retaining records and documents such as a document management system (DMS) or even a simple file share or Cloud-data storage.  This needs to be shared on a regular basis with team members as well as included in the onboarding process.  This of course ties into an Information Governance program that captures those types of data along with their data sources.   

  • Use Versioning control & Tags – Collaboration tools have some amazing new tricks to manage data.  For example, the Microsoft Office M365 suite stores version modifications inside their files (e.g., MS Word, PowerPoint, etc) so that you don’t have to create a new file with each change.  Now the actual document itself becomes its own “Source of Truth”.  Tagging documents can also help track documents.  This includes “Forced Tagging” which mandates a user select a tag that best meets the criteria before it is stored in the system of record.  You can also create custom tags embedded in a document and use that for searching down the road. 
  • Hire a “Chief Data Hygienist” – For an organization that really wants to manage data within an enterprise, have a Chief Data Hygienist (aka Chief Information Governance Office or CIGO) can help build and maintain a program that really wants to know what, where and how your data has been used.  These types of organizations embrace that data has a value as well as risk.   
  • Act as your own “Data Hygienist” – I like to go through my data sources at times and clean things up when I have some down time.  I find it cathartic looking at folders and files to determine if I really need to keep that data.   I make it a practice to spend 15-20 minutes every couple of days to see what’s going on in my data sources which calls back to my “Got time to lean, got time clean” mantra.  This also helps me mentor my team members on the best practices of managing data inside a particular platform (I’m looking at you Microsoft Teams!!).   

It is very easy to go down the data rabbit hole. No storage system or technology is perfect and eDiscovery practitioners will still need to track down relevant data sources in their practice, as an individual contributor and take some time to keep good data hygiene in mind, you’ll find that even incremental cleaning will make a world of difference.