What is Dark Data in Data Analytics – Data Science Jargon for Beginners

Updated December 4, 2017

Dark Data, Not to be confused with Darth Vader…

haha, well I just couldn’t resist the StarWars reference especially with the newest one being released in less then two weeks! I am a really big Star Wars Fan!

Anyway, the reason you are here… dark data.

Dark data is data stored on a database that sits unused by a company.This is information gathered and processed but never analyzed. A data analyst gathered the data but never translated it into usable insights.

Most business usually have a lot of dark data laying around their databases. There is so much data in the world today that they simply can’t make use of all that is out there. So we are left with a lot of “dark data” ready to be used, but going unused.

Why hang on to dark data? Isn’t it just taking up valuable database space? Well yes it is… but think about when your boos comes to you and says, “why did you spend $XXX,XXX on X???” By storing the dark data that relates to your decision but did not receive analyzation for your decision, you have proof to add to your former research.

The power of data lies in fact based decisions. What sounds better to the executives upstairs?

  • I had a really really good feeling about this decisions.
  • The data clearly shows that this decision is the best decision.

I would vote on the data. Keeping dark data around helps clarify the decisions a company makes.


