A long, long time ago, when the Earth was still young, an innocent computer science graduate entered a castle, looking for work. The castle’s occupants were kind and took him in. Surrounded by experienced analysts, designers, programmers and testers, he developed his skills and flourished for a number of years; gaining a reputation for answering difficult questions and eventually developed a particular liking for the now archaic skill of “database design”.
One day, a great ogre (whose name was “IT Director”) challenged him to a contest: “Find me a database that will meet all my needs, now and in the future!”
The database designer winced at such a mighty test of his skills. There were many database platforms available, but which one to choose? Would it be a relational database, with the confidence of structure, but the inflexibility of dealing with the unknown? What about an indexed sequential flat-file, record-based solution, with quicker access times but little structure? Perhaps a hierarchical solution, with rapid search times, but much duplication? He even considered no database at all, but a large blob of data that would hold everything would be nearly impossible to use.
He deliberated for many years without reaching a conclusion. Many fads came and went, with each vendor adding to the capabilities of their preferred solution.
The indexed sequential vendors suffered because they spoke only the laughingly outdated COBOL language, although their method of accessing records was swift and efficient.
The Oracle vendor insisted that their database was the best, adding many tools and techniques and inflating the price accordingly. And then a bit more. And then a bit more.
The XML vendor insisted that his structure could hold anything known or unknown and it seemed to be true, until the designer realised that he would never be able to find anything that he’d stored in time to use it, unless the database was very small.
The MySQL vendor’s offering looked promising, with all the benefits of the other relational database vendors, but without the cost, and the designer started to take a great interest in his wares. Unfortunately for the MySQL vendor, the Oracle vendor ate him.
Several decades later, our hero was no closer to his goal. The ogre grew impatient. “All this time and nothing to show for it? Why, you have not even decided which vendor to use!”.
The designer grew sad and started to wander the land, visiting conference after conference; surely there was an answer to his problem?
Then, out of nowhere, he heard a small voice: “The answer you seek lies with NoSQL”.
The designer turned to see a small gnome with a brown beard and a tweed cap and waistcoat standing in the shadows. “What is this NoSQL” enquired the designer, “and how can there be such a thing?”
“Ah”, answered the gnome, “it is the future, my friend. For in the future there will be no need to store data efficiently, as servers and storage will be dirt cheap and plentiful”.
“And where will these plentiful servers be kept?” queried the designer, for he knew that physical space was limited in the farms available to the kingdom.
“In the very clouds themselves” answered the gnome, mysteriously.
Ignoring the fact that servers were heavy and certainly don’t float in the air (as well as the poorly conceived, obvious attempt at a gag), the designer started to take interest. If the gnome spoke the truth, any amount of data could be stored in the structures that could be created dynamically in real time. It seemed that the great and exponentially increasing volumes of data that the ogre had to provide to his masters, the Giants, could easily be accommodated within such silos. There was no end to their capacity, if the gnome was to be believed, as the wide array of these cloud-based servers could be increased at any time.
“But how do I get the data out again?” asked the designer. “For surely, the data is not as important in and of itself; it is the information that can be derived from it that matters.”
Nervously, the gnome laughed. “Er, don’t you worry about that. You’ll be able to see all sorts of trends in the creation of the data through this freely distributed magic item: the window of Hadoop”.
“That seems like a made up name…” thought the designer.
He looked sceptical. “What about the data itself? Retrieving the objects is fine and dandy when I need the whole thing, but how do I get access to it at an individual property level? I need operational data for the ogre to run his machinations for the Giants.”
The gnome looked sheepish and turned defensive. “Operational data? No one does that anymore! It’s all social media and “Internet of Things” now, matey! Billions of data items, swilling around the place and you need to store it all!”
The designer considered this. Was the gnome correct? Was there really no need to hold and use operational data? He reached a decision.
“No – you’re wrong. For if there is no operational data, the machinery will halt and no amount of social media selfies or fridge content warnings will change the fact that the Giants won’t be able to add to their piles of gold. Goodbye!” And, with that, the designer turned his back on the charlatan and continued on his merry way.
It was on return to the castle that a revelation hit the designer. Picking himself back up, he ran to his whiteboard and started to draw…
“So, have you solved the problem?” demanded the ogre.
“Yes, Sire, I believe I have” responded the designer.
“Well, which database is the one we shall use? SQL or NoSQL?”
“Both” replied the designer, smiling.
“SQL databases are perfect for storing structured operational data”, (as long as they are well designed by someone who knows what they are doing, he winked), “and NoSQL databases are great at storing large amounts of unstructured, little-used data objects. In combining the two, using the external connectivity available to relational database platforms, we can store the high volume / little information data in a NoSQL database and the low volume, higher value operational data in a SQL database. In that way, we gain the benefits of low cost open source relational database options such as PostgreSQL distributed over a small number of higher-powered servers, with the low cost NoSQL repositories, such as Cassandra, spread over the cloud.”
The ogre was gob-smacked. He grinned. “I believe you might be onto something!”, for the ogre could serve his operational data machine and everyone knows that Giants like BIG data!