LESS IS MORE: DEEP LEARNING, BIG DATA, SMALL DATA AND NO DATA

The idea that Deep Learning is directly associated with Big Data (not only correlation – we’re talking about full blown implication) is present in courses, lectures, articles everywhere you look… and that is unfortunate.

Theory doesn’t always agree with practice, and this is where I’m coming from. Beside the fact that usually the Big Data term is totally misused, mainly to signify large quantities of information instead of the actual correct naming convention of multi-modal unstructured data-stores, I truly believe that this association is almost oxymoronic and here’s why: 

I often read that “in order to train neural networks, large quantities of data are required”. It used to be true, I give you that, and it is still true for some particular problems. However, training approaches such as few-shot learning, one-shot learning or even zero-shot learning are known for quite some time and have been used by many, successfully, including me and my team. Yet another dimension of this type of argument, although a wee bit “lighter” than the “no Deep Learning” one, is the one promoted by those who do believe in classical statistical approaches and even shallow machine learning. Here I am talking about those arguing that classic machine learning methods are “almost as powerful as Deep Learning and require a fraction of the data that a neural model needs to be properly trained”. Then again, you might come and say (or even debate) that shallow machine learning is the only answer for certain domains, but we’re not going to approach that part of the subject in this specific article.

   For now, let’s just share an objective look at this idea: I, as well as my team and many other practitioners in this field, strongly believe that the only path to simulate intelligent processes and cognitive pathways is through employment of automated discovery of their representations and digitized formalization of these into mathematical graphs – or Deep Learning, in short. For all the other concerns presented in this short argument we have a short 4-words answer as well: machine generated artificial data. As some prominent figures in our branch say, probably the most important discovery in recent years is the emergence of generative adversarial training. Well, that is exactly what myself and my team successfully employ in order to simulate and generate virtually infinite amounts of data for our training processes.

  From this point a different set of issues might arise, such as: resemblance within the artificial data to confidential non-artificial data, protection of certain patterns that could generate privacy issues and other privacy related challenges. Beside the fact that we are carefully tackling all these, we are also actively promoting Differential Privacy, Federated Machine Learning and Secured Encrypted Deep Learning together with a ongrowing international community.