July 27, 2018 at 03:00PM
In recent years I have ran into a number of misconceptions regarding AI, and sometimes when discussing AI with people from outside the field, I feel like we are talking about two different topics. This article is an attempt at clarifying what AI practitioners mean by AI, and where it is in its current state.
The first misconception has to do with Artificial General Intelligence, or AGI:
- Applied AI systems are just limited versions of AGI
Despite what many think,the state of the art in AI is still far behind human intelligence. Artificial General Intelligence, i.e. AGI, has been the motivating fuel for all AI scientists from Turing to today. Somewhat analogous to Alchemy, the eternal quest for AGI that replicates and exceeds human intelligence has resulted in the creation of many techniques and scientific breakthroughs. AGI has helped us understand facets of human and natural intelligence, and as a result, we’ve built effective algorithms inspired by our understanding and models of them.
However, when it comes to practical applications of AI, AI practitioners do not necessarily restrict themselves to pure models of human decision making, learning, and problem solving. Rather, in the interest of solving the problem and achieving acceptable performance, AI practitioners often do what it takes to build practical systems. At the heart of the algorithmic breakthroughs that resulted in Deep Learning systems, for instance, is a technique called back-propagation. This technique, however, is not how the brain builds models of the world. This brings us to the next misconception:
- There is a one-size-fits-all AI solution.
A common misconception is that AI can be used to solve every problem out there–i.e. the state of the art AI has reached a level such that minor configurations of ‘the AI’ allows us to tackle different problems. I’ve even heard people assume that moving from one problem to the next makes the AI system smarter, as if the same AI system is now solving both problems at the same time. The reality is much different: AI systems need to be engineered, sometimes heavily, and require specifically trained models in order to be applied to a problem. And while similar tasks, especially those involving sensing the world (e.g., speech recognition, image or video processing) now have a library of available reference models, these models need to be specifically engineered to meet deployment requirements and may not be useful out of the box. Furthermore, AI systems are seldom the only component of AI-based solutions. It often takes many tailor-made classically programed components to come together to augment one or more AI techniques used within a system. And yes, there are a multitude of different AI techniques out there, used alone or in hybrid solutions in conjunction with others, therefore it is incorrect to say:
- AI is the same as Deep Learning
Back in the day, we thought the term artificial neural networks (ANNs) was really cool. Until, that is, the initial euphoria around it’s potential backfired due to its lack of scaling and aptitude towards over-fitting. Now that those problems have, for the most part, been resolved, we’ve avoided the stigma of the old name by “rebranding” artificial neural networks as “Deep Learning”. Deep Learning or Deep Networks are ANNs at scale, and the ‘deep’ refers not to deep thinking, but to the number of hidden layers we can now afford within our ANNs (previously it was a handful at most, and now they can be in the hundreds). Deep Learning is used to generate models off of labeled data sets. The ‘learning’ in Deep Learning methods refers to the generation of the models, not to the models being able to learn real-time as new data becomes available. The ‘learning’ phase of Deep Learning models actually happens offline, needs many iterations, is time and process intensive, and is difficult to parallelize.
Recently, Deep Learning models are being used in online learning applications. The online learning in such systems is achieved using different AI techniques such as Reinforcement Learning, or online Neuro-evolution. A limitation of such systems is the fact that the contribution from the Deep Learning model can only be achieved if the domain of use can be mostly experienced during the off-line learning period. Once the model is generated, it remains static and not entirely robust to changes in the application domain. A good example of this is in ecommerce applications–seasonal changes or short sales periods on ecommerce websites would require a deep learning model to be taken offline and retrained on sale items or new stock. However, now with platforms like Sentient Ascend that use evolutionary algorithms to power website optimization, large amounts of historical data is no longer needed to be effective, rather, it uses neuro-evolution to shift and adjust the website in real time based on the site’s current environment.
For the most part, though, Deep Learning systems are fueled by large data sets, and so the prospect of new and useful models being generated from large and unique datasets has fueled the misconception that…
- It’s all about BIG data
It’s not. It’s actually about good data. Large, imbalanced datasets can be deceptive, especially if they only partially capture the data most relevant to the domain. Furthermore, in many domains, historical data can become irrelevant quickly. In high-frequency trading in the New York Stock Exchange, for instance, recent data is of much more relevance and value than, for example data from before 2001, when they had not yet adopted decimalization.
Finally, a general misconception I run into quite often:
- If a system solves a problem that we think requires intelligence, that means it is using AI
This one is a bit philosophical in nature, and it does depend on your definition of intelligence. Indeed, Turing’s definition would not refute this. However, as far as mainstream AI is concerned, a fully engineered system, say to enable self-driving cars, which does not use any AI techniques, is not considered an AI system. If the behavior of the system is not the result of the emergent behavior of AI techniques used under the hood, if programmers write the code from start to finish, in a deterministic and engineered fashion, then the system is not considered an AI-based system, even if it seems so.
AI paves the way for a better future
Despite the common misconceptions around AI, the one correct assumption is that AI is here to stay and is indeed, the window to the future. AI still has a long way to go before it can be used to solve every problem out there and to be industrialized for wide scale use. Deep Learning models, for instance, take many expert PhD-hours to design effectively, often requiring elaborately engineered parameter settings and architectural choices depending on the use case. Currently, AI scientists are hard at work on simplifying this task and are even using other AI techniques such as reinforcement learning and population-based or evolutionary architecture search to reduce this effort. The next big step for AI is to make it be creative and adaptive, while at the same time, powerful enough to exceed human capacity to build models.
by Babak Hodjat, co-founder & CEO Sentient Technologies
No comments:
Post a Comment