May 2014 Monthly Meeting Summary
Test Automation for Self-Modifying AI Systems - - roundtable discussion facilitated by Rick Hower
'Artificial intelligence' includes software that can adapt and learn, i.e., self-modify. How can test automation
help us to know if an AI-based system does what it should and doesn't do what it should not?
Such questions become more critical as AI-based systems are utilized more widely and grow in complexity.
* AI code examples and what aspects are 'self-modifying'
* Strongly Self-modifying AI vs Weakly Self-modifying AI - examples and challenges in test automation for such systems
* How AI test automation approaches can help with test automation for 'everyday' types of non-AI projects
Meeting took place on: Wed. May 21 2014 6:30 PM
- Initial discussion revolved around the introductory slides (pdf, 0.6MB)
and basics of AI, machine learning, and some example machine learning approaches, with a focus on
artificial neural networks.
- Some attendees had experience in the past with machine learning projects. One indicated that they did mostly system level testing.
They also tried randomization of order of training data and data sets as some of their test approaches.
- Some other discussion of AI testing involved sampling and use of constraints
- There was much discussion of weakly vs strongly self-modifying systems. An example of the 'weak' type was
considered to be currrent ANN systems where the weights between node connections are modified during 'learning'.
Somewhat more strongly self-modifying systems might be considered those that self-adjust the number of layers of nodes, or add
or delete nodes during learning.
- The discussion of strongly self-modifying systems noted that such AI systems do not yet exist. This might include systems
that learn how to modify the basic learning algorithms they utilize or self-modify methods of gathering/pre-processing/filtering
data. Other possibilities could include systems that can self-replicate all or parts of their architecture, systems
that can self-modify the methods by which they interface with the world around them (data or physical interfaces), or
systems that can 'recruit' other AI or non-AI systems and learn to cooperate.
- It was speculated that machine-learning systems in the financial markets might be able to produce emergent behavior
(eg, difficult to predict and thus difficult to test) - such systems may learn to compete indirectly since they
often interface indirectly via a particular financial market such as the NASDAQ. Such competition could lead to
a form of cooperation.
- Some backpropagation ANN code was examined and aspects that were 'self-modifying' were discussed. There was further
discussion around the term 'self-modifing', noting it was something of a subjective term and could be taken to mean
self-modifying of system state, code parameters, system configurations, algorithms employed, system architecture,
network configurations, algorithm generation methods, node connection architecture, layer size and configuration,
methods of cooperation or competition among multiple systems, etc. 'Weak' vs. 'Strong' self-modification would
thereby just be a means of grouping them to facilitate discussion.
- It was noted that the new Baxter robot from Rethink Robotics is capable of learning movements/object manipulation
and image learning/recognition. According to the company's web site it utilizes a multi-pronged approach to manage
risk and safety such that the robot can work safely alongside humans (a relatively new development in robotics).
It was also noted that Baxter is a fixed-in-place robot with no locomotion - it can only move its upper portions.
- There was some discussion of the significantly different risk levels and testing challenges of an AI-based
robotic system that was mobile instead of fixed.
- There was some discussion of Continuous Delivery software development approaches and how such approaches could
facilitate automated testing of self-modifying systems, especially strongly self-modifying systems. Given that CD
approaches include automated deployment of test environments and tests, a test environment could be deployed for each
modification of an AI-based algorithm/system, and then test automation run against it. Given an appropriate
infrastructure this could be done rapidly and continuously for any number of modifications (eg, rapid deployment
of thousands of environments and test runs). This approach would not of course test all possible self-modifications,
but could be part of an overall risk-management strategy.
- There was also discussion that, like with any large complex software-based system, testing could never be exhaustive
nor guarantee safety or behavior, but could effectively reduce risks.
- There was discussion of attendees' experience testing large, difficult-to-test systems. 'Stochastic' approaches
were mentioned - test automation which was essentially throwing enormous amounts of data (real, synthetic, or random)
at a system and analyzing results for anomalous behavior by either comparing to a test 'oracle' or by aggregating
results data and looking for anomalous patterns/outliers. There was discussion of mocking and other assistive
methods, and other possible approaches.
- Employment of heavy logging capabilities was suggested as a part of testing self-modifying systems, and the
resulting need for methods to analyze large log data volumes. AI-based log analysis was suggested - using machine
learning to monitor system logs to provide indicators of anomalous behavior. Such indicators or 'trained' monitoring
systems might then be used for real-time monitoring of 'live' self-modifying systems.
- There was some discussion of testing of analagous systems, such as testing of human functioning/behavior
in flight simulators, credit scoring, etc. along with related risk-containment strategies.
NoVaTAIG Home Page