INFER Software

Development of Computational Intelligence Platform for Evolving and Robust Predictive Systems

During the development of the INFER platform as a software product, various options of software architectures which are both easily maintainable and ensure high performance of the implemented methods have been investigated. The former feature has been achieved by introducing high modularity to the system. For this purpose, a plug-and-play environment has been developed and applied. In consequence, the modules can be added, removed or drifted-in as required. To achieve the high performance of the system, parallel computing solutions have beeen developed. Departing from the classical software development processes like V-model, RUP or Waterfall, the project built on modern SCRUM method. 

A considerbale effort has also been devoted to the reserach of modern User Interfaces allowing for seamless and efficient interaction between the user and the platform, intuitive presenation of data flowing through the system as well as of the pool of complex, dynamically changing predictive models.

 

Highlights

  • INFER platform acts in the domain of adaptive predictive systems
  • INFER is an environment for predictive models, which can be generated automatically or manually
  • The predictive models within the INFER environment compete with each other for the right to produce a prediction
  • The models can also collaborate by forming ensembles
  • The platform is constantly looking to improve existing predictive models by perfoming Hyper Parameter Optimization, i.e. a search for optimal values of non-trainable parameters of the models, all happening in the background

 

High level view of the INFER platform

  • Server implemented in JAVA
  • External Data Sources – CSV, SQL database
  • Internal Storage – H2 Database
  • Client-Server connection – Java Web Services or RMI

 

Data flow project

  • A project groups configuration of the whole data transformation flow
  • Multiple projects can be defined and run in parallel

 

Data Source

  • Imported from CSV file or external DB
  • H2, PostgreSQL and SQLServer supported
  • Column selection and type definition possible at import
  • Online data streaming
  • Can be shared by multiple projects

 

Data Stream

  • One data stream defined per project
  • Responsible for fetching samples from data source and feeding them to trainers at defined pace
  • Samples can be sent in batches
  • Chain of preprocessing nodes can be defined
  • Preprocessing node logic defined in a customizable method

 

Trainer

  • Multiple trainers per project allowed
  • Holds a group of predictive elements
  • Predictive elements can be added and removed at project runtime
  • Responsible for generating, training, validating and feeding samples to predictive elements
  • Defines training data range
  • Trainer functionality is defined in a set of customizable methods:
    • Instance selection method – defines how training data set is divided among predictive elements
    • Validation method – defines validation algorithm which can be used to judge PE quality before running the project
    • Generation method – defines algorithm for automatically generating predictive elements at project startup

 

Predictive Element (PE)

  • Generates and sends predictions to Global Performance Evaluation (GPE)
  • Main processing logic defined in customizable processing method
  • Generated by trainer or manually defined by user
  • Can be switched between trainers or switched off at runtime by unassigning from any trainer
  • Composite processing methods – contain Directed Acyclic Graphs (DAGs) of processing methods
  • DAGs can be easily edited using drag and drop

 

Global Performance Evaluation

  • Calculates final prediction based on predictions received from all connected predictive elements
  • Predictive Elements compete to produce the final (global) prediction on the basis of past performance

  • GPE functionality is defined in a set of customizable methods:
    • Performance function – calculates each PE performance based on prediction and target values
    • Global prediction function – generates final prediction based on performance values
    • Change detector – triggers PE adaptation based on performance values

 

Method Pool

  • Most of the functionality is defined in plugins - methods
  • All available methods are available to user in Method Pool View
  • Processing methods from method pool can be dragged directly to PE graphs
  • Processing methods from WEKA and JSAT libraries are integrated (over 160 methods available)
  • Method pool can be further extended by implementing custom methods

 

Custom Methods

  • User can implement custom methods as JAVA classes
  • One of the predefined interfaces must be implemented, based on method usage context
  • The JAR file containg custom method class must be delivered to server pickup dir
  • Custom method is imported to method pool at runtime – no need to restart the server

 

Adaptivity

  • Change detector method in GPE triggers PE adaptation based on performance values
  • When change detection event is triggered, trainer starts PE retraining using data range defined by change detector
  • Adaptive training is run in background – doesn’t affect PE or Trainer processing
  • When adaptive training is finished, PE is updated
  • Method based adaptation – adaptive processing methods can update their state during regular processing. Such adaptation can be switched on or off at PE level

 

Hyper Parameter Optimization (HPO)

  • Can be used to improve PE performance by selecting better non-trainable processing method parameters
  • HPO task can be defined for each PE
  • User defines range of values for PE processing method parameters
  • HPO functionality defined in customizable methods:
    • Optimization method – defines strategy how the candidate method models are selected
    • Validation method – estimates error of a candidate method model
  • HPO task runs in background – doesn’t affect PE or Trainer processing
  • When HPO tasks finds improved parameter set, PE is updated
  • Triggered by the user or by the change detector

 

The INFER client application

  • Implemented in JAVA, using Eclipse Rich Client Platform
  • Eclipse CDO Model Repository technology used

 

Output

  • All processing data is stored in an internal database
  • Performance/accuracy data is stored for predictive elements and GPE
  • Each system element’s processing data can be observed by user in table view
  • Table data can be exported to CSV
  • Data from each table can be plotted

 

Graphs

  • Project graph can be used to configure project structure
  • PE graph can be used for composite methods to modify DAG structure
  • For each method in a DAG one can examine:
    • Input and output data table
    • Method state information

Please contact the INFER project coordinator Prof. Bogdan Gabrys if you need any further information regarding the INFER software.