Multilayered Iterative GMDH algorithm (MIA or CML)

As with the Combinatorial algorithm, the output variable must be specified in advance by the person in charge of modeling, which corresponds to the use of so-called explicit templates [16,25]. In each layer, the F best models are used to successively extend the input data sample.

In Multilayered Iterative (MIA) recurrent algorithm the iteration rule remains unchanged from one layer to next. As it shown the first layer tests the models, that can be derived from the information contained in any two columns of the sample. The second layer uses information from four columns; the third, from any eight columns, etc. The exhaustive-search termination rule is the same as for the Combinatorial algorithm: in each layer the optimal models are selected by the minimum of external criterion.

Output model: Y_k+1 = d₀ + d₁x_1k + d₂ x_2k+ ... +d_mx_{M
k}x_{M-1 k}

where:
         1 - data sampling;
         2 - layers of partial descriptions complexing;
         3 - form of partial descriptions;
         4 - choice of optimal models;
         5 - additional model definition by discriminating criterion;
         F1 and F2 - number of variables for data sampling extension.

MIA should be used when it is needed to handle a big number of variables (up to 500). This algorithm also can be modified in such way that at each layer a set of F best variables is selected and at next layer only this variables are used. MIA may contain in some cases the "multilayerness error" when effective variable are not selected which is analogical to statistical error of control systems.

Multilayered GMDH algorithms can be used for solving of incorrect and ill-defined modeling problems, i.e. in the cases when number of observations is less than variables N < M. The regression analysis methods are inapplicable in this case, because they give not possibility to build the only model, which is adequate to process in this case.

Originally GMDH was proposed as addition to regression analysis of two procedures:
1) for sets of model candidates generation: different algorithms mainly differs one from another by the way of models candidates sets generation;
2) for search of optimal model by external criterion.

Recent time two additional procedures are added:
3) preliminary handling of data sample by clusterization algorithm. Initial data sample should be changed to the set of cluster centers coordinates.
4) models received are used as active neurons in twice-multilayered neuronet for additional increase of modeling accuracy.