Fig. 6

Performance of classification trees to distinguish phenotypes. Different subsets of the overall examination data were used. Each row corresponds a classification tree based on different subsets of data. The left part of the figure indicates which subsets of variables were available to the model (black dots). The central part shows which variables were retained in the decision tree, whose complexity is indicated by its number of nodes. The first line corresponds to the tree, which has only 3 nodes and retained the three variables ‘cough score’, ‘BALF neutrophils’ and ‘BALF mast cells’. Some trees have up to 7 nodes, which is indicative of overfitting, making these trees useless in a realistic setting. Finally, the trees performance is shown as heatmap in the right. This analysis highlights the critical importance of BALF cytology in the differentiation of the EA phenotypes. Furthermore, it indicates that the diagnosis of miEA as defined in this study is the most challenging with the selected examinations and requires BALF cytology. (Abbr.: BALF = bronchoalveolar lavage fluid, EA = equine asthma, miEA = mild EA, modEA = moderate EA, sEA = severe EA, RR = respiratory rate, RT = rectal temperature, PaO2 = arterial oxygen partial pressure, PaCO2 = arterial carbon dioxide partial pressure, AaDO2 = Alveolar-arterial oxygen gradient)