Classification in MerQur: From ROC Curves to Gradient Boosting
DOI:
https://doi.org/10.53463/merqur.20260449Keywords:
classification, ROC, AUC, TSSAbstract
Classification is one of the most common tasks in academic research and applied data science: assigning an observation to one of predefined categories. On the bridge between classical statistics (logistic regression) and modern machine learning (random forest, gradient boosting, SVM), the correct measurement and reporting of classification performance requires special methodological care. This study introduces in detail the 6 analyses offered in MerQur’s Classification category: ROC Curve, TSS (True Skill Statistic), Confusion Matrix Metrics, Random Forest Classification, Support Vector Machine (SVM), and Gradient Boosting Classification. For each: (i) the basis of the method and its place in the classification task, (ii) hyperparameters and selection strategies, (iii) form fields and options in MerQur, (iv) reported performance metrics (accuracy, precision, recall, F1, AUC, TSS, kappa, Matthews correlation), and (v) interpretation guidance for a typical research question. The role of ROC and AUC in threshold-independent evaluation, the selection of correct metrics in imbalanced classes (F1 / MCC / TSS), the interpretation of variable importance in Random Forest, kernel selection in SVM, and overfitting control in Gradient Boosting are discussed. MerQur’s Classification category covers the spectrum from classical diagnostic threshold analysis to modern ensemble methods within a single graphical interface and includes k-fold cross-validation as standard.
References
Allouche, O., Tsoar, A., & Kadmon, R. (2006). Assessing the accuracy of species distribution models: Prevalence, kappa and the true skill statistic (TSS). Journal of Applied Ecology, 43(6), 1223–1232. https://doi.org/10.1111/j.1365-2664.2006.01214.x
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). ACM. https://doi.org/10.1145/2939672.2939785
Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21, 6. https://doi.org/10.1186/s12864-019-6413-7
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018
DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics, 44(3), 837–845. https://doi.org/10.2307/2531595
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd ed.). Springer.
Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (3rd ed.). Wiley.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30, 3146–3154.
Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18–22.
Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta, 405(2), 442–451.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE, 10(3), e0118432. https://doi.org/10.1371/journal.pone.0118432
Youden, W. J. (1950). Index for rating diagnostic tests. Cancer, 3(1), 32–35. https://doi.org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
Downloads
Published
Issue
Section
License
Copyright (c) 2026 MerQur

This work is licensed under a Creative Commons Attribution 4.0 International License.
This article is published under a Creative Commons Attribution 4.0 International License (CC-BY 4.0). Under this license you may:
- Share: Copy and redistribute the material in any medium or format.
- Adapt: Remix, transform and build upon the material for any purpose, including commercial use.
- Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made.