open access publication

Article, 2024

Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring

IEEE Transactions on Network and Service Management, ISSN 1932-4537, 2373-7379, Volume PP, 99, Page 1, 10.1109/tnsm.2024.3368501

Contributors

Camacho, José 0000-0001-9804-8122 [1] Wasielewska, Katarzyna 0000-0001-8087-790X [1] Bro, Rasmus 0000-0002-7641-4854 [2] Kotz, David [3]

Affiliations

  1. [1] University of Granada
  2. [NORA names: Spain; Europe, EU; OECD];
  3. [2] University of Copenhagen
  4. [NORA names: KU University of Copenhagen; University; Denmark; Europe, EU; Nordic; OECD];
  5. [3] Dartmouth College
  6. [NORA names: United States; America, North; OECD]

Abstract

There is an increasing interest in the development of new data-driven models useful to assess the performance of communication networks. For many applications, like network monitoring and troubleshooting, a data model is of little use if it cannot be interpreted by a human operator. In this paper, we present an extension of the Multivariate Big Data Analysis (MBDA) methodology, a recently proposed interpretable data analysis tool. In this extension, we propose a solution to the automatic derivation of features, a cornerstone step for the application of MBDA when the amount of data is massive. The resulting network monitoring approach allows us to detect and diagnose disparate network anomalies, with a data-analysis workflow that combines the advantages of interpretable and interactive models with the power of parallel processing. We apply the extended MBDA to two case studies: UGR’16, a benchmark flow-based real-traffic dataset for anomaly detection, and Dartmouth’18, the longest and largest Wi-Fi trace known to date.

Keywords

UGR’16, Wi-Fi, Wi-Fi traces, amount, amount of data, analysis, analysis tools, anomalies, anomaly detection, applications, approach, automatic derivation, big data analysis, case study, cases, communication networks, data, data analysis, data analysis tools, data analysis workflow, data model, data-driven models, dataset, derivation of features, detection, development, extension, feature learning, features, human operator, increasing interest, interaction model, interest, interpretation, learning, model, monitoring, monitoring approach, network, network anomalies, network monitoring, network monitoring approach, operation, parallel processing, performance, performance of communication networks, power, power of parallel processing, process, solution, study, tools, trace, troubleshooting, workflow

Funders

  • European Commission

Data Provider: Digital Science