Modelling Inherent Risk of Data Intensive Technologies Quantitatively-differentiated Risk Management Framework Proposal

Main Article Content

Petre-Cornel Grigorescu https://orcid.org/0009-0000-6950-415X
Iulia-Cristina Ciurea https://orcid.org/0009-0002-7881-8833

Keywords

Data-Intensive Applications, Risk Management, Regulatory Compliance, Data Management, Risk Modelling

Abstract

This study introduces a systematic methodology for risk management in data-intensive systems inside regulated environments, with a special emphasis on European Union scenarios. The framework tackles the distinct issues of reconciling regulatory compliance with the necessity for technical innovation. It delineates a risk trajectory throughout multiple phases of the data pipeline: collection, intake, processing, modelling, and application. Each stage corresponds to certain risk controls, ranging from fundamental validations at lower risk tiers to stringent security and accountability protocols for elevated risks. Organisations can mitigate any negative effects and successfully utilise data-driven insights by implementing appropriate controls at each phase. The suggested approach incorporates a quantitative risk formula that considers data volume, parameter complexity, and sensitive data items to yield a comprehensive risk score. Risk levels are assigned through Monte Carlo simulations, ensuring probabilistic accuracy in risk assessment. To enhance applicability, the framework defines risk thresholds and proposes differentiated controls, enabling organisations to simulate risk scenarios before implementation. This flexible framework seeks to promote the secure and responsible development of data-intensive applications, allowing European companies to enhance their competitiveness globally while upholding ethical and legal standards, such as the EU AI Act, or the EU Digital Services Act.

Abstract 224 | 1144-PDF-v13n1pp296-315 Downloads 10

References

Bavdaž, M., Snijkers, G., Sakshaug, J. W., Brand, T., Haraldsen, G., Kurban, B., Saraiva, P., and Willimack, D. K. (2020). Business data collection methodology: Current state and future outlook. Statistical Journal of the IAOS, 36(3), 741–756. https://doi.org/10.3233/SJI-200623
Car, J., Sheikh, A., Wicks, P. and Williams, M. S. (2019). Beyond the hype of big data and artificial intelligence: Building foundations for knowledge and wisdom. BMC Medicine, 17(1), 143. https://doi.org/10.1186/s12916-019-1382-x
Davis, R. and King, J. J. (1984). The origin of rule-based systems in AI. In B. G. Buchanan & E. H. Shortliffe (Eds.), Rule-based expert systems: The MYCIN experiments of the Stanford Heuristic Programming Project. Reading, MA: Addison-Wesley.
De Santis, F., Gubbiotti, S., and Pacifico, M. P. (2024). Distributions of risk functions for the Pareto model. Open Journal of Statistics, 14(6), 721–736. https://doi.org/10.4236/ojs.2024.146032
Feldstein, S. (2023). Evaluating Europe’s push to enact AI regulations: How will this influence global norms? Democratization, 31(5), 1049–1066. https://doi.org/10.1080/13510347.2023.2196068
George, A S; George, A S H. (2023). FMCG’s digital dilemma: The consequences of insufficient IT expertise in the fast-moving consumer goods industry. Partners Universal International Innovation Journal, 1(3), 46–69. https://doi.org/10.5281/zenodo.8066759
Greengard, S. 2021. The internet of things. Cambridge: MIT Press.
Guest, G., Namey, E. E., Mitchell, M. L. (2013). Collecting qualitative data: A field manual for applied research. Thousand Oaks, CA: Sage.
Huang, Q; Zhao, T. (2024). Data collection and labeling techniques for machine learning. arXiv preprint arXiv:2407.12793. https://doi.org/10.48550/arXiv.2407.12793
Janssen, M., Brous, P., Estevez, E., Barbosa, L. S., & Janowski, T. (2020). Data governance: Organizing data for trustworthy artificial intelligence. Government Information Quarterly, 37(3), 101493. https://doi.org/10.1016/j.giq.2020.101493
Jatnika, H., Waluyo, A., Azis, A. (2024). A comparative study on data collection methods: Investigating optimal datasets for data mining analysis. Journal of Applied Data Sciences, 5(1), 16–23. https://doi.org/10.47738/jads.v5i1.148
Kempeneer, S. (2021). A big data state of mind: Epistemological challenges to accountability and transparency in data-driven regulation. Government Information Quarterly, 38(3), 101578. https://doi.org/10.1016/j.giq.2021.101578
Kirkegaard, J. F., David-Wilp, S., & Tausendfreund, R. (2024). Keeping Europe Competitive. German Marshall Fund of the United States. http://www.jstor.org/stable/resrep59184
Narayana, V. L., Rao, G. S., Gopi, A. P., & Patibandla, R. S. M. L. (2022). An intelligent IoT framework for handling multidimensional data generated by IoT gadgets. In F. Al-Turjman & A. Nayyar (Eds.), Machine Learning for Critical Internet of Medical Things, 189–210. Springer. https://doi.org/10.1007/978-3-030-80928-7_9
Paltrinieri, N., Comfort, L., & Reniers, G. (2019). Learning about risk: Machine learning for risk assessment. Safety Science, 118, 1–10. https://doi.org/10.1016/j.ssci.2019.06.001
Peng, K. & Yan, G. (2021). A survey on deep learning for financial risk prediction. Quantitative Finance and Economics, 5(4), 716–737. https://doi.org/10.3934/QFE.2021032
Reis, I., Baron, D., & Shahaf, S. (2018). Probabilistic random forest: A machine learning algorithm for noisy data sets. The Astronomical Journal, 157(1), 16. https://doi.org/10.3847/1538-3881/aaf101
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., & Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529, 484–489. https://doi.org/10.1038/nature16961
Tyagi, A. K., Dananjayan, S., Agarwal, D., & Thariq Ahmed, H. F. (2023). Blockchain—Internet of Things applications: Opportunities and challenges for Industry 4.0 and Society 5.0. Sensors, 23(947), 1–22. https://doi.org/10.3390/s23020947
Waltl, B., Bonczek, G., & Matthes, F. (2018). Rule-based information extraction: Advantages, limitations, and perspectives. Jusletter IT, 2, 4.
Zhang, X. (2024). Machine learning insights into digital payment behaviors and fraud prediction. Applied and Computational Engineering, 77, 203–209. http://dx.doi.org/10.20944/preprints202406.1933.v1
Zou, F. (2022). Research on data cleaning in big data environment. In Proceedings of the 2022 International Conference on Cloud Computing, Big Data and Internet of Things (3CBIT) (pp. 145–148). IEEE. https://doi.org/10.1109/3CBIT57391.2022.00037