• Abstract

    The process of identifying new patterns or peculiarities that exist in the regular time series data is called time series novelty detection or anomaly detection. Although it is one of the most difficult data mining areas, it is gaining popularity due to its quick application to real-world problems. This research proposes a novel way to detect time series novelty using ML algorithms. An usage of suggested ML techniques to find outliers in time series data has increased recently. Using a dataset from Stack Overflow, this research investigates the use of machine learning for anomaly and novelty identification from time series data.The initial data preparations were dealing with missing values, examining tags of datasets and data visualisation through individuals’ words and words using the following assessment metrics: MAE=0. 0629, RMSE=0. 089, and MSE=0. 007. The performance of the second-best model, ARIMA, yielded an MAE = 0. 068, RMSE = 0. 0936 and MSE = 0. 008. The lowest accuracy for this task was witnessed with the Decision Tree Regressor since its error rates were the highest. The results confirm the suitability of the Random Forest Regressor in increasing accuracy in time series data with a special focus on novel and abnormal data point detection while emphasising the significance of the model choice.

  • References

    1. Aguilar, D. L., Medina-Perez, M. A., Loyola-Gonzalez, O., Choo, K. K. R., & Bucheli-Susarrey, E. (2023). Towards an Interpretable Autoencoder: A Decision-Tree-Based Autoencoder and its Application in Anomaly Detection. IEEE Transactions on Dependable and Secure Computing. https://doi.org/10.1109/TDSC.2022.3148331
    2. Arora, R., Gera, S., & Saxena, M. (2021). Mitigating Security Risks on Privacy of Sensitive Data used in Cloud-based ERP Applications. 2021 8th International Conference on Computing for Sustainable Global Development (INDIACom), 458–463.
    3. Arora, R., Kumar, A., & Soni, A. (2024). Deep Learning Approaches for Enhanced Kidney Segmentation: Evaluating U-Net and Attention U-Net with Cross-Entropy and Focal Loss Functions. https://doi.org/10.20944/preprints202408.1816.v1
    4. Arora, S., & Khare, P. (2024). THE IMPACT OF MACHINE LEARNING AND AI ON ENHANCING RISK-BASED IDENTITY VERIFICATION PROCESSES. International Research Journal of Modernization in Engineering Technology and Science, 06(05), 8246–8255.
    5. Arora, S., Khare, P., & Gupta, S. (2024). AI-Driven DDoS Mitigation at the Edge: Leveraging Machine Learning for Real-Time Threat Detection and Response. 2024 International Conference on Data Science and Network Security (ICDSNS), 1–7. https://doi.org/10.1109/ICDSNS62112.2024.10690930
    6. Banu, D. P. S., Mary, M. I. A. A., Banu, D. P. S., & Mary, M. I. A. A. (2021). Prediction and Forecasting of Copper Prices using ARIMA models. Jetir, 8(4), 286–290.
    7. Bauskar, S. (2022). BUSINESS ANALYTICS IN ENTERPRISE SYSTEM BASED ON APPLICATION OF ARTIFICIAL INTELLIGENCE. International Research Journal of Modernization in Engineering Technology and Science, 04(01), 1861–1870. https://doi.org/DOI : https://www.doi.org/10.56726/IRJMETS18127
    8. Bauskar, S. (2023). Advanced Encryption Techniques For Enhancing Data Security In Cloud Computing Environment. International Research Journal of Modernization in Engineering Technology and Science, 05(10), 3328–3339. https://doi.org/: https://www.doi.org/10.56726/IRJMETS45283
    9. Bauskar, S. (2024). Enhancing System Observability with Machine Learning Techniques for Anomaly Detection. International Journal of Management, IT & Engineering, 14(10), 64–70.
    10. Bertalanič, B., Hribar, J., & Fortuna, C. (2024). Visibility Graph-Based Wireless Anomaly Detection for Digital Twin Edge Networks. IEEE Open Journal of the Communications Society, 5, 3050–3065. https://doi.org/10.1109/OJCOMS.2024.3393853
    11. Bishukarma, R. (2021). The Role of AI in Automated Testing and Monitoring in SaaS Environments. IJRAR, 8(2). https://www.ijrar.org/papers/IJRAR21B2597.pdf
    12. Bradley, T., Alhajjar, E., & Bastian, N. D. (2023). Novelty Detection in Network Traffic: Using Survival Analysis for Feature Identification. Proceedings - 2023 IEEE International Conference on Assured Autonomy, ICAA 2023. https://doi.org/10.1109/ICAA58325.2023.00010
    13. Breiman, L. E. O. (2001). Random Forests. 5–32.
    14. Chandu, H. S. (2024a). Efficient Machine Learning Approaches for Energy Optimization in Smart Grid Systems. 10(9).
    15. Chandu, H. S. (2024b). Enhancing Manufacturing Efficiency: Predictive Maintenance Models Utilizing IoT Sensor Data. IJSART, 10(9).
    16. Darban, Z. Z., Webb, G. I., Pan, S., Aggarwal, C. C., & Salehi, M. (2022). Deep Learning for Time Series Anomaly Detection: A Survey.
    17. Ganesh, N., Jain, P., Choudhury, A., Dutta, P., Kalita, K., & Barsocchi, P. (2021). Random forest regression-based machine learning model for accurate estimation of fluid flow in curved pipes. Processes. https://doi.org/10.3390/pr9112095
    18. Gopalsamy, M. (2020). Artificial Intelligence (AI) Based Internet-of-Things (IoT)-Botnet Attacks Identification Techniques to Enhance Cyber security. International Journal of Research and Analytical Reviews (IJRAR), 7(4), 414–420.
    19. Gopalsamy, M. (2024). Identification And Classification Of Phishing Emails Based on Machine Learning Techniques To Improvise Cyber security. IJSART, 10(10).
    20. Goyal, R. (2024a). An Effective Machine Learning Based Regression Techniques For Prediction Of Health Insurance Cost. International Journal of Core Engineering & Management, 7(11), 49–60.
    21. Goyal, R. (2024b). EXPLORING THE PERFORMANCE OF MACHINE LEARNING MODELS FOR CLASSIFICATION AND IDENTIFICATION OF FRAUDULENT INSURANCE CLAIMS. International Journal of Core Engineering & Management, 7(10).
    22. Gupta, K. P. and S. (2024). The Impact of Data Quality Assurance Practices in Internet of Things (IoT) Technology. International Journal of Technical Innovation in Modern Engineering & Science, 10(10), 1–8.
    23. Guven, M., & Uysal, F. (2023). Time Series Forecasting Performance of the Novel Deep Learning Algorithms on Stack Overflow Website Data. Applied Sciences (Switzerland). https://doi.org/10.3390/app13084781
    24. Halim, M., Pratomo, B. A., & Jati Santoso, B. (2023). Comparative Analysis of Novelty Detection Algorithms in Network Intrusion Detection Systems. 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings. https://doi.org/10.1109/ICAMIMIA60881.2023.10427625
    25. He, Y., Huang, Z., Vogt, S., & Sick, B. (2024). PrOuD: Probabilistic Outlier Detection Solution for Time-Series Analysis of Real-World Photovoltaic Inverters. Energies. https://doi.org/10.3390/en17010064
    26. Hoare, S. W., Asbridge, D., & Beatty, P. C. W. (2002). On-line novelty detection for artefact identification in automatic anaesthesia record keeping. Medical Engineering and Physics. https://doi.org/10.1016/S1350-4533(02)00146-7
    27. Hossen, M. J., Hoque, J. M. Z., Aziz, N. A. binti A., Ramanathan, T. T., & Raja, J. E. (2024). Unsupervised novelty detection for time series using a deep learning approach. Heliyon. https://doi.org/10.1016/j.heliyon.2024.e25394
    28. Jambhulkar, N. N. (2013). Modeling of Rice Production in Punjab using ARIMA Model. 8, 2–3.
    29. Khan, S., & Alghulaiakh, H. (2020). ARIMA model for accurate time series stocks forecasting. International Journal of Advanced Computer Science and Applications. https://doi.org/10.14569/IJACSA.2020.0110765
    30. Khare, P. (2023). Enhancing Security with Voice : A Comprehensive Review of AI-Based Biometric Authentication Systems. 10(2), 398–403.
    31. Khare, P., Arora, S., & Gupta, S. (2024). Integration of Artificial Intelligence (AI) and Machine Learning (ML) into Product Roadmap Planning. 2024 First International Conference on Electronics, Communication and Signal Processing (ICECSP), 1–6. https://doi.org/10.1109/ICECSP61809.2024.10698502
    32. Kumar, S. G., Sunny, S., Sayed, A., Jyothidasan, A., Nanda, V., Trinity, J., & Namakkal-Soorappan, R. (2022). Chronic Reductive Stress Modifies Ribosomal Proteins in Nrf2 Transgenic Mouse Hearts. Free Radical Biology and Medicine, 192, 73. https://doi.org/10.1016/j.freeradbiomed.2022.10.125
    33. Lo, S. Y., Oza, P., & Patel, V. M. (2023). Adversarially Robust One-Class Novelty Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2022.3189638
    34. Ma, J., & Perkins, S. (2003). Time-series Novelty Detection Using One-class Support Vector Machines. Proceedings of the International Joint Conference on Neural Networks. https://doi.org/10.1109/ijcnn.2003.1223670
    35. Mani Gopalsamy. (2021). Enhanced Cybersecurity for Network Intrusion Detection System Based Artificial Intelligence (AI) Techniques. International Journal of Advanced Research in Science, Communication and Technology, 12(01), 671–681. https://doi.org/10.48175/IJARSCT-2269M
    36. Mani Gopalsamy. (2022). An Optimal Artificial Intelligence (AI) technique for cybersecurity threat detection in IoT Networks. International Journal of Science and Research Archive, 7(2), 661–671. https://doi.org/10.30574/ijsra.2022.7.2.0235
    37. Mathur., S. (2024). Supervised Machine Learning-Based Classification and Prediction of Breast Cancer. International Journal of Intelligent Systems and Applications in Engineering, 12(3).
    38. Merilinna, J. (2023). Advanced Uncertainty Quantification and Novelty Detection for Random Forest Models. https://doi.org/10.5121/csit.2023.131920
    39. Milunovich, G. (2020). Forecasting Australia’s real house price index: A comparison of time series and machine learning methods. Journal of Forecasting. https://doi.org/10.1002/for.2678
    40. Ouafae, B., Oumaima, L., Mariam, R., & Abdelouahid, L. (2021). Survey on Novelty Detection using Machine Learning Techniques. Advances in Science, Technology and Engineering Systems Journal. https://doi.org/10.25046/aj060510
    41. Ozdagli, A. I., & Koutsoukos, X. (2019). Machine learning based novelty detection using modal analysis. Computer-Aided Civil and Infrastructure Engineering. https://doi.org/10.1111/mice.12511
    42. Pranav Khare, S. A. (2024). Predicting Customer Churn in Subscription-Based Enterprises Using Machine Learning. May, 365–377. https://doi.org/10.1007/978-981-99-8438-1_26
    43. Pranav Khare, S. S. (2023). AI-Powered Fraud Prevention : A Comprehensive Analysis of Machine Learning Applications in Online Transactions. 10(12), 491–497.
    44. Rath, A., Das Gupta, A., Rohilla, V., Balyan, A., & Mann, S. (2022). Intelligent Smart Waste Management Using Regression Analysis: An Empirical Study. Communications in Computer and Information Science. https://doi.org/10.1007/978-3-031-07012-9_12
    45. Rinky Dwivedi, V. R. (2016). Empowering Agile Method Feature-Driven Development by Extending It in RUP Shell. Advances in Computer and Computational Sciences: Proceedings of ICCCCS 2016, 1.
    46. Ritesh Tandon, Aniqa Sayed, M. A. H. (2023). Face mask detection model based on deep CNN technique using AWS. International Journal of Engineering Research and Applications Www. Ijera. Com, 13(05), 12–19.
    47. Rohilla, V., Chakraborty, D. S., & kumar, D. R. (2019). Random Forest with Harmony Search Optimization for Location Based Advertising. International Journal of Innovative Technology and Exploring Engineering. https://doi.org/10.35940/ijitee.i7761.078919
    48. Rohilla, V., Chakraborty, S., & Kaur, M. (2022). Artificial Intelligence and Metaheuristic-Based Location-Based Advertising. Scientific Programming. https://doi.org/10.1155/2022/7518823
    49. Rohilla, V., Chakraborty, S., & Kumar, R. (2020). Car Auomation Simulator Using Machine Learning. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3566915
    50. Rohilla, V., Chakraborty, S., & Kumar, R. (2022). Deep learning based feature extraction and a bidirectional hybrid optimized model for location based advertising. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-022-12457-3
    51. Rohilla, V., Kumar, M. S. S., Chakraborty, S., & Singh, M. S. (2019). Data Clustering using Bisecting K-Means. Proceedings - 2019 International Conference on Computing, Communication, and Intelligent Systems, ICCCIS 2019. https://doi.org/10.1109/ICCCIS48478.2019.8974537
    52. Sahil Arora, & Apoorva Tewari. (2023). Fortifying Critical Infrastructures: Secure Data Management with Edge Computing. International Journal of Advanced Research in Science, Communication and Technology, 3(2), 946–955. https://doi.org/10.48175/IJARSCT-12743E
    53. sahil Arora, P. K. (2023). The Role of Machine Learning in Personalizing User Experiences. JETIR, 11(6), 1–1.
    54. Sahoo, K., Samal, A. K., Pramanik, J., & Pani, S. K. (2019). Exploratory data analysis using python. International Journal of Innovative Technology and Exploring Engineering. https://doi.org/10.35940/ijitee.L3591.1081219
    55. Singh, A. P. A. (2022). STRATEGIC APPROACHES TO MATERIALS DATA COLLECTION AND INVENTORY MANAGEMENT. International Journal of Business Quantitative Economics and Applied Management Research, 7(5).
    56. Singh, A. P. A., & Gameti, N. (2024). Leveraging Digital Twins for Predictive Maintenance: Techniques, Challenges, and Application. IJSART, 10(09), 118–128.
    57. Singh, D., & Singh, B. (2020). Investigating the impact of data normalization on classification performance. Applied Soft Computing. https://doi.org/10.1016/j.asoc.2019.105524
    58. Sinha, H. (2024a). A Comprehensive Study on Air Quality Detection Using ML Algorithms. Journal of Emerging Technologies and Innovative Research (JETIR) Www.Jetir.Org, 11(9), b116–b122.
    59. Sinha, H. (2024b). Benchmarking Predictive Performance Of Machine Learning Approaches For Accurate Prediction Of Boston House Prices : An In-Depth Analysis. Ternational Journal of Research and Analytical Reviews (IJRAR), 11(3).
    60. Sinha, H. (2024c). Predicting Bitcoin Prices Using Machine Learning Techniques With Historical Data. International Journal of Creative Research Thoughts (IJCRT), 12(8). https://doi.org/10.3390/e25050777
    61. Sinha, H. (2024d). Predicting Employee Performance in Business Environments Using Effective Machine Learning Models. International Journal Of Novel Research And Developmen, 9(9), 875–881.
    62. Sinha, H. (2024e). The Identification of Network Intrusions with Generative Artificial Intelligence Approach for Cybersecurity. Journal of Web Applications and Cyber Security, 2(2), 20–29. https://doi.org/10.48001/jowacs.2024.2220-29
    63. Spiliotis, E. (2022). Decision Trees for Time-Series Forecasting. Foresight: The International Journal of Applied Forecasting.
    64. Suyambu, P. K. V. and M. R. (2023). A Study on Energy Management Systems (EMS) in Smart Grids Industry. International Journal of Research and Analytical Reviews (IJRAR), 10(02), 558–563.
    65. Tandon, R. (2024a). An Analysis Of COVID-19 Tweets Sentiments Based On Large Language Models (Llms). International Journal of Research and Analytical Reviews (IJRAR), 11(3), 319–328.
    66. Tandon, R. (2024b). The Machine Learning Based Regression Models Analysis For House Price Prediction. International Journal of Research and Analytical Reviews (IJRAR), 11(3), 296–305.
    67. Thomas, J. (2024). Optimizing Nurse Scheduling : A Supply Chain Approach for Healthcare Institutions. 2251–2259.
    68. Thomas, J., Vedi, K. V., & Gupta, S. (2023). An analysis of sustainable e-commerce logistics in supply chain management.
    69. Thomas, J., Vedi, K. V., & Gupta, S. (2024). Artificial Intelligence and Big Data Analytics for Supply Chain Management. International Research Journal of Modernization in Engineering Technology and Science, 06(09). https://doi.org/DOI : https://www.doi.org/10.56726/IRJMETS61488
    70. Thomas, J., & Vedi, V. (2021). Enhancing Supply Chain Resilience Through Cloud-Based SCM and Advanced Machine Learning: A Case Study of Logistics. Journal of Emerging Technologies and Innovative Research (JETIR), 8(9).
    71. Thomas, J., Vummadi, J., & Shah, R. (2024). Machine Learning Driven Device for Enhanced Quality Oversight in Supply Chains.
    72. Thota, S. R., & Arora, S. (2024a). COLLABORATIVE FILTERING AND KNOWLEDGE GRAPHS FOR DATA DISCOVERY. 05, 8679–8692.
    73. Thota, S. R., & Arora, S. (2024b). Neurosymbolic AI for Explainable Recommendations in Frontend UI Design-Bridging the Gap between Data-Driven and Rule-Based Approaches. International Research Journal of Engineering and Technology, 11(5).
    74. Tyralis, H., & Papacharalampous, G. (2017). Variable selection in time series forecasting using random forests. Algorithms. https://doi.org/10.3390/a10040114
    75. Universit, L., Curie, M., Bo, P. V. I., Cedex, P., & Yu, B. (2012). Analysis of a Random Forests Model. 13, 1063–1095.
    76. Vennemann, B., Obrist, D., & Rösgen, T. (2019). Automated diagnosis of heart valve degradation using novelty detection algorithms and machine learning. PLoS ONE. https://doi.org/10.1371/journal.pone.0222983
    77. Vishwakarma, M. R. S. and P. K. (2022). An Efficient Machine Learning Based Solutions for Renewable Energy System. International Journal of Research and Analytical Reviews (IJRAR), 9(4), 951–958.
    78. Wang, F., Yan, M., Li, Q., & Wang, C. (2023). A Multivariate Time Series Anomaly Detection Model Based on Spatio-Temporal Dual Features. Proceedings - 2023 International Conference on Networking and Network Applications, NaNA 2023. https://doi.org/10.1109/NaNA60121.2023.00075
    79. Zhou, Q. F., Zhou, H., Ning, Y. P., Yang, F., & Li, T. (2015). Two approaches for novelty detection using random forest. Expert Systems with Applications. https://doi.org/10.1016/j.eswa.2014.12.028

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Copyright (c) 2025 The Author

How to cite

Sinha, H. (2024). Analysis of anomaly and novelty detection in time series data using machine learning techniques. Multidisciplinary Science Journal, 7(6), 2025299. https://doi.org/10.31893/multiscience.2025299
  • Article viewed - 1823
  • PDF downloaded - 499