Empirical Predictors of Juvenile Crime: A PRISMA-Guided Review for Machine Learning Feature Selection
List of Authors
  • Ganthan Narayana Samy, Noor Hafizah Hassan, Norziha Megat Mohd Zainuddin, Nurazean Maarop, Pritheega Magalingam, Roslina Mohammad, Wan Faezah Abbas1

Keyword
  • juvenile crime, systematic review, machine learning, feature selection

Abstract
  • Juvenile delinquency is a significant challenge for international policymakers and academics, necessitating predictive and preventative solutions based on empirical data. The emergence of data-driven methodologies, especially machine learning (ML), necessitates the identification and enhancement of predictors of young criminality for effective modelling and intervention. The objective of this systematic study is to consolidate empirical predictors of adolescent criminality across various geographic and methodological settings and to present a thematic framework for guiding feature selection in machine learning prediction models. Adhering to PRISMA guidelines, 42 academic research published from 2015 to 2025 were selected from three databases, including six geographical areas. Research was evaluated according to methodological approach, types of data sources, and identified predictors. Recurring variables were thematically synthesised and evaluated for compatibility with machine learning predicting features. Quantitative methodologies predominated the literature (n=30), especially regression and econometric studies, with an increasing use of predictive modelling beyond 2020. Variables associated with education, parental influences, socioeconomic indicators, and behavioural aspects were identified as the predominant predictors. Five super themes namely Family and Home Environment, Education and School Context, Socioeconomic Disadvantage, Community and Behavioural Influence, and Legal/Systemic Context were identified, with corresponding applicability for machine learning feature engineering. This article presents a systematic and scalable approach for integrating empirically verified variables into machine learning models for predicting adolescent criminality. It underscores significant deficiencies in regional representation and advocates for data-driven strategies in adolescent crime prevention and policy formulation.

Reference
  • No Data Recorded