Age Classification in Forensic Medicine Using Machine Learning Techniques

Zolotenkova G.V., Rogachev A.I., Pigolkin Y.I., Edelev I.S., Borshchevskaya V.N., Cameriere R.

Key words: forensic medicine; age diagnostics; age groups; machine learning techniques; nonlinear dimensionality reduction methods.

The aim of the study was to assess the capabilities of age determination (age group) at death using classification techniques by histomorphometric characteristics of osseous and cartilaginous tissue aging.

Materials and Methods. The study material was a database containing the findings of morphometric researches of osseous and cartilaginous tissue histologic specimens from 294 categorized male corpses aged 10–93 years. For data analysis and classification we used modern machine learning methods: k-NN, SVM, logistic regression, CatBoost, SGD, naive Bayes, random forest, nonlinear dimensionality reduction methods (t-SNE and uMAP), and recursive feature elimination for feature selection.

Results. The used techniques (algorithms) provided effective representation of a complex data set (76 histomorphometricfeatures), allowing to reveal the cluster structure inside the low dimensional feature space, thus fitting the classifier becomes even more reasonable. During feature selection, we estimated their importance for age group classification and studied the relationship between classification quality and the number of features inside the feature space. Data pre-processing made it possible to get rid of noise and keep most informative features, thereby accelerating a learning process and improving the classification quality. Data projection showed more well-defined cluster structure in the space of selected features. The accuracy of establishing certain groups was equal to 90%. It proves high efficiency of machine learning techniques used for forensic age diagnostics based on histomorphometric findings.

Introduction

Age diagnostics is a key element in the identification of personality [1, 2]. Morphological changes of tissues and organs occurring in postnatal ontogenesis are not always the effects of ageing. The effect of a variety of factors including both endogenous (genetic predisposition, comorbidities, mass-height characteristics, and others), and exogenous (occupation, bad habits, ecological problems) causes the discrepancy between biological and real age with maximum bias of a resulting assessment in middle and elderly age groups [3, 4]. Linear regression used in expert practice contributes to the end result error increase in age prognosis. Considering the fact that tissue and organ aging processes have complex dynamics and cannot be described by simple linear dependencies, most researchers concur on inexpedience of such approach, since it fails to solve an objective [5–7].

Currently, the data set of quantitative aging indices of various organs and tissues has been stored [7–14]. For the most part, the established databases are “noisy”, since they contain a large number of heterogeneous indices hampering their processing and making a final solution. In such cases, it is reasonable to use nonlinear data mining with well-marked roundup properties [15]. Modern information technologies (machine learning techniques) are promising [16–18].

The aim of the study was to assess the capabilities of age determination (age group) at death using classification techniques by histomorphometric characteristics of bone and cartilaginous tissue aging.

Materials and Methods

To attain the goal, the following study design was suggested:

1) feature selection and reduction of feature space using the chosen algorithm;

2) comparative analysis and classifier selection providing the maximum accuracy making the age prediction of an unknown person;

3) defining an optimal age interval to achieve the best performance in terms of maximum accuracy and reliability.

Forensic medical examination of an unknown person frequently states possible age indicating confidential or expected intervals, this referring is to a synthesizing assessment of numerous parameters to determine the boundaries of an age group the identified persons belongs to. This is an example of the classification task, which considers a variety of previously labeled objects described in some feature space and is used as a training sample to fit a model, which is capable to classify unlabeled data. To solve the task, we built the model classifying objects of similar nature. As a class label, we used an age group of an object. In the present study, for training and validation of different classifiers, we used databases of age-related changes of bone and cartilaginous tissues from 294 male corpses with known race (ethnic homogeneity) and age (from 10 to 93 years). Numerical values were taken from literature [2, 8–10] in micro-osteometric study of histological specimens of diaphysis and epiphysis of long bones (В1–В24) and thyroid cartilage (С1–С28).

On the basis of a default age group labeling recommended by VII All-Union scientific conference on age morphology, physiology and biochemistry, 7 age groups were distinguished: under 12 years; from 13 to 18 years; from 19 to 21 years; from 22 to 35 years; from 36 to 60 years; from 61 to 75 years; over 75 years. When forming the groups, we took into consideration literature data and the results of our previous studies. So, the ground for establishing the upper value of the second age group was the presence of epiphyseal plate, which can be found only in humans under 18 years of age.

For differentiated study, we divided the material into 10-year age intervals (Table 1).

Table 1. Distribution of autopsy objects (corpses) to study by ten-year intervals

In order to fit a model to classify an age group of people using the given features, the following machine learning algorithms were used: random forest, CatBoost, k-NN, logistic regression, SGD, SVM, naive Bayes, t-SNE and uMAP, Python programming language, scikit-learn library. Within the study, we applied the techniques based on conceptually different approaches. Their brief summary is as follows.

k-nearest neighbors algorithm (k-NN) keeps the information on all objects of a training sample. For a new object, which is going to be classified, the closest points in the data of distance metric given in advance are found. Among the nearest k-points, the most common class is determined, which will be used as a model output.

In case of a classical logistic regression, we used the following methods: logistic regression from sklearn library and stochastic gradient descent (SGD) from Vowpal Wabbit based on a gradient descent — an iterative process, during which model weights are updated.

Support vector machine (SVM) in contrast to logistic regression, which has prerequisites, is based on object set geometry in feature space. SVM constructs maximum-margin hyperplane between objects of different classes, and it can be constructed both in an initial feature space and also in its new representation resulting from using a kernel — the function, which makes points in other feature space according to initial data. Thus, a separating hyperplane in the initial feature space differs from that in the initial space.

Naive Bayes classifier arranges the objects based on Bayes’ theorem. The approach utilizes even a small amount of data available for learning, evaluation of parameters, and classification.

Decision tree models a decision-making process by an expert. The model has a graph structure, in each tree node there is a decision rule that defines the next node, and in the tree leaves, there are the resulting labels of classes. Tree traverse from the roots and further according to the rules and feature values in a certain object defines the classification procedure.

Random forest is an example of an ensemble of decision trees. Simultaneously, several trees are fitted using different subsets of features during a fitting process. A final result is obtained by voting: each tree provides a class label, and the one receiving the majority vote in the ensemble is used as a final result.

CatBoost is based on a gradient boosting, in contrast to random forest, which is an example of bagging demonstrating an alternative way of model assembling. The essence of boosting usage is in the combination of weak (with low generalization ability) functions (low depth trees were used within the present study), which are fitted during an iterative process when at every step a new model is learnt using the data on the errors of the previous ones.

Classification quality was evaluated during cross-validation. The experiments were carried out on 5 folds. The quality of classification algorithms was estimated by F1-score metrics, and a model with the highest parameter value was chosen. To assess the classification, an error matrix was plotted which enabled to comprehend how a model was making errors, and where misclassified objects were referred to; as well as distinguish classes, the work with which causes most errors. This information was used to search ten-year age intervals in each class during the experiments, which can be achieved if there is an optimum relationship between quality of classifier work and intraclass age dispersion. We analyzed ROC curves plotted for each class: diagram form and area enabled to reveal problem classes (age groups).

Results

Initial data was presented in the form of two-dimension images taken with the use of nonlinear dimension reduction techniques: t-SNE and uMAP enable to reduce dimension of feature space for the following data imaging. The techniques enable to get the low-dimensional representation of objects in such a way that the objects similar in the initial feature space are modeled by closely adjacent points, while dissimilar points are located as far apart as possible. Figure 1 represents the imaging results indicating close intermingling of age groups without evident clustering before feature selection. Those from age groups 1 (under 12 years) and 2 (13–18 years), which at this stage are visually separable, stand apart.

Figure 1. Two-dimensional display of initial data before feature selection:

(а) using t-SNE; (b) using uMAP

The following stage was feature selection using recursive feature elimination involving decision trees and Gini coefficient (Figure 2). During the procedure, a model based on the selected algorithm was initially trained on all features, among which the less informative ones were chosen. In our case, importance is considered as a feature contribution to Gini coefficient value decrease when searching optimal split, i.e. the way, how efficiently we can separate the objects of different classes using a certain feature. This feature is eliminated, and the described procedure is repeated, the informative capacities of features are recalculated.

Figure 2. Informativity of features [2, 8–10]

According to the findings, quantitative characteristics of age-related changes in the thyroid cartilage make the major contribution to target clustering; the changes relate to the processes of maturation and further ossification of the cartilaginous tissue itself, the replacement of reticular tissue by adipose tissue. Figure 2 shows the ranked list, where the following characteristics have the leading positions: osseous tissue area per field of vision of a histological specimen of the thyroid cartilage (С2), weighting (С28) in the thyroid cartilage radiograph of osseous (С1) and cartilaginous (С17) tissues. In addition, one should consider both cartilaginous tissue area and new cartilage zone thickness (С23), and also its correlation with the mature cartilage zone (С25). The areas of adipose (С14) and reticular (С15) tissues per field of vision of the thyroid cartilage also characterize the cartilage ageing and atrophic process in it. It stands to mention the validity of selecting not just osseous tissue area, but considering dimensional characteristics of trabecula per field of vision: their thickness (С13) and area (С6).

An expert and comparative analysis confirmed the practicability of selecting microosteometric indices as objective markers of age-related osseous tissue changes. The significance of characteristics indicating morphological changes in histostructure was noted, primarily, in the diaphysis compact bone: the thickness of inner (В10) and outer (В12) circumferential lamellae, osteone area (В11), and their quantitative characteristic (В23). Such remodeling parameters as Haversian canal diameter (B21) and its relationship with osteone diameter (B22) serve as an objective prediction of bone age. The importance of these features was revealed using random forest during EDA, since the correlation coefficients of the characteristics obtained through their analysis by descriptive statistics were 0.4 and 0.3, respectively. The circumstance proves efficient usage of random forest for the stated objective. Quantitative estimation of osseous tissue changing processes is extensively used as an objective measure of the biological age. While a person grows, develops, and ages, the evidences of an increasing number of structure element cycles are accumulating, therefore, the number of osteones with restricted central part (В18) is also consistently an attributive age index. The importance of age-related changes of spongy substance of the lower epiphysisа was noted: the number of osteones (В13), dimensional characteristics of the cartilaginous tissue area (В7).

The study analyzed the dependence of classifiers’ work quality on the number of features sorted out by importance decrease. It should be noted that for the algorithms based on decision trees technique (CatBoost, random forest) and applied for two-dimensional displays aimed at further imaging, the quality remains nearly the same up to using all initial features; that can be explained by the fact that these algorithms may select features themselves. Other algorithms are found to have the tendency for quality degradation after using more than 28 features. For this reason, exactly 28 features were selected (Figure 3).

Figure 3. Dependence of F1-score value on the number of features used

After feature selection, the data were re-imaged (Figure 4). The more noticeable cluster structure was revealed. Moreover, neighboring age groups are closely located, while persons with a greater age gap are far from each other. t-SNE and uMAP techniques cannot handle missing data, so the missing values in initial data were filled by the feature value in the previous person in age-sorted-out data. It could make some points get into neighboring clusters, but even considering this fact, there were no situations when a person got into a cluster, the average age of which was very much different from his own.

Figure 4. Two-dimensional display of initial data after feature selection:

(а) using t-SNE; (b) using uMAP

To solve the problem, we compared the operational efficiency of classifiers. Table 2 demonstrates the results.

Table 2. Operating quality of algorithms on studied data

For further experiments, we chose random forest algorithm, since it showed the best performance on the data considered. The model quality was evaluated using cross-validation on 5 iterations. Confusion matrix (representation of real and predicted by algorithms class marks) and ROC curves for each class particularly were plotted (Figure 5). An error curve (receiver operating characteristic, ROC) is a curve, which enables to assess classification quality: it shows the relation between the proportion of objects from the total number of objects of positive classes, which were classified correctly (classification algorithm sensitivity), and the proportion of objects from the total number of negative class objects, which were referred to a positive class by mistake (classification algorithm specificity) when changing a decision rule threshold. The area under curve serves as a numerical characteristic of the model operating quality.

Figure 5. ROC curves for each age group

According to the data of the error matrix and ROC curves, the assessment of classification accuracy of objects by the developed model, the objects belonging to the 1^st, 2^nd, 3^rd, 6^th age groups, is 100%; those belonging to group 4 — 90%; group 5 — 92%. Steady fragmentation of these groups is the reflection of fundamental processes of postnatal ontogenesis in the chondro-osseous system. A maturation stage in groups 1–3 is changed by stabilization (group 4) and is ended by involutive transformations (groups 5–7). The least accuracy was found in age group 7. In people over 75, the condition of tissues and organs is due to both: age involution and also the resulting effect of the bulk of attending factors (diseases, medication intake, bad habits, nutrition, lifestyle, etc.), which have a marked cumulative effect.

Figure 6 shows the result evaluations of the classification by the groups corresponding to a ten-year interval. Classification accuracy of group 2 (11–20 years) is 67%. The interval includes the periods of active growth and development of all organs and systems, puberty period that can result in uneven and heterogeneous indices. Therefore, relatively low accuracy is related to an age range of this age group. Classification accuracy decrease in group 7 (61–70 years), on the one hand, is related to the effect associated with the age of diseases, and on the other hand — it is in this decade when we found some retardation of age involution. These circumstances, to our opinion, can explain significant data scattering, and as a direct consequence, the objects could fall within neighboring groups. According to the error matrix data, classification accuracy of the 1^st, 3^rd, 4^th, 5^th, 6^th, 8^th age intervals on validation sampling is 100%, and for the 9^th interval — 80%.

Figure 6. Error matrix

Discussion

Histomorphometry of osseous tissue has been used for a long time to determine age [19–21]. There are many different modifications of techniques based on quantitative accounting of rearranging processes in osseous and cartilaginous tissues, as well as specimens (their selection) for investigations [22–24]. We are in agreement with the authors of the study [24] about the necessity to refuse the statement that osseous tissue remodeling occurs at unpredictable speed. In its turn, it means the refusal to use linear models and causes further development of histomorphometric methods.

The analysis of histological quantitative variables demonstrates that they exhibit complex relations with age. Their relationship with gender, health condition (the presence of diseases, medication intake), biomechanics is no less important. It should prevent “simple” models creation — equations of linear regression — using universal indices of histological age estimation. That is why in our research to study a complex of histomorphometric indices we used the algorithms, which enables to analyze nonlinear dependencies (for instance, SVM with pertinent kernel, decision tree) and the techniques based on them, such as random forest or CatBoost. For imaging similar data, t-SNE and uMAP are expected to work better than classical methods, such as principal component analysis (PCA), which is used for similar purposes and demonstrates the notably less informative result [15]. The classification accuracy (±30 years) achieved by the authors of the work [15] can be related to the fact that there were studied the databases of qualitative estimation of morphological changes in pubic articulation.

A histomorphometric method, the findings of which were the material for the present study, is a quantitative measure and enables impartially approach age diagnostics. The obtained results showed feature space reduction to be a requisite measure resulting in no loss of classification quality. The operating efficiency of algorithms depends on the number of features, and in case there are 20–30 features, an adequate accuracy is attained. Then, it decreases up to the accuracy of linear classifiers. It should be emphasized that it is referred to the aggregate features. Maximum accuracy and reliability of the ultimate result were achieved at an integrated assessment of age-related changes of different types of osseous and cartilaginous tissues.

The application of an advanced technique of nonlinear dimensional reduction uMAP combined with a well-reputed t-SNE after feature selection provided an opportunity to observe the manifestation in the data of cluster structure, which was absent at a primary stage. It goes to show, on the one hand, the practicability of feature filtering by their informative value, and on the other hand — the correctness of the selection made. The studies also demonstrated feature space reduction to be a requisite measure resulting in no classification quality loss. Classification of objects using a ranked list of histomorphometric indices enabled us to obtain significant results concerning diagnostic accuracy and reliability of the required age group. A decision tree technique demonstrated the capability to independently select features in the course of work. Regardless of their quantity, the technique gives preference to the most informative ones. Random forest algorithm appeared to be the most productive among other classifiers considered in the study to solve the assigned initial objective; such circumstance confirms the advantage of the applied algorithm to achieve the stated objective.

Conclusion

The obtained results proved the prospectiveness of machine learning techniques used in forensic expert medical practice to determine age, since the techniques showed rather high (about 90%) accuracy of the end result.

The carried out study using data mining enabled to arrange an optimal set of informative histomorphometric features of age-related changes, which is reasonable to use in order to establish a digital database as a constructive basis for data accumulation and systematization in forensic age determination. The formation of such lists enables to unify further researches in age morphology, and thereby extend “training” array data for prognoses. In fact, the issues with a learning sample (a great number of various features in a small number of observations) are the main constraining factor when implementing machine learning techniques in medicine.

The study of age involution principles with the formation of data warehouse of quantitative characteristics (ageing biomarkers) is a fundamental scientific challenge. High social significance of such researches is due to an increasing proportion of elderly. The obtained results are likely to be of interest for different medical spheres and biology including personified medicine.

Study funding. The study was supported by Russian Foundation of Fundamental Research grant No.19-07-00982a.

Conflicts of interest. The authors declare no conflicts of interest related to the present study.

References

Garvin H., Passalacqua N.V., Uh N.M., Gipson D.R., Overbury R.S., Cabo L.L. Developments in forensic anthropology: age-at-death estimation. In: Dirkmaat D.C. (editor). A companion to forensic anthropology. Chichester: Wiley-Blackwell; 2012; p. 202–223, https://doi.org/10.1002/9781118255377.ch10.
Glybochko P.V., Pigolkin Yu.I., Nikolenko V.N., Zolotenkova G.V., Efimov A.A., Alekseev Yu.D., Fedulova M.V., Savenkova E.N., Kurzin L.M., Goncharova N.N., Yurchenko M.A., Miroshnichenko N.V. Sudebno-meditsinskaya diagnostika vozrasta [Forensic diagnostics of age]. Moscow: Pervyy MGMU imeni I.M. Sechenova; 2016.
Schmitt A., Murail P., Cunha E., Rougé D. Variability of the pattern of aging on the human skeleton: evidence from bone indicators and implications on age at death estimation. J Forensic Sci 2002; 47(6): 1203–1209, https://doi.org/10.1520/jfs15551j.
Mays S. The effect of factors other than age upon skeletal age indicators in the adult. Ann Hum Biol 2015; 42(4): 332–341, https://doi.org/10.3109/03014460.2015.1044470.
Ferrante L., Skrami E., Gesuita R., Cameriere R. Bayesian calibration for forensic age estimation. Stat Med 2015; 34(10): 1779–1790, https://doi.org/10.1002/sim.6448.
Bucci A., Skrami E., Faragalli A., Gesuita R., Cameriere R., Carle F., Ferrante L. Segmented Bayesian calibration approach for estimating age in forensic science. Biom J 2019; 61(6): 1575–1594, https://doi.org/10.1002/bimj.201900016.
Hartnett K.M. Analysis of age-at-death estimation using data from a new, modern autopsy sample — part I: pubic bone. J Forensic Sci 2010; 55(5): 1145–1151, https://doi.org/10.1111/j.1556-4029.2010.01399.x.
Pigolkin Yu.I., Zolotenkova G.V., Sereda A.P., Zolotenkov D.D., Gridina N.V. Histometric symptoms of age-sensitive changes of bone tissue. Adv Gerontol 2018; 31(2): 203–210.
Pigolkin Yu.I., Poletaeva M.P., Zolotenkova G.V., Volkov A.V. The age-specific changes in the histological structure of the thyroid cartilage in the men. Sudebno-medicinskaja ekspertiza 2017; 60(5): 11–14, https://doi.org/10.17116/sudmed201760511-14.
Pigolkin Yu.I., Poletaeva М.P., Zolotenkova G.V. Age determine by the age of the thyroid cartilage by the radiological method in forensic medicine. Rossijskij ehlektronnyj zhurnal luchevoj diagnostiki 2017; 7(4): 23–29, https://doi.org/10.21569/2222-7415-2017-7-4-23-29.
Pigolkin Yu.I., Zolotenkova G.V., Berezovskii D.P. Methodological basis for determining a person’s age. Sudebno-meditsinskaya ekspertisa 2020; 63(3): 45–50, https://doi.org/10.17116/sudmed20206303145.
Kovalev А.V., Аmetrin M.D., Zolotenkova G.V., Gerasimov А.N., Gornostaev D.V., Poletaeva M.P. Forensic medical determination of the age based on the analysis of CT-scanograms of the skull and the craniovertebral region in the sagittal projection. Sudebno-meditsinskaya ekspertisa 2018; 61(1): 21–27, https://doi.org/10.17116/sudmed201861121-27.
Pigolkin Yu.I., Tkachenko S.B., Zolotenkova G.V., Velenko P.S., Zolotenkov D.D., Safroneeva Yu.L. The comprehensive evaluation of the age-specific changes in the skin. Sudebno-meditsinskaya ekspertisa 2018; 61(3): 15–18, https://doi.org/10.17116/sudmed201861315-18.
Pigolkin Yu.I., Zolotenkova G.V. Age-specific changes in the cerebral cortex capillaries. Sudebno-meditsinskaya ekspertisa 2014; 57(1): 4–10.
Buk Z., Kordik P., Bruzek J., Schmitt А., Snorek М. The age at death assessment in a multi-ethnic sample of pelvic bones using nature-inspired data mining methods. Forensic Sci Int 2012; 220(1–3): 294.e1–294.e9, https://doi.org/10.1016/j.forsciint.2012.02.019.
Moskalenko V.A., Nikolskiy A.V., Zolotykh N.Yu., Kozlov A.A., Kosonogov K.A., Kalyakulina A.I., Yusipov I.I., Levanov V.M. Cyberheart-diagnostics software package for automated electrocardiogram analysis based on machine learning techniques. Sovremennye tehnologii v medicine 2019; 11(2): 86–91, https://doi.org/10.17691/stm2019.11.2.12.
Andryushchenko V.S., Uglov A.S., Zamyatin A.V. Statistical classification of immunosignatures under significant reduction of the feature space dimensions for early diagnosis of diseases. Sovremennye tehnologii v medicine 2018; 10(3): 14–20, https://doi.org/10.17691/stm2018.10.3.2.
Samoyavcheva S.V., Shkarin V.V. Capabilities of cluster analysis in interpretation of 24-hour blood pressure monitoring data in patients with arterial hypertension and left ventricular remodeling. Sovremennye tehnologii v medicine 2015; 7(4): 113–118, https://doi.org/10.17691/stm2015.7.4.15.
Kerley E.R. The microscopic determination of age in human bone. Am J Phys Anthropol 1965; 23(2): 149–164, https://doi.org/10.1002/ajpa.1330230215.
Stout S.D. The use of cortical bone histology to estimate age at death. In: Işcan M.Y. (editor). Age markers in the human skeleton. Springfield: Charles C. Thomas; 1989, https://doi.org/10.1002/ajhb.1310030516.
Crowder C.M., Pfeiffer S. The application of cortical bone histomorphometry to estimate age at death. In: Latham K.E., Finnegan J.M., Rhine S. (editors). Age estimation of the human skeleton. Springfield: Charles C. Thomas; 2010.
Crowder C.M., Dominguez V.M. A new method for histological age estimation of the femur. In: Proceedings of the American Academy of Forensic Sciences; Vol. 18. Atlanta; 2012; p. 374–375.
Doyle E., Márquez-Grant N., Field L., Holmes T., Arthurs O.J., van Rijn R.R., Hackman L., Kasper K., Lewis J., Loomis P., Elliott D., Kroll J., Viner M., Blau S., Brough A., de las Heras S.M., Garamendi P.M. Guidelines for best practice: imaging for age estimation in the living. J Forensic Radiol Imaging 2019; 16: 38–49, https://doi.org/10.1016/j.jofri.2019.02.001.
Crowder C. Evaluating the use of quantitative bone histology to estimate adult age at death. PhD Thesis. Toronto: University of Toronto, Department of Anthropology; 2005.