Search Immortality Topics:

Page 12«..11121314..2030..»


Category Archives: Machine Learning

Use of machine learning to assess the prognostic utility of radiomic … – Nature.com

Centers for Disease Control and Prevention. CDC covid data tracker. https://covid.cdc.gov/covid-data-tracker/ (Accessed 13 June 2022) (2022).

Karim, S. S. A. & Karim, Q. A. Omicron sars-cov-2 variant: A new chapter in the covid-19 pandemic. Lancet 398(10317), 21262128 (2021).

Article CAS PubMed PubMed Central Google Scholar

Kupferschmidt, K. & Wadman, M. Delta variant triggers new phase in the pandemic. Science 372(6549), 13751376 (2021).

Article ADS CAS Google Scholar

McCue, C. et al. Long term outcomes of critically ill covid-19 pneumonia patients: Early learning. Intensive Care Med. 47(2), 240241 (2021).

Article CAS PubMed Google Scholar

Michelen, M. et al. Characterising long term covid-19: A living systematic review. BMJ Glob. Health 6(9), e005427 (2021).

Article PubMed Google Scholar

Jacobi, A. et al. Portable chest x-ray in coronavirus disease-19 (covid-19): A pictorial review. Clin. Imaging 64, 3542 (2020).

Article PubMed PubMed Central Google Scholar

Kim, H. W. et al. The role of initial chest x-ray in triaging patients with suspected covid-19 during the pandemic. Emerg. Radiol. 27(6), 617621 (2020).

Article PubMed PubMed Central Google Scholar

Akl, E. A. et al. Use of chest imaging in the diagnosis and management of covid-19: A who rapid advice guide. Radiology 298(2), E63E69 (2021).

Article PubMed Google Scholar

Borkowski, A. A. et al. Using artificial intelligence for covid-19 chest x-ray diagnosis. Fed. Pract. 37(9), 398404 (2020).

PubMed PubMed Central Google Scholar

Balbi, M. et al. Chest x-ray for predicting mortality and the need for ventilatory support in covid-19 patients presenting to the emergency department. Eur. Radiol. 31(4), 19992012 (2021).

Article CAS PubMed Google Scholar

Maroldi, R. et al. Which role for chest x-ray score in predicting the outcome in covid-19 pneumonia?. Eur. Radiol. 31(6), 40164022 (2021).

Article CAS PubMed Google Scholar

Monaco, C. G. et al. Chest x-ray severity score in covid-19 patients on emergency department admission: A two-centre study. Eur. Radiol. Exp. 4(1), 68 (2020).

Article PubMed PubMed Central Google Scholar

Hussain, L. et al. Machine-learning classification of texture features of portable chest x-ray accurately classifies covid-19 lung infection. Biomed. Eng. Online 19(1), 88 (2020).

Article PubMed PubMed Central Google Scholar

Ismael, A. M. & engr, A. Deep learning approaches for covid-19 detection based on chest x-ray images. Expert Syst. Appl. 164(114), 054 (2021).

Google Scholar

Salvatore, M. et al. A phenome-wide association study (phewas) of covid-19 outcomes by race using the electronic health records data in michigan medicine. J. Clin. Med. 10(7), 1351 (2021).

Article CAS PubMed PubMed Central Google Scholar

Spector-Bagdady, K. et al. Coronavirus disease 2019 (covid-19) clinical trial oversight at a major academic medical center: Approach of michigan medicine. Clin. Infect. Dis. 71(16), 21872190 (2020).

Article CAS PubMed Google Scholar

Nypaver, M. et al. The michigan emergency department improvement collaborative: A novel model for implementing large scale practice change in pediatric emergency care. Pediatrics 142(1 MeetingAbstract), 105 (2018).

Article Google Scholar

Abbas, A., Abdelsamea, M. M. & Gaber, M. M. Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network. Appl. Intell. 51, 854864 (2021).

Article Google Scholar

Gupta, A. et al. Association between antecedent statin use and decreased mortality in hospitalized patients with COVID-19. Nat. Commun. 12(1), 1325 (2021).

Article ADS CAS PubMed PubMed Central Google Scholar

Cox, D. R. Regression models and life tables (with discussion). J. R. Stat. Soc. B 34(2), 187220 (1972).

MATH Google Scholar

Therneau, T. M. & Grambsch, P. M. Modeling survival data: Extending the Cox model. In The Cox Model 3977 (Springer, 2000).

MATH Google Scholar

Plsterl, S., Navab, N. & Katouzian, A. An efficient training algorithm for kernel survival support vector machines. https://doi.org/10.48550/arXiv.1611.07054 (Preprint posted online November 21, 2016).

Ishwaran, H. et al. Random survival forests. Ann. Appl. Stat. 2(3), 841860 (2008).

Article MathSciNet MATH Google Scholar

Hothorn, T. et al. Survival ensembles. Biostatistics 7(3), 355373 (2006).

Article PubMed MATH Google Scholar

Zhou, Z. H. Ensemble Methods: Foundations and Algorithms (CRC Press, 2012).

Book Google Scholar

Zwanenburg, A. et al. Image biomarker standardisation initiative. https://doi.org/10.48550/arXiv.1612.07003 (Preprint posted online December 21, 2016)

Harrell, F. E. et al. Evaluating the yield of medical tests. JAMA 247(18), 25432546 (1982).

Article PubMed Google Scholar

Harrell, F. E. Jr., Lee, K. L. & Mark, D. B. Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med. 15(4), 361387 (1996).

Article PubMed Google Scholar

Holste, G. et al. End-to-end learning of fused image and non-image features for improved breast cancer classification from mri. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 32943303 (2021).

Zhou, H. et al. Diagnosis of distant metastasis of lung cancer: Based on clinical and radiomic features. Transl. Oncol. 11(1), 3136 (2018).

Article PubMed Google Scholar

Militello, C. et al. CT Radiomic Features and Clinical Biomarkers for Predicting Coronary Artery Disease. Cogn. Comput. 15(1), 238253 (2023).

Article Google Scholar

Huang, S. C. et al. Multimodal fusion with deep neural networks for leveraging CT imaging and electronic health record: A case-study in pulmonary embolism detection. Sci. Rep. 10(1), 19 (2020).

Article Google Scholar

Liu, Z. et al. Imaging genomics for accurate diagnosis and treatment of tumors: A cutting edge overview. Biomed. Pharmacother. 135, 111173 (2021).

Article CAS PubMed Google Scholar

Tomaszewski, M. R. & Gillies, R. J. The biological meaning of radiomic features. Radiology 298(3), 505516 (2021).

Article PubMed Google Scholar

Brouqui, P. et al. Asymptomatic hypoxia in COVID-19 is associated with poor outcome. Int. J. Infect. Dis. 102, 233238 (2021).

Article CAS PubMed Google Scholar

Struyf, T. et al. Cochrane COVID-19 Diagnostic Test Accuracy Group. Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID19. Cochrane Database Syst. Rev. (5) (2022).

Garrafa, E. et al. Early prediction of in-hospital death of covid-19 patients: A machine-learning model based on age, blood analyses, and chest x-ray score. Elife 10, e70640 (2021).

Article CAS PubMed PubMed Central Google Scholar

Schalekamp, S. et al. Model-based prediction of critical illness in hospitalized patients with covid-19. Radiology 298(1), E46E54 (2021).

Article PubMed Google Scholar

Soda, P. et al. Aiforcovid: Predicting the clinical outcomes in patients with covid-19 applying ai to chest-x-rays. An Italian multicentre study. Med. Image Anal. 74, 102216 (2021).

Article PubMed PubMed Central Google Scholar

Shen, B. et al. Initial chest radiograph scores inform covid-19 status, intensive care unit admission and need for mechanical ventilation. Clin. Radiol. 76(6), 473.e1-473.e7 (2021).

Article CAS PubMed Google Scholar

Liu, Y. et al. Tumor heterogeneity assessed by texture analysis on contrast-enhanced CT in lung adenocarcinoma: Association with pathologic grade. Oncotarget 8(32), 5366453674 (2017).

Article PubMed PubMed Central Google Scholar

Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 19 (2012).

Google Scholar

He, K. et al. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770778 (2016).

Chandra, T. B. et al. Coronavirus disease (covid19) detection in chest x-ray images using majority voting based classifier ensemble. Expert Syst. Appl. 165(113), 909 (2021).

Google Scholar

Johri, S. et al. A novel machine learning-based analytical framework for automatic detection of covid-19 using chest x-ray images. Int. J. Imaging Syst. Technol. 31(3), 11051119 (2021).

Article Google Scholar

Selvi, J. T., Subhashini, K. & Methini, M. Investigation of covid-19 chest x-ray images using texture featuresA comprehensive approach. Computational 1, 4558 (2021).

MATH Google Scholar

van Griethuysen, J. J. M. et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 77(21), e104e107 (2017).

Article PubMed PubMed Central Google Scholar

Zhang, Q., Wu, Y. N. & Zhu, S. C. Interpretable convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 88278836 (2018).

Varghese, B. A. et al. Predicting clinical outcomes in covid-19 using radiomics on chest radiographs. Br. J. Radiol. 94(1126), 20210221 (2021).

Article PubMed Google Scholar

Iori, M. et al. Mortality prediction of COVID-19 patients using radiomic and neural network features extracted from a wide chest X-ray sample size: A robust approach for different medical imbalanced scenarios. Appl. Sci. 12(8), 3903 (2022).

Article CAS Google Scholar

Blain, M. et al. Determination of disease severity in covid-19 patients using deep learning in chest x-ray images. Diagn. Interv. Radiol. 27(1), 2027 (2021).

Article PubMed Google Scholar

Liu, X. et al. Temporal radiographic changes in covid-19 patients: Relationship to disease severity and viral clearance. Sci. Rep. 10(1), 10263 (2020).

Article ADS CAS PubMed PubMed Central Google Scholar

Yasin, R. & Gouda, W. Chest x-ray findings monitoring covid-19 disease course and severity. Egypt. J. Radiol. Nucl. Med. 51(1), 193 (2020).

Article Google Scholar

Castelli, G. et al. Brief communication: Chest radiography score in young covid-19 patients: Does one size fit all?. PLoS ONE 17(2), e0264172 (2022).

Original post:
Use of machine learning to assess the prognostic utility of radiomic ... - Nature.com

Posted in Machine Learning | Comments Off on Use of machine learning to assess the prognostic utility of radiomic … – Nature.com

Using Machine Learning to Predict the 2023 Kentucky Derby … – DataDrivenInvestor

Can the forecasted weather be used to predict the winning race time?

My hypothesis is that the weather plays a major impact on the Kentucky Derbys winning race time. In this analysis I will use the Kentucky Derbys forecasted weather to predict the winning race time using Machine Learning (ML). In previous articles I discussed the importance of using explainable ML in a business setting to provide business insights and help with buy-in and change management. In this analysis, because Im striving purely for accuracy, I will disregard this advice and go directly to the more complex, but accurate, black box Gradient Boosted Machine (GBM), because we want to win some money!

The data I will use comes from the National Weather Service:

# Read in Data #data <- read.csv("...KD Data.csv")

# Declare Year Variables #year <- data[,1]

# Declare numeric x variables #numeric <- data[,c(2,3,4)]

# Scale numeric x variablesscaled_x <- scale(numeric)# check that we get mean of 0 and sd of 1colMeans(scaled_x)apply(scaled_x, 2, sd)

# One-Hot Encoding #data$Weather <- as.factor(data$Weather)xfactors <- model.matrix(data$Year ~ data$Weather)[, -1]

# Bring prepped data all back together #scaled_df <- as.data.frame(cbind(year,y,scaled_x,xfactors))

# Isolate pre-2023 data #old_data <- scaled_df[-1,]new_data <- scaled_df[1,]

# Gradient Boosted Machine ## Find Max Interaction Depth #floor(sqrt(NCOL(old_data)))

# find index for n trees with minimum CV errorbest.iter <- gbm.perf(tree_mod, method="OOB", plot.it=TRUE, oobag.curve=TRUE, overlay=TRUE)print(best.iter)

In this article, I chose a more accurate, but complex, black box model to predict the Kentucky Derbys winning race time. This is because I dont care about generating insights or winning buy-in or change management, rather I want to use the model that is the most accurate so I can make a data driven gamble. In most business cases you will give up accuracy for explainability, however there are some instances (like this one) in which accuracy is the primary requirement of a model.

This prediction is based off forecasted weather for Saturday May 6th, taken on Thursday May 4th, so obviously it should be taken with a grain of salt. As everyone knows, even with huge amounts of technology, predicting weather is very difficult. Using forecasted weather to predict the winning race time adds even more uncertainity. That being said, I will take either the over or the under that matches my predicted winning time of 122.12 seconds.

Read the original post:
Using Machine Learning to Predict the 2023 Kentucky Derby ... - DataDrivenInvestor

Posted in Machine Learning | Comments Off on Using Machine Learning to Predict the 2023 Kentucky Derby … – DataDrivenInvestor

Machine Learning to Estimate Breast Cancer Recurrence | CLEP – Dove Medical Press

Introduction

Cancer recurrence is considered to be an important cancer outcome metric to measure the burden of the disease and success of (neo)adjuvant therapies. Despite this, high-quality breast cancer recurrence rates currently remain unknown in most countries, including Belgium. To date, cancer recurrence is not systematically registered in most population-based cancer registries, due to the difficulty and labor-intensity of registering follow-up for recurrences.

Recurrence definitions used for registration purposes differ among countries, due to the lack of consensus regarding a standardized clinical definition. Defining recurrence clinically is a challenge, since various methods exist to detect recurrences after (neo)adjuvant treatments of a patient such as physical examination, pathological examination, imaging, or tumor markers. Unlike the guidelines and definitions that currently exist in the clinical trial setting,1,2 no guidelines are set to correctly and consistently register a recurrence in a patient with stage IIII breast cancer at diagnosis.

Real-world recurrence data could give an estimation of cancer burden and efficacy of cancer treatment modalities outside a conventional clinical trial setting, which could eventually lead to improvements in quality of care.3,4 Administrative data from health insurance companies on medical treatments and procedures, also known as bill claims, and hospital discharge data could represent an alternative source for the assessment of disease evolution after breast cancer treatment.

Recently, machine learning algorithms based on classification and regression trees (CART) have been developed to detect cancer recurrence at the population level using claims data.5 However, only in a limited number of countries, research teams were able to successfully construct algorithms to detect breast cancer recurrences, and only for a small number of centers (USA,6,7 Canada,8,9 Denmark10,11 and Sweden)12 Our aim was to develop, test and validate an algorithm using administrative data features allowing the estimation of breast cancer recurrence rates for all Belgian patients with breast cancer.

To construct and validate an algorithm to detect distant recurrences, female patients with breast cancer diagnosed between January 1, 2009 and December 31, 2014 were included from nine different centers located in all three Belgian regions. We did not include patients with stage IV breast cancer at diagnosis, patients with a history of cancer (any second primary cancer, multiple tumors, and contralateral tumors), or patients who could not be coupled to administrative data sources. All breast cancers, regardless of molecular subtype, were included. Among the nine centers were centers from the Flemish region (University Hospitals Leuven, General Hospital Groeninge, Jessa Hospital, Imelda Hospital, and AZ Delta), Brussels-Capital region (Cliniques universitaires Saint-Luc and Institut Jules Bordet) and Walloon region (CHR Mons-Hainaut and CHU UCL Namur). For all nine centers, 300 patients were included per center, by randomly selecting from the study population 50 patients per incidence year. The study population of six centers was divided by randomization (6040% split-sample validation) into a training set to develop the algorithm, and an independent test set to perform an internal validation.13 The algorithm was additionally validated with an external validation set of the three remaining centers, to check reproducibility of the algorithm in a dataset with patients from other centers.

For the selection of the nine centers, we aimed for a reasonable variety of center characteristics based on teaching vs non-teaching hospital, the spread across the three regions in Belgium, and center size.

For each patient in the study population, recurrence status (yes, no, unknown) and recurrence date (day, month, year) were extracted and collected from electronic medical files and reviewed by trained data managers from each of the nine hospitals. Recurrence was defined as the occurrence of a distant recurrence or metastasis between 120 days after the primary diagnosis and within 10 years of follow-up after diagnosis or end of study (December 31, 2018). Data managers were instructed to consider death due to breast cancer in our definition of a recurrence. Loco-regional recurrence, was not considered as an outcome in our study. Both patients with a progression (without a disease-free interval) and patients with a recurrence (with a disease-free interval) were considered as outcome in our definition of recurrence. Patients with an unknown recurrence status, due to the lack of follow-up for example, were excluded from the analysis. Patients with a recurrence within 120 days were considered de novo stage IV and therefore excluded because interference of first-line treatment complicates recurrence detection. Starting from diagnosis to detect recurrent disease might cause more false positive recurrence cases due to the treatment of the initial breast cancer overlapping with the immediate first-line treatment due to metastatic disease. Recurrence diagnosis date was the time-point (described in day, month, and year), confirmed by pathological examination, imaging (CT, PET-CT, bone scintigraphy or MRI scan), or defined by physicians in the multidisciplinary team meeting (MDT).

In the course of an extensive data linking process with pseudonymization of the patient data, the recurrence data from the hospitals (i.e., gold standard) were linked to several population-based data sources. These included cancer registration data from the Belgian Cancer Registry (BCR), and administrative data sources, including claims or reimbursement data (InterMutualistic Agency, IMA),14 hospital discharge data (Technische Cel, TCT),15 information on vital status (Crossroads Bank for Social Security, CBSS)16 and cause of death (Agentschap Zorg en Gezondheid, Observatoire de la Sant et du Social de Bruxelles-Capitale, and Agence pour une Vie de Qualit AVIQ).17 Information on data sources and data used is presented in Appendix 1.

To build a robust algorithm to detect distant recurrences, pre-processing and extraction of features were performed. Expert-driven features to potentially detect recurrences in administrative data were created based on recommendations from breast oncologists (P.N. and H.W.). First, a comprehensive list of reimbursement codes for diagnostic and therapeutic procedures and medications was selected, and code groups were created based on their relevance for the diagnosis and/or treatment of distant metastasis in breast cancer patients (See Appendix 2).

Potential features were further refined based on the exploration of data from patients with a recurrence, including time-frames starting from time points after diagnosis (0 days, 90 days, 160 days, 270 days, and 365 days after diagnosis). We assessed different time-frames to obtain the most accurate feature to detect recurrences, and because starting from the date of diagnosis might result in noise from the treatment of the initial breast cancer. We additionally created features based on count of codes, by assessing the maximum number of codes per year or per pre-defined time-frame (starting from 0, 90, 160, 270, and 365 days after diagnosis) (Table 1). The best performing time-frame was selected for each feature by maximizing the Youdens J index:18

Table 1 List of Potential Markers for Recurrence (Available Within Administrative Data) Based on Recommendations from Breast Oncologists

After a feature list was obtained (as described in previous section), this list was narrowed down based on the ensemble method of bootstrapping.19 In total 1000 bootstrap samples were used to generate 1000 classification and regression trees (CART) using the same training set, and to select best-performing features based on the frequency of the features.19,20

Cost-complexity pruning was applied for each bootstrap sample, to obtain the best performing model and avoid over-fitting of the model to the dataset.20 CART inherently uses entropy for the selection of nodes or features. The higher the entropy, the more informative and useful the feature is.20 A 10-fold cross-validation was also performed to ensure robustness of the model in different training sets. Collinearity of the selected features was accounted for by the one standard error (1-SE) rule, to eliminate redundant features. The 1-SE rule selects the least complex tree that is within 1 standard error from the best performing tree.21

Based on the selected features from the bootstrapping, a principal CART model was built to classify patients as having a recurrence or not by using the complete training set.

Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and classification accuracy was calculated for evaluating and comparing the performance of the principal CART model. All models were created and trained in SAS 9.4 (SAS Institute, Cary, NC, USA) within the SAS Enterprise Guide software (version 7.15 of the SAS System for Windows).

Data for a total of 2507 patients could be retrieved from nine Belgian centers and were included in the final dataset to train, test and externally validate the algorithm (Figure 1 and Table 2). The mean follow-up period was 7.4 years. For the split sample validation, the patients from six centers were split into the training set (N = 975 of which 78 distant recurrences, 8.0%) and internal validation set (N = 713 of which 56 distant recurrences, 7.9%). The external validation set consisted of three independent centers with 819 patients, of which 82 had distant recurrences (10.0%). The training, internal validation, and external validation sets did not have differences in distribution of baseline tumor and patient characteristics (Table 2).

Table 2 Baseline Patient and Tumor Characteristics

Figure 1 Patient inclusion flow diagram.

Based on bootstrap aggregation, 1000 CART models were built using the following features: (1) Presence of a follow-up MDT meeting, starting from 270 days after diagnosis (feature present in 975 out of 1000 CART models), (2) Maximum number of CT codes present (with a moving average over time) of 5 or more times a year (851 CART models), and (3) Death due to breast cancer (412 CART models) (see Supplementary Figure 1). Afterwards, the final CART model was constructed with these three features and calculated by using all data of the training set (Figure 2).

Figure 2 Final CART model to detect recurrences based on the three selected features after bootstrapping. Nodes represent selected features by the algorithm to classify patients.

Abbreviations: MDT, multidisciplinary team meeting; CT, computed tomography scan.

The sensitivity of the principal CART model to detect recurrences for the training set was 79.5% (95% confidence interval [CI] 68.887.8%), specificity was 98.2% (95% CI 97.199.0%), with an overall accuracy of 96.7% (95% CI 95.497.7%) (Table 3), and an AUC (area under the curve) of 94.2%. After 10-fold cross-validation within the training set, we found a sensitivity of 71.8% (95% CI 66.486.7%), specificity of 98.2% (95% CI 96.398.5%) and overall accuracy of 96.1% (95% CI 94.797.2%). The internal validation (i.e. based on test set) resulted in a sensitivity of 83.9% (95% CI 71.792.4%), a specificity of 96.7% (95% CI 95.098.9%), and accuracy of 95.7% (95% CI 93.997.0%). After external validation was performed on three additional centers, the sensitivity was 84.1% (95% CI 74.491.3%), with a specificity of 98.2% (95% CI 97.099.1%) and accuracy of 96.8% (95% CI 95.497.9%).

Table 3 Performance of Training Set, Cross Validation, Internal Validation Set and External Validation Set

In this study, we were able to successfully develop a machine learning algorithm to detect distant recurrence in patients with breast cancer, achieving accuracy of 96.8% after external validation in multiple centers across Belgium. The final list of detected parameters were presence of a follow-up MDT meeting, CT scan (max 5 times a year), and death due to breast cancer. Recurrence data are lacking in many population-based cancer registries due to the cost and labor-intensity of registration.3 True incidence of cancer recurrence should be known across age groups and regions in Belgium, to measure burden of illness and eventually improve quality of care. Current recurrence numbers are often extrapolated from clinical trials, which typically exclude older and frail patients. Older patients are more susceptible to receive under-treatment and to recurrences22,23 and recurrence numbers could therefore be underestimated.

The administrative data sources used in our algorithm virtually cover all residents of Belgium,14 which was useful to achieve population-based recurrence data. We were also able to accomplish a multi-centric study by developing the training model and performing an external validation based on data of multiple centers. Likewise, it is highly important to have a relatively large population and reliable gold standard to develop and train a machine learning model in these studies, to avoid prolonging and complicating the feature selection process due to conflicting recurrence and treatment data occurrence.

The definition of a distant recurrence in medical files was the occurrence of a distant recurrence or metastases after a period of 120 days. This time-frame until detection of recurrence varied among previous studies.2427 Most common exclusions were done either from 120 days (Chubak et al 2012) or 180 days after diagnosis (Amar et al 2020). Disease progression can be difficult to measure accurately and can be overestimated because of timing of therapeutic procedures that might be delayed. The limitation of our study was that we could not make a distinction between disease progression and disease recurrence. Defining medical recurrence in the clinic is a challenge, which makes it more difficult to define recurrence with a proxy based on administrative data.28 Therefore, setting a clear definition of window of treatment and the time-frame for detection of recurrence is considered important for future studies.

We chose to restrict our definition to distant recurrences to achieve a straightforward feature selection. We included death due to breast cancer as an outcome in our definition of recurrences. Cause-specific death and accurate source of cause of death is of utmost importance when studying recurrences, since recurrence and death are closely related to each other.29

The machine learning algorithm used in this study was a decision tree, i.e. the Classification And Regression Tree (CART) with the ensemble method. Ensemble learning combines multiple decision trees sequentially (boosting) or in parallel (bootstrap aggregation). The key advantages of using bootstrap aggregation are: better predictive accuracy, less variance, and less bias than a single decision tree. Similarly, latest studies more often make use of ensemble methods.7,9,12

Within the recurrence detection features that were selected from the bootstrapping method for the cohort of six different Belgian centers, no treatment features were selected, which could indicate that there are more inter-center similarities for diagnostic regimens and more differences in terms of treatment regimens. During pre-processing of the features, we did additional checks of features to improve accuracy of the model. For instance, we generated a treatment feature that only included metastases-specific chemotherapy agent codes. However, this feature was not included in the final model. Next, we tried out a model without diagnostic features, but this did not improve accuracy. Previous studies mostly make use of metastatic diagnosis codes (secondary malignant neoplasm or SMN code from ICD-9 or ICD-10) in their algorithm, which would be useful if highly reliable. We also checked subgroups by testing out different models for patients younger or older than 70 years, and different incidence years. We applied the algorithm on subgroups based on age or incidence years, to check if the algorithm accuracy performed better in specific subgroups. As expected, we found higher performance in younger patients (Supplementary Table 1).

Our algorithm performance was comparable to previous studies using decision trees.9,12,24,3032 We found greater accuracy compared with the pooled accuracy of previous algorithms.5

Although algorithms with highest overall accuracy are often sought-after in earlier studies, some studies also provide multiple algorithms to choose from based on their preference, e.g. high-sensitivity or high-specificity algorithms.6,10,24,26,30 Finally, we also investigated the false negative cases from University Hospitals in Leuven to explain why these cases were misclassified. We found that in most false negative cases, the patients were missed due to the lack of attestation of the claims or management of the patients procedures. These cases were most likely patients for which there was a decision to withhold treatment because of comorbid disease, older age, the prognosis of the recurrence, or patients treatments were reimbursed by the sponsor of a clinical trial.

Previously, algorithms based on administrative claims data to detect breast cancer recurrences at the population level have been established.5,710,12 For example research groups from the USA, Canada, and Sweden have built algorithms to detect recurrences in a delimited region within a population. Recent results from these groups have proven that machine learning algorithms based on administrative data can be used to detect recurrences, in the absence of systematic registration. These studies, however, only encompassed a few centers and were thus not validated in a larger cohort of a population. Moreover, most of these algorithms included complete metastasis-specific International Classification of Diseases (ICD)-codes to detect recurrences. Since metastasis-specific codes are not complete in our database, we were not able to use this code in our algorithm. Particularly, the Danish registry has actively collected recurrence information in the Danish Breast Cancer Group (DBCG) clinical database, which they were able to use to construct and validate population-based recurrence-algorithms to complete their recurrence database.10,11 Additionally, they were able to look into long-term recurrences beyond 10 years after incidence date.4,33

The objective of this study was to develop an algorithm that could be used on a nation-wide level to estimate population-wide distant recurrences. Compared with other studies, we used a large sample size and reported both internal and external validation, which was hardly reported in earlier studies.5 Another strength of our study was that unlike many other studies from the USA using Medicare claims,3438 we were able to include all eligible patients with a breast cancer diagnosis, and not just patients older than 65 years.

Although we used different diagnosis and treatment code sources, it should be noted that treatment regimens often change over time and adaptation of the features should be performed for later use. Adapting the algorithm based on changes in diagnosis or treatment regimens might be necessary to obtain accurate recurrence rates of more incidence years in the future. Ideally, we would also prefer to have long-term follow-up and claims data for patients to detect long-term recurrences. However, due to regulations and the large bulk of data that is generated, a longer follow-up of the codes was not possible within the current study. Longer follow-up of recurrences and administrative data would likely improve the accuracy and lead to a more robust algorithm.

In conclusion, our machine learning algorithm to detect metastatic breast cancer recurrences performed with high accuracy after external validation. Claims data are available for medical procedures and medications, hospital discharge data, vital status and cause of death data on the whole population level, which allows the development of models for Belgium. This substantiates the feasibility to develop and validate recurrence algorithms at the population level and might encourage other population-based registries to develop recurrence models or actively register recurrences in the future as these become progressively important. These rates are valuable to gain more insights about recurrences outside the clinical trial setting and might unveil the importance of active registration of recurrences.

AUC, Area under the curve; ATC, Anatomical Therapeutic Chemical classification; AVIQ, Agence pour une Vie de Qualit; BCR, Belgian Cancer Registry; CA15-3, Cancer antigen 15-3; CART, Classification and regression tree; CBSS, Crossroads Bank for Social Security; CT, Computed tomography; FN, False negatives; FP, False positives; ICD, International Classification of Diseases and Related Health Problems; IMA, InterMutualistic Agency; MDT, Multidisciplinary team meeting; MRI, Magnetic Resonance Imaging; MZG, Minimale Ziekenhuis Gegevens; NPV, Negative predictive value; PPV, Positive predictive value; PET-CT, Positron emission tomography computed tomography; SE, Standard error; SMN, Secondary malignant neoplasm; TN, True negatives; TP, True positives.

The data that support the findings of this study are available upon reasonable request. The data can be given within the secured environment of the Belgian Cancer Registry, according to its regulations, and only upon approval by the Information Security Committee.

This retrospective chart review study involving human participants was in accordance with the ethical standards of the institutional and national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. This study was approved by the Ethics Committee of University Hospitals Leuven (S60928). Informed consent for use of data of all participants was obtained.

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

This work was supported by VZW THINK-PINK (Belgium).

The authors report no conflicts of interest in this work.

1. Gourgou-Bourgade S, Cameron D, Poortmans P, et al. Guidelines for time-to-event end point definitions in breast cancer trials: results of the DATECAN initiative (Definition for the Assessment of Time-to-event Endpoints in CANcer trials). Ann Oncol. 2015;26(5):873879. doi:10.1093/annonc/mdv106

2. Eisenhauer EA, Therasse P, Bogaerts J, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer. 2009;45(2):228247. doi:10.1016/j.ejca.2008.10.026

3. Warren JL, Yabroff KR. Challenges and opportunities in measuring cancer recurrence in the United States. J Natl Cancer Inst. 2015;107:djv134djv134. doi:10.1093/jnci/djv134

4. Negoita S, Ramirez-Pena E. Prevention of late recurrence: an increasingly important target for breast cancer research and control. J Natl Cancer Inst. 2021. doi:10.1093/JNCI/DJAB203

5. Izci H, Tambuyzer T, Tuand K, et al. A systematic review of estimating breast cancer recurrence at the population level with administrative data. J Natl Cancer Inst. 2020;112:979988. doi:10.1093/jnci/djaa050

6. Ritzwoller DP, Hassett MJ, Uno H, et al. Development, validation, and dissemination of a breast cancer recurrence detection and timing informatics algorithm. J Natl Cancer Inst. 2018;110:273281. doi:10.1093/jnci/djx200

7. Amar T, Beatty JD, Fedorenko C, et al. Incorporating breast cancer recurrence events into population-based cancer registries using medical claims: cohort study. JMIR Cancer. 2020;6(2):110.

8. Cairncrossh ZF, Nelson G, Shack L, Metcalfe A. Validation in Alberta of an administrative data algorithm to identify cancer recurrence. Curr Oncol. 2020;27(3):e343e346. doi:10.3747/co.27.5861

9. Lambert P, Pitz M, Singh H, Decker K. Evaluation of algorithms using administrative health and structured electronic medical record data to determine breast and colorectal cancer recurrence in a Canadian province: using algorithms to determine breast and colorectal cancer recurrence. BMC Cancer. 2021;21(1):110. doi:10.1186/s12885-021-08526-9

10. Pedersen RN, ztrk B, Mellemkjr L, et al. Validation of an algorithm to ascertain late breast cancer recurrence using Danish medical registries. Clin Epidemiol. 2020;12:10831093. doi:10.2147/CLEP.S269962

11. Rasmussen LA, Jensen H, Virgilsen LF, et al. A validated algorithm for register-based identification of patients with recurrence of breast cancer-Based on Danish Breast Cancer Group (DBCG) data. CANCER Epidemiol. 2019;59:129134. doi:10.1016/j.canep.2019.01.016

12. Valachis A, Carlqvist P, Szilcz M, et al. Use of classifiers to optimise the identification and characterisation of metastatic breast cancer in a nationwide administrative registry. Acta Oncol. 2021;60(12):16041610. doi:10.1080/0284186X.2021.1979645

13. Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J. 2014;35:19251931. doi:10.1093/eurheartj/ehu207

14. Het Intermutualistisch Agentschap [The Intermutualistic Agency] (IMA) LAgence InterMutualiste (AIM). https://ima-aim.be/.

15. Technische Cel voor het beheer van de MZG-MFG data [Technical cel for management of MZG-MFG data]- La Cellule Technique pour la gestion des donnes RHM-RFM. https://tct.fgov.be/.

16. CBSS - Crossroads Bank for Social Security. Available from: https://www.ksz-bcss.fgov.be/nl/documents-list. Accessed April 28, 2023.

17. Agence pour une Vie de Qualit [Walloon Agency for quality of life] (AViQ). https://www.aviq.be/.

18. Smits N. A note on Youdens J and its cost ratio. BMC Med Res Methodol. 2010;10(1):14. doi:10.1186/1471-2288-10-89

19. Sutton CD. Classification and regression trees, bagging, and boosting. Handb Stat. 2005;24:303329.

20. Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and regression trees. Classif Regres Trees. 1984;20:1358.

21. Chen Y, Yang Y. The one standard error rule for model selection: does it work? Stats. 2021;4(4):868892. doi:10.3390/stats4040051

22. Enger SM, Soe ST, Buist DSM, et al. Breast cancer treatment of older women in integrated health care settings. J Clin Oncol. 2006;24(27):43774383. doi:10.1200/JCO.2006.06.3065

23. Han Y, Sui Z, Jia Y, et al. Metastasis patterns and prognosis in breast cancer patients aged 80 years: a SEER database analysis. J Cancer. 2021;12(21):6445. doi:10.7150/jca.63813

24. Xu Y, Kong S, Cheung WY, et al. Development and validation of case-finding algorithms for recurrence of breast cancer using routinely collected administrative data. BMC Cancer. 2019;19(1):110. doi:10.1186/s12885-019-5432-8

25. Chubak J, Onega T, Zhu W, et al. An electronic health record-based algorithm to ascertain the date of second breast cancer events. Med Care. 2017;55:e81e87. doi:10.1097/MLR.0000000000000352

26. Kroenke CH, Chubak J, Johnson L, et al. Enhancing breast cancer recurrence algorithms through selective use of medical record data. J Natl Cancer Inst. 2016;108. doi:10.1093/jnci/djv336

27. Cronin-Fenton D, Kjrsgaard A, Nrgaard M, et al. Breast cancer recurrence, bone metastases, and visceral metastases in women with stage II and III breast cancer in Denmark. Breast Cancer Res Treat. 2018;167(2):517528. doi:10.1007/s10549-017-4510-3

28. In H, Bilimoria KY, Stewart AK, et al. Cancer recurrence: an important but missing variable in national cancer registries. Ann Surg Oncol. 2014;21(5):15201529. doi:10.1245/s10434-014-3516-x

29. Nout RA, Fiets WE, Struikmans H, et al. The in- or exclusion of non-breast cancer related death and contralateral breast cancer significantly affects estimated outcome probability in early breast cancer. Breast Cancer Res Treat. 2008;109(3):567572. doi:10.1007/s10549-007-9681-x

30. Chubak J, Yu O, Pocobelli G, et al. Administrative data algorithms to identify second breast cancer events following early-stage invasive breast cancer. J Natl Cancer Inst. 2012;104(12):931940. doi:10.1093/jnci/djs233

31. Nordstrom B, Whyte J, Stolar M, Catherine Mercaldi JK, Kallich JD. Identification of metastatic cancer in claims data. Pharmacoepidemiology. 2012;21(2):2128. doi:10.1002/pds.3247

32. Nordstrom BL, Simeone JC, Malley KG, et al. Validation of claims algorithms for progression to metastatic cancer in patients with breast, non-small cell lung, and colorectal cancer. Pharmacoepidemiol Drug Saf. 2015;24(1, SI):511.

33. Pedersen RN, Ozt Rk Esen B, Mellemkjaer L, et al. The incidence of breast cancer recurrence 1032 years after primary diagnosis. J Natl Cancer Inst. 2021. doi:10.1093/JNCI/DJAB202

34. Lamont EB, Ii JEH, Weeks JC, et al. Measuring disease-free survival and cancer relapse using medicare claims from CALGB breast cancer trial participants (Companion to 9344). J Natl Cancer Inst. 2006;98(18):13351338. doi:10.1093/jnci/djj363

35. Chawla N, Yabroff KR, Mariotto A, et al. Limited validity of diagnosis codes in Medicare claims for identifying cancer metastases and inferring stage. Ann Epidemiol. 2014;24(9):666672.e2. doi:10.1016/j.annepidem.2014.06.099

36. Hassett MJ, Ritzwoller DP, Taback N, et al. Validating billing/encounter codes as indicators of lung, colorectal, breast, and prostate cancer recurrence using 2 large contemporary cohorts. Med Care. 2014;52(10):e65e73. doi:10.1097/MLR.0b013e318277eb6f

37. Sathiakumar N, Delzell E, Yun H, et al. Accuracy of medicare claim-based algorithm to detect breast, prostate, or lung cancer bone metastases. Med Care. 2017;55:e144e149. doi:10.1097/MLR.0000000000000539

38. McClish D, Penberthy L, Pugh A. Using Medicare claims to identify second primary cancers and recurrences in order to supplement a cancer registry. J Clin Epidemiol. 2003;56(8):760767. doi:10.1016/S0895-4356(03)00091-X

View original post here:
Machine Learning to Estimate Breast Cancer Recurrence | CLEP - Dove Medical Press

Posted in Machine Learning | Comments Off on Machine Learning to Estimate Breast Cancer Recurrence | CLEP – Dove Medical Press

Securing weak spots in AML: Optimizing Model Evaluation with … – Finextra

Manually evaluating transaction monitoring models is slow and error-prone, with mistakes resulting in potentially large fines. To avoid this, banks are increasingly turning to automated machine learning.

Regulators increasingly expect banks and financial institutions to be able to demonstrate the effectiveness of their transaction monitoring systems.

As part of this process, banks need to evaluate the models they use and verify (and document) that theyre up to the task. Institutions that fail to maintain a sufficiently effective anti-money laundering program arefrequently hit with huge fines, including several that have totaled over USD1 billion.

Lisa Monaco, the deputy attorney general at the US Department of Justice (DoJ) while announcing arecent fine for Danske Bank, said to expect companies to invest in robust compliance programs. Failure to do so may well be a one-way ticket to a multi-billion-dollar guilty plea.

Such threats are putting added pressure on smaller banks and FIs. While the larger institutions often will struggle less because of their army of data scientists, model validation and evaluation can be a burden for players with more limited resources.

What is a model?

In the US, banks commonly monitor transactions using a rule-based system of parameters and thresholds. Common rules detect the value of transactions over a period of time or an increase in the volume or value of transactions. If sufficient conditions are met, an alert is triggered.

Even in their simplest incarnation, regulators consider such systems to be models. According to supervisory guidanceOCC 2011-12, a model is defined as any quantitative approach that processes inputs and produces reports. In practice, a typical rule-based transaction monitoring system involves multiple layers of rules.

Regardless of complexity, banks must manage model risks appropriately. There are three main types of model risk that banks need to consider:

These are easy questions to ask, but answering them can be extremely challenging. The OCC supervisory guidance stipulates that banks should manage model risks just like any other type of risk, which includes critical analysis by objective, informed parties who can identify model limitations and assumptions and produce appropriate change.

This guidance states that banks should ensure their models are performing as expected, in line with their design objectives and business uses. It defines the key elements of an effective validation framework as:

Regulatory compliance

Regulators have continued to raise the bar as the US seeks to restrict access to sanctioned countries and individuals, as well as cracking down on financial crime in general.

Since 2018, the New York State Department of Financial Services has required boards or senior officers to submitan annual compliance finding that certifies the effectiveness of an institutions transaction monitoring and sanctions filtering programs.

Taking this a step further, the DoJ announced in 2022 that it was considering a requirement for chief executives and chief compliance officers to certify the design and implementation of their compliance program. With continued geopolitical tensions as the war in Ukraine drags on, the potential cost of a compliance failure is only going to increase.

The regulation of models comes under these broad requirements for effective risk controls. While the approach that banks take to evaluate models will vary on a case-by-case basis, the general principles apply equally.

Similarly, the frequency of model evaluation should be determined using a risk-based approach, typically prompted by any significant changes to the institutions risk profile, such as a merger or acquisition, or expansion into new products, services, customer types or geographic areas. However, regulators increasingly expect models to be evaluated as often as every 12-18 months.

Model evaluation challenges

Rule-based models are being asked to do much more as the nature and volume of financial transactions has evolved. As new threats have emerged, models have become more and more complex (though not more effective). Unfortunately, many are not up to the task.

In many cases, the model has become a confusing black box that few people within the institution understand. Over the years, changes to data feeds, scenario logic, system functions, and staffing can mean that documentation explaining how the model works is incomplete or inaccurate. All of this can make evaluation very difficult for smaller banks. A first-time assessment will almost certainly be time-consuming and costly, and possibly flawed.

However, the challenges are not going away. Changes in consumer behavior, which accelerated during the pandemic, are here to stay. Banks and FIs have digitized their operations, vastly increasing their range of online services and payment methods. Consumers are also showing greater willingness to switch to challenger banks with digital-first business models.

These changes have created more vulnerabilities. Competitive pressures are putting compliance budgets under pressure, while the expansion of online services has created more opportunities for AML failures. To keep up, FIs need to respond quickly and flexibly to new threats.

Better model evaluation with Automated Machine Learning

This process of model evaluation can be optimized using automated machine learning (AutoML). This allows models to be evaluated continuously (or on short cycles) with a standardized process, which leads to higher quality evaluations. By contrast, the manual approach is slow and very error prone.

AutoML models take huge sets of data, learn from the behaviors encoded in that data and reveal patterns that indicate evidence of money laundering. The rapidly changing landscape of AML regulations, in combination with the growing number of transactions and customers, leaves almost no room for a traditional manual project-by-project approach. That is why the industry is increasingly looking at a more disruptive approach:models that are trained with customers' good behavior. The results of this non-traditional method in combination with AutoML let banks adaptto the new reality and stay ahead of almost any new criminal pattern.

Continued here:
Securing weak spots in AML: Optimizing Model Evaluation with ... - Finextra

Posted in Machine Learning | Comments Off on Securing weak spots in AML: Optimizing Model Evaluation with … – Finextra

IEEE Computer Society Emerging Technology Fund Recipient … – Benzinga

Presentation at The Eleventh International Conference on Learning Representations (ICLR) debuts new findings for end-to-end neural network Trojan removal techniques

LOS ALAMITOS, Calif., May 5, 2023 /PRNewswire/ -- Today, at the virtual Backdoor Attacks and Defenses in Machine Learning (BANDS) workshop during The Eleventh International Conference on Learning Representations (ICLR), participants in the IEEE Trojan Removal Competition presented their findings and success rates at effectively and efficiently mitigating the effects of neural trojans while maintaining high performance. Evaluated on clean accuracy, poisoned accuracy, and attack success rate, the competition's winning team from the Harbin Institute of Technology in Shenzhen, with set HZZQ Defense, formulated a highly effective solution, resulting in a 98.14% poisoned accuracy rate and only a 0.12% attack success rate. This group will be awarded the first-place prize of $5,000 USD.

"The IEEE Trojan Removal Competition is a fundamental solution to improve the trustworthy implementation of neural networks from implanted backdoors," said Prof. Meikang Qiu, chair of IEEE Smart Computing Special Technical Committee (SCSTC) and full professor of Beacom College of Computer and Cyber Science at Dakota State University, Madison, S.D., U.S.A. He also was named the distinguished contributor of IEEE Computer Society in 2021. "This competition's emphasis on Trojan Removal is vital because it encourages research and development efforts toward enhancing an underexplored but paramount issue."

In 2022, IEEE CS established its Emerging Technology Fund, and for the first time, awarded $25,000 USD to IEEE SCSTC for the "Annual Competition on Emerging Issues of Data Security and Privacy (EDISP)," which yielded the IEEE Trojan Removal Competition (TRC '22). The proposal offered a novel take on a cyber topic, because unlike most existing competitions that only focus on backdoor model detection, this competition encouraged participants to explore solutions that can enhance the security of neural networks. By developing general, effective, and efficient white box trojan removal techniques, participants have contributed to building trust in deep learning and artificial intelligence, especially for pre-trained models in the wild, which is crucial to protecting artificial intelligence from potential attacks.

With 1,706 valid submissions from 44 teams worldwide, six groups successfully developed techniques that achieved better results than the state-of-the-art baseline metrics published in top machine-learning venues. The benchmarks summarizing the models and attacks used during the competition are being released to enable additional research and evaluation.

"We're hoping that this benchmark provides diverse and easy access to model settings for people coming up with new AI security techniques," shared Yi Zeng, the competition chairof the IEEE TRC'22, research assistant atBradley Department of Electrical and Computer Engineering, Virginia Tech, Blacksburg, Va., U.S.A. "This competition has yielded new data sets consisting of trained poisoned pre-trained models that are of different architectures and trained on diverse kinds of data distributionswith really high attack success rates, and now developers can explore new defense methods and get rid of remaining vulnerabilities."

During the competition, collective participant results yielded two key findings:

These findings point to the fact that for the time being, a generalized approach to mitigating attacks on neural networks is not advisable. Zeng emphasized the urgent need for a comprehensive AI security solution: "As we continue to witness the widespread impact of pre-trained foundation models on our daily lives, ensuring the security of these systems becomes increasingly critical. We hope that the insights gleaned from this competition, coupled with the release of the benchmark, will galvanize the community to develop more robust and adaptable security measures for AI systems."

"As the world becomes more dependent on AI and machine learning, it is important to deal with the security and privacy issues that these technologies bring up," said Qiu. "The IEEE TRC '22 competition for EDISP has made a big difference in this area. I'd like to offer a special thanks to my colleagues on the steering committeeProfessors Ruoxi Jia from Virginia Tech, Neil Gong from Duke, Tianwei Zhang from Nanyang Technological University, Shu-Tao Xia from Tsinghua University, and Bo Li from University of Illinois Urbana-Champaignfor their help and support."

Ideas and insights coming out of the event, along with the public benchmark data, will help make the future of machine learning and artificial intelligence safer and more dependable. The team plans to run the competition for a second year, and those findings will further strengthen the security parameters of neural networks.

"This is precisely the kind of work we want the Emerging Technology Fund to fuel," said Nita Patel, 2023 IEEE Computer Society President. "It goes a long way toward bolstering iterative developments that will strengthen the security of machine learning and AI platforms as the technologies advance."

For more information about the Emerging Technology Grants Program overall, visit https://www.computer.org/communities/emerging-technology-fund.

About IEEE Trojan Removal CompetitionThe IEEE TRC'22 aims to encourage the development of innovative end-to-end neural network backdoor removal techniques to counter backdoor attacks. For more information, visit https://www.trojan-removal.com/.

About IEEE Computer SocietyThe IEEE Computer Society is the world's home for computer science, engineering, and technology. A global leader in providing access to computer science research, analysis, and information, the IEEE Computer Society offers a comprehensive array of unmatched products, services, and opportunities for individuals at all stages of their professional careers. Known as the premier organization that empowers the people who drive technology, the IEEE Computer Society offers international conferences, peer-reviewed publications, a unique digital library, and training programs. Visit computer.org for more information.

SOURCE IEEE Computer Society

Read the original post:
IEEE Computer Society Emerging Technology Fund Recipient ... - Benzinga

Posted in Machine Learning | Comments Off on IEEE Computer Society Emerging Technology Fund Recipient … – Benzinga

Computer science research team explores how machine learning … – The College of New Jersey News

Services like Google Translate can help millions of people communicate in over 100 languages. Users can type or speak words to be translated, or even translate text in photos and videos using augmented reality.

Now, computer science professor Andrea Salgian and Ben Guerrieri 26 are working to add one more language to the list: American Sign Language.

Using computer vision and machine learning, the researchers are setting out to create a program to serve as a Google Translate tool for ASL speakers to sign to the camera and receive a direct translation.

Right now, were looking at recognizing letters and words that have static gestures, Salgian said, referring to letters in the ASL alphabet with no hand movement. The program will act more like a dictionary at first. The pair will then develop the automated translation, she explained.

Salgians research utilizes a free machine-learning framework called Mediapipe, which is developed by Google and uses a camera to detect joint locations in real time. The program tracks the users movements, provides the coordinates of every single joint in the hand, and uses the coordinates to extract gestures that are matched to ASL signs.

Computer science major Ben Guerrieri 26 discovered Salgians project shortly after arriving at TCNJ and is now working alongside her in this AI research.

Its such a hands-on thing for me to do, he said of his contribution to the project, which consists of researching and developing the translator algorithms. We get to incrementally develop algorithms that have super fascinating real-time results.

This project is part of Salgians on-going interest and research into visual gesture recognition that also includes applications to musical conducting and exercising.

ASL is a fascinating application, especially looking at the accessibility aspect of it, Salgian said. To make communication possible for those who dont speak ASL but would love to understand would mean so much, Salgian said.

Kaitlyn Bonomo 23

See the original post here:
Computer science research team explores how machine learning ... - The College of New Jersey News

Posted in Machine Learning | Comments Off on Computer science research team explores how machine learning … – The College of New Jersey News