Search Immortality Topics:

Category Archives: Machine Learning

Sliding Out of My DMs: Young Social Media Users Help Train … – Drexel University

In a first-of-its-kind effort, social media researchers from Drexel University, Vanderbilt University, Georgia Institute of Technology and Boston University are turning to young social media users to help build a machine learning program that can spot unwanted sexual advances on Instagram. Trained on data from more than 5 million direct messages annotated and contributed by 150 adolescents who had experienced conversations that made them feel sexually uncomfortable or unsafe the technology can quickly and accurately flag risky DMs.

The project, which was recently published by the Association for Computing Machinery in its Proceedings of the ACM on Human-Computer Interaction, is intended to address concerns that an increase of teens using social media, particularly during the pandemic, is contributing to rising trends of child sexual exploitation.

In the year 2020 alone, the National Center for Missing and Exploited Children received more than 21.7 million reports of child sexual exploitation which was a 97% increase over the year prior. This is a very real and terrifying problem, said Afsaneh Razi, PhD, an assistant professor in Drexels College of Computing & Informatics, who was a leader of the research.

Social media companies are rolling out new technology that can flag and remove sexually exploitative images and helps users to more quickly report these illegal posts. But advocates are calling for greater protection for young users that could identify and curtail these risky interactions sooner.

The groups efforts are part of a growing field of research looking at how machine learning and artificial intelligence be integrated into platforms to help keep young people safe on social media, while also ensuring their privacy. Its most recent project stands apart for its collection of a trove of private direct messages from young users, which the team used to train a machine learning-based program that is 89% accurate at detecting sexually unsafe conversations among teens on Instagram.

Most of the research in this area uses public datasets which are not representative of real-word interactions that happen in private, Razi said. Research has shown that machine learning models based on the perspectives of those who experienced the risks, such as cyberbullying, provide higher performance in terms of recall. So, it is important to include the experiences of victims when trying to detect the risks.

Each of the 150 participants who range in age from 13- to 21-years-old had used Instagram for at least three months between the ages of 13 and 17, exchanged direct messages with at least 15 people during that time, and had at least two direct messages that made them or someone else feel uncomfortable or unsafe. They contributed their Instagram data more than 15,000 private conversations through a secure online portal designed by the team. And were then asked to review their messages and label each conversation, as safe or unsafe, according to how it made them feel.

Collecting this dataset was very challenging due to sensitivity of the topic and because the data is being contributed by minors in some cases, Razi said. Because of this, we drastically increased the precautions we took to preserve confidentiality and privacy of the participants and to ensure that the data collection met high legal and ethical standards, including reporting child abuse and the possibility of uploads of potentially illegal artifacts, such as child abuse material.

The participants flagged 326 conversations as unsafe and, in each case, they were asked to identify what type of risk it presented nudity/porn, sexual messages, harassment, hate speech, violence/threat, sale or promotion of illegal activities, or self-injury and the level of risk they felt either high, medium or low.

This level of user-generated assessment provided valuable guidance when it came to preparing the machine learning programs. Razi noted that most social media interaction datasets are collected from publicly available conversations, which are much different than those held in private. And they are typically labeled by people who were not involved with the conversation, so it can be difficult for them to accurately assess the level of risk the participants felt.

With self-reported labels from participants, we not only detect sexual predators but also assessed the survivors perspectives of the sexual risk experience, the authors wrote. This is a significantly different goal than attempting to identify sexual predators. Built upon this real-user dataset and labels, this paper also incorporates human-centered features in developing an automated sexual risk detection system.

Specific combinations of conversation and message features were used as the input of the machine learning models. These included contextual features, like age, gender and relationship of the participants; linguistic features, such as wordcount, the focus of questions, or topics of the conversation; whether it was positive, negative or neutral; how often certain terms were used; and whether or not a set of 98 pre-identified sexual-related words were used.

This allowed the machine learning programs to designate a set of attributes of risky conversations, and thanks to the participants assessments of their own conversations, the program could also rank the relative level of risk.

The team put its model to the test against a large set of public sample conversations created specifically for sexual predation risk-detection research. The best performance came from its Random Forest classifier program, which can rapidly assign features to sample conversations and compare them to known sets that have reached a risk threshold. The classifier accurately identified 92% of unsafe sexual conversations from the set. It was also 84% accurate at flagging individual risky messages.

By incorporating its user-labeled risk assessment training, the models were also able to tease out the most relevant characteristics for identifying an unsafe conversation. Contextual features, such as age, gender and relationship type, as well as linguistic inquiry and wordcount contributed the most to identifying conversations that made young users feel unsafe, they wrote.

This means that a program like this could be used to automatically warn users, in real-time, when a conversation has become problematic, as well as to collect data after the fact. Both of these applications could be tremendously helpful in risk prevention and the prosecution of crimes, but the authors caution that their integration into social media platforms must preserve the trust and privacy of the users.

Social service providers find value in the potential use of AI as an early detection system for risks, because they currently rely heavily on youth self-reports after a formal investigation had occurred, Razi said. But these methods must be implemented in a privacy-preserving matter to not harm the trust and relationship of the teens with adults. Many parental monitoring apps are privacy invasive since they share most of the teen's information with parents, and these machine learning detection systems can help with minimal sharing of information and guidelines to resources when it is needed.

They suggest that if the program is deployed as a real-time intervention, then young users should be provided with a suggestion rather than an alert or automatic report and they should be able to provide feedback to the model and make the final decision.

While the groundbreaking nature of its training data makes this work a valuable contribution to the field of computational risk detection and adolescent online safety research, the team notes that it could be improved by expanding the size of the sample and looking at users of different social media platforms. The training annotations for the machine learning models could also be revised to allow outside experts to rate the risk of each conversation.

The group plans to continue its work and to further refine its risk detection models. It has also created an open-source community to safely share the data with other researchers in the field recognizing how important it could be for the protection of this vulnerable population of social media users.

The core contribution of this work is that our findings are grounded in the voices of youth who experienced online sexual risks and were brave enough to share these experiences with us, they wrote. To the best of our knowledge, this is the first work that analyzes machine learning approaches on private social media conversations of youth to detect unsafe sexual conversations.

This research was supported by the U.S. National Science Foundation and the William T. Grant Foundation.

In addition to Razi, Ashwaq Alsoubai and Pamela J. Wisniewski, from Vanderbilt University; Seunghyun Kim and Munmun De Choudhury, from Georgia Institute of Technology; and Shiza Ali and Gianluca Stringhini, from Boston University, contributed to the research.

Read the full paper here: https://dl.acm.org/doi/10.1145/3579522

See the original post here:
Sliding Out of My DMs: Young Social Media Users Help Train ... - Drexel University

Posted in Machine Learning | Comments Off

Synthetic data could be better than real data – Nature.com

Credit: Janelle Barone

When more than 155,000 students from all over the world signed up to take free online classes in electronics in 2012, offered through the fledgling US provider edX, they set in motion an explosion in the popularity of online courses.

The edX platform, created by the Massachusetts Institute of Technology (MIT) and Harvard University, both in Cambridge, Massachusetts, was not the first attempt at teaching classes online but the number of participants it attracted was unusual. The activity created a massive amount of information on how people interact with online education, and presented researchers with an opportunity to garner answers to questions such as What might encourage people to complete courses?, and What might give them a reason to drop out?.

We had a tonne of data, says Kalyan Veeramachaneni, a data scientist at MITs Laboratory for Information and Decision Systems. Although the university had long dealt with large data sets generated by others, that was the first time that MIT had big data in its own backyard, says Veeramachaneni.

Hoping to take advantage, Veeramachaneni assigned 20 MIT students to run analyses of the information. But he soon ran into a roadblock: legally, the data had to be private. This wealth of information was held on a single computer in his laboratory, with no connection to the Internet to prevent hacking. The researchers had to schedule a time to use it. It was a nightmare, Veeramachaneni says. I just couldnt get the work done because the barrier to the data was very high.

Why artificial intelligence needs to understand consequences

His solution, eventually, was to create synthetic students computer-generated versions of edX participants that shared characteristics with real students using the platform, but that did not give away private details. The team then applied machine-learning algorithms to the synthetic students activity, and in doing so discovered several factors associated with a person failing to complete a course1. For instance, students who tended to submit assignments right on a deadline were more likely to drop out. Other groups took the findings of this analysis and used them to help create interventions to help real people complete future courses2.

This experience of building and using a synthetic data set led Veeramachaneni and his colleagues to create the Synthetic Data Vault, a set of open-source software that allows users to model their own data and then use those models to generate alternative versions of the data3. In 2020, he co-founded a company called DataCebo, based in Boston, Massachusetts, which helps other companies to do this.

The desire to preserve privacy is one of the driving forces behind synthetic-data research. Because artificial intelligence (AI) and machine learning have expanded rapidly, finding their way into areas as diverse as health care, art and financial analysis, concerns about the data used to train the systems is also growing. To learn, these algorithms must consume vast amounts of information much of which relates to individuals. The system could reveal private details, or be used to discriminate against people when making decisions on hiring, lending or housing, for example. The data fed to these machines might also be owned by an individual or company that does not want the information to be used to create a tool that might then compete with them or at least, might not want to give the data away for free.

Some researchers think that the answer to these concerns could lie in synthetic data. Getting computers to manufacture data that is close enough to the real thing without recycling real information could help to address privacy problems. But it could also do much more. I want to move away from just privacy, says Mihaela van der Schaar, a machine-learning researcher and director of the UK Cambridge Centre for AI in Medicine. I hope that synthetic data could help us create better data.

All data sets come with issues that go beyond privacy considerations. They can be expensive to produce and maintain. In some cases for example, trying to diagnose a rare medical condition using imaging there simply might not be enough real-world data available to train a system to do the task reliably. Bias is also a problem both social biases, which might cause systems to favour one group of people over another, and subtler issues such as a training set of photos that includes only a handful taken at night. Synthetic data, its proponents say, can get around these problems by adding absent information to data sets faster and more cheaply than gathering it from the real world, assuming it were possible to obtain the real thing at all.

To me, its about making data this living, controllable object that you can change towards your application and your goals, says Phillip Isola, a computer scientist at MIT who specializes in machine vision. Its a fundamental new way of working with data.

There are several ways to synthesize data, but they all invoke the same concept. A computer, using a machine-learning algorithm or a neural network, analyses a real data set and learns about the statistical relationships within it. It then creates a new data set containing different data points than the original, but retaining the same relationships. A familiar example is ChatGPT, the text generation engine. ChatGPT is based on a large language model, Generative Pre-trained Transformer, which pored over billions of examples of text written by humans, analysed the relationships between the words and built a model of how they fit together. When given a prompt Write me an ode to ducks ChatGPT takes what it has learnt about odes and ducks and produces a string of words, with each word choice informed by the statistical probability of it following the previous one:

Oh ducks, feathered and free,

Paddling in ponds with such glee,

Your quacks and waddles are a delight,

A joy to behold, day or night.

With the right training, machines can produce not only text but also images, audio or the rows and columns of tabular data. The question is, how accurate is the output? Thats one of the challenges in synthetic data, says Thomas Strohmer, a mathematician who directs the Center for Data Science and Artificial Intelligence Research at the University of California, Davis (UC Davis).

Jason Adams, Thomas Strohmer and Rachael Callcut (left to right) are part of the synthetic data research team at UC Davis Health.

You first have to figure out what you mean by accuracy, he says. To be useful, a synthetic data set must retain the aspects of the original that are relevant to the outcome the all-important statistical relationships. But AI has accomplished many of its impressive feats by identifying patterns in data that are too subtle for humans to notice. If we could understand the data well enough to easily identify the relationships in medical data that suggest someone is at risk of a disease, we would have no need for a machine to find those relationships in the first place, Strohmer says.

This catch-22 means that the clearest way to know whether a synthetic data set has captured the important nuances of the original is to see if an AI system trained on the synthetic data makes similarly accurate predictions to a system trained on the original. The more capable the machine, the harder it is for humans to distinguish the real from the fake. AI-generated images and text are already at the point where they seem realistic to most people, and the technology is advancing rapidly. Were getting close to the level where, even to the expert, the imagery looks correct, but it still might not be correct, Isola says. It is therefore important that users treat synthetic data with some caution, and dont lose sight of the fact that it isnt real data, he says. It still might be misleading.

Last April, Strohmer and two of his colleagues at UC Davis Health in Sacramento, California, won a four-year, US$1.2-million grant from the US National Institutes of Health to work out ways to generate high-quality synthetic data that could help physicians to predict, diagnose and treat diseases. As part of the project, Strohmer is developing mathematical methods of proving just how accurate synthetic data sets are.

He also wants to include a mathematical guarantee of privacy, especially given the stringent laws around medical privacy around the world, such as the Health Insurance Portability and Accountability Act in the United States and the European Unions General Data Protection Regulation. The difficulty is that the utility and privacy of data are in tension; increasing one means decreasing the other.

To increase privacy in data, scientists add statistical noise to a data set. If, for instance, one of the data points collected is a persons age, they throw in some random ages to make individuals less identifiable. Its easier to pinpoint a 45-year-old man with diabetes than a person with diabetes who might be 38, or 51, or 62. But, if the age of diabetes onset is one of the factors being studied, this privacy-protecting measure will lead to less accurate results.

Abandoned: the human cost of neurotechnology failure

Part of the difficulty of guaranteeing privacy is that scientists are not completely sure how synthetic data reveals private information or how to measure how much it reveals, says Florimond Houssiau, a computer scientist at the Alan Turing Institute in London. One way in which secrets could be spilled is if the synthetic data are too similar to the original data. In a data set that contains many pieces of information associated with an individual, it can be hard to grasp the statistical relationships. In this case, the system generating the synthetic version is more likely to replicate what it sees rather than make up something entirely new. Privacy is not actually that well understood, Houssiau says. Scientists can assign a numerical value to the privacy level of a data set, but we dont exactly know which values should be considered safe or not. And so its difficult to do that in a way that everyone would agree on.

The varied nature of medical data sets also makes generating synthetic versions of them challenging. They might include notes written by physicians, X-rays, temperature measurements, blood-test results and more. A medical professional with years of training and experience might be able to put those factors together and come up with a diagnosis. Machines, so far, cannot. We just dont know enough, in terms of machine learning, to extract information from different modalities, Strohmer says. Thats a problem for analysis tools, but its also a problem for machines tasked with creating synthetic data sets that retain the all-important relationships. We dont understand yet how to automatically detect these relationships, he says.

There are also fundamental theoretical limits to how much improvement data can undergo, says Isola. Information theory contains a principle called the data-processing inequality, which states that processing data can only reduce the amount of information available, not add to it4. And all synthetic data must have real data at its root, so all the problems with real data privacy, bias, expense and more still exist at the start of the pipeline. Youre not getting something for free youre still ultimately learning from the world, from data. Youre just reformatting that into an easier-to-work-with format that you can control better, Isola says. With synthetic data, data comes in and a better version of the data comes out.

Although synthetic data in medicine havent yet made their way into clinical use, there are some areas where such data sets have taken off. They are being widely used in finance, Strohmer says, with many companies springing up to help financial institutions create new data that protect privacy. Part of the reason for this difference might be that the stakes are lower in finance than in medicine. If in finance you get it wrong, it still hurts, but it doesnt lead to death, so they can push things a little bit faster than in the medical field, Strohmer says.

In 2021, the US Census Bureau announced that it was looking at creating synthetic data to enhance the privacy of people who respond to its annual American Community Survey, which provides detailed information about households in subsections of the country. Some researchers have objected, however, on the grounds that the move could undermine the datas usefulness. In February, Administrative Data Research UK, a partnership that enables the sharing of public-sector data, announced a grant to study the value of synthetic versions of data sets that have been created by the Office of National Statistics and the UK Data Service.

Bioinspired robots walk, swim, slither and fly

Some people are also using synthetic data to test software that they hope to eventually use on real data that they do not yet have access to, says Andrew Elliott, a statistician at the University of Glasgow, UK. These fake data have to look something like the real data, but they can be meaningless, because they only exists for testing the code. A scientist who wants to analyse a sensitive data set that they are granted only limited access to can perfect the code first with synthetic data, and not have to waste time when they get hold of the real data.

For now, synthetic data are a relatively niche pursuit. van der Schaar thinks that more people should be talking about synthetic data and their potential impact and not just scientists. Its important that not only computer scientists understand, but also the general public, she says. People need to wrap their heads around this technology because it could affect everyone.

The issues around synthetic data not only raise interesting research questions for scientists but also important issues for society at large, Strohmer says. Data privacy is so important in the age of surveillance capitalism, he says. Creating good synthetic data that both preserve privacy and reflect diversity, and that are made widely available, has the potential not just to improve the performance of AI and expand its uses, but also to help democratize AI research. A lot of data is owned by a few big companies, and that creates an imbalance. Synthetic data could help to re-establish this balance a little bit, Strohmer says. I think thats an important, bigger goal behind synthetic data.

See the original post:
Synthetic data could be better than real data - Nature.com

Posted in Machine Learning | Comments Off

David Higginson of Phoenix Children’s Hospital on using machine … – Chief Healthcare Executive

Chicago - David Higginson has some advice for hospitals and health systems looking to use machine learning.

"Get started," he says.

Higginson, the chief innovation officer of Phoenix Children's Hospital, offered a presentation on machine learning at the HIMSS Global Health Conference & Exhibition. He described how machine learning models helped identify children with malnutrition and people who would be willing to donate to the hospital's foundation.

After the session, he spoke with Chief Healthcare Executive and offered some guidance for health systems looking to do more with machine learning.

"I would say get started by thinking about how you going to use it first," Higginson says. "Don't get tricked into actually building the model."

"Think about the problem, frame it up as a prediction problem," he says, while adding that not all problems can be framed that way.

"But if you find one that is a really nice prediction problem, ask the operators, the people that will use it everyday: 'Tell me how you'd use this,'" Higginson says. "And work with them on their workflow and how it's going to change the way they do their job.

"And when they can see it and say, 'OK, I'm excited about that, I can see how it's going to make a difference,' then go and build it," he says. "You'll have more motivation to do it, you'll understand what the goal is. But when you finally do get it, you'll know it's going to be used."

See the article here:
David Higginson of Phoenix Children's Hospital on using machine ... - Chief Healthcare Executive

Posted in Machine Learning | Comments Off

How ChatGPT might help your family doctor and other emerging health trends – Toronto Star

Health innovation in Canada has always been strong, but the sector is now experiencing growth at a pace we havent seen before.

While COVID-19 helped accelerate change, new technologies like OpenAIs ChatGPT are also having an impact. Plus, Canadian companies are leveraging machine learning to develop new therapies, diagnostics and patient platforms.

Theres a lot of really interesting drivers out there for innovation, says Jacki Jenuth, partner and chief operating officer at Lumira Ventures. Were starting to better define some of the underlying mechanisms and therapeutics approaches for diseases that up until now had no options, such as neurodegenerative diseases. And researchers are starting to define biomarkers to select patients more likely to respond in clinical settings thats really good news.

Next week, the annual MaRS Impact Health conference will bring together health care professionals, entrepreneurs, investors, policymakers and other stakeholders. Heres a sneak preview of some of the emerging trends in the health care and life sciences space theyll be exploring.

There's huge revenue opportunities in women's health, says Annie Thriault, managing partner at Cross-Border Impact Ventures. (Fryer, Tim)

Womens health funding isnt where it should be, says Annie Thriault, managing partner at Cross-Border Impact Ventures. Bayer recently announced its stopping R&D for womens health to focus on other areas. Other pharmaceutical companies such as Merck have made similar decisions in recent years. Its hard to imagine why groups are moving in that direction, because were seeing huge revenue opportunities in these markets, says Thriault. A lot of exciting things are happening.

One area that Thriault has been watching closely has been personalized medicine that uses artificial intelligence, machine learning or sophisticated algorithms to tailor treatment for women and children. For instance, there are tools that provide targeted cancer treatments that use gender as a key input. In the past, that maybe wouldnt have been thought of as an important variable, she says.

In prenatal care, there are new tools related to diagnosing anomalies in pregnancies through data. What we see in maternal health is a lot of inequalities, Thriault says. But if the exam is performed with the same level of care, accuracy, and specificity, then analyzed through AI to spot problems, you can make positive health outcomes and hopefully a less unequal health system.

Click to expand

With the right protections and security measures, AI could help create efficiencies in health care, says Frank Rudzicz, a ??faculty member at the Vector Institute for Artificial Intelligence. (Fryer, Tim)

New technologies like ChatGPT have shown the potential of not just getting AI and machine learning to take large data sets and make sense of them, but also to create efficiencies when it comes to doing paperwork with that information.

I always thought wed get to this point, but I just didnt think wed get to here so soon where we are talking about AI really changing the nature of jobs, says Frank Rudzicz, a faculty member at the Vector Institute for Artificial Intelligence. And its just getting started.

There are a lot of inefficiencies in health care that AI can help with. Doctors, for instance, spend up to half their time working on medical records and filling out forms. (A recent study from the Canadian Federation of Independent Business found that collectively they are spending some 18.5 million hours on unnecessary paperwork and administrative work each year the equivalent of more than 55 million patient visits.) Thats not what they signed up for, he says. They signed up to help people.

While people are becoming more comfortable with using technology to track and monitor their health whether that be through smartwatches, smartphone apps or genetic testing there arent as many connection points for them to use that data with their family doctor. There is an opportunity, Rudzicz says, to use data and technologies such as machine learning, with proper guardrails and patient consent, to sync the data with your doctors records to help with diagnosis and prescribing.

Ultimately, doctors are trained professionals and they need to be the ones who make the diagnosis and come up with treatment plans with the patients, he says. But once you get all the pieces together, the results could be more accurate and safer than they have been.

Plus, there are a lot of possible futures for technologies like ChatGPT in health care, such as automating repetitive tasks like filling out forms or writing requisitions and referral letters for doctors to review before submitting. The barrier to entry for anything that will speed up your workflow is going to be very low and easily integrated, Rudzicz says.

While there's been a slowdown in venture capital investments, there's still funding to be found, says Jacki Jenuth, partner and chief operating officer at Lumira Ventures. (Fryer, Tim)

While theres been a slowdown in venture capital funding, with fewer dollars available as markets become more rational after the record highs of the last few years, theres still funding to be found, says Lumiras Jenuth. Management teams in the life sciences space just have to be more resourceful and explore all possible avenues of funding, including corporations, non-dilutive sources, foundations and disease specific funders, she adds.

It helps to build deep relationships with investors who want to make an impact in the health sectors, she says. The pitch needs to be targeted for each one of these groups. Youll hear a lot of nos, so you need to be tenacious. Its not easy.

Discover more of the technologies and ideas that will transform health care at the MaRS Impact Health conference on May 3 and 4.

Disclaimer This content was produced as part of a partnership and therefore it may not meet the standards of impartial or independent journalism.

Posted in Machine Learning | Comments Off

Machine learning: As AI tools gain heft, the jobs that could be at stake – The Indian Express

Watch out for the man with the silicon chipHold on to your job with a good firm gripCause if you dont youll have had your chipsThe same as my old man

Scottish revival singer-songwriter Ewan MacColls 1986 track My Old Man was an ode to his father, an iron-moulder who faced an existential threat to his job because of the advent of technology. The lyrics could finds some resonance nearly four decades on, as industry leaders and tech stalwarts predict the advancement in large language models such as OpenAIs GPT-4 and their ability to write essays, code, and do maths with greater accuracy and consistency, heralding a fundamental tech shift; almost as significant as the creation of the integrated circuit, the personal computer, the web browser or the smartphone. But there still are question marks over how advanced chatbots could impact the job market. And if the blue collar work was the focus of MacColls ballad, artificial intelligence (AI) models of the generative pretrained transformer type signify a greater threat for white collar workers, as more powerful word-predicting neural networks that manage to carry out a series of operations on arrays of inputs end up producing output that is significantly humanlike. So, will this latest wave impact the current level of employment?

According to Goldman Sachs economists Joseph Briggs and Devesh Kodnani, the answer is a resounding yes, and they predict that as many as 300 million full-time jobs around the world are set to get automated, with workers replaced by machines or AI systems. What lends credence to this stark prediction is the new wave of AI, especially large language models that include neural networks such as Microsoft-backed OPenAIs ChatGPT.

The Goldman Sachs economists predict that such technology could bring significant disruption to the labour market, with lawyers, economists, writers, and administrative staff among those projected to be at greatest risk of becoming redundant. In a new report, The Potentially Large Effects of Artificial Intelligence on Economic Growth, they calculate that approximately two-thirds of jobs in the US and Europe are set to be exposed to AI automation, to various degrees.

In general white-collar workers, and workers in advanced economies in general, are projected to be at a greater risk than blue collar workers in developing countries. The combination of significant labour cost savings, new job creation, and a productivity boost for non-displaced workers raises the possibility of a labour productivity boom like those that followed the emergence of earlier general-purpose technologies like the electric motor and personal computer, the report said.

And OpenAI itself predicts that a vast majority of workers will have at least part of their jobs automated by GPT models. In a study published on the arXiv preprint server, researchers from OpenAI and the University of Pennsylvania said that 80 percent of the US workforce could have at least 10 percent of their tasks affected by the introduction of GPTs.

Central to these predictions is the way models such as ChatGPT get better with more usage GPT stands for Generative Pre-trained Transformer and is a marker for how the platform works; being pre-trained by human developers initially and then primed to learn for itself as more and more queries are posed by users to it. The OpenAI study also said that around 19 per cent of US workers will see at least 50 per cent of their tasks impacted, with the qualifier that GPT exposure is likely greater for higher-income jobs, but spans across almost all industries. These models, the OpenAI study said, will end up as general-purpose technologies like the steam engine or the printing press.

A January 2023 paper, by Anuj Kapoor of the Indian Institute of Management Ahmedabad and his co-authors, explored the question of whether AI tools or humans were more effective at helping people lose weight. The authors conducted the first causal evaluation of the effectiveness of human vs. AI tools in helping consumers achieve their health outcomes in a real-world setting by comparing the weight loss outcomes achieved by users of a mobile app, some of whom used only an AI coach while others used a human coach as well.

Interestingly, while human coaches scored higher broadly, users with a higher BMI did not fare as well with a human coach as those who weighed less.

The results of our analysis can extend beyond the narrow domain of weight loss apps to that of healthcare domains more generally. We document that human coaches do better than AI coaches in helping consumers achieve their weight loss goals. Importantly, there are significant differences in this effect across different consumer groups. This suggests that a one-size-fits-all approach might not be most effective Kapoor told The Indian Express.

The findings: Human coaches help consumers achieve their goals better than AI coaches for consumers below the median BMI relative to consumers who have above-median BMI. Human coaches help consumers achieve their goals better than AI coaches for consumers below the median age relative to consumers who have above-median age.

Human coaches help consumers achieve their goals better than AI coaches for consumers below the median time in a spell relative to consumers who spent above-median time in a spell. Further, human coaches help consumers achieve their goals better than AI coaches for female consumers relative to male consumers.

While Kapoor said the paper did not go deeper into the why of the effectiveness of AI+Human plans for low BMI individuals over high BMI individuals, he speculated on what could be the reasons for that trend: Humans can feel emotions like shame and guilt while dealing with other humans. This is not always true, but in general and theres ample evidence to suggest this research has shown that individuals feel shameful while purchasing contraceptives and also while consuming high-calorie indulgent food items. Therefore, high BMI individuals might find it difficult to interact with other human coaches. This doesnt mean that health tech platforms shouldnt suggest human plans for high BMI individuals. Instead, they can focus on (1) Training their coaches well to make the high BMI individuals feel comfortable and heard and (2) deciding the optimal mix of the AI and Human components of the guidance for weight loss, he added.

Similarly, the female consumers responding well to the human coaches can be attributed to the recent advancements in the literature on Human AI interaction, which suggests that the adoption of AI is different for females/males and also theres differential adoption across ages, Kapoor said, adding that this can be a potential reason for the differential impact of human coaches for females over males.

An earlier OECD paper on AI and employment titled New Evidence from Occupations most exposed to AI asserted that the impact of these tools would be skewed in favour of high-skilled, white-collar ones, including: business professionals; managers; science and engineering professionals; and legal, social and cultural professionals.

This contrasts with the impact of previous automating technologies, which have tended to take over primarily routine tasks performed by lower-skilled workers. The 2021 study noted that higher exposure to AI may be a good thing for workers, as long as they have the skills to use these technologies effectively. The research found that over the period 2012-19, greater exposure to AI was associated with higher employment in occupations where computer use is high, suggesting that workers who have strong digital skills may have a greater ability to adapt to and use AI at work and, hence, to reap the benefits that these technologies bring. By contrast, there is some indication that higher exposure to AI is associated with lower growth in average hours worked in occupations where computer use is low. On the whole, the study findings suggested that the adoption of AI may increase labour market disparities between workers who have the skills to use AI effectively and those who do not. Making sure that workers have the right skills to work with new technologies is therefore a key policy challenge, which policymakers will increasingly have to grapple with.

View post:
Machine learning: As AI tools gain heft, the jobs that could be at stake - The Indian Express

Posted in Machine Learning | Comments Off

How AI, automation, and machine learning are upgrading clinical trials – Clinical Trials Arena

Artificial intelligence (AI) is set to be the most disruptive emerging technology in drug development in 2023, unlocking advanced analytics, enabling automation, and increasing speed across the clinical trial value chain.

Todays clinical trials landscape is being shaped by macro trends that include the Covid-19 pandemic, geopolitical uncertainty, and climate pressures. Meanwhile, advancements in adaptive design, personalisation and novel treatments mean that clinical trials are more complex than ever. Sponsors seek greater agility and faster time to commercialisation while maintaining quality and safety in an evolving global market. Across every stage of clinical research, AI offers optimisation opportunities.

A new whitepaper from digital technology solutions provider Taimei examines the transformative impact of AI on the clinical trials of today and explores how it will shape the future.

The big delay areas are always patient recruitment, site start-up, querying, data review, and data cleaning, explains Scott Clark, chief commercial officer at Taimei.

Patient recruitment is typically the most time-consuming stage of a clinical trial. Sponsors must find and identify a set of subjects, gather information, and use inclusion/exclusion criteria to filter and select participants. And high-quality patient recruitment is vital to a trials success.

Once patients are recruited, they must be managed effectively. Patient retention has a direct impact on the quality of the trials results, so their management is crucial. In todays clinical trials, these patients can be distributed over more than a hundred sites and across multiple geographies, presenting huge data management challenges for sponsors.

AI can be leveraged across patient recruitment and management to boost efficiency, quality, and retention. Algorithms can gather subject information and screen and filter potential participants. They can analyse data sources such as medical records and even social media content to detect subgroups and geographies that may be relevant to the trial. AI can also alert medical staff and patients to clinical trial opportunities.

The result? Faster, more efficient patient recruitment, with the ability to reach more diverse populations and more relevant participants, as well as increase quality and retention. [Using AI], you can develop the correct cohort, explains Clark. Its about accuracy, efficiency, and safety.

Study build can be a laborious and repetitive process. Typically, data managers must read the study protocol and generate as many as 50-60 case report forms (CRFs). Each trial has different CRF requirements. CRF design and database building can take weeks and has a direct impact on the quality and accuracy of the clinical trial.

Enter AI. Automated text reading can parse, categorise, and stratify corpora of words to automatically generate eCRFs and the data capture matrix. In study building, AI is able to read the protocols and pull the best CRF forms for the best outcomes, adds Clark.

It can then use the data points from the CRFs to build the study base, creating the whole database in a matter of minutes rather than weeks. The database is structured for export to the biostatisticians programming. AI can then facilitate the analysis of data and develop all of the required tables, listings and figures (TLFs). It can even come to a conclusion on the outcomes, pending review.

Optical character recognition (OCR) can address structured and unstructured native documents. Using built-in edit checks, AI can reduce the timeframe for study build from ten weeks to just one, freeing up data managers time. We are able to do up to 168% more edit checks than are done currently in the human manual process, says Clark. AI can also automate remote monitoring to identify outliers and suggest the best route of action, to be taken with approval from the project manager.

AI data management is flexible, agile, and robust. Using electronic data capture (EDC) removes the need to manage paper-based documentation. This is essential for modern clinical trials, which can present huge amounts of unstructured data thanks to the rise of advances such as decentralisation, wearables, telemedicine, and self-reporting.

Once the trial is launched, you can use AI to do automatic querying and medical coding, says Clark. When theres a piece of data that doesnt make sense or is not coded, AI can flag it and provide suggestions automatically. The data manager just reviews what its corrected, adds Clark. Thats a big time-saver. By leveraging AI throughout data input, sponsors also cut out the lengthy process of data cleaning at the end of a trial.

Implementing AI means establishing the proof of concept, building a customised knowledge base, and training the model to solve the problem on a large scale. Algorithms must be trained on large amounts of data to remove bias and ensure accuracy. Today, APIs enable best-in-class advances to be integrated into clinical trial applications.

By taking repetitive tasks away from human personnel, AI accelerates the time to market for life-saving drugs and frees up man-hours for more specialist tasks. By analysing past and present trial data, AI can be used to inform future research, with machine learning able to suggest better study design. In the long term, AI has the potential to shift the focus away from trial implementation and towards drug discovery, enabling improved treatments for patients who need them.

To find out more, download the whitepaper below.

See the original post here:
How AI, automation, and machine learning are upgrading clinical trials - Clinical Trials Arena

Posted in Machine Learning | Comments Off

Category Archives: Machine Learning

Sliding Out of My DMs: Young Social Media Users Help Train … – Drexel University

Synthetic data could be better than real data – Nature.com

David Higginson of Phoenix Children’s Hospital on using machine … – Chief Healthcare Executive

How ChatGPT might help your family doctor and other emerging health trends – Toronto Star

Machine learning: As AI tools gain heft, the jobs that could be at stake – The Indian Express

How AI, automation, and machine learning are upgrading clinical trials – Clinical Trials Arena

The Future Of Nano Technology

Categories

Recent Posts

Archives

Popular Key Word Searches