Search Immortality Topics:

Page 63«..1020..62636465..7080..»


Category Archives: Machine Learning

Machine learning to transform delivery of major rail projects in UK – Global Railway Review

By utilising machine learning, Network Rail can increase prediction accuracy, reduce delays, unlock early risk detection and enable significant cost savings.

Credit: Network Rail

Network Rail has announced that it is working with technology startup nPlan to use machine learning technology across its portfolio of projects, which has the potential to transform the way major rail projects are delivered across Britain.

Through using data from past projects to produce accurate cost and time forecasts, the partnership will deliver efficiencies in the way projects are planned and carried out, and improve service reliability for passengers by reducing the risk of overruns.

In a world-first for such work on this scale, Network Rail tested nPlans risk analysis and assurance solution on two of its largest rail projects on the Great Western Main Line and the Salisbury to Exeter Signalling project representing over 3 billion of capital expenditure.

This exercise showed that, by leveraging past data, cost savings of up to 30 million could have been achieved on the Great Western Main Line project alone. This was primarily achieved by flagging unknown risks to the project team those that are invisible to the human eye due to the size and complexity of the project data and allowing them to mitigate those risks before they occur at a significantly lower cost than if they are missed or ignored.

The machine learning technology works by learning from patterns in historical project performance. Put simply, the algorithm learns by comparing what was planned against what actually happened on a project at an individual activity level. This facilitates transparency and a shared, improved view of risk between project partners.

Following the success of this trial, nPlan and Network Rail will now embark on the next phase of deployment, rolling out the software on 40 projects before scaling up on all Network Rail projects by mid-2021. Using data from over 100,000 programmes, Network Rail will increase prediction accuracy, reduce delays, allow for better budgeting and unlock early risk detection, leading to greater certainty in the outcome of these projects.

Network Rails Programme Director for Affordability, Alastair Forbes, said: By championing innovation and using forward-thinking technologies, we can deliver efficiencies in the way we plan and carry out rail upgrade and maintenance projects. It also has the benefit of reducing the risk of project overruns, which means, in turn, we can improve reliability for passengers.

Dev Amratia, CEO and co-founder of nPlan, said: Network Rail is amongst the largest infrastructure operators in Europe, and adopting technology to forecast and assure projects can lead to better outcomes for all of Britains rail industry, from contractors to passengers. I look forward to significantly delayed construction projects, and the disruption that they cause for passengers, becoming a thing of the past, with our railways becoming safer and more resilient.

See original here:
Machine learning to transform delivery of major rail projects in UK - Global Railway Review

Posted in Machine Learning | Comments Off on Machine learning to transform delivery of major rail projects in UK – Global Railway Review

A beginners guide to the math that powers machine learning – The Next Web

How much math knowledge do you need for machine learning and deep learning? Some people say not much. Others say a lot. Both are correct, depending on what you want to achieve.

There are plenty of programming libraries, code snippets, and pretrained models that can get help you integrate machine learning into your applications without having a deep knowledge of the underlying math functions.

But theres no escaping the mathematical foundations ofmachine learning. At some point in your exploration and mastering of artificial intelligence, youll need to come to terms with the lengthy and complicated equations that adorn AI whitepapers and machine learning textbooks.

In this post, I will introduce some of my favorite machine learning math resources. And while I dont expect you to have fun with machine learning math, I will also try my best to give you some guidelines on how to make the journey a bit more pleasant.

Khan Academys online courses are an excellent resource to acquire math skills for machine learning

Many machine learning books tell you that having a working knowledge of linear algebra. I would argue that you need a lot more than that. Extensive experience with linear algebra is a must-havemachine learning algorithms squeeze every last bit out of vector spaces and matrix mathematics.

You also need to know a good bit of statistics and probability, as well as differential and integral calculus, especially if you want to become more involved indeep learning.

There are plenty of good textbooks, online courses, and blogs that explore these topics. But my personal favorite isKhan Academys math courses. Sal Khan has done a great job of putting together a comprehensive collection of videos that explain different math topics. And its free, which makes it even better.

Although each of the videos (which are also available on YouTube) explain a separate topic, going through the courses end-to-end provides a much richer experience.

I recommend thelinear algebracourse in particular. Here, youll find everything you need about vector spaces, linear transformations, matrix transformations, and coordinate systems. The course has not been tailored for machine learning, and many of the examples are about 2D and 3D graphic systems, which are much easier to visualize than the multidimensional spaces of machine learning problems. But they discuss the same concepts youll encounter in machine learning books and whitepapers. In the course are some hidden gems like least square calculations and eigenvectors, which are important topics in machine learning.

The calculus course are a bit more fragmented, but it might be a good feature for readers who already have a strong foundation and just want to brush up their skills. Khan includes precalculus, differential calculus, and integral calculus courses that cover the foundations. Themultivariable calculus coursediscusses some of the topics that are central to deep learning, such as gradient descent and partial derivatives.

There are also several statistics courses in Khan Academys platform, and there are some overlaps between them. They all discuss some of the key concepts you need in data science and machine learning, such as random variables, distributions, confidence intervals, and the difference between continuous and categorical data. I recommend thecollege statistics course, which includes some extra material that is relevant to machine learning, such as the Bayes theorem.

To be clear, Khan Academys courses are not a replacement for the math textbook and classroom. They are not very rich in exercises. But they are very rich in examples, and for someone who just needs to blow the dust off their algebra knowledge, theyre great. Sal talks very slowly, probably to make the videos usable for a wider audience who are not native English speakers. I run the videos on 1.5x speed and have no problem understanding them, so dont let the video lengths taunt you.

Vanilla algebra and calculus are not enough to get comfortable with the mathematics of machine learning. Machine learning concepts such as loss functions, learning rate, activation functions, and dimensionality reduction are not covered in classic math books. There are more specialized resources for that.

My favorite isMathematics for Machine Learning. Written by three AI researchers, the provides you with a strong foundation to explore the workings of different components of machine learning algorithms.

The book is split into two parts. The first part is mathematical foundations, which is basically a revision of key linear algebra and calculus concepts. The authors cover a lot of material in little more than 200 pages, so most of it is skimmed over with one or two examples. If you have a strong foundation, this part will be a pleasant read. If you find it hard to grasp, you can combine the chapters with select videos from Khans YouTube channel. Itll become much easier.

The second part of the book focuses on machine learning mathematics. Youll get into topics such as regression, dimensionality reduction, support vector machines, and more. Theres no discussion ofartificial neural networksand deep learning concepts, but being focused on the basics makes this book a very good introduction to the mathematics of machine learning.

As the authors write on their website: The book is not intended to cover advanced machine learning techniques because there are already plenty of books doing this. Instead, we aim to provide the necessary mathematical skills to read those other books.

For a more advanced take on deep learning, I recommendHands-on Mathematics for Deep Learning. This book also contains an intro on linear algebra, calculus, and probability and statistics. Again, this section is for people who just want to jar their memory. Its not a basic introductory book.

The real value of this book comes in the second section, where you go into the mathematics of multilayer perceptrons,convolutional neural networks(CNN), andrecurrent neural networks(RNN). The book also goes into the logic of other crucial concepts such as regularization (L1 and L2 norm), dropout layers, and more.

These are concepts that youll encounter in most books on machine learning and deep learning. But knowing the mathematical foundations will help you better understand the role hyperparameters play in improving the performance of your machine learning models.

A bonus section dives into advanced deep learning concepts, such as the attention mechanism that has made Transformers so efficient and popular, generative models such as autoencoders andgenerative adversarial networks, and the mathematics oftransfer learning.

Agreeably, mathematics is not the most fun way to start machine learning education, especially if youre self-learning. Fortunately, as I said at the beginning of this article, you dont need to begin your machine learning education by poring over double integrals, partial derivatives, and mathematical equations that span a pages width.

You can start with some of the more practical resources on data science and machine learning. A good introductory book isPrinciples of Data Science, which gives you a good overview of data science and machine learning fundamentals along with hands-on coding examples in Python and light mathematics.Hands-on Machine Learning andPython Machine Learningare two other books that are a little more advanced and also give deeper coverage of the mathematical concepts. UdemysMachine Learning A-Zis an online course that combines coding with visualization in a very intuitive way.

I would recommend starting with one or two of the above-mentioned books and courses. They will give you a working knowledge of the basics of machine learning and deep learning and prepare your mind for the mathematical foundations. Once you know have a solid grasp of different machine learning algorithms, learning the mathematical foundations becomes much more pleasant.

As you master the mathematics of machine learning, you will find it easier to find new ways to optimize your models and tweak them for better performance. Youll also be able to read the latest cutting edge papers that explain the latest findings and techniques in deep learning, and youll be able to integrate them into your applications. In my experience, the mathematics of machine learning is an ongoing educational experience. Always look for new ways to hone your skills.

This article was originally published by Ben Dickson on TechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech and what we need to look out for. You can read the original article here.

Published October 2, 2020 10:00 UTC

See the original post:
A beginners guide to the math that powers machine learning - The Next Web

Posted in Machine Learning | Comments Off on A beginners guide to the math that powers machine learning – The Next Web

Trust Algorithms? The Army Doesn’t Even Trust Its Own AI Developers – War on the Rocks

Last month, an artificial intelligence agent defeated human F-16 pilots in a Defense Advanced Research Projects Agency challenge, reigniting discussions about lethal AI and whether it can be trusted. Allies, non-government organizations, and even the U.S. Defense Department have weighed in on whether AI systems can be trusted. But why is the U.S. military worried about trusting algorithms when it does not even trust its AI developers?

Any organizations adoption of AI and machine learning requires three technical tools: usable digital data that machine learning algorithms learn from, computational capabilities to power the learning process, and the development environment that engineers use to code. However, the militarys precious few uniformed data scientists, machine learning engineers, and data engineers who create AI-enabled applications are currently hamstrung by a lack of access to these tools. Simply put, uniformed personnel cannot get the data, computational tools, or computing capabilities to create AI solutions for the military. The problem is not that the systems or software are inherently unsafe, but that users cannot get approvals to access or install them.

Without data, computing power, and a development environment, AI engineers are forced to cobble together workarounds with the technical equivalent of duct-tape and WD-40 or jump through bureaucratic hoops to get access to industry-standard software libraries that would take only a few seconds to download on a personal computer. Denying AI engineers these tools is the equivalent of denying an infantryman her rifle and gear (body armor, helmet, and first aid kit). If the military can trust small-unit leaders to avoid fratricide or civilian casualties while leading soldiers in a firefight or to negotiate with tribal leaders as part of counter-insurgency operations, it can trust developers to download software libraries with hundreds of millions of registered downloads.

The Defense Departments Joint AI Center has initiated a multi-year contract to build the Joint Common Foundation, a platform to equip uniformed AI developers with the tools needed to build machine learning solutions. However, tools alone are not enough. The Joint Common Foundation should be part of a broader shift in empowering developers with both tools and trust.

Developers Need Data

Data is the lifeblood of modern machine learning, but much of the Defense Departments data is neither usable nor accessible, making the military data rich but information poor. The military is hardly alone in its inability to harness the potential of data. A survey by Kaggle, the worlds largest data science community, showed that dirty data was the biggest barrier to data science work.

A recent article in a publication about the Joint Common Foundation mentioned the difficulties of object detection using MQ-9 Reaper drone videos because position data was burned in to the images, confusing the machines. Our most trying experience with dirty data comes from the Army human resources system which as you might have guessed has copies of soldiers personnel records in image or pdf form, rather than a searchable, analyzable database. Instead of using AI to address talent management, the Army is struggling to make evaluations and records computer-readable. Once cleaned and structured, the data should also be accessible by users and their tools.

Military data owners frequently refuse to share their data, siloing it away from other data sources. Uniformed developers often spend hours to find the right authority to request access to a dataset. When they do, overly restrictive and nonsensical data sharing practices are common. For example, in one authors experience, a data-owning organization shipped a laptop to that individual with preconfigured programs on it, because the data-owning organization did not trust the AI engineer to download the information or configure their own tools. Other times, the approval process takes weeks, as legal, G-6, G-8, and Network Enterprise Technology Command entities take turns saying: Its not my decision, I dont know, or This seems scary.

While the services have information system owners at regional network enterprise centers to manage users and networks, there is no such role or process for data. The Joint Common Foundation may put some of the Defense Departments data under one technical roof, but it doesnt solve the problem of bureaucratic silos and gatekeepers. Without an established framework for identifying and labeling which AI engineers have need-to-know and a streamlined process for access requests, the data will still be effectively locked away.

And an Advanced Development Environment

In the rare event that data is accessible, uniformed AI engineers are not allowed to install software or configure their machines. The government computers with data access may only have data science languages like R and much more rarely Python and Julia and may also prohibit or severely inhibit the installation of software libraries that allow for data exploration, visualization, or machine learning. These libraries are critical to making machine learning accessible to any AI researchers (which the military has few of). Denying these tools to uniformed AI engineers forces them to reinvent the wheel, rebuilding algorithms from scratch.

In simple terms, the current options are blunt, general-purpose tools, but most AI engineers prefer advanced tools. For comparison, a financial analyst could do complex math by hand or with a basic calculator, but Microsoft Excel is a far more robust tool. The Armys AI engineers face an equivalent situation.

Without these tools and libraries, AI engineers are forced to recreate the research of several academics in whatever coding language is allowed to do anything even as basic as matrix multiplication. As uniformed technologists, we build side projects on our personal computers with much more ease (and modern tools) than on government equipment. Such disparity is not surprising, but the central issues are permission, control, and speed rather than security or risk.

The Joint Common Foundation is expected to provide a secure software engineering environment and access to other resources, but a centralized solution of individually allocating software permissions will never keep pace with user needs. For comparison, the Defense Information Systems Agency has spent nearly $150 million since 2018 to address the backlog of more than 700,000 personnel awaiting security clearances, with some success. The importance of AI in future warfare means that backlogs of hundreds of AI developers waiting for software tools to do their job is a critical national security risk. A long process is not necessarily a thorough one while scalability comes from educating, trusting, and empowering many users. In order to actually enable the uniformed AI workforce to do its job, there needs to be greater trust in what tools and programs they are allowed to install and use on their government-furnished equipment.

The common refrain is that those tools are not safe, but that reasoning is just draconian and lacks critical thinking. Fighter jets are expensive and precious, yet military pilots still fly and occasionally crash them. Soldiers on a combat patrol or even the rifle range are at increased risk, but they patrol and train because that is their mission. Security is a balance of risk and effectiveness and we need to re-evaluate our digital network policies. Its unreasonable that minor version updates of TensorFlow and PyTorch key machine learning libraries created and maintained by Google and Facebook, respectively would suddenly be a threat. Its also unlikely that a widely-used open-source library would be a threat or that the threat would be detected in a review, yet somehow missed by millions of other users. Moreover, government networks should be secure enough to detect and isolate malicious behavior or at least built with zero trust minimizing the time a network user has elevated privileges such that the blast radius is minimized. The U.S. military can do better and the Joint Common Foundation alone will not suffice.

Plus, More Computing Power

Once an AI engineer has access to data and the necessary software tools to build machine learning algorithms, they will need computational power, or compute, to train the machine to learn using the data. Computing power, like data, is currently siloed within some data-focused organizations like the Center for Army Analysis, the G-8, and the Office of Business Transformation, and is inaccessible to AI engineers outside of these organizations. Even if an AI developer is granted an account on the systems, the computational environments are only accessible via government laptops maintained by specific IT administrators.

This purely bureaucratic restriction means that a substantial number of the militarys AI workforce who may be doing training with industry, getting a degree in machine learning from Carnegie Mellon, or otherwise in an environment without a computer on the .mil domain would not be able to use their new skills on military problems.

Connectivity and access have been issues at the Armys Data Science Challenge. When participants raised the issue last year, the sponsors of the challenge made the data available to military members without access to government computers (and, no data leaks transpired). This year, however, the bureaucratic access control issue will prevent last years competition winner along with however many AI engineers that are currently in school, training with industry, or simply unable to get to a government computer due to the novel coronavirus teleworking restrictions from competing.

Do Both: Centralize and Delegate

Ongoing platform efforts like the Coeus system proposed by the Armys AI Task Force and Joint Common Foundation being built by the Joint AI Center are much-needed efforts to put tools in the hands of AI developers. We strongly support them. Both may take years to reach full operational capability, but the military needs AI tools right now. The Joint Common Foundation contract has options for four years, which is a long time in the fast-moving field of AI. Few people in the Pentagon understand AI and no one there knows what AI will look like in four years. Four years ago, the federal government spent half as much on AI as it does now; the Defense Department had not established the Joint AI Center or even the Pentagons first large AI effort, Project Maven; and the Pentagon had no AI strategy at all. Who can predict with confidence on such a time horizon? While fully functioning platforms are being developed, the Pentagon can take immediate steps.

The Defense Department and the services should formally track people in AI or software engineer roles, giving them skill identifiers similar to those of medical professionals, and giving digital experts specific permissions: access to data sources, authority to use low risk software locally (including virtual machines), and secure access to compute resources. The services have IT admins who are entrusted with elevated network permissions (the bar is only a CompTIA Security+ certification) and it is time to create a new user profile for developers. AI and software engineers (many of whom have degrees in computer science) require access to customize their own devices and use many specialty tools. The process to become an authorized user should be clear and fast with incentives for approval authorities to hit speed benchmarks.

First, the Defense Department needs to update its policies related to data sharing (2007 and 2019). Department leadership needs to formally address issues with permissions, approval processes, privacy, confidentiality, sensitivity for data sharing, and recognize AI engineers as a new user group that is distinctly different from data scientists. Moreover, access to data gets lost in bureaucracy because there is no executive role to manage it. The Defense Department should also consider creating a role of an information data owner to perform this role based on the information security owner role that controls network security. Data scientists and AI experts need access to data to do their jobs. This should not mean carte blanche, but maybe parity with contractors is a fair target.

Current policies restricting access to data for uniformed AI experts are especially frustrating when one considers that the Defense Department pays contractors like Palantir billions of dollars for aggregation and analysis of sensitive, unclassified, and classified data. Given that military leadership trusts contractors who have little allegiance to the military beyond a contract with wide latitude in data access, shouldnt the military also extend at least the same trust with data to its own people?

Second, the Defense Department should set a goal to rapidly put as many tools as possible in the hands of engineers. The Joint AI Center and AI hubs within the services should drive expansion of existing virtual software stores with well-known, vetted-safe software libraries like Pandas, Scikit Learn, PyTorch, and TensorFlow and allow AI and software engineers to freely install these packages onto government computers. Such a capability to manage software licenses already exists but needs a major upgrade to meet the new demands of uniformed digital technologists.

Concurrently, the Defense Department should lower the approval authority of software installation from one-star generals to colonels (O-6) in small scale use cases. For example, if an AI teams commanding officer is comfortable using an open source tool, the team should be able to use it locally or in secure testing environments, but they should not push it to production until approved by the Defense Systems Information Agency. Once the agency approves the tool, it can be added to the software store and made available to all uniformed personnel with the AI Engineer user role described above. The chief information officer/G-6 and deputy secretary of defense should provide incentives for the Defense Information Systems Agency to accelerate its review processes. The net benefit will allow engineers to refine and validate prototypes while security approvals are running in parallel.

In particular, designated users should be authorized to install virtualization software (like VMWare or Docker) and virtual private network servers into government computers. Virtualization creates a logically isolated compartment on a client and gives developers full configuration control over software packages and operating systems on a virtual machine. The virtual machine can break without affecting the government hardware it sits on thus making the local authority for software installation less risky. VPN technology will allow approved users to connect to .mil systems without government equipment except for a common access card. These products are secure and widely recognized as solutions to enterprise security problems.

The military will also benefit by giving AI developers access to virtualization tools. Now, they will become beta testers, users who encounter problems with security or AI workflows. They can identify issues and give feedback to the teams building the Joint Common Foundation and Coeus, or the teams reviewing packages at the Defense Systems Information Agency. This would be a true win for digital modernization and part of a trust-building flywheel.

Risk Can Be Mitigated

If the military truly wants an AI-enabled force, it should give its AI developers access to tools and trust them to use those tools. Even if the military does build computational platforms, like Coeus or Joint Common Foundation, the problem of having grossly insufficient computational tools will persist if the services still do not trust their AI engineers to access or configure their own tools. We fully recognize that allowing individual AI engineers to install various tools, configure operating systems, and have access to large amounts of data poses some level of additional risk to the organization. On its face, in a world of cyber threats and data spillage, this is a scary thought. But the military over hundreds of years of fighting has recognized that risk cannot be eliminated, only mitigated. Small, decentralized units closest to the problems should be trusted with the authority to solve these problems.

The military trusts personnel to handle explosives, drop munitions, and maneuver in close proximity under fire. Uniformed AI engineers need to be entrusted with acquiring and configuring their computational tools. Without that trust and the necessary tools to perform actual AI engineering work, the military may soon find itself without the AI engineers as well.

Maj. Jim Perkins is an Army Reservist with the 75th Innovation Command. After 11 years on active duty, he now works in national security cloud computing with a focus on machine learning at the edge. From 20152017, he led the Defense Entrepreneurs Forum, a 501(c)(3) nonprofit organization driving innovation and reform in national security. He is a member of the Military Writers Guild and he tweets at @jim_perkins1.

The opinions expressed here are the authors own and do not reflect official policy of the Department of Defense, Department of the Army, or other organizations.

Image: U.S. Army Cyber Command

Original post:
Trust Algorithms? The Army Doesn't Even Trust Its Own AI Developers - War on the Rocks

Posted in Machine Learning | Comments Off on Trust Algorithms? The Army Doesn’t Even Trust Its Own AI Developers – War on the Rocks

GreenFlux, Eneco eMobility and Royal HaskoningDHV implement smart charging based on machine learning – Green Car Congress

Royal HaskoningDHVs office in the city of Amersfoort, the Netherlands, is the first location in the world where electric vehicles are smart charged using machine learning. The charging stations are managed by the charging point operator Eneco eMobility, with smart charging technology provided by the GreenFlux platform.

With the number of electric vehicles ever increasing, so is the pressure to increase the number of charging stations on office premises. This comes at a cost; electric vehicles require a significant amount of power, which can lead to high investments in the electrical installation. With smart charging these costs can be significantly reduced, by ensuring that not all vehicles charge at the same time.

With the innovation, developed by GreenFlux, deployed by Eneco eMobility and applied at Royal HaskoningDHVs Living Lab Charging Plaza in Amersfoort, the Netherlands, smart charging is now taken to the next level, allowing up to three times more charging stations on a site than with regular smart charging.

The novelty in this solution is that machine learning is used to determine or estimate how charge station sites are wired physicallydata that commonly is incomplete and unreliable. At Royal HaskoningDHV, the algorithm determines over time the topology of how all the three-phase electricity cables are connected to each individual charge station.

Using this topology, the algorithm can optimize between single and three phase charging electric vehicles. Though this may seem like a technicality, it allows up to three times as many charging stations to be installed on the same electrical infrastructure.

Now that this part has been tested and proven, there is so much more we can add. We can use the same technology to, for instance, predict a drivers departure time or how much energy they will need. With these kinds of inputs, we can optimize the charging experience even further.

Lennart Verheijen, head of innovation at GreenFlux

Visit link:
GreenFlux, Eneco eMobility and Royal HaskoningDHV implement smart charging based on machine learning - Green Car Congress

Posted in Machine Learning | Comments Off on GreenFlux, Eneco eMobility and Royal HaskoningDHV implement smart charging based on machine learning – Green Car Congress

Four steps to accelerate the journey to machine learning – SiliconANGLE News

This is the golden age of machine learning. Once considered peripheral, machine learning technology is becoming a core part of businesses around the world. From healthcare to agriculture, fintech to media and entertainment, machine learning holds great promise for industries universally.

Although standing up machine learning projects can seem daunting, ingraining a machine learning-forward mindset within the workplace is critical. In 2018, according to Deloitte Insights State of AI in Enterprise report, 63% of companies invested in machine learning to catch up with their rivals or to narrow their lead. International Data Corp. estimates that by 2021, global spending on AI and other cognitive technologies will exceed $50 billion.

So, the question is no longer whether your company should have a machine learning strategy, but rather, how can your company get its machine learning strategy in motion as quickly and effectively as possible?

Whether your company is just getting started with machine learning, or in the middle of your first implementation, here are the four steps that you should take in order to have a successful journey.

When it comes to adopting machine learning, data is often cited as the No. 1 challenge. In our experience with customers, more than half of the time building machine learning models can be spent in data wrangling, data cleanup and pre-processing stages. If you dont invest in establishing a strong data strategy, any machine learning talent you hire will be forced to spend a significant proportion of their time dealing with data cleanup and management, instead of inventing new algorithms.

When starting out, the three most important questions to ask are: What data is available today? What data can be made available? And a year from now, what data will we wish we had started collecting today?

In order to determine what data is available today, youll need to overcome data hugging, the tendency for teams to guard the data they work with most closely and not share with other groups in the organization. Breaking down silos between teams for a more expansive view of the data landscape is crucial for long-term success. And along the way, youll need to make sure you have the right access control and data governance.

On top of that, youll need to know what data actually matters as part of your machine learning approach. When you plan your data strategy, think about best ways to store data and invest early in the data processing tools for de-identification and anonymization if needed. For example, Cerner Corp. needed to tackle this challenge to effectively leverage their data for predictive and digital diagnostic insights. Today, the company uses a fully managed service to build, deploy and manage machine learning models at scale.

When evaluating what and how to apply machine learning, you should focus on assessing the problem across three dimensions: data readiness, business impact and machine learning applicability the chance of success based on your teams skills.

Balancing speed with business value is key. You should first look for places where you already have a lot of untapped data. Next, evaluate if the area will benefit from machine learning or if youre fixing something that isnt actually broken. Avoid picking a problem thats flashy but has unclear business value, as it will end up becoming a one-off experiment that never sees the light of day.

A good example of solving for the right problems can be seen in Formula One World Championship Ltd. The motorsport company was looking for new ways to deliver race metrics that could change the way fans and teams experience racing, but had more than 65 years of historical race data to sift through.After aligning their technical and domain experts to determine what type of untapped data had the most potential to deliver value for its teams and fans, Formula 1 data scientists then used Amazon SageMaker to train deep learning modelson this historical data to extract critical performance statistics, make race predictions and relay engaging insights to their fans into the split-second decisions and strategies adopted by teams and drivers.

Next, in order to move from a few pilots to scaling machine learning, you need to champion a culture of machine learning. Leaders and developers alike should always be thinking about how they can apply machine learning across various business problems.

A common mistake a lot of companies make is putting tech experts on a separate team. By working in a silo, they may end up building machine learning models mostly as proof of concepts, but dont actually solve real business problems. Instead, businesses need to combine a blend of technical and domain experts to work backwards from the customer problem. Assembling the right group of people also helps eliminate the cultural barrier to adoption with a quicker buy-in from the business.

Similarly, leaders should constantly find ways to make it easier for their developers to apply machine learning. Building the infrastructure to do machine learning at scale is a labor-intensive process that slows down innovation. They should encourage their teams not to focus on the undifferentiated heavy lifting portions of building machine learning models. By using tools that cover the entire machine learning workflow to build, train and deploy machine learning models, companies can get to production faster with much less effort and at a lower cost.

For instance, Intuit Inc. wanted to simplify the expense sorting process for their self-employed TurboTax customers to help identify potential deductions. Using Amazon SageMaker for its ExpenseFinder tool, which automatically pulls a years worth of bank transactions, Intuits machine learning algorithm helps its customers discover $4,300 on average in business expenses. Intuits time to build machine learning models also decreased from six months to less than a week.

Finally, to build a successful machine learning culture, you need to focus on developing your team. This includes building the right skills for your engineers and ensuring that your line of business leaders are also getting the training needed to understand machine learning.

Recruiting highly experienced talent in an already limited field is highly competitive and often too expensive, so companies are well-served to develop internal talent as well. You can cultivate your developers machine learning skills through robust internal training programs, which also help attract and retain talent.

One approach used by Morningstar Inc., the global financial services firm, used hands-on training for employees with AWS DeepRacer to accelerate the application of machine learning across the companys investing products, services and processes. More than 445 of Morningstars employees are currently involved in the AWS DeepRacer League, which has created an engaging way to upskill and unite its global teams.

If your organization follows these steps, the machine learning culture you build will play a vital role in setting it up for long-term success. There will be growing pains, but at its core, machine learning is experimentation that gets better over time, so your organization must also embrace failures and take a long-term view of whats possible.

No longer an aspirational technology for fringe use cases, machine learning is making meaningful transformation possible for organizations around the world and can make a tangible impact on yours too.

Swami Sivasubramanian is vice president of Amazon AI, running AI and machine learning services for Amazon Web Services Inc. He wrote this article for SiliconANGLE.

Show your support for our mission with our one-click subscription to our YouTube channel (below). The more subscribers we have, the more YouTube will suggest relevant enterprise and emerging technology content to you. Thanks!

Support our mission: >>>>>> SUBSCRIBE NOW >>>>>> to our YouTube channel.

Wed also like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we dont have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary onSiliconANGLE along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams attheCUBE take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here,please take a moment to check out a sample of the video content supported by our sponsors,tweet your support, and keep coming back toSiliconANGLE.

Go here to see the original:
Four steps to accelerate the journey to machine learning - SiliconANGLE News

Posted in Machine Learning | Comments Off on Four steps to accelerate the journey to machine learning – SiliconANGLE News

A machine learning approach to define antimalarial drug action from heterogeneous cell-based screens – Science Advances

Abstract

Drug resistance threatens the effective prevention and treatment of an ever-increasing range of human infections. This highlights an urgent need for new and improved drugs with novel mechanisms of action to avoid cross-resistance. Current cell-based drug screens are, however, restricted to binary live/dead readouts with no provision for mechanism of action prediction. Machine learning methods are increasingly being used to improve information extraction from imaging data. These methods, however, work poorly with heterogeneous cellular phenotypes and generally require time-consuming human-led training. We have developed a semi-supervised machine learning approach, combining human- and machine-labeled training data from mixed human malaria parasite cultures. Designed for high-throughput and high-resolution screening, our semi-supervised approach is robust to natural parasite morphological heterogeneity and correctly orders parasite developmental stages. Our approach also reproducibly detects and clusters drug-induced morphological outliers by mechanism of action, demonstrating the potential power of machine learning for accelerating cell-based drug discovery.

Cell-based screens have substantially advanced our ability to find new drugs (1). However, most screens are unable to predict the mechanism of action (MoA) of identified hits, necessitating years of follow-up after discovery. In addition, even the most complex screens frequently find hits against cellular processes that are already targeted (2). Limitations in finding new targets are becoming especially important in the face of rising antimicrobial resistance across bacterial and parasitic infections. This rise in resistance is driving increasing demand for screens that can intuitively find new antimicrobials with novel MoAs. Demand for innovation in drug discovery is exemplified in efforts on targeting Plasmodium falciparum, the parasite that causes malaria. Malaria continues to be a leading cause of childhood mortality, killing nearly half a million children each year (3). Drug resistance has emerged to every major antimalarial to date including rapidly emerging resistance to frontline artemisinin-based combination therapies (4). While there is a healthy pipeline of developmental antimalarials, many target common processes (5) and may therefore fail quickly because of prevalent cross-resistance. Thus, solutions are urgently sought for the rapid identification of new drugs that have a novel MoA at the time of discovery.

Parasite cell morphology within the human contains inherent MoA-predictive capacity. Intracellular parasite morphology can distinguish broad stages along the developmental continuum of the asexual parasite (responsible for all disease pathology). This developmental continuum includes early development (early and late ring form), feeding (trophozoite), genome replication or cell division (schizont), and extracellular emergence [merozoite; see (6) for definitions]. Hence, drugs targeting a particular stage should manifest a break in the continuum. Morphological variation in the parasite cell away from the continuum of typical development may also aid drug MoA prediction if higher information granularity can be generated during a cell-based screen. Innovations in automated fluorescence microscopy have markedly expanded available data content in cell imaging (7). By using multiple intracellular markers, an information-rich landscape can be generated from which morphology, and, potentially, drug MoA can be deduced. This increased data content is, however, currently inaccessible both computationally and because it requires manual expert-level analysis of cell morphology. Thus, efforts to use cell-based screens to find drugs and define their MoA in a time-efficient manner are still limited.

Machine learning (ML) methods offer a powerful alternative to manual image analysis, particularly deep neural networks (DNNs) that can learn to represent data succinctly. To date, supervised ML has been the most successful application for classifying imaging data, commonly based on binning inputs into discrete, human-defined outputs. Supervised methods using this approach have been applied to study mammalian cell morphologies (8, 9) and host-pathogen interactions (10). However, discrete outputs are poorly suited for capturing a continuum of morphological phenotypes, such as those that characterize either malaria parasite development or compound-induced outliers, since it is difficult or impossible to generate labels of all relevant morphologies a priori. A cell imaging approach is therefore needed that can function with minimal discrete human-derived training data before computationally defining a continuous analytical space, which mirrors the heterogeneous nature of biological space.

Here, we have created a semi-supervised model that discriminates diverse morphologies across the asexual life cycle continuum of the malaria parasite P. falciparum. By receiving input from a deep metric network (11) trained to represent similar consumer images as nearby points in a continuous coordinate space (an embedding), our DNN can successfully define unperturbed parasite development with a much finer information granularity than human labeling alone. The same DNN can quantify antimalarial drug effects both in terms of life cycle distribution changes [e.g., killing specific parasite stage(s) along the continuum] and morphological phenotypes or outliers not seen during normal asexual development. Combining life cycle and morphology embeddings enabled the DNN to group compounds by their MoA without directly training the model on these morphological outliers. This DNN analysis approach toward cell morphology therefore addresses the combined needs of high-throughput cell-based drug discovery that can rapidly find new hits and predict MoA at the time of identification.

Using ML, we set out to develop a high-throughput, cell-based drug screen that can define cell morphology and drug MoA from primary imaging data. From the outset, we sought to embrace asynchronous (mixed stage) asexual cultures of the human malaria parasite, P. falciparum, devising a semi-supervised DNN strategy that can analyze fluorescence microscopy images. The workflow is summarized in Fig. 1 (A to C).

(A) To ensure all life cycle stages were present during imaging and analysis, two transgenic malaria cultures, continuously expressing sfGFP, were combined (see Materials and Methods); these samples were incubated with or without drugs before being fixed and stained for automated multichannel high-resolution, high-throughput imaging. Resulting datasets (B) contained parasite nuclei (blue), cytoplasm (not shown), and mitochondrion (green) information, as well as the RBC plasma membrane (red) and brightfield (not shown). Here, canonical examples of a merozoite, ring, trophozoite, and schizont stage are shown. These images were processed for ML analysis (C) with parasites segregated from full field of views using the nuclear stain channel, before transformation into embedding vectors. Two networks were used; the first (green) was trained on canonical examples from human-labeled imaging data, providing MLderived labels (pseudolabels) to the second semi-supervised network (gray), which predicted life cycle stage and compound phenotype. Example images from human-labeled datasets (D) show that disagreement can occur between human labelers when categorizing parasite stages (s, schizont; t, trophozoite; r, ring; m, merozoite). Each thumbnail image shows (from top left, clockwise) merged channels, nucleus staining, cytoplasm, and mitochondria. Scale bar, 5 m.

The P. falciparum life cycle commences when free micron-sized parasites (called merozoites; Fig. 1B, far left) target and invade human RBCs. During the first 8 to 12 hours after invasion, the parasite is referred to as a ring, describing its diamond ringlike morphology (Fig. 1B, left). The parasite then feeds extensively (trophozoite stage; Fig. 1B, right), undergoing rounds of DNA replication and eventually divides into ~20 daughter cells (the schizont-stage; Fig. 1B, far right), which precedes merozoite release back into circulation (6). This discrete categorization belies a continuum of morphologies between the different stages.

The morphological continuum of asexual development represents a challenge when teaching ML models, as definitions of each stage will vary between experts (Fig. 1D and fig. S1). To embrace this, multiple human labels were collected. High-resolution three-dimensional (3D) images of a 3D7 parasite line continuously expressing superfolder green fluorescent protein (sfGFP) in the cytoplasm (3D7/sfGFP) were acquired using a widefield fluorescence microscope (see Materials and Methods), capturing brightfield DNA [4,6-diamidino-2-phenylindole (DAPI), cytoplasm (constitutively expressed sfGFP), mitochondria (MitoTracker abbreviated subsequently to MITO)], and the RBC membrane [fluorophore-conjugated wheat germ agglutinin (WGA)]. 3D datasets were converted to 2D images using maximum intensity projection. Brightfield was converted to 2D using both maximum and minimum projection, resulting in six channels of data for the ML. Labels (5382) were collected from human experts, populating the categories of ring, trophozoite, schizont, merozoite, cluster-of-merozoites (multiple extracellular merozoites attached after RBC rupture), or debris. For initial validation and as a test of reproducibility between experts, an additional 448 parasites were collected, each labeled by five experts (Fig. 1D).

As demonstrated (Fig. 1D and fig. S1A), human labelers show some disagreement, particularly between ring and trophozoite stages. This disagreement is to be expected, with mature ring stage and early trophozoite stage images challenging to define even for experts. When comparing the human majority vote versus the model classification (fig. S1B and note S1), some disagreement was seen, particularly for human-labeled trophozoites being categorized as ring stages by the ML algorithm.

Image patches containing parasites within the RBC or after merozoite release were transformed into input embeddings using the deep metric network architecture originally trained on consumer images (11) and previously shown for microscopy images (12). Embeddings are vectors of floating point numbers representing a position in high-dimensional space, trained so related objects are located closer together. For our purposes, each image channel was individually transformed into an embedding of 64 dimensions before being concatenated to yield one embedding of 384 dimensions per parasite image.

Embeddings generated from parasite images were next transformed using a two-stage workflow to represent either on-cycle (for mapping the parasite life cycle continuum) or off-cycle effects (for mapping morphology or drug induced outliers). Initially, an ensemble of fully connected two-layer DNN models was trained on the input embeddings to predict the categorical human life cycle labels for dimethyl sulfoxide (DMSO) controls. DMSO controls consisted of the vehicle liquid for drug treatments (DMSO) being added to wells containing no drugs. For consistency, the volume of DMSO was normalized in all wells to 0.5%. This training gave the DNN robustness to control for sample heterogeneity and, hence, sensitivity for identifying unexpected results (outliers). The ensemble was built from three pairs of fully supervised training conditions (six total models). Models only differed in the training data they received. Each network pair was trained on separate (nonoverlapping) parts of the training data, providing an unbiased estimate of the model prediction variance.

After initial training, the supervised DNN predicted its own labels (i.e., pseudolabels) for previously unlabeled examples. As with human-derived labels, DNN pseudolabeling was restricted to DMSO controls (with high confidence) to preserve the models sensitivity to off-cycle outliers (which would not properly fit into on-cycle outputs). High confidence was defined as images given the same label prediction from all six models and when all models were confident of their own prediction (defined as twice the probability of selecting the correct label at random). This baseline random probability is a fixed number for a dataset or classification and provided a suitable baseline for model performance.

A new ensemble of models was then trained using the combination of human-derived labels and DNN pseudolabels. The predictions from this new ensemble were averaged to create the semi-supervised model.

The semi-supervised model was first used to represent the normal (on-cycle) life cycle continuum. We selected the subset of dimensions in the unnormalized final prediction layer that corresponded to merozoites, rings, trophozoites, and schizonts. This was projected into 2D space using principal components analysis (PCA) and shifted such that its centroid was at the origin. This resulted in a continuous variable where angles represent life cycle stage progression, referred to as Angle-PCA. This Angle-PCA approach permitted the full life cycle to be observed as a continuum with example images despite data heterogeneity (Fig. 2A and fig. S2) and 2D projection (Fig. 2B) following the expected developmental order of parasite stages. This ordered continuum manifested itself without specific constraints being imposed, except those provided by the categorical labels from human experts (see note S2).

After learning from canonical human-labeled parasite images (for examples, please see Fig. 1B) and filtering debris and other outliers, the remaining life cycle data from asynchronous cultures was successfully ordered by the model. The parasites shown are randomly selected DMSO control parasites from multiple imaging runs, sorted by Angle PCA (A). The colored, merged images show RBC membrane (red), mitochondria (green), and nucleus (blue). For a subset of parasites on the right, the colored, merged image plus individual channels are shown: (i) merged, (ii) brightfield minimum projection, (iii) nucleus, (iv) cytoplasm, (v) mitochondria, and (vi) RBC membrane (brightfield maximum projection was also used in ML but is not shown here). The model sorts the parasites in life cycle stage order, despite heterogeneity of signal due to nuances such as imaging differences between batches. The order of the parasites within the continuum seen in (A) is calculated from the angle within the circle created by projecting model outputs using PCA, creating a 2D scatterplot (B). This represents a progression through the life cycle stages of the parasite, from individual merozoites (purple) to rings (yellow), trophozoites (green), schizonts (dark green), and finishing with a cluster of merozoites (blue). The precision-recall curve (C) shows that human labelers and the model have equivalent accuracy in determining the earlier/later parasite in pairs. The consensus of the human labelers was taken as ground truth, with individual labelers (orange) agreeing with the consensus on 89.5 to 95.8% of their answers. Sweeping through the range of too close to call values with the ML model yields the ML curve shown in black. Setting this threshold to 0.11 radians, the median angle variance across the individual models used in the ensemble yields the blue dot.

To validate the accuracy of the continuous life cycle prediction, pairs of images were shown to human labelers to define their developmental order (earlier/later) with the earliest definition being the merozoite stage. Image pairs assessed also included those considered indistinguishable (i.e., too close to call). Of the 295 pairs selected for labeling, 276 measured every possible pairing between 24 parasites, while the remaining 19 pairs were specifically selected to cross the trophozoite/schizont boundary. Human expert agreement with the majority consensus was between 89.5 and 95.8% (note S3), with parasite pairs called equal (too close to call) to 25.7 to 44.4% of the time. These paired human labels had more consensus than the categorical (merozoite, ring, trophozoite, and schizont) labels that had between 60.9 and 78.4% agreement between individual human labels and the majority consensus.

The Angle-PCA projection results provide an ordering along the life cycle continuum, allowing us to compare this sort order to that by human experts. With our ensemble of six models, we could also evaluate the consensus and variation between angle predictions for each example. The consensus between models for relative angle between two examples was greater than 96.6% (and an area under the precision-recall curve score of 0.989; see note S4 for definition), and the median angle variation across all labeled examples was 0.11 radians. The sensitivity of this measurement can be tuned by selecting a threshold for when two parasites are considered equal, resulting in a precision-recall curve (Fig. 2C). When we use the median angle variation of the model as the threshold for examples that are too close to call, we get performance (light blue point) that is representative of the human expert average. These results demonstrate that our semi-supervised model successfully identified and segregated asynchronous parasites and infected RBCs from images that contain >90% uninfected RBCs (i.e., <10% parasitaemia) and classifies parasite development logically along the P. falciparum asexual life cycle.

Having demonstrated the semi-supervised model can classify asynchronous life cycle progression consistently with fine granularity, the model was next applied to quantify on-cycle differences (i.e., life cycle stage-specific drug effects) in asynchronous, asexual cultures treated with known antimalarial drugs. Two drug treatments were initially chosen that give rise to aberrant cellular development: the ATP4ase inhibitor KAE609 (also called Cipargamin) (13) and the mitochondrial inhibiting combinational therapy of atovaquone and proguanil (14) (here referred to as Ato/Pro). KAE609 reportedly induces cell swelling (15), while Ato/Pro reduces mitochondrial membrane potential (16). Drug treatments were first tested at standard screening concentrations (2 M) for two incubation periods (6 and 24 hours). Next, drug dilutions were carried out to test the semi-supervised models sensitivity to lower concentrations using half-median inhibitory concentrations (IC50s) of each compound (table S1). IC50 and 2 M datasets were processed through the semi-supervised model and overlaid onto DMSO control data as a histogram to explore on-cycle drug effects (Fig. 3). KAE609 treatment exhibited a consistent skew toward ring stage parasite development (8 to 12 hours after RBC invasion; Fig. 3) without an increase within this stage of development, while the Ato/Pro treatment led to reduced trophozoite stages (~12 to 30 hours after RBC invasion; Fig. 3). This demonstrates that the fine-grained continuum has the sensitivity to detect whether drugs affect specific stages of the parasite life cycle.

Asynchronous Plasmodium falciparum cultures were treated with the ATPase4 inhibitor KAE609 or the combinational MITO treatment of atovaquone and proguanil (Ato/Pro) with samples fixed and imaged 6 (A) and 24 (B) hours after drug additions. Top panels show histograms indicating the number of parasites across life cycle continuum. Compared to DMSO controls (topmost black histogram), both treatments demonstrated reduced parasite numbers after 24 hours. Shown are four drug/concentration treatment conditions: low-dose Ato/Pro (yellow), high-dose Ato/Pro (orange), low-dose KAE609 (light blue), and high-dose KAE609 (dark blue). Box plots below demonstrate life cycle classifications in the DMSO condition of images from merozoites (purple) to rings (yellow), trophozoites (green), and finishing with schizonts (dark green).

The improved information granularity was extended to test whether the model could identify drug-based morphological phenotypes (off-cycle) toward determination of MoA. Selecting the penultimate 32-dimensional layer of the semi-supervised model meant that, unlike the Angle-PCA model, outputs were not restricted to discrete on-cycle labels but instead represented both on- and off-cycle changes. This 32-dimensional representation is referred to as the morphology embedding.

Parasites were treated with 1 of 11 different compounds, targeting either PfATP4ase (ATP4) or mitochondria (MITO) and DMSO controls (table S1). The semi-supervised model was used to evaluate three conditions: random, where compound labels were shuffled; Angle-PCA, where the two PCA coordinates are used; and full embedding, where the 32-dimensional embedding was combined with the Angle-PCA. To add statistical support that enables compound level evaluation, a bootstrapping of the analysis was performed, sampling a subpopulation of parasites 100 times.

As expected, the randomized labels led to low accuracy (Fig. 4A), serving as a baseline for the log odds (probability). When using the 2D Angle-PCA (on-cycle) information, there was a significant increase over random in the log odds ratio (Fig. 4A). This represents the upper-bound information limit for binary live/dead assays due to their insensitivity to parasite stages. When using the combined full embedding, there was a significant log odds ratio increase over both the random and Angle-PCA conditions (Fig. 4A). To validate that this improvement was not a consequence of having a larger dimensional space compared to the Angle-PCA, an equivalent embedding from the fully supervised model trained only on expert labels (and not on pseudolabels) demonstrated approximately the same accuracy and log odds ratio as Angle-PCA. Thus, our semi-supervised model can create an embedding sensitive to the phenotypic changes under distinct MoA compound treatment.

To better define drug effect on Plasmodium falciparum cultures, five mitochondrial (orange text) and five PfATP4ase (blue text) compounds were used; after a 24-hour incubation, images were collected and analyzed by the semi-supervised model. To test performance, various conditions were used (A). For random, images and drug names were scrambled, leading to the model incorrectly grouping compounds based on known MoA (B). Using life cycle stage definition (as with Fig. 3), the model generated improved grouping of compounds (C) versus random. Last, by combining the life cycle stage information with the penultimate layer (morphological information, before life cycle stage definition) of the model, it led to correct segregation of drugs based on their known MoA (D).

To better understand drug MoA, we evaluated how the various compounds were grouped together by the three approaches (random, Angle-PCA, and morphology embedding), performing a hierarchical linkage dendrogram (Fig. 4, B to D). The random approach shows that, as expected, the different compounds do not reveal MoA similarities. For the Angle-PCA output, the MITO inhibitors atovaquone and antimycin are grouped similarly, but the rest of the clusters are a mixture of compounds from the two MoA groups. Last, the morphology embedding gave rise to an accurate separation between the two groups of compounds having different MoA. One exception for grouping was atovaquone (when used alone), which was found to poorly cluster with either group (branching at the base of the dendrogram; Fig. 4D). This result is likely explained by the drug dosages used, as atovaquone is known to have a much enhanced potency when used in combination with proguanil (16).

The semi-supervised model was able to consistently cluster MITO inhibitors away from ATP4ase compounds in a dimensionality that suggested a common MoA. Our semi-supervised model can therefore successfully define drug efficacy in vitro and simultaneously assign a potential drug MoA from asynchronous (and heterogeneous) P. falciparum parasite cultures using an imaging-based screening assay with high-throughput capacity.

Driven by the need to accelerate novel antimalarial drug discovery with defined MoA from phenotypic screens, we applied ML to images of asynchronous P. falciparum cultures. This semi-supervised ensemble model could identify effective drugs and cluster them according to MoA, based on life cycle stage (on-cycle) and morphological outliers (off-cycle).

Recent image-based ML approaches have been applied to malaria cultures but have, however, focused on automated diagnosis of gross parasite morphologies from either Giemsa- or Leishman-stained samples (1719), rather than phenotypic screening for drug MoA. ML of fluorescence microscopy images have reported malaria identification of patient-derived blood smears (20) and the use of nuclear and mitochondrial specific dyes for stage categorization and viability (21), although the algorithmic approach did not include deep learning. Previous unsupervised and semi-supervised ML approaches have been applied to identify phenotypic similarities in other biological systems, such as cancer cells (12, 2224), but none have addressed the challenge of capturing the continuum of biology within the heterogeneity of control conditions. We therefore believe our study represents a key milestone in the use of high-resolution imaging data beyond diagnostics to predict the life cycle continuum of a cell type (coping with biological heterogeneity), as well as using this information to indicate drug-induced outliers and successfully group these toward drug MoA.

Through semi-supervised learning, only a small number of human-derived discrete but noisy labels from asynchronous control cultures were required for our DNN method to learn and distribute data as a continuous variable, with images following the correct developmental order. By reducing expert human input, which can lead to image identification bias (see note S2), this approach can control for interexpert disagreement and is more time efficient. This semi-supervised DNN therefore extends the classification parameters beyond human-based outputs, leading to finer information granularity learned from the data automatically through pseudolabels. This improved information, derived from high-resolution microscopy data, permits the inclusion of subtle but important features to distinguish parasite stages and phenotypes that would otherwise be unavailable.

Our single model approach was trained on life cycle stages through embedding vectors, whose distribution allows identification of two readouts, on-cycle (sensitive to treatments that slow the life cycle or kill a specific parasite stage) and off-cycle (sensitive to treatments that cluster away from control distributions). We show that this approach with embeddings was sensitive to stage-specific effects at IC50 drug concentrations (Fig. 3), much lower than standard screening assays. Drug-based outliers were grouped in a MoA-dependent manner (Fig. 4), with data from similar compounds grouped closer than data with unrelated mechanisms.

The simplicity of fluorescence imaging means that this method could be applied to different subcellular parasite features, potentially improving discrimination of cultures treated with other compounds. In addition, imaging the sexual (gametocyte) parasite stages with and without compound treatments will build on the increasing need for drugs, which target multiple stages of the parasite life cycle (25). Current efforts to find drugs targeting the sexual stages of development are hampered by the challenges of defining MoA from a nonreplicating parasite life cycle stage (25). This demonstrates the potential power of a MoA approach, applied from the outset of their discovery, simply based on cell morphology.

In the future, we envisage that on-cycle effects could elucidate the power of combinational treatments (distinguishing treatments targeting different life cycle stages) for a more complete therapy. Using off-cycle, this approach could identify previously unidentified combinational treatments based on MoA. Because of the sample preparation simplicity, this approach is also compatible with using drug-resistant parasite lines.

New drugs against malaria are seen as a key component of innovation required to bend the curve toward the diseases eradication or risk a return to premillennium rates (3, 26). Seen in this light, application of ML-driven screens should enable the rapid, large-scale screening and identification of drugs with concurrent determination of predicted MoA. Since ML-identified drugs will start from the advanced stage of predicted MoA, these should bolster the much-needed development of new chemotherapeutics for the fight against malaria.

To generate parasite line 3D7/sfGFP, 3D7 ring stages were transfected with both plasmids pkiwi003 (p230p-bsfGFP) and pDC2-cam-co.Cas9-U6.2-hDHFR _P230p (50 g each; fig. S3) following standard procedures (27) and selected on 4 nM WR99210 (WR) for 10 days. pDC2-cam-co.Cas9-U6.2-hDHFR _P230p encodes for Cas9 and the guide RNA for the P230p locus. pkiwi003 comprises the repair sequence to integrate into the P230p locus after successful double-strand break induced by the Cas9. pkiwi003 (p230p-bsfGFP) was obtained by inserting two polymerase chain reaction (PCR) fragments both encoding parts of P230p (PF3D7_0208900) consecutively into the pBluescript SK() vector with Xho I/Hind III and Not I/Sac I, respectively. sfGFP together with the hsp70 (bip) 5 untranslated region was PCR-amplified from pkiwi002 and cloned into pkiwi003 with Hind III/Not I. pkiwi002 is based on pBSp230pDiCre (28), where the FRB (binding domain of the FKBP12rapamycin-associated protein) and Cre60 cassette (including promoter and terminator) was removed with Afe I/Spe I, and the following linkers inserted are as follows: L1_F cctttttgcccccagcgctatataactagtACAAAAAAGTATCAAG and L1_R CTTGATACTTTTTTGTactagttatatagcgctgggggcaaaaagg. In a second step, FKBP (the immunophilin FK506-binding protein) and Cre59 were removed with Nhe I/Pst I and replaced by sfGFP, which was PCR-amplified from pCK301 (29). pDC2-cam-co.Cas9-U6.2-hDHFR _P230p was obtained by inserting the guide RNA (AGGCTGATGAAGACATCGGG) into pDC2-cam-co.Cas9-U6.2-hDHFR (30) with Bbs I. Integration of pkiwi003 into the P230p locus was confirmed by PCR using primers #99 (ACCATCAACATTATCGTCAG), #98 (TCTTCATCAGCCTGGTAAC), and #56 (CATTTACACATAAATGTCACAC; fig. S3).

The transgenic 3D7/sfGFP P. falciparum asexual parasites were cultured at 37C (with a gas mixture of 90% N2, 5% O2, and 5% CO2) in human O+ erythrocytes under standard conditions (31), with RMPI-Hepes medium supplemented with 0.5% AlbuMAX-II. Two independent stocks (culture 1 and culture 2; Fig. 1A) of 3D7/sfGFP parasites were maintained in culture and synchronized separately with 5% d-sorbitol on consecutive days to ensure acquisition of all stages of the asexual cycle on the day of sample preparation. Samples used for imaging derived from cultures harboring an approximate 1:1:1 ratio of rings, trophozoites, and schizonts, with a parasitaemia around 10%.

Asexual cultures were diluted 50:50 in fresh media before 50 nM MitoTracker CMXRos (Thermo Fisher Scientific) was added for 20 min at 37C. Samples were then fixed in phosphate-buffered saline (PBS) containing 4% formaldehyde and 0.25% glutaraldehyde and placed on a roller at room temperature, protected from light for 20 min. The sample was then washed 3 in PBS before 10 nM DAPI, and WGA (5 g/ml) conjugated to Alexa Fluor 633 was added for 10 min and protected from light. The sample was then washed 1 in PBS and diluted 1:30 in PBS before pipetting 100 l into each well of a CellVis (Mountain View, CA) 96-well plate.

Samples were imaged using a Nikon Ti-Eclipse widefield microscope and Hamamatsu electron multiplying charge-coupled device camera, with a 100 Plan Apo 1.4 numerical aperture (NA) oil objective lens (Nikon); the NIS-Elements JOBS software package (Nikon) was used to automate the plate-based imaging. The five channels [brightfield, DNA (DAPI), cytoplasm (sfGFP-labeled), mitochondria (MitoTracker or MITO), and RBC (WGA-633)] were collected serially at Nyquist sampling as a 6-m z-stack, with fluorescent excitation from the CoolLED light source. To collect enough parasite numbers per treatment, 32 fields of view (sites) were randomly generated and collected within each well, with treatments run in technical triplicate. Data were saved directly onto an external hard drive for short-term storage and processing (see below).

The 3D images were processed via a custom macro using ImageJ and transformed into 2D maximum intensity projection images. Brightfield channels were also projected using the minimum intensity projection as this was found to improve analysis of the food vacuole and anomalies including double infections. Converting each whole-site image to per-parasite embedding vectors was performed as previously described (12), with some modifications: The Otsu threshold was set to the minimum of the calculated threshold or 1.25 of the foreground mean of the image, and centers closer than 100 pixels were pruned. Each channel image was separately fed as a grayscale image into the deep metric network for conversion into a 64-dimension embedding vector. The six embedding vectors (one from each fluorescent channel and both minimum and maximum projections of the brightfield channel) were concatenated to yield a final 384 dimension embedding vector.

All labels were collected using the annotation tool originally built for collecting diabetic retinopathy labels (32). For each set of labels gathered, tiled images were stitched together to create a collage for all parasites to be labeled. These collages contained both stains in grayscale and color overlays to aid identification. Collages and a set of associated questions were uploaded to the annotation tool, and human experts (Imperial College London) provided labels (answers). In cases where multiple experts labeled the same image, a majority vote was used to determine the final label.

Initial labels for training classified parasites into 1 of 11 classes: merozoite, ring, trophozoite, schizont, cluster of merozoites, multiple infection, bad image, bad patch (region of interest) location, parasite debris, unknown parasite inside an RBC, or other. Subsequent labels were collected with parasite debris classified further into the following: small debris remnant, cluster of debris, and death inside a RBC (table S2). For training, the following labels were dropped: bad image, bad patch location, unknown parasite inside an RBC, unspecified parasite debris, and other. For these labels, five parasites were randomly sampled from each well of experiments.

To validate the model performance, an additional 448 parasites were labeled by five experts. The parasites were selected from eight separate experimental plates using only control image data (DMSO only).

Last, paired labels were collected to validate the sort-order results. For these labels, the collage included two parasites, and experts identified which parasite was earlier in the life cycle or whether the parasites were too close to call. Here, data from the 448 parasite validation set were used, limited to cases where all experts agreed that the images were of a parasite inside an RBC. From this set, 24 parasites were selected, and all possible pairings of these 24 parasites were uploaded as questions (24 choose 2 = 276 questions uploaded). In addition, another 19 pairs were selected that were near the trophozoite/schizont boundary to enable angle resolution analysis.

To prepare the data for analysis, the patch embeddings were first joined with the ground truth labels for patches with labels. Six separate models were trained on embeddings to classify asexual life cycle stages and normal anomalies such as multiple infection, cell death, and cellular debris. Each model was a two-layered (64 and 32 dimensions), fully connected (with ReLu nonlinearities) neural network. To create training data for each of the six models, human-labeled examples were partitioned so that each example within a class is randomly assigned to one of four partitions. Each partition is a split of the data with example images randomly placed into a partition (subject to the constraint that it is balanced for each life cycle category). Each model was then trained on one of the six ways to select a pair from the four partitions. Training was carried out with a batch size of 128 for 1000 steps using the Adam optimizer (33) with a learning rate of 2 104. Following the initial training, labels were predicted on all unlabeled data using all six models, and for each class, 400 examples were selected with the highest mean probability (and at least a mean probability of 0.4) and with an SD of the probability less than 0.07 (which encompasses the majority of the predictions with labels). The training procedure was repeated with the original human labels and predicted (pseudo-) labels to generate our final model. The logits are extracted from the trained model, and a subspace representing the normal life cycle stages is projected using 2D by PCA. The life cycle angle is computed as arctan(y/x), where x and are the first and second coordinates of the projection, respectively.

For each drug with a certain dose and application duration, the evaluation of its effect is based on the histogram of the classified asexual life cycle stages, and finer binned stages obtained from the estimated life cycle angle. A breakdown of labeled images for drug morphologies is given in table S3.

WHO, World Malaria Report (Geneva, 2019).

J. Wang, Y. Song, T. Leung, C. Rosenberg, J. Wang, J. Philbin, B. Chen, Y. Wu, Learning Fine-Grained Image Similarity with Deep Ranking, paper presented at the Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2014), pp. 13861393.

Read more here:
A machine learning approach to define antimalarial drug action from heterogeneous cell-based screens - Science Advances

Posted in Machine Learning | Comments Off on A machine learning approach to define antimalarial drug action from heterogeneous cell-based screens – Science Advances