Search Immortality Topics:

Page 58«..1020..57585960..7080..»


Category Archives: Machine Learning

The security threat of adversarial machine learning is real – TechTalks

The Adversarial ML Threat Matrix provides guidelines that help detect and prevent attacks on machine learning systems.

This article is part ofDemystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI.

With machine learning becoming increasingly popular, one thing that has been worrying experts is the security threats the technology will entail. We are still exploring the possibilities: The breakdown of autonomous driving systems? Inconspicuous theft of sensitive data from deep neural networks? Failure of deep learningbased biometric authentication? Subtle bypass of content moderation algorithms?

Meanwhile, machine learning algorithms have already found their way into critical fields such as finance, health care, and transportation, where security failures can have severe repercussion.

Parallel to the increased adoption of machine learning algorithms in different domains, there has been growing interest in adversarial machine learning, the field of research that explores ways learning algorithms can be compromised.

And now, we finally have a framework to detect and respond to adversarial attacks against machine learning systems. Called the Adversarial ML Threat Matrix, the framework is the result of a joint effort between AI researchers at 13 organizations, including Microsoft, IBM, Nvidia, and MITRE.

While still in early stages, the ML Threat Matrix provides a consolidated view of how malicious actors can take advantage of weaknesses in machine learning algorithms to target organizations that use them. And its key message is that the threat of adversarial machine learning is real and organizations should act now to secure their AI systems.

The Adversarial ML Threat Matrix is presented in the style of ATT&CK, a tried-and-tested framework developed by MITRE to deal with cyber-threats in enterprise networks. ATT&CK provides a table that summarizes different adversarial tactics and the types of techniques that threat actors perform in each area.

Since its inception, ATT&CK has become a popular guide for cybersecurity experts and threat analysts to find weaknesses and speculate on possible attacks. The ATT&CK format of the Adversarial ML Threat Matrix makes it easier for security analysts to understand the threats of machine learning systems. It is also an accessible document for machine learning engineers who might not be deeply acquainted with cybersecurity operations.

Many industries are undergoing digital transformation and will likely adopt machine learning technology as part of service/product offerings, including making high-stakes decisions, Pin-Yu Chen, AI researcher at IBM, told TechTalks in written comments. The notion of system has evolved and become more complicated with the adoption of machine learning and deep learning.

For instance, Chen says, an automated financial loan application recommendation can change from a transparent rule-based system to a black-box neural network-oriented system, which could have considerable implications on how the system can be attacked and secured.

The adversarial threat matrix analysis (i.e., the study) bridges the gap by offering a holistic view of security in emerging ML-based systems, as well as illustrating their causes from traditional means and new risks induce by ML, Chen says.

The Adversarial ML Threat Matrix combines known and documented tactics and techniques used in attacking digital infrastructure with methods that are unique to machine learning systems. Like the original ATT&CK table, each column represents one tactic (or area of activity) such as reconnaissance or model evasion, and each cell represents a specific technique.

For instance, to attack a machine learning system, a malicious actor must first gather information about the underlying model (reconnaissance column). This can be done through the gathering of open-source information (arXiv papers, GitHub repositories, press releases, etc.) or through experimentation with the application programming interface that exposes the model.

Each new type of technology comes with its unique security and privacy implications. For instance, the advent of web applications with database backends introduced the concept SQL injection. Browser scripting languages such as JavaScript ushered in cross-site scripting attacks. The internet of things (IoT) introduced new ways to create botnets and conduct distributed denial of service (DDoS) attacks. Smartphones and mobile apps create new attack vectors for malicious actors and spying agencies.

The security landscape has evolved and continues to develop to address each of these threats. We have anti-malware software, web application firewalls, intrusion detection and prevention systems, DDoS protection solutions, and many more tools to fend off these threats.

For instance, security tools can scan binary executables for the digital fingerprints of malicious payloads, and static analysis can find vulnerabilities in software code. Many platforms such as GitHub and Google App Store already have integrated many of these tools and do a good job at finding security holes in the software they house.

But in adversarial attacks, malicious behavior and vulnerabilities are deeply embedded in the thousands and millions of parameters of deep neural networks, which is both hard to find and beyond the capabilities of current security tools.

Traditional software security usually does not involve the machine learning component because itsa new piece in the growing system, Chen says, adding thatadopting machine learning into the security landscape gives new insights and risk assessment.

The Adversarial ML Threat Matrix comes with a set of case studies of attacks that involve traditional security vulnerabilities, adversarial machine learning, and combinations of both. Whats important is that contrary to the popular belief that adversarial attacks are limited to lab environments, the case studies show that production machine learning system can and have been compromised with adversarial attacks.

For instance, in one case study, the security team at Microsoft Azure used open-source data to gather information about a target machine learning model. They then used a valid account in the server to obtain the machine learning model and its training data. They used this information to find adversarial vulnerabilities in the model and develop attacks against the API that exposed its functionality to the public.

Other case studies show how attackers can compromise various aspect of the machine learning pipeline and the software stack to conduct data poisoning attacks, bypass spam detectors, or force AI systems to reveal confidential information.

The matrix and these case studies can guide analysts in finding weak spots in their software and can guide security tool vendors in creating new tools to protect machine learning systems.

Inspecting a single dimension (machine learning vs traditional software security) only provides an incomplete security analysis of the system as a whole, Chen says. Like the old saying goes: security is only asstrong as its weakest link.

Unfortunately, developers and adopters of machine learning algorithms are not taking the necessary measures to make their models robust against adversarial attacks.

The current development pipeline is merely ensuring a model trained on a training set can generalize well to a test set, while neglecting the fact that the model isoften overconfident about the unseen (out-of-distribution) data or maliciously embbed Trojan patteninthe training set, which offers unintended avenues to evasion attacks and backdoor attacks that an adversary can leverage to control or misguide the deployed model, Chen says. In my view, similar to car model development and manufacturing, a comprehensive in-house collision test for different adversarial treats on an AI model should be the new norm to practice to better understand and mitigate potential security risks.

In his work at IBM Research, Chen has helped develop various methods to detect and patch adversarial vulnerabilities in machine learning models. With the advent Adversarial ML Threat Matrix, the efforts of Chen and other AI and security researchers will put developers in a better position to create secure and robust machine learning systems.

My hope is that with this study, the model developers and machine learning researchers can pay more attention to the security (robustness) aspect of the modeland looking beyond a single performance metric such as accuracy, Chen says.

Read the original:
The security threat of adversarial machine learning is real - TechTalks

Posted in Machine Learning | Comments Off on The security threat of adversarial machine learning is real – TechTalks

Altruist: A New Method To Explain Interpretable Machine Learning Through Local Interpretations of Predictive Models – MarkTechPost

Artificial intelligence (AI) and machine learning (ML) are the digital worlds trendsetters in recent times. Although ML models can make accurate predictions, the logic behind the predictions remains unclear to the users. Lack of evaluation and selection criteria make it difficult for the end-user to select the most appropriate interpretation technique.

How do we extract insights from the models? Which features should be prioritized while making predictions and why? These questions remain prevalent. Interpretable Machine Learning (IML) is an outcome of the questions mentioned above. IML is a layer in ML models that helps human beings understand the procedure and logic behind machine learning models inner working.

Ioannis Mollas, Nick Bassiliades, and Grigorios Tsoumakas have introduced a new methodology to make IML more reliable and understandable for end-users.Altruist, a meta-learning method, aims to help the end-user choose an appropriate technique based on feature importance by providing interpretations through logic-based argumentation.

The meta-learning methodology is composed of the following components:

Paper: https://arxiv.org/pdf/2010.07650.pdf

Github: https://github.com/iamollas/Altruist

Related

Consulting Intern: Grounded and solution--oriented Computer Engineering student with a wide variety of learning experiences. Passionate about learning new technologies and implementing it at the same time.

See the original post:
Altruist: A New Method To Explain Interpretable Machine Learning Through Local Interpretations of Predictive Models - MarkTechPost

Posted in Machine Learning | Comments Off on Altruist: A New Method To Explain Interpretable Machine Learning Through Local Interpretations of Predictive Models – MarkTechPost

ATL Special Report Podcast: Tactical Use Cases And Machine Learning With Lexis+ – Above the Law

Welcome back listeners to this exclusive Above the Law Lexis+ Special Report Podcast: Introducing a New Era in Legal Research, brought to you by LexisNexis. This is the second episode in our special series.

Join us once again as LexisNexis Chief Product Officer for North America Jeff Pfeifer (@JeffPfeifer) and Evolve the Law Contributing Editor Ian Connett (@QuantumJurist) dive deeper into Lexis+, sharing tactical use cases, new tools like brief analysis and Ravel view utilizing data visualization, and howJeffs engineering team at Lexis Labs took Google machine learning technology to law school to provide Lexis+ users with the ultimate legal research experience.

This is the second episode of our special four part series. You can listen to our first episode with Jeff Pfeifer here for more on Lexis+. We hope you enjoy this special report featuring Jeff Pfeifer and will stay tuned for the next episodes in the series.

Links and Resources from this Episode

Review and Subscribe

If you like what you hear please leave a review by clicking here

Subscribe to the podcast on your favorite player to get the latest episodes.

More here:
ATL Special Report Podcast: Tactical Use Cases And Machine Learning With Lexis+ - Above the Law

Posted in Machine Learning | Comments Off on ATL Special Report Podcast: Tactical Use Cases And Machine Learning With Lexis+ – Above the Law

Efficient audits with machine learning and Slither-simil – Security Boulevard

by Sina Pilehchiha, Concordia University

Trail of Bits has manually curated a wealth of datayears of security assessment reportsand now were exploring how to use this data to make the smart contract auditing process more efficient with Slither-simil.

Based on accumulated knowledge embedded in previous audits, we set out to detect similar vulnerable code snippets in new clients codebases. Specifically, we explored machine learning (ML) approaches to automatically improve on the performance of Slither, our static analyzer for Solidity, and make life a bit easier for both auditors and clients.

Currently, human auditors with expert knowledge of Solidity and its security nuances scan and assess Solidity source code to discover vulnerabilities and potential threats at different granularity levels. In our experiment, we explored how much we could automate security assessments to:

Slither-simil, the statistical addition to Slither, is a code similarity measurement tool that uses state-of-the-art machine learning to detect similar Solidity functions. When it began as an experiment last year under the codename crytic-pred, it was used to vectorize Solidity source code snippets and measure the similarity between them. This year, were taking it to the next level and applying it directly to vulnerable code.

Slither-simil currently uses its own representation of Solidity code, SlithIR (Slither Intermediate Representation), to encode Solidity snippets at the granularity level of functions. We thought function-level analysis was a good place to start our research since its not too coarse (like the file level) and not too detailed (like the statement or line level.)

Figure 1: A high-level view of the process workflow of Slither-simil.

In the process workflow of Slither-simil, we first manually collected vulnerabilities from the previous archived security assessments and transferred them to a vulnerability database. Note that these are the vulnerabilities auditors had to find with no automation.

After that, we compiled previous clients codebases and matched the functions they contained with our vulnerability database via an automated function extraction and normalization script. By the end of this process, our vulnerabilities were normalized SlithIR tokens as input to our ML system.

Heres how we used Slither to transform a Solidity function to the intermediate representation SlithIR, then further tokenized and normalized it to be an input to Slither-simil:

Figure 2: A complete Solidity function from the contract TurtleToken.sol.

Figure 3: The same function with its SlithIR expressions printed out.

First, we converted every statement or expression into its SlithIR correspondent, then tokenized the SlithIR sub-expressions and further normalized them so more similar matches would occur despite superficial differences between the tokens of this function and the vulnerability database.

Figure 4: Normalized SlithIR tokens of the previous expressions.

After obtaining the final form of token representations for this function, we compared its structure to that of the vulnerable functions in our vulnerability database. Due to the modularity of Slither-simil, we used various ML architectures to measure the similarity between any number of functions.

Figure 5: Using Slither-simil to test a function from a smart contract with an array of other Solidity contracts.

Lets take a look at the function transferFrom from the ETQuality.sol smart contract to see how its structure resembled our query function:

Figure 6: Function transferFrom from the ETQuality.sol smart contract.

Comparing the statements in the two functions, we can easily see that they both contain, in the same order, a binary comparison operation (>= and <=), the same type of operand comparison, and another similar assignment operation with an internal call statement and an instance of returning a true value.

As the similarity score goes lower towards 0, these sorts of structural similarities are observed less often and in the other direction; the two functions become more identical, so the two functions with a similarity score of 1.0 are identical to each other.

Research on automatic vulnerability discovery in Solidity has taken off in the past two years, and tools like Vulcan and SmartEmbed, which use ML approaches to discovering vulnerabilities in smart contracts, are showing promising results.

However, all the current related approaches focus on vulnerabilities already detectable by static analyzers like Slither and Mythril, while our experiment focused on the vulnerabilities these tools were not able to identifyspecifically, those undetected by Slither.

Much of the academic research of the past five years has focused on taking ML concepts (usually from the field of natural language processing) and using them in a development or code analysis context, typically referred to as code intelligence. Based on previous, related work in this research area, we aim to bridge the semantic gap between the performance of a human auditor and an ML detection system to discover vulnerabilities, thus complementing the work of Trail of Bits human auditors with automated approaches (i.e., Machine Programming, or MP).

We still face the challenge of data scarcity concerning the scale of smart contracts available for analysis and the frequency of interesting vulnerabilities appearing in them. We can focus on the ML model because its sexy but it doesnt do much good for us in the case of Solidity where even the language itself is very young and we need to tread carefully in how we treat the amount of data we have at our disposal.

Archiving previous client data was a job in itself since we had to deal with the different solc versions to compile each project separately. For someone with limited experience in that area this was a challenge, and I learned a lot along the way. (The most important takeaway of my summer internship is that if youre doing machine learning, you will not realize how major a bottleneck the data collection and cleaning phases are unless you have to do them.)

Figure 7: Distribution of 89 vulnerabilities found among 10 security assessments.

The pie chart shows how 89 vulnerabilities were distributed among the 10 client security assessments we surveyed. We documented both the notable vulnerabilities and those that were not discoverable by Slither.

This past summer we resumed the development of Slither-simil and SlithIR with two goals in mind:

We implemented the baseline text-based model with FastText to be compared with an improved model with a tangibly significant difference in results; e.g., one not working on software complexity metrics, but focusing solely on graph-based models, as they are the most promising ones right now.

For this, we have proposed a slew of techniques to try out with the Solidity language at the highest abstraction level, namely, source code.

To develop ML models, we considered both supervised and unsupervised learning methods. First, we developed a baseline unsupervised model based on tokenizing source code functions and embedding them in a Euclidean space (Figure 8) to measure and quantify the distance (i.e., dissimilarity) between different tokens. Since functions are constituted from tokens, we just added up the differences to get the (dis)similarity between any two different snippets of any size.

The diagram below shows the SlithIR tokens from a set of training Solidity data spherized in a three-dimensional Euclidean space, with similar tokens closer to each other in vector distance. Each purple dot shows one token.

Figure 8: Embedding space containing SlithIR tokens from a set of training Solidity data

We are currently developing a proprietary database consisting of our previous clients and their publicly available vulnerable smart contracts, and references in papers and other audits. Together theyll form one unified comprehensive database of Solidity vulnerabilities for queries, later training, and testing newer models.

Were also working on other unsupervised and supervised models, using data labeled by static analyzers like Slither and Mythril. Were examining deep learning models that have much more expressivity we can model source code withspecifically, graph-based models, utilizing abstract syntax trees and control flow graphs.

And were looking forward to checking out Slither-simils performance on new audit tasks to see how it improves our assurance teams productivity (e.g., in triaging and finding the low-hanging fruit more quickly). Were also going to test it on Mainnet when it gets a bit more mature and automatically scalable.

You can try Slither-simil now on this Github PR. For end users, its the simplest CLI tool available:

Slither-simil is a powerful tool with potential to measure the similarity between function snippets of any size written in Solidity. We are continuing to develop it, and based on current results and recent related research, we hope to see impactful real-world results before the end of the year.

Finally, Id like to thank my supervisors Gustavo, Michael, Josselin, Stefan, Dan, and everyone else at Trail of Bits, who made this the most extraordinary internship experience Ive ever had.

Recent Articles By Author

*** This is a Security Bloggers Network syndicated blog from Trail of Bits Blog authored by Nol Ponthieux. Read the original post at: https://blog.trailofbits.com/2020/10/23/efficient-audits-with-machine-learning-and-slither-simil/

Visit link:
Efficient audits with machine learning and Slither-simil - Security Boulevard

Posted in Machine Learning | Comments Off on Efficient audits with machine learning and Slither-simil – Security Boulevard

AutoML Alleviates the Process of Machine Learning Analysis – Analytics Insight

Machine Learning (ML)is constantly being adopted by diverse organizations in an enthusiasm to acquire answers and analysis. As the embracing highly increases, it is often forgotten that machine learning has its flaws that need to be addressed for acquiring a perfect solution.

Applications of artificial intelligence andmachine learning are using new toolsto find practical answers to difficult problems. Companies move forward with the emerging technologies to get a competitive edge on their working style and system. Through the process, organizations are learning a very important lesson that one strategy doesnt fit for all.Business organizations want machine learningto do analysis on large data, which is complex and difficult. They neglect the fact that machine learning cant perform on diverse data storage and even if it does, it will conclude with a wrong prediction.

Analysing unstructured and overwhelming large datasets on machine learning is dangerous. Machine learning might conclude with a wrong solution while performing predictive analysis on such data. The implementation of the misconception in a companys working system might drag down its improvement. Many products that incorporatemachine learning capabilitiesuse predetermined algorithms and many diverse ways to handle data. However, each organizations data has different technical characteristics that might not go well with the existing machine learning configuration.

To address the problems where machine learning falls short, AutoML takes head-on in the companys data analysis perspective. AutoML takes over labour intensive job of choosing and tuning machine learning models. The new technology takes on many repetitive tasks where skilful problem definition and data preparation are needed. It reduces the need to understand algorithm parameters and shortening the compute time needed to produce better models.

Machine learning is an application of artificial intelligence that provides systems with the ability to automatically learn and improve from experience without being explicitly programmed. The technology focuses on the development of computer programs that can access data and use it for themselves. It is a model created and trained on a set of previously gathered data, often known as outcomes. The model can be used tomake predictions using that data.

However, machine learning cant get accurate results all the time. It depends on the data scientist handling the machine learning configurations and data inputs. A data scientist studies the input data and understands the desired output to solve business problems. They choose the apt mathematical algorithm from a dozen and tune those parameters called hyperparameters and evaluate the resulting models. The data scientist has the responsibility to adjust the algorithms tuning parameters again and again until the machine learning model produces the desired result. If the results are not tactic, then the data scientist might even start from the very beginning.

Machine learning system struggles to function when the data is too large or unorganised. Some of the other machine learning issues are,

Classification- The process of labeling data can be thought to as a discrimination problem, modeling the similarities between groups.

Regression- Machine learning staggers to predict the value of a new unpredicted data.

Clustering- Data can be divided into groups based on similarity and other measures of natural structure in data. But, human hands are needed to assign names to the groups.

As mentioned earlier, machine learning alone cant address the datasets of an organisation to find predictions. Here are some reasons why tuning a machine learning algorithm is challenging to choose and how AutoML can prove to be useful at such instances.

Choosing the right algorithm: It is not always obvious to choose a perfect algorithm that might work well for building real-value predictions, anomaly detection and classification models for a particular data set. Data scientists have to go through many well-known algorithms of machine learning that could suit the real-world situation. It could take weeks or even months to come up with the right algorithm.

Selecting relevant information: Data storage has diverse data variables or predictors. Henceforth, it is hard to tell which of those data points are significant for making a decision. This process of selecting relevant information to include in data models is called feature selection.

Training machine learning models: The most difficult process in machine learning is to choose a subset of data that can be used for training a machine learning model. In some cases, training against some data variables or predictors can increase training time while actually reducing the accuracy of the ML model.

Automated machine learning (AutoML)basically involves automating the end-to-end process of applying machine learning to real-world problems that are actually relevant in the industry.AutoML makes well-educated guessesto select a suitable ML algorithm and effective initial hyperparameters. The technology tests the accuracy of training the chosen algorithms with those parameters and makes tiny adjustments, and tests the results again. AutoML also automates the creation of small, accurate subsets of data to use for those iterative refinements, yielding excellent results in a fraction of the time.

In a nutshell, AutoML acts as a right tool that quickly chooses, builds and deploys machine learning models that deliver accurate results.

See the original post here:
AutoML Alleviates the Process of Machine Learning Analysis - Analytics Insight

Posted in Machine Learning | Comments Off on AutoML Alleviates the Process of Machine Learning Analysis – Analytics Insight

Synopsys and SiMa.ai Collaborate to Bring Machine Learning Inference at Scale to the Embedded Edge – AiThority

Engagement Leverages Synopsys DesignWare IP, Verification Continuum, and Fusion Design Solutions to Accelerate Development of SiMa.ai MLSoC Platform

Synopsys, Inc.announced its collaboration with SiMa.ai to bring its machine learning inference at scale to the embedded edge. Through this engagement, SiMa.ai has adopted key products from SynopsysDesignWare IP,Verification Continuum Platform, andFusion Design Platformfor the development of their MLSoC, a purpose-built machine-learning platform targeted at specialized computer vision applications, such as autonomous driving, surveillance, and robotics.

Recommended AI News: Medical Knowledge Group Continues Growth With Acquisiton Of Magnolia Innovation To Provide Expanded Services To Biopharmaceutical Industry

SiMa.ai selected Synopsys due to its expertise in functional safety, complete set of proven solutions and models, and silicon-proven IP portfolio that will help SiMa.ai deliver high-performance computing at the lowest power. With Synopsys automotive-grade solutions, SiMa.ai can accelerate their SoC-level ISO 26262 functional safety assessments and qualification while achieving their target ASILs.

Working closely with top-tier customers, we have developed a software-centric architecture that delivers high-performance machine learning at the lowest power. Our purpose-built, highly integrated MLSoC supports legacy compute along with industry-leading machine learning to deliver more than 30x better compute-power efficiency, compared to industry alternatives, said Krishna Rangasayee, founder and CEO, at SiMa.ai. We are delighted to collaborate with Synopsys towards our common goal to bring high-performance machine learning to the embedded edge. Leveraging Synopsys industry-leading portfolio of IP, verification, and design platforms enables us to reduce development risk and accelerate the design and verification process.

Recommended AI News: Building A Private Database-As-A-Service Is Emerging As A Prime Alternative To Managed Cloud Databases

We are pleased to support SiMa.ai as it brings MLSoC chip to market, saidManoj Gandhi, general manager of the Verification Group at Synopsys. Our collaboration aims to address SiMa.ais mission to enable customers to build low-power, high-performance machine learning solutions at the embedded edge across a diverse set of industries.

Since SiMa.ais inception it has strategically collaborated with Synopsys to support all aspects of their MLSoC architecture design and verification.

Recommended AI News: NEC Selects NXP RF Airfast Multi-Chip Modules For Massive MIMO 5G Antenna Radio Unit For Rakuten Mobile In Japan

View original post here:
Synopsys and SiMa.ai Collaborate to Bring Machine Learning Inference at Scale to the Embedded Edge - AiThority

Posted in Machine Learning | Comments Off on Synopsys and SiMa.ai Collaborate to Bring Machine Learning Inference at Scale to the Embedded Edge – AiThority