ForSE 2021 Abstracts


Area 1 - Application of Formal Methods Techniques

Full Papers
Paper Nr: 1
Title:

Malware Classification with Word Embedding Features

Authors:

Aparna S. Kale, Fabio Di Troia and Mark Stamp

Abstract: Malware classification is an important and challenging problem in information security. Modern malware classification techniques rely on machine learning models that can be trained on features such as opcode sequences, API calls, and byte n-grams, among many others. In this research, we consider opcode features. We implement hybrid machine learning techniques, where we engineer feature vectors by training hidden Markov models—a technique that we refer to as HMM2Vec—and Word2Vec embeddings on these opcode sequences. The resulting HMM2Vec and Word2Vec embedding vectors are then used as features for classification algorithms. Specifically, we consider support vector machine (SVM), k-nearest neighbor (k-NN), random forest (RF), and convolutional neural network (CNN) classifiers. We conduct substantial experiments over a variety of malware families. Our experiments extend well beyond any previous related work in this field.

Paper Nr: 2
Title:

Malware Classification using Long Short-term Memory Models

Authors:

Dennis Dang, Fabio Di Troia and Mark Stamp

Abstract: Signature and anomaly based techniques are the quintessential approaches to malware detection. However, these techniques have become increasingly ineffective as malware has become more sophisticated and complex. Researchers have therefore turned to deep learning to construct better performing model. In this paper, we create four different long-short term memory (LSTM) based models and train each to classify malware samples from 20 families. Our features consist of opcodes extracted from malware executables. We employ techniques used in natural language processing (NLP), including word embedding and bidirection LSTMs (biLSTM), and we also use convolutional neural networks (CNN). We find that a model consisting of word embedding, biLSTMs, and CNN layers performs best in our malware classification experiments.

Paper Nr: 4
Title:

Malware Classification with GMM-HMM Models

Authors:

Jing Zhao, Samanvitha Basole and Mark Stamp

Abstract: Discrete hidden Markov models (HMM) are often applied to malware detection and classification problems. However, the continuous analog of discrete HMMs, that is, Gaussian mixture model-HMMs (GMM-HMM), are rarely considered in the field of cybersecurity. In this paper, we use GMM-HMMs for malware classification and we compare our results to those obtained using discrete HMMs. As features, we consider opcode sequences and entropy-based sequences. For our opcode features, GMM-HMMs produce results that are comparable to those obtained using discrete HMMs, whereas for our entropy-based features, GMM-HMMs generally improve significantly on the classification results that we have achieved with discrete HMMs.

Paper Nr: 5
Title:

Unconventional Attack against Voting Machines Enlarging the Scope of Cybersecurity Risk Analysis

Authors:

Eric Filiol

Abstract: Most modern democracies and states have adopted a large number of standards and norms to promote and harmonize international trade. The precautionary principle has come to complete this regulatory arsenal especially in the field of security of states and citizens, their health, their private life ... The aim is also to protect government agencies against wrong decisions, especially when uncertain, immature technologies are concerned. Social, political, institutional security and stability and now cybersecurity has become heavily dependent on these new forms of regulation. In this article we will show how this regulation arsenal could be exploited by cybercriminals. It is indeed possible through a broader vision of the notion of cyber attack to turn these norms and standards and this precautionary principle precisely against those they are supposed to protect. Among many possible scenarios, we consider a specific one for illustration with respect to the attack of voting machines. The main conclusion is that any (cyber)security risk analysis should now extend the mostly favoured technical view to a more operational vision in which non technical aspects also be included.

Paper Nr: 7
Title:

A New Dataset for Smartphone Gesture-based Authentication

Authors:

Elliu Huang, Fabio Di Troia, Mark Stamp and Preethi Sundaravaradhan

Abstract: In this paper, we consider the problem of authentication on a smartphone, based on gestures. Specifically, the gestures consist of users holding a smartphone while writing their initials in the air. Accelerometer data from 80 subjects was collected and we provide a preliminary analysis of this data using machine learning techniques. The machine learning techniques considered include principal component analysis (PCA) and support vector machines (SVM). The results presented here are intended to provide a baseline for additional research based on our dataset.

Paper Nr: 8
Title:

On Formalising and Analysing the Tweetchain Protocol

Authors:

Mariapia Raimondo, Simona Bernardi and Stefano Marrone

Abstract: Distributed Ledger Technology is demonstrating its capability to provide flexible frameworks for information assurance capable of resisting to byzantine failures and multiple target attacks. The availability of development frameworks allows the definition of many applications using such a technology. On the contrary, the verification of such applications are far from being easy since testing is not enough to guarantee the absence of security problems. The paper describes an experience in the modelling and security analysis of one of these applications by means of formal methods: in particular, we consider the Tweetchain protocol as a case study and we use the Tamarin Prover tool, which supports the modelling of a protocol as a multiset rewriting system and its analysis with respect to temporal first-order properties. With the aim of making the modeling and verification process reproducible and independent of the specific protocol, we present a general structure of the Tamarin Prover model and of the properties to verified. Finally, we discuss the strengths and limitations of the Tamarin Prover approach considering three aspects: modelling, analysis and the verification process.

Paper Nr: 9
Title:

Experimental Evaluation of Description Logic Concept Learning Algorithms for Static Malware Detection

Authors:

Peter Švec, Štefan Balogh and Martin Homola

Abstract: In this paper, we propose a novel approach for malware detection by using description logics learning algorithms. Over the last years, there has been a huge growth in the number of detected malware, leading to over a million unique samples observed per day. Although traditional machine learning approaches seem to be ideal for the malware detection task, we see very few of them deployed in real world solutions. Our proof-of-concept solution performs learning task from semantic input data and provides fully explainable results together with a higher robustness against adversarial attacks. Experimental results show that our solution is suitable for malware detection and we can achieve higher detection rates with additional improvements, such as enhancing the ontology with a larger amount of expert knowledge.

Paper Nr: 10
Title:

Towards Formal Security Verification of Over-the-Air Update Protocol: Requirements, Survey and UpKit Case Study

Authors:

Christophe Ponsard and Denis Darquennes

Abstract: The fast growing number of connected devices through the Internet of Things requires the capability of performing secure and efficient over-the-air updates in order to manage the deployment of innovative features and to correct security issues. Pushing an updated image in a device requires a complex protocol exposed to security threats which could be exploited to block, spy or even take control of the updated device. Hence, such update protocols need to be carefully designed and verified. In the scope of this paper, we review some representative update protocols and related threats based on MQTT, TUF (Uptane) and the blockchain. We then show how the adequate management of those threats can be verified using a formal modelling and verification approach using the Tamarin tooling. Our work is applied to the concrete case of the UpKit protocol which exhibits an interesting design.

Short Papers
Paper Nr: 3
Title:

Colluding Covert Channel for Malicious Information Exfiltration in Android Environment

Authors:

Rosangela Casolare, Fabio Martinelli, Francesco Mercaldo and Antonella Santone

Abstract: Mobile devices store a lot of sensitive and private information. It is easy from the developer point of view to release the access to sensitive and critical assets in mobile application development, such as Android. For this reason it can happen that the developer inadvertently causes sensitive data leak, putting users’ privacy at risk. Recently, a type of attack that creates a capability to transfer sensitive data between two (or more) applications is emerging i.e., the so-called colluding covert channel. To demonstrate this possibility, in this work we design and develop a set of applications exploiting covert channels for malicious purposes, which uses the smartphone accelerometer to perform a collusion between two Android applications. The vibration engine sends information from the source application to the sink application, translating it into a vibration pattern. The applications have been checked by more than sixty antimalware which did not classify them as malware, except for two antimalware which returned a false positive.