Repository logo
 

FE - DI | Dissertações de Mestrado e Teses de Doutoramento

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 10 of 371
  • False-negative Reduction in Mammography Breast Cancer Diagnosis Through Radiomics and Deep Learning
    Publication . Grinet, Marco António Vieira Macedo; Gomes, Abel João Padrão; Gouveia, Ana Isabel Rodrigues
    Traditional breast cancer diagnostic methods are heavily reliant on different medical imaging modalities. These imaging modalities, such as MG, MRI, US, and DBT, are used in breast cancer screening, treatment planning, as well as tracking disease progression. However, the process of evaluating each diagnostic image and extract relevant information from it requires a trained and experienced professional. This can be very time consuming for the medical professional, and thwarts efforts of expanding breast cancer screening to areas with a deficit of medical staff, such as rural areas away from major metropolitan centers. With the rise of digital imaging methods, DICOM, and PACS systems, it has become possible to connect patients with medical staff that reside in a different location. […]
  • Portable, multi-task, on-the-edge and low-cost computer vision framework based on deep learning: Precision agriculture application
    Publication . Assunção, Eduardo Timóteo; Proença, Hugo Pedro Martins Carriço; Gaspar, Pedro Miguel de Figueiredo Dinis Oliveira
    Precision agriculture is a new concept that has been introduced worldwide to increase production, reduce labor and ensure efficient management of fertilizers and irrigation processes. Computer vision is an essential component of precision agriculture and plays an important role in many agricultural tasks. It serves as a perceptual tool for the mechanical interface between robots and environments or sensed objects, as well as for many other tasks such as crop yield prediction. Another important consideration is that some vision applications must run on edge devices, which typically have very limited processing power and memory. Therefore, the computer vision models that are to run on edge devices must be optimized to achieve good performance. Due to the significant impact of Deep Learning and the advent of mobile devices with accelerators, there has been increased research in recent years on computer vision for general purpose applications that have the potential to increase the efficiency of precision agriculture tasks. This thesis explore how deep learning models running on edge devices are affected by optimizations, i.e., inference accuracy and inference time. Lightweight models for weed segmentation, peach fruit detection, and fruit disease classification are cases of studies. First, a case study of peach fruit detection with the well-known Faster R-CNN object detector using the breakthrough AlexNet Convolutional Neural Network (CNN) as the image feature extractor is performed. A detection accuracy of 0.90 was achieved using metric Average Precision (AP). The breakthrough AlexNet CNN is not an optimized model for use in mobile devices. To explore a lightweight model, a case study of peach fruit disease classification is next conducted using the MobineNet CNN. The MobileNet was trained on a small dataset of images of healthy, rotten, mouldy, and scabby peach fruit and achieved a performance of 0.96 F1. Lessons learned from this work led to using this model as a baseline CNN for other computer vision applications (e.g., fruit detection and weed segmentation). Next, a study was conducted on robotic weed control using an automated herbicide spot sprayer. The DeepLab semantic segmentation model with the MobileNet backbone was used to segment weeds and determine spatial coordinates for the mechanism. The model was optimized and deployed on the Jetson Nano device and integrated with the robotic vehicle to evaluate real-time performance. An inference time of 0.04 s was achieved, and the results obtained in this work provide insight into how the performance of the semantic segmentation model of plants and weeds degrades when the model is adapted through optimization for operation on edge devices. Finally, to extend the application of lightweight deep learning models and the use of edge devices and accelerators, the Single Shot Detector (SSD) was trained to detect peach fruit in three different varieties and was deployed in a Raspberry Pi device with an integrated Tensor Unity Processor (TPU) accelerator. Some variations of MobileNet as a backbone were explored to investigate the tradeoff between accuracy and inference time. MobileNetV1 yielded the best inference time with 21.01 Frame Per Second (FPS), while MobileDet achieved the best detection accuracy (88.2% AP). In addition, an image dataset of three peach cultivars from Portugal was developed and published. This thesis aims to contribute to future steps in the development of precision agriculture and agricultural robotics, especially when computer vision needs to be processed on small devices.
  • How Can Deep Learning Aid Human Behavior Analysis?
    Publication . Roxo, Tiago Filipe Dias dos Santos ; Proença, Hugo Pedro Martins Carriço; Inácio, Pedro Ricardo Morais
    With the increase of available surveillance data and robustness of state-of-the-art deep learning models, various recent research topics focus on human biometric assessment, tracking and person re-identification. However, one other area of work not extensively explored that can combine surveillance and visual-based models is assessing human behavior. The lack of work in this topic is not surprising given the inherent difficulties on categorizing human behavior in such conditions, in particular without subject cooperation. Based on the psychology literature, human behavior analysis typically requires controlled experimental environments, with subject cooperation and assessing features via grid-based survey. As such, it is not clear on how deep learning models can aid psychology experts in human behavior analysis, which is where this thesis intents to contribute to the body of knowledge. We extensively review psychology literature to define a set of features that have been proven as influential towards human behavior and that can be assessed via camera in surveillance-like conditions. This way, we define human behavior via subject profiling using seven behavioral features: interaction, relative position, clothing, soft biometrics, subject proximity, pose, and use of handheld devices. Note that this analysis does not categorize human behavior into specific states (e.g. aggressive, depressive) but rather creates a set of features that can be used to profile subjects, usable to aid/complement behavioral experts and to compare behavioral traits between subjects in a scene. Furthermore, to motivate the development of works in these areas, we review state-of-the-art approaches and datasets to highlight the limitation of certain areas and discuss the topics worth exploring for future works. After defining a set of behavioral features, we start by exploring the limitation of current biometric models in surveillance conditions, in particular the resilience of gender inference approaches. We demonstrate that these models underperform in surveillance-like data, using PAR datasets, highlighting the limitations of training in cooperative settings to perform in wilder conditions. Supported by the findings of our initial experiments, complementing face and body information arouse as a viable strategy to increase model robustness in these conditions, which lead us to design and propose a new model for wild gender inference based on this premise. This way, we extend the knowledge of an extensive discussed literature topic (gender classification) by exploring its application in settings where current models do not typically perform (surveillance). We also explore the topic of human interaction, namely Active Speaker Detection, in particular towards more uncooperative scenarios such as surveillance conditions. Contrary to the gender/biometrics topic, this is a lesser explored area where works are mainly based on assessing active speakers via face and audio information in cooperative conditions and with good audio and image quality (movie settings). As such, to clearly demonstrate the limitations of state-of-the-art ASD models we start by creating a wilder ASD dataset (WASD), composed of different categories with increasing challenges towards ASD, namely with audio and image quality degradation, and containing uncooperative subjects. This dataset highlighted the limitations of current models to deal with unconstrained scenarios (e.g. surveillance conditions), while also displaying the importance of body information in conditions where audio quality is subpar and face access is not guaranteed. Following this premise, we design the first model that complements audio, face, and body information to achieve state-of-the-art performance in challenging conditions, in particular surveillance settings. Furthermore, this model also proposed a novel way to combine data via SE blocks, which allowed to provide reasoning behind model’s decision by visual interpretability. The use of SE blocks was also extended to other models and ASD-related areas to highlight the viability of this approach for model-agnostic interpretability. Although this initial model was superior to the state-of-the-art in challenging data, its performance in cooperative settings was not as robust. As such, we develop a new model that simultaneously combines face and body information in visual data extraction which, in conjunction with pretraining in challenging data, leads to state-of-the-art performance in both cooperation and challenging conditions (such as surveillance settings). These works pave a new way to assess human interaction in more challenging data and with model interpretability, serving as baselines for future works.
  • Improving the Robustness of Demonstration Learning
    Publication . Correia, André Rosa de Sousa Porfírio; Alexandre, Luís Filipe Barbosa de Almeida
    With the fast improvement of machine learning, Reinforcement Learning (RL) has been used to automate human tasks in different areas. However, training such agents is difficult and restricted to expert users. Moreover, it is mostly limited to simulation environments due to the high cost and safety concerns of interactions in the real world. Demonstration Learning is a paradigm in which an agent learns to perform a task by imitating the behavior of an expert shown in demonstrations. It is a relatively recent area in machine learning, but it is gaining significant traction due to having tremendous potential for learning complex behaviors from demonstrations. Learning from demonstration accelerates the learning process by improving sample efficiency, while also reducing the effort of the programmer. Due to learning without interacting with the environment, demonstration learning can allow the automation of a wide range of real world applications such as robotics and healthcare. Demonstration learning methods still struggle with a plethora of problems. The estimated policy is reliant on the coverage of the data set which can be difficult to collect. Direct imitation through behavior cloning learns the distribution of the data set. However, this is often not enough and the methods may struggle to generalize to unseen scenarios. If the agent visits out-of-distribution cases, not only will it not know what to do, but the consequences in the real world can be catastrophic. Because of this, offline RL methods try to specifically reduce the distributional shift. In this thesis, we focused on proposing novel methods to tackle some of the open problems in demonstration learning. We start by introducing the fundamental concepts, methodologies, and algorithms that underpin the proposed methods in this thesis. Then, we provide a comprehensive study of the state-of-the-art of Demonstration Learning methods. This study allowed us to understand existing methods and expose the open problems which motivate this thesis. We then developed five methods that push improve upon the state-of-the-art and solve different problems. The first method proposes to tackle the context problem, where policies are restricted to the context in which they were trained. We propose a method to learn context-invariant image representations with contrastive learning, by making use of a multi-view demonstration data set. We show that these representations can be used in lieu of the original images to learn a policy with standard reinforcement learning algorithms. This work also contributed with benchmark environment and a demonstration data set. Next, we tackled the potential of combining reinforcement learning with demonstration learning to cover the weaknesses of both paradigms. Specifically, we developed a method to improve the safety of reinforcement learning agents during their learning process. The proposed method makes use of a demonstration data set with safe and unsafe trajectories. Before each interaction, the method evaluates the trajectory and stops it if deems it unsafe. The method was used to augment state-of-theart reinforcement learning methods, and it reduced the crash rate significantly which also resulted in a slight increase in performance. In the following work, we acknowledged the significant strides made in sequence modelling and their impact in a plethora of machine learning problems. We noticed that these methods had recently been applied to demonstration learning. However, the state-of-the-art method was reliant on task knowledge and user interaction to perform. We proposed a hierarchical method which identifies important states in each demonstration, and uses them to guide the sequence model. The result is a method that is task and user independent but also achieves better performance than the previous state-of-the-art. Next, we made use of the novel Mamba architecture to improve upon the previous sequence modelling method. By replacing the Transformer architecture with the Mamba, we proposed two methods that reduce the complexity, and inference time while also improving the performance. Finally, we apply demonstration learning to under-explored applications. Specifically, we apply demonstration learning to teach an agent to dance to music. We describe the insight of modelling the task of learning to dance as a translation task, where the agent learns to translate from the language of music to the language of dance. We used the previous experience resulted from the two sequence modelling methods to propose two variants: using the Transformer or the Mamba architectures. The method modifies the standard sequence modelling architecture to process sequences of audio features and translate them to dance poses. Results show that the method can translate diverse and unseen music to high-quality dance motions coherent within the genre. Results obtained by the proposed methods advance the state-of-the-art in Demonstration Learning and provide solutions to open problems in the field. All the proposed methods were evaluated against state-of-the-art baselines and evaluated on several tasks and diverse data sets, improving the performance and tackling their respective problems.
  • Detection of Stealthy Distributed Denial of Service Attacks Using Artificial Intelligence Methods
    Publication . Rios, Vinícius de Miranda; Freire, Mário Marques; Magoni, Damien
    Distributed Denial of Service (DDoS) attacks have been used to disrupt various online activities. The significant traffic volume of these distributed attacks has enabled the identification of signatures and behavior profiles that fostered the development of detection mechanisms for mitigating these attacks. However, as new attack types emerge, such as low-rate Denial of Service (DoS) attacks, new detection mechanisms need to be developed to combat these evolving threats effectively. Many detection mechanisms rely primarily on statistical analysis to identify low-rate DoS attacks in data traffic. However, these methods often exhibit a high rate of false negatives and are only applicable to small-scale data. Artificial intelligence techniques have been widely employed in various fields, including social network analysis and disease monitoring, and have gradually gained prominence in the field of cybersecurity in recent years. This thesis focuses on studying and developing detection mechanisms that exhibit effective performance against two specific types of low-rate DoS attacks: the Reduction of Quality (RoQ) attack and the Slowloris attack. For the RoQ attack, we examine the traffic transmission format to create a similar one, as there is no existing software capable of generating this type of attack traffic on the internet. For the Slowloris attack, we utilized free and open-source software specifically developed for this purpose. Subsequently, we analyze the traffic from both attacks and extract features that can be used by detection mechanisms. In this thesis, two approaches have been developed for classifying and detecting RoQ and Slowloris attacks: one approach is based on the separate use of a set of traditional Machine Learning (ML) algorithms and the second approach is based on fuzzy logic plus one traditional ML algorithm (that previously led to good classification results) and Euclidean distance. For the RoQ attack detection, the first approach uses eleven separate machine learning algorithms, namely K-Nearest Neighbors (K-NN), Multilayer Perceptron Neural Network (MLP), Support Vector Machine (SVM), Multinomial Naive Bayes (MNB), Gaussian Naive Bayes (GNB), Decision Tree (DT), Random Forest (RF), Gradient Boosting (XGB), Logistic Regression (LR), AdaBoost, and Light Gradient Boosting Machine (LGBM), while the second approach consists in our proposed method which combines fuzzy logic, the MLP algorithm, and the Euclidean distance method. For the Slowloris attack detection, the first approach utilizes nine machine learning algorithms, namely KNN, GNB, MLP, SVM, DT, MNB, RF, XGB, and LGBM, while the second approach consists in our proposed method which combines fuzzy logic, the RF algorithm, and the Euclidean distance method. Both approaches utilize previously selected features to classify the data traffic as either attack traffic or legitimate traffic. The obtained results show that some ML algorithms (namely MLP and RF) as well as our approach based on fuzzy logic, one ML algorithm, and Euclidean distance are good candidates to be used to classify RoQ and Slowloris attacks, but the latter approach with a slightly longer runtime for detecting them.
  • Contributions to Permissionless Decentralized Networks for Digital Currencies Based on Delegated Proof of Stake
    Publication . Morais, Rui Pedro Bernardo de; Crocker, Paul Andrew; Sousa, Simão Melo de
    With the growing and flourishing of human societies came the desire to exchange what was deemed as valuable, be it a good or a service. Initially this exchange was made directly through barter, either synchronously or asynchronously with debt. The first had the downside of requiring coincidence of wants and the second the need for trust. Both were very inefficient and did not scale well. So, what we call money was invented, which is nothing more than a good that is used as medium of exchange between other goods and services. Since then, money has changed form and has acquired new functions, namely unit of account and store of value. The most recent form of money is digital currency. This money cannot be transferred physically like other forms, so it needs a digital network to be transferred, which can have different characteristics. This thesis concerns a specific type of networks for digital currencies: permissionless, meaning that any participant can have read and write access to the network; decentralized, meaning that no single entity controls the network; and that use Delegated Proof of Stake (DPoS) as a Sybil defence mechanism, to prevent the network from being controlled by malicious actors that create numerous false identities. Its research tries to fulfil the vision that a network for digital currencies, besides being permissionless and decentralized, should be scalable, monetary policy agnostic, anonymous and have high performance. Three different layers of the network are studied: the communication layer, responsible for sending and receiving messages, the transaction layer, responsible for validating those messages, and the consensus layer, responsible for reaching agreement on the state of the network. The first two goals can be achieved in the communication layer. On one hand, a vertical way to scale the system is proposed composed of a peer management and traffic prioritization design based on DPoS, offering an alternative to highly disseminated fee-based models. On the other hand, a horizontal way to scale is presented through database sharding. In the transaction layer, a general framework to make DPoS compatible with anonymity is described. More specifically, two different approaches to achieve amount anonymity are proposed: one based on multi-party computation and the other on the Diffie-Hellman key exchange. Finally, a new decoy selection algorithm, called SimpleDSA, is developed to improve sender anonymity. The consensus layer features two innovative consensus algorithms, Nero and Echidna, and two methods for state machine replication: Sphinx (leader-based) and Cerberus (leaderless). These developments aim to enhance the performance of the network, specifically by decreasing the latency of its state changes and increasing the throughput, i.e., increasing the number of state changes per unit of time. A protocol that instantiates the transaction and consensus layer, called Adamastor, is formalized with security proofs and implemented with a prototype in the Rust language. Benchmarks demonstrate the practicality of the scheme and potential application to decentralized payment systems. While further research is needed, particularly in implementing a fully operational network, it sets a foundation for future advancements. In conclusion, this thesis contributes to the area of knowledge that results from the fusion of economics and computer science, by offering technical solutions for implementing a vision of a more inclusive, fairer, efficient, and secure financial system. The implications of this work are far-reaching, suggesting a future where digital currencies play a significant role in shaping global finance and technology.
  • Development of a Salesforce Solution: From Discovery to Implementation
    Publication . Martins, Rita Ribeiro; Silva, Frutuoso Gomes Mendes da; Dinkhuysen, Gabriela Levy
    This report details the internship experience focused on learning and implementing Salesforce, a prominent platform in the realm of CRM. The initial phase emphasizes an in- -depth immersion into Salesforce, covering essential knowledge assimilation, architectural understanding, and tool familiarization. Concurrently, practical skills in conducting project discoveries are honed, emphasizing the significance of the Discovery phase in a project’s lifecycle. As the internship progresses, the objectives shift towards the hands-on application of Salesforce knowledge and discovery methodologies. The intern is challenged to develop diverse functionalities to fulfill the goals identified during the Discovery phase. This later stage not only requires the application of theoretical knowledge but also demands effective translation of identified requirements into functional and efficient solutions. The overarching objectives of the internship seamlessly intertwine Salesforce proficiency, a deep comprehension of the discovery process, and the skillful application of this knowledge to create functionalities. The report provides a comprehensive overview of the internship journey, capturing the learning curve and practical applications in the dynamic environment of Salesforce and project discovery
  • Autonomous emergency braking for highway trajectory planning
    Publication . Ribeiro, Ricardo André Pereira; Pombo, Nuno Gonçalo Coelho Costa
    Autonomous vehicles (AV) require critical skills in several scenarios: awareness, intelligent decision-making, and executive control. The improvement of these characteristics is a natural reaction to the emergence of recognition systems such as sensors, which present increasingly precise measurements and a more significant collection of types of data, combined with the emergence of new technologies and mathematical approaches to existing problems in this sector, as well as the evolution in the artificial intelligence sector as raping the transport industry to a new level of automatization. In several areas, all this growth has led to this sector’s rapid development in recent years. The main idea of AVs is to create an intelligent decision-making module capable of controlling all essential processes associated with a vehicle, ranging from creating trajectories or steering control to even risk analysis, as is the case of the Autonomous Emergency Braking (AEB) system. The latter, as ADAS, is carried out mainly to mitigate human errors such as driver distractions, risk analysis with mathematical precision, combat deficiencies in human perception in scenarios with adverse environmental or physical conditions (fatigue, stress, anxiety), and prevent car accidents. This will reduce road traffic, minimize human casualties and injuries, and save millions in monetary loss for all road users. AVs are the future of strengthening and improving security policies in various scenarios. However, their high price in the development and testing process has proven to be a significant deterrent to developing these technologies. Consequently, available solutions in the area of autonomous emergency braking will be addressed, and all new solutions and studies, their strengths and weaknesses, the state of the development process, test systems, quality, and reliability, will be presented. This work aims to create AVs with well-designed trajectory planning using an adaptive Model Predictive Control (MPC) capable of achieving outstanding performance in critical highway scenarios. Furthermore, it also aims to aggregate an emergency braking system that reacts to multidimensional analyses, including collision detection, time to collision (TTC), and braking distance. This study also shows the necessity of placing particular emphasis on verification, validation, and testing (VVT) in the automobile industry, which has contributed significantly to the development of automation systems. They allow developers to test software at a low cost and risk cycle, finding hidden faults in the preliminary phase and increasing confidence in security, functional, and transaction analysis for autonomous prototypes on existing road networks. This is increasingly becoming a norm in the automobile industry thanks to the cost-benefit ratio, allowing the removal of errors before reaching the absolute testing phase, where the cost of mistakes, both monetary and humanitarian, can be catastrophic. For the models to be carried out through simulation environments, this work used Matlab Simulink.
  • ML Orchestrator: Development and Optimization of Machine Learning Pipelines and Platforms
    Publication . Marques, Pedro Joel da Silva Miroto; Neves, João Carlos Raposo; Lopes, Vasco Ferrinho; Degardin, Bruno Manuel
    Machine Learning Pipelines play a crucial role in the efficient development of large-scale models, which is a complex process that involves several stages and faces intrinsic challenges. This document seeks to explore the depth of these structures, from the initial preparation of datasets to the final stage of model implementation, as well as the importance of optimizing these structures. Emphasis is also placed on the critical relevance of this process in Cloud Computing environments, where flexibility, scalability and efficiency are imperative. By understanding and properly applying optimized strategies, we not only improve model performance, but also maximize the benefits offered by cloud computing, thus shaping the future of Machine Learning development at scale. The Google Cloud Platform, more specifically the Vertex AI tool, offers a comprehensive solution for building and implementing Machine Learning Pipelines, as it allows development teams to take advantage of pre-trained models, automation of tasks and management of tasks and resources in a simplified way, leading to improved scalability, enabling efficient processing of large volumes of data. In addition, an analysis is made of how the Google Kubernetes Engine tool plays a key role in the management and scaling of these structures, since the ability to manage containers on a large scale guarantees an efficient execution of Machine Learning processes, providing a dynamic response to requests from clients. To efficiently build and optimize a ML pipeline, essential objectives were set to ensure robustness and efficiency. This includes creating a Google Kubernetes Cluster with its complementary services in GKE for the Playground Tool service, employing scalability strategies like KEDA and deploying the DeepNeuronicML model for objects and actions predictions from real-time video streams. Additionally, a Copilot is used to monitor computational resources, ensuring the ML pipeline can manage multiple clients and their AI models in an optimized and scalable manner. To conclude, it’s important to note that optimizing Machine Learning Pipelines in cloud environments is not just a necessity, but a strategic advantage. By adopting innovative approaches and integrating the tools mentioned above (Vertex AI and Google Kubernetes Engine), business organizations can overcome the complex challenges of these structures and boost efficiency and innovation in their Machine Learning services.
  • Angular Atomic Components Architecture
    Publication . Pena, Nuno Rodrigo Lopes; Pombo, Nuno Gonçalo Coelho Costa; Aniceto, Alexandre Miguel Coelho
    O principal objetivo deste relatório passa por expor todo o trabalho realizado durante o estágio na empresa Emvenci. Este incide na realização de uma reformulação/criação de diversas páginas da aplicação web da empresa utilizando a metodologia de Atomic Design. Através desta reestruturação, um dos principais objetivos é permitir uma melhoria de performance do website, realizando otimizações ao nível do tempo de carregamento de conteúdo, número de linhas de código e tamanho que cada ficheiro ocupa. Desta forma, o documento começa por expor o universo de trabalho da empresa e os seus respetivos objetivos. De seguida, é feita uma descrição da plataforma utilizada pela empresa e uma explicação da metodologia a ser aplicada. No terceiro capítulo, são apresentadas as ferramentas utilizadas, bem como uma explicação das mesmas. Por fim, é retratado o desenvolvimento do estágio, referindo os aspetos mais importantes que foram realizados.