| Name: | Description: | Size: | Format: | |
|---|---|---|---|---|
| 1.74 MB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
A saúde mental é um dos maiores desafios globais da atualidade, sendo a depressão e a ansiedade dois dos transtornos mais frequentes e incapacitantes. As redes sociais, pela sua
natureza dinâmica e pela forma como captam estados emocionais, tornaram-se uma fonte
de dados importante para a investigação em saúde mental. Neste contexto, a presente dissertação explora o uso de dados textuais provenientes de redes sociais para identificar sinais
associados a transtornos mentais e a comportamentos frequentemente descritos na literatura clínica.
A metodologia seguida iniciou-se com a recolha de diferentes conjuntos de dados textuais públicos e em inglês, complementada pelo recurso ao conjunto de dados SNCrawler (Apenas os
dados em inglês). Posteriormente, foram desenvolvidos modelos de classificação específicos
para ansiedade e depressão, aplicados de forma cruzada para garantir maior consistência nas
etiquetas. Numa fase seguinte, procedeu-se à associação de comportamentos, com base em
vocabulário expandido e técnicas de embeddings semânticos, permitindo ter um conjunto de
dados com uma nova camada de informação. A etapa final consistiu no treino de um modelo
multitask, capaz de classificar simultaneamente transtornos e comportamentos, o qual foi
aplicado ao conjunto de dados SNCrawler, resultando na criação de um recurso anotado que
liga os transtornos, ansiedade e depressão, a comportamentos ligados aos mesmos descritos
na literatura clínica.
Os resultados obtidos confirmam a viabilidade de recorrer a dados textuais e técnicas de
Natural Language Processing (NLP) para apoiar a fenotipagem digital em saúde mental,
mesmo perante limitações como a escassez de dados anotados e a dificuldade de explorar
modalidades além do texto. Este trabalho contribui assim para a criação de novos recursos e
metodologias, que poderão ser aprofundados em investigações futuras, nomeadamente através da integração multimodal e da validação clínica.
Mental health is one of the greatest global challenges today, with depression and anxiety being two of the most common and debilitating disorders. Social media, due to its dynamic nature and the way it captures emotional states, has become an important source of data for mental health research. In this context, this dissertation explores the use of textual data from social media to identify signs associated with mental disorders and behaviors frequently described in clinical literature. The methodology followed began with the collection of different sets of public textual data in English, supplemented by the use of the SNCrawler dataset (English data only). Subsequently, specific classification models for anxiety and depression were developed and applied cross-cutting to ensure greater consistency in the labels. In a next phase, behaviors were associated based on expanded vocabulary and semantic embedding techniques, resulting in a dataset with a new layer of information. The final step consisted of training a multitask model capable of simultaneously classifying disorders and behaviors, which was applied to the SNCrawler dataset, resulting in the creation of an annotated resource linking disorders, anxiety, and depression to behaviors related to them described in the clinical literature. The results obtained confirm the feasibility of using textual data and Natural Language Processing (NLP) techniques to support digital phenotyping in mental health, even in the face of limitations such as the scarcity of annotated data and the difficulty of exploring modalities beyond text. This work thus contributes to the creation of new resources and methodologies, which could be further explored in future research, particularly through multimodal integration and clinical validation.
Mental health is one of the greatest global challenges today, with depression and anxiety being two of the most common and debilitating disorders. Social media, due to its dynamic nature and the way it captures emotional states, has become an important source of data for mental health research. In this context, this dissertation explores the use of textual data from social media to identify signs associated with mental disorders and behaviors frequently described in clinical literature. The methodology followed began with the collection of different sets of public textual data in English, supplemented by the use of the SNCrawler dataset (English data only). Subsequently, specific classification models for anxiety and depression were developed and applied cross-cutting to ensure greater consistency in the labels. In a next phase, behaviors were associated based on expanded vocabulary and semantic embedding techniques, resulting in a dataset with a new layer of information. The final step consisted of training a multitask model capable of simultaneously classifying disorders and behaviors, which was applied to the SNCrawler dataset, resulting in the creation of an annotated resource linking disorders, anxiety, and depression to behaviors related to them described in the clinical literature. The results obtained confirm the feasibility of using textual data and Natural Language Processing (NLP) techniques to support digital phenotyping in mental health, even in the face of limitations such as the scarcity of annotated data and the difficulty of exploring modalities beyond text. This work thus contributes to the creation of new resources and methodologies, which could be further explored in future research, particularly through multimodal integration and clinical validation.
Description
Keywords
Ansiedade Depressão Embeddings Fenoti-
Pagem Digital Hultig Multi-Task Model Natural Language Processing Saúde Mental
