Show simple item record

dc.contributor.authorThompson, Paul
dc.contributor.authorAnaniadou, Sophia
dc.contributor.authorBasinas, Ioannis
dc.contributor.authorBrinchmann, Bendik Christian
dc.contributor.authorCramer, Christine
dc.contributor.authorGalea, Karen S.
dc.contributor.authorGe, Calvin B.
dc.contributor.authorGeorgiadis, Panagiotis
dc.contributor.authorKirkeleit, Jorunn
dc.contributor.authorKuijpers, Eelco
dc.contributor.authorNguyen, Nhung
dc.contributor.authorNuñez, Roberto
dc.contributor.authorSchlünssen, Vivi
dc.contributor.authorStokholm, Zara Ann
dc.contributor.authorTaher, Evana Amir
dc.contributor.authorTinnerberg, Håkan
dc.contributor.authorVan Tongeren, Martie
dc.contributor.authorXie, Qianqian
dc.date.accessioned2024-08-26T07:18:21Z
dc.date.available2024-08-26T07:18:21Z
dc.date.created2024-08-22T14:42:23Z
dc.date.issued2024
dc.identifier.citationPLOS ONE. 2024, 19 (8), .
dc.identifier.issn1932-6203
dc.identifier.urihttps://hdl.handle.net/11250/3148435
dc.description.abstractAnindividual’s likelihood of developing non-communicable diseases is often influenced by the types, intensities and duration of exposures at work. Job exposure matrices provide exposure estimates associated with different occupations. However, due to their time-consuming expert curation process, job exposure matrices currently cover only a subset of possible workplace exposures and may not be regularly updated. Scientific literature articles describing exposure studies provide important supporting evidence for developing and updating job exposure matrices, since they report on exposures in a variety of occupational scenarios. However, the constant growth of scientific literature is increasing the challenges of efficiently identifying relevant articles and important content within them. Natural language processing methods emulate the human process of reading and understanding texts, but in a fraction of the time. Such methods can increase the efficiency of both finding relevant documents and pinpointing specific information within them, which could streamline the process of developing and updating job exposure matrices. Named entity recognition is a fundamental natural language processing method for language understanding, which automatically identifies mentions of domain-specific concepts (named entities) in documents, e.g., exposures, occupations and job tasks. State-of-the-art machine learning models typically use evidence from an annotated corpus, i.e., a set of documents in which named entities are manually marked up (annotated) by experts, to learn how to detect named entities automatically in new documents. We have developed a novel annotated corpus of scientific articles to support machine learning based named entity recognition relevant to occupational substance exposures. Through incremental refinements to the annotation process, wedemonstrate that expert annotators can attain high levels of agreement, and that the corpus canbeusedtotrain high-performance named entity recognition models. The corpus thus constitutes an important foundation for the wider development of natural language processing tools to support the study of occupational exposures.
dc.description.abstractSupporting the working life exposome: Annotating occupational exposure for enhanced literature search
dc.language.isoeng
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleSupporting the working life exposome: Annotating occupational exposure for enhanced literature search
dc.title.alternativeSupporting the working life exposome: Annotating occupational exposure for enhanced literature search
dc.typePeer reviewed
dc.typeJournal article
dc.description.versionpublishedVersion
dc.source.pagenumber27
dc.source.volume19
dc.source.journalPLOS ONE
dc.source.issue8
dc.identifier.doi10.1371/journal.pone.0307844
dc.identifier.cristin2288672
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution 4.0 International
Except where otherwise noted, this item's license is described as Attribution 4.0 International