Loading…
Attending this event?
Monday October 7, 2024 1:46pm - 2:05pm EDT
ZOOM PASSCODE: ld4-2024
The intersection of human understanding and machine processing in cultural heritage presents a fundamental challenge: humans naturally express their interpretations through textual descriptions, while machines reason most reliably over structured data.

Whilst researchers, developers and the public need data to be available in numerous formats, manual translation requires time and intimate knowledge of the data and chosen ontology; machine learning approaches generally require numerous paired training examples to perform well.

However, large quantities of linked-data and natural language samples already exist separately. We use cycle-consistency training, an unsupervised approach for learning bidirectional translation between linked-data and natural language. Using two sequence-to-sequence language models and two unpaired datasets, we learn to align their feature spaces through iterative back-translation: one model generates a synthetic example as input to a second model, which attempts to recreate the real, original input data to the first model. Once trained, these models may be used to translate arbitrary data from one representation to the other. This approach has already been shown to be incredibly effective in a graph-to-text setting (Q. Gou et al., 2020) but is yet to be applied in cultural heritage.

This presentation gives an overview of the datasets, the key differences between them, and the implications this has for the task of translation, particularly with respect to our training paradigm. I will then close with some proposed remedies before opening up to questions.

I look forward to seeing you all there!
Speakers
WT

William Thorne

PhD Candidate, University of Sheffield; National Gallery (London)
I'm Liam, I am studying a joint PhD between the University of Sheffield and the National Gallery (London) into information extraction, organisation and searching of art historical text collections. My key areas of research interest are in reducing computational and data costs of language... Read More →
Monday October 7, 2024 1:46pm - 2:05pm EDT
Zoom
Log in to leave feedback.

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link