Article, 2024

Is medieval distant viewing possible? : Extending and enriching annotation of legacy image collections using visual analytics

Digital Scholarship in the Humanities, ISSN 0268-1145, 2055-7671, 1477-4615, 2055-768x, Volume 39, 2, Pages 638-656, 10.1093/llc/fqae020

Contributors

Meinecke, Christofer

0000-0002-5637-9975 (Corresponding author) [1] Guéville, Estelle [2] Wrisley, David Joseph

0000-0002-0355-1487 [3] Jänicke, Stefan

0000-0001-9353-5212 [4]

Affiliations

[1] Leipzig University

[NORA names:

[2] Yale University

[NORA names:

[3] New York University Abu Dhabi

[NORA names:

[4] University of Southern Denmark

[NORA names:

SDU University of Southern Denmark

University

Abstract

Abstract Distant viewing approaches have typically used image datasets close to the contemporary image data used to train machine learning models. To work with images from other historical periods requires expert annotated data, and the quality of labels is crucial for the quality of results. Especially when working with cultural heritage collections that contain myriad uncertainties, annotating data, or re-annotating, legacy data is an arduous task. In this paper, we describe working with two pre-annotated sets of medieval manuscript images that exhibit conflicting and overlapping metadata. Since a manual reconciliation of the two legacy ontologies would be very expensive, we aim (1) to create a more uniform set of descriptive labels to serve as a “bridge” in the combined dataset, and (2) to establish a high-quality hierarchical classification that can be used as a valuable input for subsequent supervised machine learning. To achieve these goals, we developed visualization and interaction mechanisms, enabling medievalists to combine, regularize and extend the vocabulary used to describe these, and other cognate, image datasets. The visual interfaces provide experts an overview of relationships in the data going beyond the sum total of the metadata. Word and image embeddings as well as co-occurrences of labels across the datasets enable batch re-annotation of images, recommendation of label candidates, and support composing a hierarchical classification of labels.

Keywords

analytes, annotated data, annotation, approach, batch, candidates, cognates, collection, cultural heritage collections, data, dataset, distant view, embedding, experts, goal, heritage collections, hierarchical classification, historical period, image collection, image data, image datasets, image embeddings, images, input, interaction, interaction mechanism, interface, labeling, learning, learning models, legacy, legacy data, legacy ontologies, machine learning, machine learning models, manuscript images, mechanism, medievalists, metadata, model, myriad uncertainties, ontology, overview, overview of relationships, period, quality, quality of labels, quality of results, re-annotation, recommendations, reconciliation, relationship, results, supervised machine learning, task, train machine learning models, uncertainty, views, visual analytics, visual interface, visualization, vocabulary, words

Is medieval distant viewing possible? : Extending and enriching annotation of legacy image collections using visual analytics

Contributors

Affiliations

Abstract

Keywords

Data Provider: Digital Science

LINKS
-

Matching Records in NORA

SUBJECTS
+

DK Main Research Area

UN SDG Classification

OECD Classification

AU/NZ FOR Classification

METRICS
+

Citation Metrics

Attention Metrics

Attention Metrics

DK Open Access Indicator

Contributors

Affiliations

Abstract

Keywords

Data Provider: Digital Science

LINKS-

Matching Records in NORA

SUBJECTS+

DK Main Research Area

UN SDG Classification

OECD Classification

AU/NZ FOR Classification

METRICS+

Citation Metrics

Attention Metrics

Attention Metrics

DK Open Access Indicator

Matching Records in NORA

DK Open Access Indicator

DK Green Classification

LINKS
-

SUBJECTS
+

METRICS
+