Abstract
Automatic language processing and computational human sciences : artificial intelligence in the service of the past
This talk will present examples of the use of automatic language processing methods in the humanities, particularly in the sciences of texts and the philology of ancient and medieval texts in French and Hebrew. We'll start with the use of text/image alignment techniques that facilitate the supervised creation of ground truth data for the automatic transcription of handwritten scripts, help resolve abbreviations and reconstruct copies of the same text. We will continue with the challenges posed by the normalization or lemmatization of ancient states of language, presenting significant graphical variation, while showing how this can then be used for the detection of intertextuality or the use of stylometric methods for the identification of authors of anonymous or disputed texts. Finally, we will show how automatic language processing and artificial intelligence can be used to build up and analyze large corpora in long diachronic time, and how these can then be analyzed using methods such as word and document embeddings or large language models to track major thematic evolutions over time.