Incorporating Audio Data?

While the Occasion Maps papers, that are the object of this project, do not include any audio recordings, the question of how to deal with audio recording arose due to a number transcripts in collection. In cooperating with the Garfinkel Archive it became clear that those transcripts likely correlated with tape recordings in the archive and which were already (partly) digitised. While not within the scope of this digitisation projects, this discovery led to an exploration of these transcripts and audio recordings and an exchange with other projects dealing with audio [Audio Taskforce Call link].

“I have a lot of backlog of transcriptions from previous sessions of topics similar to the ones we’re discussing but i’m not going to be able to get to them but they may be [XXXX] to you but not in the present shape. The typist was instructed “Oh… just type what you hear”. They worked unsupervised and they are filled with all kinds of …crud. Whosoever volunteers to do an editorial job on a transcript will have [XXXX] in return for which you can have the further task, after it has been editorialized, and I mean heavily editorialized I’m talking “taking out the crud” For example if you have a page, in the beginning will be..its…like just the sort of thing we’re talking about now that will be pointless. That needs to be taken out. There is a lot of half-formed sentences. Umms and ahhs. Come on.. so it needs to be turned into scripts of sorts.”

  • 1831 Audiodateien
  • 2000 - 3000 Stunden
  • mitgeschnittene Vorlesungen, Vorträge und Gespräche
  • Unterschiedliche Medien
  • Reel-to-reel,Compact Cassette,Digital auf Speicherkarte

Audio Recordings

As part of the digitizisation rpocess of the Harold Garfinkel Archive in Newburyport, MA, over [number] audio files and [] hours of audio were archived from various physical media. (A complete collection of audio will be made available for interested researches at the Harold Garfinkel Archive at a later date.) Garfinkel had established a process of working that heavily relied on recording almost all of this seminars, lectures and, in some cases, even office hours and telephone calls. In case of these lecture on occasion maps, Garfinkel had every lecture first recorded and then later had assistants (and students) type transcriptions of the lectures and hand them back in. Students party received extra credit for this work (mentioned on some of the recordings) Due to the circumstances of the recordings- they were often simply done by setting up a smal consumer-grade recorder up in a lecture room full of students, talking and moving around - the audio quality is often rather lacking and many passages of Garfinkel’s lecture are inaudible or tough to understand.

Transcripts

It would of, course, be a desirable long term goal to generate transcriptions of the audio in the archive, display these automatic transcripts synchronized with the audio and, in cases were they are available, with the historical typoscripts. For this to be plausible and feasible, an automatic solution would have to found. While automatic audio-to-text tools exist, the audio quality of some of the files pose a challenge. Some experiments were done with automatic ai-assistad transcription tools and while even throughout the year 2023, clear improvements could be seen for audio files of this quality, the results were still unusable. Another challenged posed by the material for the technology, especially LLM, is the very particular language and sociological terminology used by Garfinkel. For example the term “occasion maps” is very specific to Garfinkel and even a small niche within his work. LLM transcriptions have reliable transcribed this as “economical maps” or a similar variation, reflecting the fact that amongst their sample set and within their language model the term “economical maps” is of course vastly more frequent and thus in an unclear situation a much more likely truth than “occasion maps”

Typoscripts

These typoscripts, found in the collections, give further insights into Garfinkel process, as they can be found in different states of editing and revision. When looking at the transcroipt three groups can roughy be recognised that vary in terms of completeness and accuracy to the source audio. The first group are relatively raw transcriptions that closely follow the audio but have obvious lacunae and gaps left by the transcriber when they could not make out utterances in the audio. Some have omitted small utterances, aside or preliminary remarks that are present in the audio but have been left out in the transcript. Another group are these raw trasncripotions with added handwritten annotation and revision remarks, likely by Garfinkel or an assistant. These were sometime made in order to fill graps in the raw transcriptions and in some cases to reorder and add thoughts and explanation no in the audio. A first hint, that these transcriptions were used by Garfinkel to further work on this ideas and to bring the transcript closer in form and content to a research paper. A third category a typoscripts is apparently yet another step further along in the edition process. The typoscript do not follow the audio anymore but are restructured and rewritten form likely based on previous transcripts. These reorderd typoscriopt were not the end of any process but include further handwritten revisions. This lead to an interesting observation in regards to the relation between audio recording, and thus the performative act of holding the lecture and the written word. The much more common case (examples) in history-of-science