2 Replies Latest reply on Nov 5, 2018 11:59 PM by Mason Klank

    What are best practices for transcribing non-Standard English characters?

    Julia Gabbard Ledbetter Newbie

      I started transcribing an 1859 letter from one A. B. Pikard and initially thought the unusual spellings were just a sign of limited literacy. In the second body paragraph, though, the writer references his system as the "Fonetik Alfabet". I started looking more closely at the spellings and saw that he uses  different characters for distinct vowel sounds that we represent with the same letters in standard English spelling; for example, a letter that looks like a typical cursive e for an "eh" sound and a letter resembling a Greek epsilon ε for an "ee" sound. Unusual characters are also used for oo, th, sh, and long i.  (It turns out Reverend Pickard was a member of the American Phonetics Association, and had some affiliation with the Association for Spelling Reform, based on a few Google searches.)


      Most of these characters have no direct Unicode equivalent at all, let alone the English alphabet, so a visual transcription is challenging to say the least. Some have a slight resemblance to characters in the International Phonetic Alphabet, but aren't assigned to the same phoneme. A sample of the rough transcription I started, without substituting any special characters, goes like this:

      I am veri much oblijd dat u shud

      tink enuf ov min tu ansr it. I trust

      u wil hav no difikulti in redin

      dis;– u se it is ritn in de Fonetik


      Obviously, this is un-searchable for all practical purposes, but that can be addressed with tags. A bigger issue is that it also likely won't be screenreader accessible even without adding special characters. I'm having fun thinking up ways to address it, as someone with an amateur interest in linguistics, but since it's a rather unique situation I wanted to get input from the Community Managers. It seems like maybe the most usable option would be creating two transcriptions for each page; first a visual transcription with the most visually similar Unicode characters possible standing in for the special characters Rev Pickard devised (to display his spelling system) and then a Standard English transcription for accessibility (both for screenreaders and people who are unable to parse the odd notation) and searchability. Thoughts?


      Thank you!

        • Re: What are best practices for transcribing non-Standard English characters?
          Mason Klank Newbie

          I have a similar question, although not as complicated. I've come across some documents written in other lanquages, mostly French, I was wondering should I use the French or foreign characters or use the normal alphabet for better searchability?

          • Re: What are best practices for transcribing non-Standard English characters?
            Victoria Van Hyning Adventurer

            What a wonderful and thoughtful post Julia, thank you for this. Your ideas are very much in line with some of the discussion we've had here with our curatorial and reference librarian colleagues around improving access to materials in languages/characters other than English/latin alphabet. 


            In brief, we'd like for volunteers to transcribe documents as they appear, because it might be helpful for people who are doing research into spelling, linguists or in this case phonetics. My fellow Community Manager, Lauren, and I have both looked at the letter and we concur with your observation that it's not a straight forward task in this case. As with much palaeography or scholarly editing there are always layers of interpretation.


            For now, I'd suggest that you and anyone else encountering phonetic spelling or materials in languages other than English, transcribe things as closely to the original as possible and then offer a translation at the bottom of the page if you can. As you observe, this will make materials accessible for screen readers.


            If something has gone through review and is effectively locked on crowd.loc.gov, posting a link to the original page and offering a translation here would be a helpful alternative. Then alert us so we can see about how to bring the translations back into the catalog.


            We have not yet agreed on a pathway for bringing translations back into the catalog, and of course translations are even more open to interpretation than transcription, so there's some more to think through there.


            I hope that this is a helpful answer for now.


            Thanks again for your thoughtful questions and ideas.



            1 of 1 people found this helpful