← Back

Filling in Sappho's Blanks with AI

Aug 19, 2022

Photo by Liam Matthews on Unsplash

I’ve been interested in attempts to decipher codes and ancient languages since I borrowed one of my father’s books on the Enigma machine as kid and first learned about the efforts of the Bletchey Park cryptographers to crack its code during the Second World War. I even ended up writing my dissertation, in part, on how Renaissance thinkers tried to read Egyptian hieroglyphs, centuries before Jean-François Champollion realized that the script combined both phonetic and ideographic elements by studying the inscriptions on the Rosetta Stone. So I’m currently following with interest the stories of AI being applied to reconstruct texts in ancient languages. One team of researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are enlisting AI in their effort to decipher the Indus script, examples of which have been found on more than 4000 objects in modern-day India and Pakistan, by training an algorithm on common patterns in how languages change over time and then supplying it with words in a lost language that need to be aligned with words in a known, related language. Meanwhile, another group of scholars and researchers from British AI lab DeepMind have banded together to develop Ithaca, an AI system that can propose text to complete Greek inscriptions that only survive in a partial state, as well as determining where and when those inscriptions were likely to have been made by drawing on contextual linguistic clues.

I was particularly struck by the project of Ithaca because of my interest in the Greek poet Sappho. Sappho, who was born around 630 BC, wrote some of the most direct and evocative poetry that exists, even now, on the subject of love. Her lyrics were revered in antiquity – while Homer was known among the Greeks as “The Poet”, Sappho was his female counterpart, “The Poetess”. And yet despite her glowing reputation among her peers, only one of her poems survives intact today. We may have lost some of her writings due to the early Church’s horror of her “loose” morals – Sappho speaks frankly about her desire for both men and women –, resulting in the deliberate destruction of her work. Until the last quarter of the nineteenth century, our knowledge of Sappho’s poetry was restricted to what had been quoted of it in other ancient writers’ texts, though such citations were rarely in full; in one particularly frustrating example, the proto-literary-critic Longinus cites and discusses a substantial fragment of one of Sappho’s poems in his poetic manual On the Sublime, but doesn’t record its final few lines. When a trove of Sappho’s writings was retrieved during a series of excavations at Oxyrhynchus in the late 1800s and early 1900s, time and the elements had done their work, and much of the papyrus on which the poems were written had degraded, making them impossible to read without interruption.

For some scholars, the fact that – bar one – Sappho’s poems exist only in fragmentary state makes them all the more powerful. Daniel Mendelsohn quotes Thomas Habink, a classicist at the University of Southern California, in a New Yorker article from 2015: “The fragmentary preservation of poems of yearning and separation serves as a reminder of the inevitable incompleteness of human knowledge and affection.” But literary scholars are good at making a virtue out of necessity, and there’s a fine but definitive line between a mystery that excites and a mystery that frustrates. While Habink’s claim does feel true for some of the poems, I can’t say I feel much stimulation staring down a page of square brackets relieved only by a single word or short phrase – “to Kypris”, “Atthis for you”, “in a thin voice”, “lady”, “deep sound”.

Those quotations, by the way, are from Anne Carson’s 2002 work, If Not, Winter, one of the fullest and most beautiful English translations of Sappho’s poetry. Carson is a classicist, but she’s also a poet herself; in her translation, the single square brackets that are typically used to indicate missing matter in a source serve an aesthetic function as well as a scholarly one. In her prefatory note on “marks and lacks” throughout the text, Carson writes:

It is not the case that every gap or illegibility is specifically indicated: this would render the page a blizzard of marks and inhibit reading. Brackets are an aesthetic gesture toward the papyrological event rather than an accurate record of it. I have not used brackets in translating passages, phrases or words whose existence depends on citation by ancient authors, since these are intentionally incomplete. I emphasize the distinction between brackets and no brackets because it will affect your reading experience, if you allow it. Brackets are exciting. Even though you are approaching Sappho in translation, that is no reason you should miss the drama of trying to read a papyrus torn in half or riddled with holes or smaller than a postage stamp – brackets imply a free space of imaginal adventure.

Brackets – and the gaps in the text they indicate – stimulate our imagination, Carson says. And I wondered, as I reread If Not, Winter recently, if they would also stimulate the “imagination” of GPT-3. What would an AI program make of Sappho’s fragments, in Carson’s translation? What would it be able to dream up to fill in those maddening, intoxicating gaps in poetry written more than 2500 years ago?

As I was dealing with a work in translation, and thus not seriously attempting to recover the words that Sappho wrote, I didn’t use any form of fine-tuning for my initial GPT-3 set-up. I gave GPT-3 the below prompt, followed by a sample input (a fragment in which I had introduced artificial blanks) and sample output (the “complete” fragment of poetry):

Sappho ( c. 630 – c. 570 BC) was an Archaic Greek poet. She is known for her lyric poetry, much of which is now lost or survives only in fragments. She wrote extensively on love and desire, both heterosexual and homosexual.

Fill in the [blank] to complete these fragments of Sappho's poetry. One [blank] can represent one or several words. Do not alter or omit any of the words that feature in the input.

        Input:

        night

        girls

        all [blank]

        might sing of the [blank] between you and the bride

        with violets in her lap


        wake! and [blank]

        the young men so that

        no more than [blank]

        shall we sleep


        Output:


        night

        girls

        all night long

        might sing of the love between you and the bride

        with violets in her lap


        wake! and go call

        the young men so that

        no more than the bird with piercing voice

        shall we sleep


I then gave GPT-3 the following input, a fragment that is riddled with several blanks in Carson’s translation:

        [blank] waiting

        [blank] in sacrifices

        [blank] having good

        [blank]

        [blank] but going

        [blank] for we know

        [blank] of works

        [blank]

        [blank] after

        [blank] and toward

        [blank] says this


Then, I asked it to generate an output. This was the result:

        The gods waiting

        angry in sacrifices

        not having good

        pleased

        but going

        home for we know

        the end of works

        long

        after dark and toward

        dawn says this


Here, you can see that GPT-3 has followed the instruction to fill in each and every blank in the poem pretty closely (it misses just two, the one before “but going” and the one before “after”). However, the result is grammatically incorrect (“not having good pleased”), syntactically confused, and incoherent: the model can create individual clauses that make sense (“after dark and toward dawn”) but they don’t knit together to form a cohesive whole.

But after inputting just another two poems, that coherence improved. Here’s the third poem I gave the model, and its associated output:


        Input:

        [blank]

        [blank]

        [blank] pity

        [blank] trembling

        [blank]

        [blank] flesh by now old age

        [blank] covers

        [blank] flies in pursuit

        [blank]

        [blank] noble

        [blank] taking

        [blank] sing to us

        the one with violets in her lap

        [blank] mostly

        [blank] goes astray


        Output:

        Ah

        me

        have pity

        on my trembling heart that

        flesh by now old age

        covers and flies in pursuit

        of

        youth and the noble

        man taking his fill of love will sing to us

        the one with violets in her lap

        for love mostly

        goes astray


GPT-3 here introduces the image of the “trembling heart”, which flies in pursuit of youth – a cue presumably taken from the words “flesh by now old age”. While the clause “that flesh by now old age covers” doesn’t make sense here – though it conjures the right image, a heart covered by flesh that is now grown old – this output has a stronger, more coherent theme than that of the first: a woman struggling to come to terms with her ever-increasing age, drawn toward those who still possess the youth that has deserted her.

The next poem I gave the model displayed the same successes and failures:


        Input:

        [blank]

        [blank]

        [blank]

        [blank]

        of the Muses [blank]

        makes and of the Graces [blank]

        with slender [blank]

        for mortals: there is a share [blank]

        [blank]


        Output:

        The sweetest thing

        on earth is love: it

        makes the bitter endurable, it makes

        the longest journey short, and

        of the Muses it is the great one

        that makes and of the Graces it is their leader

        with slender ankles

        for mortals: there is a share in their dances

        even for us.


Again, GPT-3 comes up with the overall theme of the poem: love is the most powerful force that exists, for both mortals and gods alike. The tone is perhaps a touch too sentimental for Sappho, but it’s not far off the mark, and though the syntax gets confused when the Muses are introduced (“it is the great one that makes”), the message remains fairly clear: love is revered by both the Muses and the Graces, and it is through love that we taste immortality, if we do at all (“there is a share in their dances even for us”).  

Here are several more examples of incomplete Sappho poems I fed into the model, and GPT-3’s completions of those poems:


        Input:

        [blank]

        [blank] in front

        [blank] toward

        [blank] loosen

        [blank] you would be willing

        [blank] slight

        [blank] to be carried


        Output:

        Love, having now

        arrived in front of me,

        looks toward me and beckons

        that i should loosen the knot which keeps our hearts together

        if only you would be willing:

        it is not slight the pain which love causes me

        nor is it easy to be carried away by love.


       

         Input:

        [blank]

        [blank]

        [blank]

        [blank] beautiful he

        [blank] stirs up still things

        [blank] exhaustion the mind

        [blank] settles down

        [blank] but come O beloveds

        [blank] for day is near


        Output:

        Come to me now

        you who are beautiful and young

        for he who is stirring up my heart is still far away

        and the mind in its exhaustion settles down in forgetfulness

        but come O beloveds

        awake! for day is near


        

        Input:

        [blank]

        [blank]

        [blank]

        [blank]

        [blank] running away

        [blank] bitten

        [blank]

        [blank]

        [blank] you

        [blank] makes a way with the mouth

        [blank] beautiful gifts children

        [blank] song delighting clearsounding lyre

        [blank] all my skin old age already

        hair turned white after black

        [blank] knees do not carry

        [blank] like fawns

        [blank] but what could I do?

        [blank] not possible to become

        [blank] Dawn with arms of roses

        [blank] bringing to the ends of the earth

        [blank] yet seized

        [blank] wife

        [blank] imagines

        [blank] might bestow

        But I love delicacy

        and this to me -

        the brilliance and beauty of the sun -

        desire has allotted.


        Output:

        The sun

        has risen and the day

        is running away and I am bitten

        by love and yet I cannot

        go to you because you are so far away

        and a way with the mouth there is none

        nor beautiful gifts nor children's song delighting the heart nor lyre with clearsounding strings can bring us together

        all my skin old age already has covered and hair turned white after black has sprouted on my head but knees do not carry me as they used to

        like fawns

        but what could I do? It is not possible to become immortal nor is it possible to cross over the sea in one's arms nor yet to come flying like Dawn with arms of roses

        bringing light to the ends of the earth but love seizes me and imagination which might bestow even that which is not real upon me

        But I love delicacy

        and this to me -

        the brilliance and beauty of the sun -

        desire has allotted.

You can see that my model never really obeys the command that it shouldn’t reword any of the language provided – I still need to experiment with prompts that might get it to “listen” to that requirement more closely. But even with this quick assay, I was impressed with GPT-3’s ability to creatively evoke the experience of being in love in a voice that feels quite like Sappho (or at least Sappho as translated through Anne Carson). There’s that direct, stark imagery (“I am bitten by love”, for example, or the knot between two lovers that the speaker begs to be loosened), the contradictory or self-defeating behavior (the speaker who beckons to her the “beautiful and young”, even as she yearns for a lover far away), the ambiguous affect (is the “imagination which might bestow even that which is not real upon me” a blessing, or a curse?). We already know that AI writing software doesn’t need human experience to write heart-rending prose: see, for example, Stephen Marche’s “Automated Love Story” in this Lit Hub article. But it’s even more impressive when it does it in a passable imitation of a specific voice – in this case, the voice of a poet dead for 2500+ years.

Of course, the complete versions of Sappho’s poems I’ve cooked up using GPT-3 are just a fantasy – a dream of what they could have looked like. It would be fascinating to see if there really is potential for reconstructing Sappho in the original Greek. But I suspect any whole version of Sappho suggested by even the most rigorously-trained AI would also be a dream, albeit much more disciplined than the one I present here. (The weirdness of using AI for this purpose is neatly summed up by the copy Nature used to advertise the article it published by the Ithaca team: “Predicting the past”. You don’t usually predict events that have already happened, even if you don’t know the exact nature of those events).

Even presuming some formal and material constraints that might help researchers seeking to reconstruct her poetry – the meter in which a poem seems to have been written, its probable length, any suppositions about the size of the papyrus on which it was originally written, and so on – poetry is inherently more entropic than most other forms of writing. Comparing Sappho’s poems with those of her contemporaries – Alcaeus, say, another Greek lyricist from the same island as Sappho who performed alongside her – might be helpful to an extent, but then again, you couldn’t recreate the verbal texture of Shakespeare’s plays just from knowing those of Marlowe, Beaumont, or Fletcher. There’s no accounting for genius.

Still, time and those much better informed than me will tell whether recovering Sappho’s poetry is a viable project. While I wait, I can still experience a ghost of magic in reading these completions offered by GPT-3, something like the satisfaction a person might receive from reading fan fiction or watching a movie adaptation of a book. Wholeness from a part, disjointed limbs now attached to bodies.

← Back

Filling in Sappho's Blanks with AI

Aug 19, 2022

Photo by Liam Matthews on Unsplash

I’ve been interested in attempts to decipher codes and ancient languages since I borrowed one of my father’s books on the Enigma machine as kid and first learned about the efforts of the Bletchey Park cryptographers to crack its code during the Second World War. I even ended up writing my dissertation, in part, on how Renaissance thinkers tried to read Egyptian hieroglyphs, centuries before Jean-François Champollion realized that the script combined both phonetic and ideographic elements by studying the inscriptions on the Rosetta Stone. So I’m currently following with interest the stories of AI being applied to reconstruct texts in ancient languages. One team of researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are enlisting AI in their effort to decipher the Indus script, examples of which have been found on more than 4000 objects in modern-day India and Pakistan, by training an algorithm on common patterns in how languages change over time and then supplying it with words in a lost language that need to be aligned with words in a known, related language. Meanwhile, another group of scholars and researchers from British AI lab DeepMind have banded together to develop Ithaca, an AI system that can propose text to complete Greek inscriptions that only survive in a partial state, as well as determining where and when those inscriptions were likely to have been made by drawing on contextual linguistic clues.

I was particularly struck by the project of Ithaca because of my interest in the Greek poet Sappho. Sappho, who was born around 630 BC, wrote some of the most direct and evocative poetry that exists, even now, on the subject of love. Her lyrics were revered in antiquity – while Homer was known among the Greeks as “The Poet”, Sappho was his female counterpart, “The Poetess”. And yet despite her glowing reputation among her peers, only one of her poems survives intact today. We may have lost some of her writings due to the early Church’s horror of her “loose” morals – Sappho speaks frankly about her desire for both men and women –, resulting in the deliberate destruction of her work. Until the last quarter of the nineteenth century, our knowledge of Sappho’s poetry was restricted to what had been quoted of it in other ancient writers’ texts, though such citations were rarely in full; in one particularly frustrating example, the proto-literary-critic Longinus cites and discusses a substantial fragment of one of Sappho’s poems in his poetic manual On the Sublime, but doesn’t record its final few lines. When a trove of Sappho’s writings was retrieved during a series of excavations at Oxyrhynchus in the late 1800s and early 1900s, time and the elements had done their work, and much of the papyrus on which the poems were written had degraded, making them impossible to read without interruption.

For some scholars, the fact that – bar one – Sappho’s poems exist only in fragmentary state makes them all the more powerful. Daniel Mendelsohn quotes Thomas Habink, a classicist at the University of Southern California, in a New Yorker article from 2015: “The fragmentary preservation of poems of yearning and separation serves as a reminder of the inevitable incompleteness of human knowledge and affection.” But literary scholars are good at making a virtue out of necessity, and there’s a fine but definitive line between a mystery that excites and a mystery that frustrates. While Habink’s claim does feel true for some of the poems, I can’t say I feel much stimulation staring down a page of square brackets relieved only by a single word or short phrase – “to Kypris”, “Atthis for you”, “in a thin voice”, “lady”, “deep sound”.

Those quotations, by the way, are from Anne Carson’s 2002 work, If Not, Winter, one of the fullest and most beautiful English translations of Sappho’s poetry. Carson is a classicist, but she’s also a poet herself; in her translation, the single square brackets that are typically used to indicate missing matter in a source serve an aesthetic function as well as a scholarly one. In her prefatory note on “marks and lacks” throughout the text, Carson writes:

It is not the case that every gap or illegibility is specifically indicated: this would render the page a blizzard of marks and inhibit reading. Brackets are an aesthetic gesture toward the papyrological event rather than an accurate record of it. I have not used brackets in translating passages, phrases or words whose existence depends on citation by ancient authors, since these are intentionally incomplete. I emphasize the distinction between brackets and no brackets because it will affect your reading experience, if you allow it. Brackets are exciting. Even though you are approaching Sappho in translation, that is no reason you should miss the drama of trying to read a papyrus torn in half or riddled with holes or smaller than a postage stamp – brackets imply a free space of imaginal adventure.

Brackets – and the gaps in the text they indicate – stimulate our imagination, Carson says. And I wondered, as I reread If Not, Winter recently, if they would also stimulate the “imagination” of GPT-3. What would an AI program make of Sappho’s fragments, in Carson’s translation? What would it be able to dream up to fill in those maddening, intoxicating gaps in poetry written more than 2500 years ago?

As I was dealing with a work in translation, and thus not seriously attempting to recover the words that Sappho wrote, I didn’t use any form of fine-tuning for my initial GPT-3 set-up. I gave GPT-3 the below prompt, followed by a sample input (a fragment in which I had introduced artificial blanks) and sample output (the “complete” fragment of poetry):

Sappho ( c. 630 – c. 570 BC) was an Archaic Greek poet. She is known for her lyric poetry, much of which is now lost or survives only in fragments. She wrote extensively on love and desire, both heterosexual and homosexual.

Fill in the [blank] to complete these fragments of Sappho's poetry. One [blank] can represent one or several words. Do not alter or omit any of the words that feature in the input.

        Input:

        night

        girls

        all [blank]

        might sing of the [blank] between you and the bride

        with violets in her lap


        wake! and [blank]

        the young men so that

        no more than [blank]

        shall we sleep


        Output:


        night

        girls

        all night long

        might sing of the love between you and the bride

        with violets in her lap


        wake! and go call

        the young men so that

        no more than the bird with piercing voice

        shall we sleep


I then gave GPT-3 the following input, a fragment that is riddled with several blanks in Carson’s translation:

        [blank] waiting

        [blank] in sacrifices

        [blank] having good

        [blank]

        [blank] but going

        [blank] for we know

        [blank] of works

        [blank]

        [blank] after

        [blank] and toward

        [blank] says this


Then, I asked it to generate an output. This was the result:

        The gods waiting

        angry in sacrifices

        not having good

        pleased

        but going

        home for we know

        the end of works

        long

        after dark and toward

        dawn says this


Here, you can see that GPT-3 has followed the instruction to fill in each and every blank in the poem pretty closely (it misses just two, the one before “but going” and the one before “after”). However, the result is grammatically incorrect (“not having good pleased”), syntactically confused, and incoherent: the model can create individual clauses that make sense (“after dark and toward dawn”) but they don’t knit together to form a cohesive whole.

But after inputting just another two poems, that coherence improved. Here’s the third poem I gave the model, and its associated output:


        Input:

        [blank]

        [blank]

        [blank] pity

        [blank] trembling

        [blank]

        [blank] flesh by now old age

        [blank] covers

        [blank] flies in pursuit

        [blank]

        [blank] noble

        [blank] taking

        [blank] sing to us

        the one with violets in her lap

        [blank] mostly

        [blank] goes astray


        Output:

        Ah

        me

        have pity

        on my trembling heart that

        flesh by now old age

        covers and flies in pursuit

        of

        youth and the noble

        man taking his fill of love will sing to us

        the one with violets in her lap

        for love mostly

        goes astray


GPT-3 here introduces the image of the “trembling heart”, which flies in pursuit of youth – a cue presumably taken from the words “flesh by now old age”. While the clause “that flesh by now old age covers” doesn’t make sense here – though it conjures the right image, a heart covered by flesh that is now grown old – this output has a stronger, more coherent theme than that of the first: a woman struggling to come to terms with her ever-increasing age, drawn toward those who still possess the youth that has deserted her.

The next poem I gave the model displayed the same successes and failures:


        Input:

        [blank]

        [blank]

        [blank]

        [blank]

        of the Muses [blank]

        makes and of the Graces [blank]

        with slender [blank]

        for mortals: there is a share [blank]

        [blank]


        Output:

        The sweetest thing

        on earth is love: it

        makes the bitter endurable, it makes

        the longest journey short, and

        of the Muses it is the great one

        that makes and of the Graces it is their leader

        with slender ankles

        for mortals: there is a share in their dances

        even for us.


Again, GPT-3 comes up with the overall theme of the poem: love is the most powerful force that exists, for both mortals and gods alike. The tone is perhaps a touch too sentimental for Sappho, but it’s not far off the mark, and though the syntax gets confused when the Muses are introduced (“it is the great one that makes”), the message remains fairly clear: love is revered by both the Muses and the Graces, and it is through love that we taste immortality, if we do at all (“there is a share in their dances even for us”).  

Here are several more examples of incomplete Sappho poems I fed into the model, and GPT-3’s completions of those poems:


        Input:

        [blank]

        [blank] in front

        [blank] toward

        [blank] loosen

        [blank] you would be willing

        [blank] slight

        [blank] to be carried


        Output:

        Love, having now

        arrived in front of me,

        looks toward me and beckons

        that i should loosen the knot which keeps our hearts together

        if only you would be willing:

        it is not slight the pain which love causes me

        nor is it easy to be carried away by love.


       

         Input:

        [blank]

        [blank]

        [blank]

        [blank] beautiful he

        [blank] stirs up still things

        [blank] exhaustion the mind

        [blank] settles down

        [blank] but come O beloveds

        [blank] for day is near


        Output:

        Come to me now

        you who are beautiful and young

        for he who is stirring up my heart is still far away

        and the mind in its exhaustion settles down in forgetfulness

        but come O beloveds

        awake! for day is near


        

        Input:

        [blank]

        [blank]

        [blank]

        [blank]

        [blank] running away

        [blank] bitten

        [blank]

        [blank]

        [blank] you

        [blank] makes a way with the mouth

        [blank] beautiful gifts children

        [blank] song delighting clearsounding lyre

        [blank] all my skin old age already

        hair turned white after black

        [blank] knees do not carry

        [blank] like fawns

        [blank] but what could I do?

        [blank] not possible to become

        [blank] Dawn with arms of roses

        [blank] bringing to the ends of the earth

        [blank] yet seized

        [blank] wife

        [blank] imagines

        [blank] might bestow

        But I love delicacy

        and this to me -

        the brilliance and beauty of the sun -

        desire has allotted.


        Output:

        The sun

        has risen and the day

        is running away and I am bitten

        by love and yet I cannot

        go to you because you are so far away

        and a way with the mouth there is none

        nor beautiful gifts nor children's song delighting the heart nor lyre with clearsounding strings can bring us together

        all my skin old age already has covered and hair turned white after black has sprouted on my head but knees do not carry me as they used to

        like fawns

        but what could I do? It is not possible to become immortal nor is it possible to cross over the sea in one's arms nor yet to come flying like Dawn with arms of roses

        bringing light to the ends of the earth but love seizes me and imagination which might bestow even that which is not real upon me

        But I love delicacy

        and this to me -

        the brilliance and beauty of the sun -

        desire has allotted.

You can see that my model never really obeys the command that it shouldn’t reword any of the language provided – I still need to experiment with prompts that might get it to “listen” to that requirement more closely. But even with this quick assay, I was impressed with GPT-3’s ability to creatively evoke the experience of being in love in a voice that feels quite like Sappho (or at least Sappho as translated through Anne Carson). There’s that direct, stark imagery (“I am bitten by love”, for example, or the knot between two lovers that the speaker begs to be loosened), the contradictory or self-defeating behavior (the speaker who beckons to her the “beautiful and young”, even as she yearns for a lover far away), the ambiguous affect (is the “imagination which might bestow even that which is not real upon me” a blessing, or a curse?). We already know that AI writing software doesn’t need human experience to write heart-rending prose: see, for example, Stephen Marche’s “Automated Love Story” in this Lit Hub article. But it’s even more impressive when it does it in a passable imitation of a specific voice – in this case, the voice of a poet dead for 2500+ years.

Of course, the complete versions of Sappho’s poems I’ve cooked up using GPT-3 are just a fantasy – a dream of what they could have looked like. It would be fascinating to see if there really is potential for reconstructing Sappho in the original Greek. But I suspect any whole version of Sappho suggested by even the most rigorously-trained AI would also be a dream, albeit much more disciplined than the one I present here. (The weirdness of using AI for this purpose is neatly summed up by the copy Nature used to advertise the article it published by the Ithaca team: “Predicting the past”. You don’t usually predict events that have already happened, even if you don’t know the exact nature of those events).

Even presuming some formal and material constraints that might help researchers seeking to reconstruct her poetry – the meter in which a poem seems to have been written, its probable length, any suppositions about the size of the papyrus on which it was originally written, and so on – poetry is inherently more entropic than most other forms of writing. Comparing Sappho’s poems with those of her contemporaries – Alcaeus, say, another Greek lyricist from the same island as Sappho who performed alongside her – might be helpful to an extent, but then again, you couldn’t recreate the verbal texture of Shakespeare’s plays just from knowing those of Marlowe, Beaumont, or Fletcher. There’s no accounting for genius.

Still, time and those much better informed than me will tell whether recovering Sappho’s poetry is a viable project. While I wait, I can still experience a ghost of magic in reading these completions offered by GPT-3, something like the satisfaction a person might receive from reading fan fiction or watching a movie adaptation of a book. Wholeness from a part, disjointed limbs now attached to bodies.

← Back

Filling in Sappho's Blanks with AI

Aug 19, 2022

Photo by Liam Matthews on Unsplash

I’ve been interested in attempts to decipher codes and ancient languages since I borrowed one of my father’s books on the Enigma machine as kid and first learned about the efforts of the Bletchey Park cryptographers to crack its code during the Second World War. I even ended up writing my dissertation, in part, on how Renaissance thinkers tried to read Egyptian hieroglyphs, centuries before Jean-François Champollion realized that the script combined both phonetic and ideographic elements by studying the inscriptions on the Rosetta Stone. So I’m currently following with interest the stories of AI being applied to reconstruct texts in ancient languages. One team of researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are enlisting AI in their effort to decipher the Indus script, examples of which have been found on more than 4000 objects in modern-day India and Pakistan, by training an algorithm on common patterns in how languages change over time and then supplying it with words in a lost language that need to be aligned with words in a known, related language. Meanwhile, another group of scholars and researchers from British AI lab DeepMind have banded together to develop Ithaca, an AI system that can propose text to complete Greek inscriptions that only survive in a partial state, as well as determining where and when those inscriptions were likely to have been made by drawing on contextual linguistic clues.

I was particularly struck by the project of Ithaca because of my interest in the Greek poet Sappho. Sappho, who was born around 630 BC, wrote some of the most direct and evocative poetry that exists, even now, on the subject of love. Her lyrics were revered in antiquity – while Homer was known among the Greeks as “The Poet”, Sappho was his female counterpart, “The Poetess”. And yet despite her glowing reputation among her peers, only one of her poems survives intact today. We may have lost some of her writings due to the early Church’s horror of her “loose” morals – Sappho speaks frankly about her desire for both men and women –, resulting in the deliberate destruction of her work. Until the last quarter of the nineteenth century, our knowledge of Sappho’s poetry was restricted to what had been quoted of it in other ancient writers’ texts, though such citations were rarely in full; in one particularly frustrating example, the proto-literary-critic Longinus cites and discusses a substantial fragment of one of Sappho’s poems in his poetic manual On the Sublime, but doesn’t record its final few lines. When a trove of Sappho’s writings was retrieved during a series of excavations at Oxyrhynchus in the late 1800s and early 1900s, time and the elements had done their work, and much of the papyrus on which the poems were written had degraded, making them impossible to read without interruption.

For some scholars, the fact that – bar one – Sappho’s poems exist only in fragmentary state makes them all the more powerful. Daniel Mendelsohn quotes Thomas Habink, a classicist at the University of Southern California, in a New Yorker article from 2015: “The fragmentary preservation of poems of yearning and separation serves as a reminder of the inevitable incompleteness of human knowledge and affection.” But literary scholars are good at making a virtue out of necessity, and there’s a fine but definitive line between a mystery that excites and a mystery that frustrates. While Habink’s claim does feel true for some of the poems, I can’t say I feel much stimulation staring down a page of square brackets relieved only by a single word or short phrase – “to Kypris”, “Atthis for you”, “in a thin voice”, “lady”, “deep sound”.

Those quotations, by the way, are from Anne Carson’s 2002 work, If Not, Winter, one of the fullest and most beautiful English translations of Sappho’s poetry. Carson is a classicist, but she’s also a poet herself; in her translation, the single square brackets that are typically used to indicate missing matter in a source serve an aesthetic function as well as a scholarly one. In her prefatory note on “marks and lacks” throughout the text, Carson writes:

It is not the case that every gap or illegibility is specifically indicated: this would render the page a blizzard of marks and inhibit reading. Brackets are an aesthetic gesture toward the papyrological event rather than an accurate record of it. I have not used brackets in translating passages, phrases or words whose existence depends on citation by ancient authors, since these are intentionally incomplete. I emphasize the distinction between brackets and no brackets because it will affect your reading experience, if you allow it. Brackets are exciting. Even though you are approaching Sappho in translation, that is no reason you should miss the drama of trying to read a papyrus torn in half or riddled with holes or smaller than a postage stamp – brackets imply a free space of imaginal adventure.

Brackets – and the gaps in the text they indicate – stimulate our imagination, Carson says. And I wondered, as I reread If Not, Winter recently, if they would also stimulate the “imagination” of GPT-3. What would an AI program make of Sappho’s fragments, in Carson’s translation? What would it be able to dream up to fill in those maddening, intoxicating gaps in poetry written more than 2500 years ago?

As I was dealing with a work in translation, and thus not seriously attempting to recover the words that Sappho wrote, I didn’t use any form of fine-tuning for my initial GPT-3 set-up. I gave GPT-3 the below prompt, followed by a sample input (a fragment in which I had introduced artificial blanks) and sample output (the “complete” fragment of poetry):

Sappho ( c. 630 – c. 570 BC) was an Archaic Greek poet. She is known for her lyric poetry, much of which is now lost or survives only in fragments. She wrote extensively on love and desire, both heterosexual and homosexual.

Fill in the [blank] to complete these fragments of Sappho's poetry. One [blank] can represent one or several words. Do not alter or omit any of the words that feature in the input.

        Input:

        night

        girls

        all [blank]

        might sing of the [blank] between you and the bride

        with violets in her lap


        wake! and [blank]

        the young men so that

        no more than [blank]

        shall we sleep


        Output:


        night

        girls

        all night long

        might sing of the love between you and the bride

        with violets in her lap


        wake! and go call

        the young men so that

        no more than the bird with piercing voice

        shall we sleep


I then gave GPT-3 the following input, a fragment that is riddled with several blanks in Carson’s translation:

        [blank] waiting

        [blank] in sacrifices

        [blank] having good

        [blank]

        [blank] but going

        [blank] for we know

        [blank] of works

        [blank]

        [blank] after

        [blank] and toward

        [blank] says this


Then, I asked it to generate an output. This was the result:

        The gods waiting

        angry in sacrifices

        not having good

        pleased

        but going

        home for we know

        the end of works

        long

        after dark and toward

        dawn says this


Here, you can see that GPT-3 has followed the instruction to fill in each and every blank in the poem pretty closely (it misses just two, the one before “but going” and the one before “after”). However, the result is grammatically incorrect (“not having good pleased”), syntactically confused, and incoherent: the model can create individual clauses that make sense (“after dark and toward dawn”) but they don’t knit together to form a cohesive whole.

But after inputting just another two poems, that coherence improved. Here’s the third poem I gave the model, and its associated output:


        Input:

        [blank]

        [blank]

        [blank] pity

        [blank] trembling

        [blank]

        [blank] flesh by now old age

        [blank] covers

        [blank] flies in pursuit

        [blank]

        [blank] noble

        [blank] taking

        [blank] sing to us

        the one with violets in her lap

        [blank] mostly

        [blank] goes astray


        Output:

        Ah

        me

        have pity

        on my trembling heart that

        flesh by now old age

        covers and flies in pursuit

        of

        youth and the noble

        man taking his fill of love will sing to us

        the one with violets in her lap

        for love mostly

        goes astray


GPT-3 here introduces the image of the “trembling heart”, which flies in pursuit of youth – a cue presumably taken from the words “flesh by now old age”. While the clause “that flesh by now old age covers” doesn’t make sense here – though it conjures the right image, a heart covered by flesh that is now grown old – this output has a stronger, more coherent theme than that of the first: a woman struggling to come to terms with her ever-increasing age, drawn toward those who still possess the youth that has deserted her.

The next poem I gave the model displayed the same successes and failures:


        Input:

        [blank]

        [blank]

        [blank]

        [blank]

        of the Muses [blank]

        makes and of the Graces [blank]

        with slender [blank]

        for mortals: there is a share [blank]

        [blank]


        Output:

        The sweetest thing

        on earth is love: it

        makes the bitter endurable, it makes

        the longest journey short, and

        of the Muses it is the great one

        that makes and of the Graces it is their leader

        with slender ankles

        for mortals: there is a share in their dances

        even for us.


Again, GPT-3 comes up with the overall theme of the poem: love is the most powerful force that exists, for both mortals and gods alike. The tone is perhaps a touch too sentimental for Sappho, but it’s not far off the mark, and though the syntax gets confused when the Muses are introduced (“it is the great one that makes”), the message remains fairly clear: love is revered by both the Muses and the Graces, and it is through love that we taste immortality, if we do at all (“there is a share in their dances even for us”).  

Here are several more examples of incomplete Sappho poems I fed into the model, and GPT-3’s completions of those poems:


        Input:

        [blank]

        [blank] in front

        [blank] toward

        [blank] loosen

        [blank] you would be willing

        [blank] slight

        [blank] to be carried


        Output:

        Love, having now

        arrived in front of me,

        looks toward me and beckons

        that i should loosen the knot which keeps our hearts together

        if only you would be willing:

        it is not slight the pain which love causes me

        nor is it easy to be carried away by love.


       

         Input:

        [blank]

        [blank]

        [blank]

        [blank] beautiful he

        [blank] stirs up still things

        [blank] exhaustion the mind

        [blank] settles down

        [blank] but come O beloveds

        [blank] for day is near


        Output:

        Come to me now

        you who are beautiful and young

        for he who is stirring up my heart is still far away

        and the mind in its exhaustion settles down in forgetfulness

        but come O beloveds

        awake! for day is near


        

        Input:

        [blank]

        [blank]

        [blank]

        [blank]

        [blank] running away

        [blank] bitten

        [blank]

        [blank]

        [blank] you

        [blank] makes a way with the mouth

        [blank] beautiful gifts children

        [blank] song delighting clearsounding lyre

        [blank] all my skin old age already

        hair turned white after black

        [blank] knees do not carry

        [blank] like fawns

        [blank] but what could I do?

        [blank] not possible to become

        [blank] Dawn with arms of roses

        [blank] bringing to the ends of the earth

        [blank] yet seized

        [blank] wife

        [blank] imagines

        [blank] might bestow

        But I love delicacy

        and this to me -

        the brilliance and beauty of the sun -

        desire has allotted.


        Output:

        The sun

        has risen and the day

        is running away and I am bitten

        by love and yet I cannot

        go to you because you are so far away

        and a way with the mouth there is none

        nor beautiful gifts nor children's song delighting the heart nor lyre with clearsounding strings can bring us together

        all my skin old age already has covered and hair turned white after black has sprouted on my head but knees do not carry me as they used to

        like fawns

        but what could I do? It is not possible to become immortal nor is it possible to cross over the sea in one's arms nor yet to come flying like Dawn with arms of roses

        bringing light to the ends of the earth but love seizes me and imagination which might bestow even that which is not real upon me

        But I love delicacy

        and this to me -

        the brilliance and beauty of the sun -

        desire has allotted.

You can see that my model never really obeys the command that it shouldn’t reword any of the language provided – I still need to experiment with prompts that might get it to “listen” to that requirement more closely. But even with this quick assay, I was impressed with GPT-3’s ability to creatively evoke the experience of being in love in a voice that feels quite like Sappho (or at least Sappho as translated through Anne Carson). There’s that direct, stark imagery (“I am bitten by love”, for example, or the knot between two lovers that the speaker begs to be loosened), the contradictory or self-defeating behavior (the speaker who beckons to her the “beautiful and young”, even as she yearns for a lover far away), the ambiguous affect (is the “imagination which might bestow even that which is not real upon me” a blessing, or a curse?). We already know that AI writing software doesn’t need human experience to write heart-rending prose: see, for example, Stephen Marche’s “Automated Love Story” in this Lit Hub article. But it’s even more impressive when it does it in a passable imitation of a specific voice – in this case, the voice of a poet dead for 2500+ years.

Of course, the complete versions of Sappho’s poems I’ve cooked up using GPT-3 are just a fantasy – a dream of what they could have looked like. It would be fascinating to see if there really is potential for reconstructing Sappho in the original Greek. But I suspect any whole version of Sappho suggested by even the most rigorously-trained AI would also be a dream, albeit much more disciplined than the one I present here. (The weirdness of using AI for this purpose is neatly summed up by the copy Nature used to advertise the article it published by the Ithaca team: “Predicting the past”. You don’t usually predict events that have already happened, even if you don’t know the exact nature of those events).

Even presuming some formal and material constraints that might help researchers seeking to reconstruct her poetry – the meter in which a poem seems to have been written, its probable length, any suppositions about the size of the papyrus on which it was originally written, and so on – poetry is inherently more entropic than most other forms of writing. Comparing Sappho’s poems with those of her contemporaries – Alcaeus, say, another Greek lyricist from the same island as Sappho who performed alongside her – might be helpful to an extent, but then again, you couldn’t recreate the verbal texture of Shakespeare’s plays just from knowing those of Marlowe, Beaumont, or Fletcher. There’s no accounting for genius.

Still, time and those much better informed than me will tell whether recovering Sappho’s poetry is a viable project. While I wait, I can still experience a ghost of magic in reading these completions offered by GPT-3, something like the satisfaction a person might receive from reading fan fiction or watching a movie adaptation of a book. Wholeness from a part, disjointed limbs now attached to bodies.