Critical Response #2: “Power to the People / Humans in the Loop”
Generally, when people react to “Artificial Intelligence”, I think a science-fiction view of a “sentient being” – Skynet, Androids, etc – prevails. Various media such as “Black Mirror” portray AI as a form of human brain transplant. However, this immediately puts humans and AI as parallel and horizontal, which may hinder our understanding of AI. With this view, the idea of a Big Red Button AI seems predominant, and our interactions become somewhat obsolete. I would like to reflect upon the Benefits of Human-in-the-Loop from Ge Wang’s “Humans in the Loop: The Design of Interactive AI Systems” article.
Why do we continue to raise the dead to demonstrate AI? We seem to take advantage of well-known artists’ conceived symbolisms and their historical influence to generate skeuomorphic outcomes – like the examples of Deep Cat, Honeymoon in Hell?, Beethoven’s 10th Symphony, and Bach Chorales. The benefits come from its economy in demonstration and false sense of equating with the past masteries. However, as Wang mentions in the article, this stylistic transfer misses the semantics of the artist and also the audience. It is also absent from having significant gains in transparency or incorporating human preference and agency.
Can the AI generate more than a stylistic transfer – perhaps a reinterpretation? Volodos’ take on Turkish March by Mozart is frequently played by pianists, and the piece takes the motive and presents it in a grand, virtuosic way. “Reinterpretation” was done by Volodos, with clear intentions throughout the piece. Would an AI be able to reinterpret? I’m sure no human being is able to reinterpret without data and experience piled up. However, there is something more than data. Humans are able to translate their longing for their mother nation into music and metaphorize falling and landing into life struggles and resilience. I think this kind of recontextualization in the fundamental level may be what current AI might lack.
Perhaps, this is where human-in-the-loop may be most efficient at – putting in something that’s more than data into the loop such as reinterpretation, intention, meaning, emotion, and more. The three design principles (Value human agency, Granularity is a virtue, and Interfaces should extend us) resonate greatly in this sense. Humans are in the main control. Like an automatic vehicle! The gear changes are automatic, yet the rest of the control is human-dependent. Granularity allows for human nuance to take place, making the interaction richer. Lastly, the extended interface gives embodiment and room for expression.
When designing with AI, I struggled locating the North Star as AI often disagreed with my navigation towards it. However, perhaps the critical thinking I should heed to before deciding on the North Star are the two principles (7.11A and 7.11B). What kind of things can be automated? What is more than data, and how can I make interventions? What is my role in this? One more thing I would add is, why is this automation needed?
<List of AI supported activities>
1. Transcription supporter - a system that suggests chord progressions based on given input – where users can select from possible chords.
2. Reharmonization supporter – a system that suggests conventional to unique chord progressions based on the given context – where users can select from possible chords.
3. Hand gesture customization in VR – simple gestures like pinch or wave can be further customized to the user, and users can intervene during training phase.
4. Video game selection supporter – Not based on genre, but based on emotional fulfillment such as triumphant, challenging, therapeutic, flow, and energetic, and if the suggested game is a disagreement, provide fine-tuning words.
5. Activity suggestion and tracker – Have generalized data of emotional differences before and after an activity, make suggestions, compare and contrast the emotional outcome of the user, and further customize a set of suggestions.
6. Piano trio AI – Similar to iRealB Pro where piano, drum, and bass make accompaniments based on a score and tempo, have AI generated instruments to accompany with user inputs/corrections on their styles.
7. Customized bike gear change – taking user data, the bike automatically changes gear based on user preference and corrections
8. Customized voice to text – users can make corrections to falsely transcribed voice into text based on accents or frequently used proper nouns.
9. Reassurance supporter – from a data of reassuring sentences or words, the user can input their current emotion and obtain reassuring quotes provided in the past.
10. Play games with a piano - a system where it takes major chords to go right, minor chord to go left, trills to jump, etc.