1. Text to image
generate a photo of 刻晴
结果意义不明,因为直接交给 sd 做图片生成,而sd不认识刻晴。
generate a photo of a cute anime girl who wears jk
似乎理解错了任务,不止生成了图,还生成了描述图片的语音
Please show me the full body photo of this girl
2. Image to text
please describe the picture https://www.shutterstock.com/image-illustration/set-handdrawn-watercolor-birds-macaw-600w-1742478266.jpg
我给的url 是鹦鹉的图片:
可以看到模型做了很多(多余的)步骤:
最后用语音描述了图片:
3. 生成视频(做不到)
没法生成视频,只能生成图片+音频
generate a video of a cute anime girl who wears jk and has white hair saying good morning to me
4. 生成语音
generate a photo of a cute anime girl who wears jk and a video where the girl says good morning to me
generate a photo of a cute anime girl who wears jk and has white hair. Based on the generated photo, please generate a video and audio that the girl says “good morning” to me
Please describe /images/1bd3.jpg with women’s voice
5. 语音识别
generate an audio of a woman saying some funny jokes. After that, transcribe the audio into text
transcribe the audio file "/audios/13cf.wav" into text
每次都报错
Please generate subtitles for this youtube video https://www.youtube.com/watch?v=3_5FRLYS-2A
6. 多轮对话
先让ai 描述一张图片,再让ai 把图片里的鸟的颜色改成红色
尝试多轮对话,让AI 修改上一轮对话的图片,会报错,不知道为啥
先找 ai 描述文件系统里的一张图,然后再问 What’s the color of the cat
这次不会报错。虽然回答错了。