2023-12-25 springboot webflux调用大模型stream接口

2024-01-11

调用google gemini 总结
在调用gemini返回mono数据正常,但是在steam调用时出现了3个问题,

  1. 如果 .bodyToFlux(String.class) 返回结果中的json有换行符,造成json无法解析
    解决方案:可以直接写为.bodyToFlux(GeminiResponse.class),这样可以无视换行符直接拿到结果列表
  2. gemini sream返回列表无法判断是否是最后一句;
    为了兼容之前openai接口,需要在接口调用结束后主动给客户端发一条标记结束的消息。
Flux<CustomResponse> results = webClient.post()...;
CustomResponselast last = new CustomResponse();
return results.collectList()
          .flatMapIterable(list -> {
                list.add(last);
                return list;
          });
  1. gemini的聊天内容组合要求如果聊天列表中的角色role是连续一样的,就要合并为同一条Content
List<Content> contents = messageList.stream().map...;
List<Content> mergeContents = new ArrayList<>();
contents.forEach(content -> {
   if (mergeContents.isEmpty()) {
                mergeContents.add(content);
            } else {
                int lastIndex = mergeContents.size() - 1;
                Content lastContent = mergeContents.get(lastIndex);
               if (lastContent.getRole().equals(content.getRole())) {
                   List<Part> parts = lastContent.getParts();
                   parts.addAll(content.getParts());
              } else {
                mergeContents.add(content);
             }
        }      
  });
request.setContents(mergeContents);

由于gemini暂时没有java的sdk,分享下对应的实体

@NoArgsConstructor
@Data
public class GeminiRequest {
    private List<Content> contents;
    private List<SafetySetting> safetySettings;
    private GenerationConfig generationConfig;
}
@NoArgsConstructor
@Data
public class Content {
    private GeminiRole role;
    private List<Part> parts;
}
@NoArgsConstructor
@Data
public class Part {

    private String text;
    private InlineData inline_data;

    @NoArgsConstructor
    @Data
    public static class InlineData {
     /**
       * mime_type : image/jpeg
       * data : '$(base64 -w0 image.jpg)'
       */
        private String mime_type;
        private String data;
    }
}
public enum GeminiRole {
    user, model
}

@Data
@NoArgsConstructor
public class SafetySetting {
    private HarmCategory category;
    private HarmBlockThreshold threshold;
}
public enum HarmCategory {
    HARM_CATEGORY_UNSPECIFIED,
    HARM_CATEGORY_DEROGATORY,
    HARM_CATEGORY_TOXICITY,
    HARM_CATEGORY_VIOLENCE,
    @Deprecated
    HARM_CATEGORY_SEXUAL,
    HARM_CATEGORY_MEDICAL,
    HARM_CATEGORY_DANGEROUS,
    HARM_CATEGORY_HARASSMENT,
    HARM_CATEGORY_HATE_SPEECH,
    HARM_CATEGORY_SEXUALLY_EXPLICIT,
    HARM_CATEGORY_DANGEROUS_CONTENT
}
public enum HarmBlockThreshold {
    HARM_BLOCK_THRESHOLD_UNSPECIFIED,
    BLOCK_LOW_AND_ABOVE,
    BLOCK_MEDIUM_AND_ABOVE,
    BLOCK_ONLY_HIGH,
    BLOCK_NONE
}
@NoArgsConstructor
@Data
public class GenerationConfig {
    private Double temperature;
    private Integer maxOutputTokens;
    private Double topP;
    private Integer topK;
    private List<String> stopSequences;
}

@Data
public class GeminiResponse {
    private PromptFeedback promptFeedback;
    private List<Candidate> candidates;
}
@Data
public class PromptFeedback {
    public List<SafetyRating> safetyRatings;
}
@Data
public class SafetyRating {
    private String category;
    private String probability;
}
@Data
public class Candidate {
    private Content content;
    private String finishReason;
    private int index;
    private List<SafetyRating> safetyRatings;
}

2023-12-25

之前用okhttp-sse调用chatgpt的接口时,感觉遇到接口异常处理不是很方便,尝试使用webflux后代码结构简单了不少,特此分享下
1.引入webflux

       <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-webflux</artifactId>
        </dependency>

2.简单写个调用

@Slf4j
public class WebFluxDemo {
    private final static String host = "https://api.openai.com/";
    private final static String uri = "v1/chat/completions";

    public static void main(String[] args) throws InterruptedException {
        //proxy
        HttpClient httpClient = HttpClient.create()
                .proxy(proxy -> proxy.type(ProxyProvider.Proxy.HTTP)
                        .host("127.0.0.1")
                        .port(1080));

        ReactorClientHttpConnector connector = new ReactorClientHttpConnector(httpClient);
        //default build
        WebClient build = WebClient.builder()
                .baseUrl(host)
                .clientConnector(connector)
                .defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
                .build();
        //build request
        OpenAiRequest openAiRequest = new OpenAiRequest();
        openAiRequest.stream = true;
        openAiRequest.setTemperature(0.7);
        openAiRequest.setModel("gpt-3.5-turbo-16k");
        openAiRequest.setMax_tokens(256);
        List<GptMessage> message = new ArrayList<>();
        GptMessage gptMessage = new GptMessage();
        gptMessage.setRole("system");
        gptMessage.setContent("You will role-play as a Female named 'lily'\\nCharacter information:Romantic,Flirty,Lover\\nCharacter background:U are most beautiful girl in school and like me");

        message.add(gptMessage);
        GptMessage question = new GptMessage();
        question.setRole("user");
        question.setContent("hi");
        message.add(question);
        openAiRequest.setMessages(message);
        //send post
        Flux<String> response = build.post()
                .uri(uri)
                .contentType(MediaType.APPLICATION_JSON)
                .header("Authorization", "Bearer " + "your token")
                .bodyValue(JSON.toJSONString(openAiRequest))
                .retrieve()
                .bodyToFlux(String.class)
                .map(result -> {
                    //build response
                    if (StringUtils.isEmpty(result)) {
                        return "";
                    } else if (result.equals(SseData.DONE.value())) {
                        return result;
                    } else {
                        OpenAISteamResponse openAISteamResponse = JSON.parseObject(result, OpenAISteamResponse.class);
                        String content = openAISteamResponse.choices.get(0).delta.getContent();
                        if (StringUtils.isNotEmpty(content)) {
                            return content;
                        } else {
                            return "";
                        }
                    }
                });
        //subscribe
        response.subscribe((content) -> log.info("content [{}]", content));
        //keep alive
        while (true) {
            Thread.sleep(1000);
        }
    }
}

3.附录
3.1关于遭到openai限流,欠费等未知异常处理,可以用其他方式替代

.onErrorResume(e -> {
                    log.error("event source failure openai {}", e.getMessage());
                    return huggingFaceService.chatStreamFlux(dto);
                })

3.2关于webflux中的flux与mono
在一次mono方法中,我让openai返回的多个结果并用\n换行符分开,结果mono中总是返回一个结果,在调式中发现原来flux与mono的差别就在返回结果的换行符上,如果把一个列表用换行符切割返回就是flux,如果用jsonString返回就是mono
3.3实体

@Data
public class OpenAiRequest {
    public String model;
    public List<GptMessage> messages;
    public Double temperature;
    public Integer max_tokens;
    public Boolean stream;
    public List<GptFunction> functions;
    public String function_call;
}
@Data
public class GptMessage {
    private String role;
    private String content;
    private String name;
    private FunctionCall function_call;
}

@Data
public class FunctionCall {
    private String name;
    private String arguments;
}
@Data
public class GptFunction {
    private Object parameters;
    private String description;
    private String name;
}
public enum SseData {
    DONE("[DONE]"),
    ;

    private final String value;

    SseData(final String value) {
        this.value = value;
    }

    public String value() {
        return value;
    }
}
@Data
public class OpenAISteamResponse {
    private String id;
    private String object;
    private int created;
    private String model;
    private List<Choice> choices;
}
@Data
public class Choice {
    private int index;
    private GptMessage delta;
    private String finish_reason;
}
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容