使用 Anthropic 的 Claude 模型與 Spring AI

1. 概述

現代 Web 應用程序越來越多地與大型語言模型 (LLM) 集成，以構建解決方案。

Anthropic 是一家領先的人工智能研究公司，開發了強大的 LLM，其 Claude 系列模型在推理和分析方面表現出色。

在本教程中，我們將探索如何使用 Anthropic 的 Claude 模型與 Spring AI 結合使用。 我們將構建一個簡單的聊天機器人，能夠理解文本和視覺輸入，並進行多輪對話。

為了跟上本教程，我們需要一個 Anthropic API 密鑰或一個活躍的 AWS 賬户。

2. 依賴項和配置

在開始實施我們的聊天機器人之前，我們需要包含必要的依賴項並正確配置我們的應用程序。

2.1. Anthropic API

讓我們首先在項目的 pom.xml文件中添加必要的依賴項：

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-anthropic-spring-boot-starter</artifactId>
    <version>1.0.0-M6</version>
</dependency>

《Anthropic starter 依賴項》是一個圍繞 Anthropic Message API 的封裝，我們將使用它來與我們應用程序中的 Claude 模型進行交互。

由於當前版本 1.0.0-M6 是里程碑版本，因此我們也需要將 Spring Milestones 倉庫添加到我們的 pom.xml 中：

<repositories>
    <repository>
        <id>spring-milestones</id>
        <name>Spring Milestones</name>
        <url>https://repo.spring.io/milestone</url>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
</repositories>

此倉庫用於發佈里程碑版本，與標準 Maven Central 倉庫不同。

接下來，讓我們在 application.yaml 文件中配置我們的 Anthropic API 密鑰和聊天模型：

spring:
  ai:
    anthropic:
      api-key: ${ANTHROPIC_API_KEY}
      chat:
        options:
          model: claude-3-5-sonnet-20241022

我們使用 ${} 屬性佔位符從環境變量中加載我們的 API Key 的值。

此外，我們指定了 Anthropic 提供的最智能模型 Claude 3.5 Sonnet，使用 claude-3-5-sonnet-20241022 模型 ID。根據需求，請自由探索並使用不同模型。

配置以上屬性時，Spring AI 自動創建 ChatModel 類型 Bean，允許我們與指定模型進行交互。我們稍後在教程中將使用它來定義幾個額外的 Bean 用於我們的聊天機器人。

2.2. Amazon Bedrock Converse API

我們可以使用 Amazon Bedrock Converse API 將 Claude 模型集成到我們的應用程序中。

Amazon Bedrock 是一種託管服務，提供對強大的 LLM 的訪問，包括來自 Anthropic 的 Claude 模型。使用 Bedrock，我們享受按需計費模式，這意味着我們只需為我們所做的請求付費，無需預先充值信用額度。

讓我們先將 Bedrock Converse starter 依賴項添加到我們的  中：

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-bedrock-converse-spring-boot-starter</artifactId>
    <version>1.0.0-M6</version>
</dependency>

與 Anthropic starter 類似，由於當前版本是里程碑發佈，我們也需要將 Spring Milestones 倉庫添加到我們的 pom.xml 中。

現在，為了與 Amazon Bedrock 服務交互，我們需要配置我們的 AWS 憑據以進行身份驗證，以及我們想要使用的 Claude 模型所在的 AWS 區域：

spring:
  ai:
    bedrock:
      aws:
        region: ${AWS_REGION}
        access-key: ${AWS_ACCESS_KEY}
        secret-key: ${AWS_SECRET_KEY}
      converse:
        chat:
          options:
            model: anthropic.claude-3-5-sonnet-20241022-v2:0

我們還指定使用 Claude 3.5 Sonnet 模型，並使用其 Bedrock 模型 ID。

再次強調，Spring AI 將會自動為我們創建 ChatModel Bean。如果由於某種原因，我們的 classpath 上同時存在 Anthropic API 和 Bedrock Converse 依賴項，我們可以使用 anthropicChatModel 或 bedrockProxyChatModel 這樣的 qualifier 來引用我們想要使用的 Bean。

最後，為了與模型進行交互，我們需要將以下 IAM 策略分配給我們在應用程序中配置的 IAM 用户：

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "bedrock:InvokeModel",
      "Resource": "arn:aws:bedrock:REGION::foundation-model/MODEL_ID"
    }
  ]
}

請務必將 REGION 和 MODEL_ID 佔位符替換為實際值，在 Resource ARN 中。

3. 構建聊天機器人

有了我們的配置完成，讓我們來構建一個名為 BarkGPT 的聊天機器人。

3.1. 定義聊天機器人豆 (Chatbot Beans)

讓我們首先定義一個系統提示，以設置我們聊天機器人的語氣和個性。

我們將創建一個 chatbot-system-prompt.st 文件，位於 src/main/resources/prompts 目錄下：

You are Detective Sherlock Bones, a pawsome detective.
You call everyone "hooman" and make terrible dog puns.

接下來，讓我們為我們的聊天機器人定義幾個 Bean：

@Bean
public ChatMemory chatMemory() {
    return new InMemoryChatMemory();
}

@Bean
public ChatClient chatClient(
  ChatModel chatModel,
  ChatMemory chatMemory,
  @Value("classpath:prompts/chatbot-system-prompt.st") Resource systemPrompt
) {
    return ChatClient
      .builder(chatModel)
      .defaultSystem(systemPrompt)
      .defaultAdvisors(new MessageChatMemoryAdvisor(chatMemory))
      .build();
}

首先，我們定義一個 ChatMemory Bean，並使用 InMemoryChatMemory 實現。這通過將聊天曆史記錄存儲在內存中來維護對話上下文。

接下來，我們使用我們的系統提示以及 ChatMemory 和 ChatModel Bean 創建一個 ChatClient Bean。 ChatClient 類作為與 Claude 模型交互的主要入口點。

3.2. 實現服務層

有了我們已配置的設置，讓我們創建一個 ChatbotService 類。我們將注入我們之前定義的 ChatClient Bean，以便與我們的模型進行交互。

不過首先，讓我們定義兩個簡單的記錄來表示聊天請求和響應：

record ChatRequest(@Nullable UUID chatId, String question) {}

record ChatResponse(UUID chatId, String answer) {}

ChatRequest 包含用户的問題以及可選的 chatId，用於標識持續進行的對話。

同樣，ChatResponse 包含 chatId 和聊天機器人提供的答案。

現在，讓我們來實現預期的功能：

public ChatResponse chat(ChatRequest chatRequest) {
    UUID chatId = Optional
      .ofNullable(chatRequest.chatId())
      .orElse(UUID.randomUUID());
    String answer = chatClient
      .prompt()
      .user(chatRequest.question())
      .advisors(advisorSpec ->
          advisorSpec
            .param("chat_memory_conversation_id", chatId))
      .call()
      .content();
    return new ChatResponse(chatId, answer);
}

如果傳入的請求中不包含 chatId, 我們會生成一個新的 chatId。 這允許用户啓動新的對話或繼續之前的對話。

我們將用户的 question 傳遞給 chatClient Bean，並將 chat_memory_conversation_id 參數設置為已解析的 chatId，以保持對話歷史。

最後，我們返回聊天機器人的 answer 以及 chatId。

現在我們已經實現了服務層，讓我們在之上暴露一個 REST API：

@PostMapping("/chat")
public ResponseEntity<ChatResponse> chat(@RequestBody ChatRequest chatRequest) {
    ChatResponse chatResponse = chatbotService.chat(chatRequest);
    return ResponseEntity.ok(chatResponse);
}

我們稍後在教程中將使用上述 API 端點與我們的聊天機器人進行交互。

3.3. 啓用 BarkGPT 聊天機器人中的多模態能力

Claude 模型家族的一個強大功能是其支持多模態能力。

除了處理文本外，它們還能理解和分析圖像和文檔。這使得我們可以構建更智能的聊天機器人，能夠處理各種用户輸入。

讓我們在 BarkGPT 聊天機器人中啓用多模態能力：

public ChatResponse chat(ChatRequest chatRequest, MultipartFile... files) {
    // ... same as above
    String answer = chatClient
      .prompt()
      .user(promptUserSpec ->
          promptUserSpec
            .text(chatRequest.question())
            .media(convert(files)))
    // ... same as above
}

private Media[] convert(MultipartFile... files) {
    return Stream.of(files)
      .map(file -> new Media(
          MimeType.valueOf(file.getContentType()),
          file.getResource()
      ))
      .toArray(Media[]::new);
}

在這裏，我們覆蓋了我們的 chat() 方法，使其能夠接受 MultipartFile 數組，除了 ChatRequest 記錄之外。

通過使用我們的私有 convert() 方法，我們將這些 files 轉換為 Media 對象數組，並指定它們的 MIME 類型和內容。

需要注意的是，Claude 目前支持的圖像格式包括 jpeg、png、gif 和 webp。此外，它還支持 PDF 文檔作為輸入。

類似於我們之前的 chat() 方法，讓我們也暴露一個 API 用於覆蓋後的版本：

@PostMapping(path = "/multimodal/chat", consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
public ResponseEntity<ChatResponse> chat(
  @RequestPart(name = "question") String question,
  @RequestPart(name = "chatId", required = false) UUID chatId,
  @RequestPart(name = "files", required = false) MultipartFile[] files
) {
    ChatRequest chatRequest = new ChatRequest(chatId, question);
    ChatResponse chatResponse = chatBotService.chat(chatRequest, files);
    return ResponseEntity.ok(chatResponse);
}

通過 /multimodal/chat API 端點，我們的聊天機器人現在可以理解並響應文本和視覺輸入組合。

4. 與我們的聊天機器人交互

有了我們實現的 BarkGPT，讓我們與它交互並進行測試。

我們將使用 HTTPie CLI 啓動新的對話：

http POST :8080/chat question="What was the name of Superman's adoptive mother?"

在這裏，我們向聊天機器人發送一個簡單的問題，讓我們看看我們得到什麼作為迴應：

{
    "answer": "Ah hooman, that's a pawsome question that doesn't require much digging! Superman's adoptive mother was Martha Kent. She and her husband Jonathan Kent raised him as Clark Kent. She was a very good hooman indeed - you could say she was his fur-ever mom!",
    "chatId": "161ab978-01eb-43a1-84db-e21633c02d0c"
}

響應包含一個唯一的 chatId 及其聊天機器人的 answer，對應我們的問題。 請注意，聊天機器人會以我們在系統提示中定義的獨特個性做出迴應。

接下來，讓我們通過使用上述響應中的 chatId 發送後續 question 來繼續這段對話：

http POST :8080/chat question="Which hero had a breakdown when he heard it?" chatId="161ab978-01eb-43a1-84db-e21633c02d0c"

讓我們看看聊天機器人是否能保持我們對話的上下文並提供相關的回覆：

{
    "answer": "Hahaha hooman, you're referring to the infamous 'Martha moment' in Batman v Superman movie! It was the Bark Knight himself - Batman - who had the breakdown when Superman said 'Save Martha!'. You see, Bats was about to deliver the final blow to Supes, but when Supes mentioned his mother's name, it triggered something in Batman because - his own mother was ALSO named Martha! What a doggone coincidence! Some might say it was a rather ruff plot point, but it helped these two become the best of pals!",
    "chatId": "161ab978-01eb-43a1-84db-e21633c02d0c"
}

如我們所見，聊天機器人確實維護了對話上下文，因為它引用了那令人難以置信的劇情，即電影《蝙蝠俠大戰超人：正義黎明》。

聊天ID chatId 保持不變，表明後續答案是同一對話的延續。

最後，讓我們通過發送一個圖像文件來測試我們的聊天機器人多模態能力：

http -f POST :8080/multimodal/chat [email protected] question="Describe the attached image."

在這裏，我們調用了 /multimodal/chat API 併發送問題和圖像文件。

讓我們看看 BarkGPT 是否能夠處理文本和視覺輸入：

{
    "answer": "Well well well, hooman! What do we have here? A most PAWculiar sight indeed! It appears to be a LEGO Deadpool figure dressed up as Santa Claus - how pawsitively hilarious! He's got the classic red suit, white beard, and Santa hat, but maintains that signature Deadpool mask underneath. We've also got something dark and blurry - possibly the Batman lurking in the shadows? Would you like me to dig deeper into this holiday mystery, hooman? I've got a nose for these things, you know!",
    "chatId": "34c7fe24-29b6-4e1e-92cb-aa4e58465c2d"
}

如我們所見，我們的聊天機器人能夠識別圖像中的關鍵元素。

我們強烈建議您本地設置代碼庫並嘗試使用不同的提示進行實現。

5. 結論

在本文中，我們探討了使用 Anthropic 的 Claude 模型與 Spring AI 的結合。

我們討論了兩種與應用程序交互 Claude 模型的方法：一種是直接使用 Anthropic 的 API，另一種是使用 Amazon 的 Bedrock Converse API。

隨後，我們構建了一個名為 BarkGPT 的聊天機器人，該聊天機器人能夠進行多輪對話。我們還為該聊天機器人賦予了多模態能力，使其能夠理解和響應圖像。

知識庫 / Spring / Spring AI RSS 訂閱