使用 Bucket4j 限制 Spring API 請求速率

REST,Spring Boot
Remote
0
07:19 AM · Dec 01 ,2025

1.  概述

在本教程中,我們將重點關注如何使用 Bucket4j 來限制 Spring REST API 的速率。

我們將探索 API 速率限制,瞭解 Bucket4j,然後通過幾種方式在 Spring 應用程序中限制 REST API 的速率。

2. API 速率限制

速率限制是一種策略,用於 限制 API 訪問。它限制客户端在一定時間範圍內可以進行的 API 調用數量。這有助於防禦 API 免受濫用,無論是有意還是惡意。

速率限制通常通過跟蹤 IP 地址應用於 API,或者更具業務特定性的方式,例如 API 密鑰或訪問令牌。作為 API 開發人員,當客户端達到限制時,我們有幾個選項:

  • 排隊請求,直到剩餘時間段結束後
  • 允許請求立即進行,但為此請求收取額外費用
  • 拒絕請求(HTTP 429 Too Many Requests)

3. Bucket4j 速率限制庫

3.1. Bucket4j 是什麼?

Bucket4j 是基於 令牌桶算法 的 Java 速率限制庫。 Bucket4j 是線程安全的庫,可用於獨立 JVM 應用程序或集羣環境。 它還通過 JCache (JSR107) 規範支持內存或分佈式緩存。

3.2. 令牌桶算法

讓我們在 API 速率限制的背景下直觀地看一下算法。

假設我們有一個桶,其容量定義為它可以容納的令牌數量。 每次消費者想要訪問 API 端點,必須從桶中獲取一個令牌。 如果令牌可用,則從桶中移除一個令牌並接受請求。 反之,如果桶中沒有令牌,則拒絕請求。

隨着請求消耗令牌,我們也會以固定的速率補充它們,以確保桶永遠不超過其容量

假設一個 API 的速率限制為每分鐘 100 個請求。 我們創建一個具有容量 100 個令牌和每分鐘 100 個令牌的填充速率的桶。

4. Getting Started With Bucket4j

4.1. Maven Configuration

Let’s begin by adding the bucket4j-core dependency to our pom.xml:


<dependency>
    <groupId>com.bucket4j</groupId>
    <artifactId>bucket4j-core</artifactId>
    <version>8.1.0</version>
</dependency>

4.2. Terminology

Before we look at how to use Bucket4j, we’ll briefly discuss some of the core classes, and how they represent the different elements in the formal model of the token-bucket algorithm.

The Bucket interface represents the token bucket with a maximum capacity. It provides methods such as tryConsume and tryConsumeAndReturnRemaining for consuming tokens. These methods return the result of consumption as true if the request conforms with the limits, and the token was consumed.

The Bandwidth class is the key building block of a bucket, as it defines the limits of the bucket. We use Bandwidth to configure the capacity of the bucket and the rate of refill.

The Refill class is used to define the fixed rate at which tokens are added to the bucket. We can configure the rate as the number of tokens that would be added in a given time period. For example, 10 buckets per second or 200 tokens per 5 minutes, and so on.

The tryConsumeAndReturnRemaining method in Bucket returns ConsumptionProbe. ConsumptionProbe contains, along with the result of consumption, the status of the bucket, such as the tokens remaining, or the time remaining until the requested tokens are available in the bucket again.

4.3. Basic Usage

Let’s test some basic rate limit patterns.

For a rate limit of 10 requests per minute, we’ll create a bucket with capacity 10 and a refill rate of 10 tokens per minute:

Refill refill = Refill.intervally(10, Duration.ofMinutes(1));
Bandwidth limit = Bandwidth.classic(10, refill);
Bucket bucket = Bucket.builder()
    .addLimit(limit)
    .build();

for (int i = 1; i <= 10; i++) {
    assertTrue(bucket.tryConsume(1));
}
assertFalse(bucket.tryConsume(1));

Refill.intervally refills the bucket at the beginning of the time window, which in this case is 10 tokens at the start of the minute.

Next, let’s see refill in action.

We’ll set a refill rate of 1 token per 2 seconds, and throttle our requests to honor the rate limit:

Bandwidth limit = Bandwidth.classic(1, Refill.intervally(1, Duration.ofSeconds(2)));
Bucket bucket = Bucket.builder()
    .addLimit(limit)
    .build();
assertTrue(bucket.tryConsume(1));     // first request
Executors.newScheduledThreadPool(1)   // schedule another request for 2 seconds later
    .schedule(() -> assertTrue(bucket.tryConsume(1)), 2, TimeUnit.SECONDS); 

Suppose we have a rate limit of 10 requests per minute. At the same time, we may wish to avoid spikes that would exhaust all the tokens in the first 5 seconds. Bucket4j allows us to set multiple limits (Bandwidth) on the same bucket. Let’s add another limit that allows only 5 requests in a 20-second time window:

Bucket bucket = Bucket.builder()
    .addLimit(Bandwidth.classic(10, Refill.intervally(10, Duration.ofMinutes(1))))
    .addLimit(Bandwidth.classic(5, Refill.intervally(5, Duration.ofSeconds(20))))
    .build();

for (int i = 1; i <= 5; i++) {
    assertTrue(bucket.tryConsume(1));
}
assertFalse(bucket.tryConsume(1));

5. Rate Limiting a Spring API Using Bucket4j

Let’s use Bucket4j to apply a rate limit in a Spring REST API.

5.1. Area Calculator API

We’ll implement a simple, but extremely popular, area calculator REST API. Currently, it calculates and returns the area of a rectangle given its dimensions:

@RestController
class AreaCalculationController {

    @PostMapping(value = "/api/v1/area/rectangle")
    public ResponseEntity<AreaV1> rectangle(@RequestBody RectangleDimensionsV1 dimensions) {
        return ResponseEntity.ok(new AreaV1("rectangle", dimensions.getLength() * dimensions.getWidth()));
    }
}

Let’s ensure that our API is up and running:

$ curl -X POST http://localhost:9001/api/v1/area/rectangle \
    -H "Content-Type: application/json" \
    -d '{ "length": 10, "width": 12 }'

{ "shape":"rectangle","area":120.0 }

5.2. Applying Rate Limit

Now we’ll introduce a naive rate limit, allowing the API 20 requests per minute. In other words, the API rejects a request if it’s already received 20 requests in a time window of 1 minute.

Let’s modify our Controller to create a Bucket and add the limit (Bandwidth):

@RestController
class AreaCalculationController {

    private final Bucket bucket;

    public AreaCalculationController() {
        Bandwidth limit = Bandwidth.classic(20, Refill.greedy(20, Duration.ofMinutes(1)));
        this.bucket = Bucket.builder()
            .addLimit(limit)
            .build();
    }
    //..
}

In this API, we can check whether the request is allowed by consuming a token from the bucket using the method tryConsume. If we’ve reached the limit, we can reject the request by responding with an HTTP 429 Too Many Requests status:

public ResponseEntity<AreaV1> rectangle(@RequestBody RectangleDimensionsV1 dimensions) {
    if (bucket.tryConsume(1)) {
        return ResponseEntity.ok(new AreaV1("rectangle", dimensions.getLength() * dimensions.getWidth()));
    }

    return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS).build();
}
# 21st request within 1 minute
$ curl -v -X POST http://localhost:9001/api/v1/area/rectangle \
    -H "Content-Type: application/json" \
    -d '{ "length": 10, "width": 12 }'

< HTTP/1.1 429

5.3. API Clients and Pricing Plan

Now we have a naive rate limit that can throttle the API requests. Next, we’ll introduce pricing plans for more business-centered rate limits.

Pricing plans help us monetize our API. Let’s assume that we have the following plans for our API clients:

  • Free: 20 requests per hour per API client
  • Basic: 40 requests per hour per API client
  • Professional: 100 requests per hour per API client

Each API client gets a unique API key that they must send along with each request. This helps us identify the pricing plan linked with the API client.

Let’s define the rate limit (Bandwidth) for each pricing plan:

enum PricingPlan {
    FREE {
        Bandwidth getLimit() {
            return Bandwidth.classic(20, Refill.intervally(20, Duration.ofHours(1)));
        }
    },
    BASIC {
        Bandwidth getLimit() {
            return Bandwidth.classic(40, Refill.intervally(40, Duration.ofHours(1)));
        }
    },
    PROFESSIONAL {
        Bandwidth getLimit() {
            return Bandwidth.classic(100, Refill.intervally(100, Duration.ofHours(1)));
        }
    };
    //..
}

Then let’s add a method to resolve the pricing plan from the given API key:

enum PricingPlan {
    
    static PricingPlan resolvePlanFromApiKey(String apiKey) {
        if (apiKey == null || apiKey.isEmpty()) {
            return FREE;
        } else if (apiKey.startsWith("PX001-")) {
            return PROFESSIONAL;
        } else if (apiKey.startsWith("BX001-")) {
            return BASIC;
        }
        return FREE;
    }
    //..
}

Next, we need to store the Bucket for each API key, and retrieve the Bucket for rate limiting:

class PricingPlanService {

    private final Map<String, Bucket> cache = new ConcurrentHashMap<>();

    public Bucket resolveBucket(String apiKey) {
        return cache.computeIfAbsent(apiKey, this::newBucket);
    }

    private Bucket newBucket(String apiKey) {
        PricingPlan pricingPlan = PricingPlan.resolvePlanFromApiKey(apiKey);
        return Bucket.builder()
            .addLimit(pricingPlan.getLimit())
            .build();
    }
}

Now we have an in-memory store of buckets per API key. Let’s modify our Controller to use the PricingPlanService:

@RestController
class AreaCalculationController {

    private PricingPlanService pricingPlanService;

    public ResponseEntity<AreaV1> rectangle(@RequestHeader(value = "X-api-key") String apiKey,
        @RequestBody RectangleDimensionsV1 dimensions) {

        Bucket tokenBucket = pricingPlanService.resolveBucket(apiKey);
        ConsumptionProbe probe = tokenBucket.tryConsumeAndReturnRemaining(1);
        if (probe.isConsumed()) {
            return ResponseEntity.ok()
                .header("X-Rate-Limit-Remaining", String.valueOf(probe.getRemainingTokens()))
                .body(new AreaV1("rectangle", dimensions.getLength() * dimensions.getWidth()));
        }
        
        long waitForRefill = probe.getNanosToWaitForRefill() / 1_000_000_000;
        return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS)
            .header("X-Rate-Limit-Retry-After-Seconds", String.valueOf(waitForRefill))
            .build();
    }
}

Let’s walk through the changes. The API client sends the API key with the X-api-key request header. We use the PricingPlanService to get the bucket for this API key, and check whether the request is allowed by consuming a token from the bucket.

In order to enhance the client experience of the API, we’ll use the following additional response headers to send information about the rate limit:

  • X-Rate-Limit-Remaining: number of tokens remaining in the current time window
  • X-Rate-Limit-Retry-After-Seconds: remaining time, in seconds, until the bucket is refilled

We can call the ConsumptionProbe methods getRemainingTokens and getNanosToWaitForRefill to get the count of remaining tokens in the bucket and the time remaining until the next refill, respectively. The getNanosToWaitForRefill method returns 0 if we’re able to consume the token successfully.

Let’s call the API:

## successful request
$ curl -v -X POST http://localhost:9001/api/v1/area/rectangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "length": 10, "width": 12 }'

< HTTP/1.1 200
< X-Rate-Limit-Remaining: 9
{"shape":"rectangle","area":120.0}

## rejected request
$ curl -v -X POST http://localhost:9001/api/v1/area/rectangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "length": 10, "width": 12 }'

< HTTP/1.1 429
< X-Rate-Limit-Retry-After-Seconds: 299
{ "status": 429, "error": "Too Many Requests", "message": "You have exhausted your API Request Quota" }

5.4. Using Spring MVC Interceptor

Suppose we now have to add a new API endpoint that calculates and returns the area of a triangle given its height and base:

@PostMapping(value = "/triangle")
public ResponseEntity<AreaV1> triangle(@RequestBody TriangleDimensionsV1 dimensions) {
    return ResponseEntity.ok(new AreaV1("triangle", 0.5d * dimensions.getHeight() * dimensions.getBase()));
}

As it turns out, we need to rate-limit our new endpoint as well. We can simply copy and paste the rate limit code from our previous endpoint. Alternatively, we can use Spring MVC’s HandlerInterceptor to decouple the rate limit code from the business code.

Let’s create a RateLimitInterceptor and implement the rate limit code in the preHandle method:

public class RateLimitInterceptor implements HandlerInterceptor {

    private PricingPlanService pricingPlanService;

    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) 
      throws Exception {
        String apiKey = request.getHeader("X-api-key");
        if (apiKey == null || apiKey.isEmpty()) {
            response.sendError(HttpStatus.BAD_REQUEST.value(), "Missing Header: X-api-key");
            return false;
        }

        Bucket tokenBucket = pricingPlanService.resolveBucket(apiKey);
        ConsumptionProbe probe = tokenBucket.tryConsumeAndReturnRemaining(1);
        if (probe.isConsumed()) {
            return true;
        } else {
            long waitForRefill = probe.getNanosToWaitForRefill() / 1_000_000_000;
            response.addHeader("X-Rate-Limit-Retry-After-Seconds", String.valueOf(waitForRefill));
            response.sendError(HttpStatus.TOO_MANY_REQUESTS.value(),
              "You have exhausted your API Request Quota"); 
            return false;
        }
    }
}

Finally, we must add the interceptor to the InterceptorRegistry:

public class Bucket4jRateLimitApp implements WebMvcConfigurer {

    private RateLimitInterceptor interceptor;

    @Override
    public void addInterceptors(InterceptorRegistry registry) {
        registry.addInterceptor(interceptor)
            .addPathPatterns("/api/v1/area/**");
    }
}

The RateLimitInterceptor intercepts each request to our area calculation API endpoints.

Let’s try out our new endpoint:

## successful request
$ curl -v -X POST http://localhost:9001/api/v1/area/triangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "height": 15, "base": 8 }'

< HTTP/1.1 200
< X-Rate-Limit-Remaining: 9
{"shape":"triangle","area":60.0}

## rejected request
$ curl -v -X POST http://localhost:9001/api/v1/area/triangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "height": 15, "base": 8 }'

< HTTP/1.1 429
< X-Rate-Limit-Retry-After-Seconds: 299
{ "status": 429, "error": "Too Many Requests", "message": "You have exhausted your API Request Quota" }

It looks like we’re done. We can keep adding endpoints, and the interceptor will apply the rate limit for each request.

6. Bucket4j Spring Boot Starter

Let’s look at another way of using Bucket4j in a Spring application. The Bucket4j Spring Boot Starter provides auto-configuration for Bucket4j that helps us achieve API rate limiting via Spring Boot application properties or configuration.

Once we integrate the Bucket4j starter into our application, we’ll have a completely declarative API rate limiting implementation, without any application code.

6.1. Rate Limit Filters

In our example, we used the value of the request header X-api-key as the key for identifying and applying the rate limits.

The Bucket4j Spring Boot Starter provides several predefined configurations for defining our rate limit key:

  • a naive rate limit filter, which is the default
  • filter by IP Address
  • expression-based filters

Expression-based filters use the Spring Expression Language (SpEL). SpEL provides access to root objects, such as HttpServletRequest, that can be used to build filter expressions on the IP Address (getRemoteAddr()), request headers (getHeader('X-api-key')), and so on.

The library also supports custom classes in the filter expressions, which is discussed in the documentation.

6.2. Maven Configuration

Let’s begin by adding the bucket4j-spring-boot-starter dependency to our pom.xml:

<dependency>
    <groupId>com.giffing.bucket4j.spring.boot.starter</groupId>
    <artifactId>bucket4j-spring-boot-starter</artifactId>
    <version>0.8.1</version>
</dependency>

We used an in-memory Map to store the Bucket per API key (consumer) in our earlier implementation. Here, we can use Spring’s caching abstraction to configure an in-memory store, such as Caffeine or Guava.

Let’s add the caching dependencies:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-cache</artifactId>
    <version>3.3.2</version>
</dependency>
<dependency>
    <groupId>javax.cache</groupId>
    <artifactId>cache-api</artifactId>
</dependency>
<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>2.8.2</version>
</dependency>
<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>jcache</artifactId>
    <version>2.8.2</version>
</dependency>

Note: We added the jcache dependencies as well, to conform with Bucket4j’s caching support.

We must remember to enable the caching feature by adding the @EnableCaching annotation to any of the configuration classes.

6.3. Application Configuration

Let’s configure our application to use the Bucket4j starter library. First, we’ll configure Caffeine caching to store the API key and Bucket in-memory:

spring:
  cache:
    cache-names:
    - rate-limit-buckets
    caffeine:
      spec: maximumSize=100000,expireAfterAccess=3600s

Next, let’s configure Bucket4j:

bucket4j:
  enabled: true
  filters:
  - cache-name: rate-limit-buckets
    url: /api/v1/area.*
    strategy: first
    http-response-body: "{ \"status\": 429, \"error\": \"Too Many Requests\", \"message\": \"You have exhausted your API Request Quota\" }"
    rate-limits:
    - cache-key: "getHeader('X-api-key')"
      execute-condition: "getHeader('X-api-key').startsWith('PX001-')"
      bandwidths:
      - capacity: 100
        time: 1
        unit: hours
    - cache-key: "getHeader('X-api-key')"
      execute-condition: "getHeader('X-api-key').startsWith('BX001-')"
      bandwidths:
      - capacity: 40
        time: 1
        unit: hours
    - cache-key: "getHeader('X-api-key')"
      bandwidths:
      - capacity: 20
        time: 1
        unit: hours

So, what did we just configure?

  • bucket4j.enabled=true – enables Bucket4j auto-configuration
  • bucket4j.filters.cache-name – gets the Bucket for an API key from the cache
  • bucket4j.filters.url – indicates the path expression for applying the rate limit
  • bucket4j.filters.strategy=first – stops at the first matching rate limit configuration
  • bucket4j.filters.rate-limits.cache-key– retrieves the key using Spring Expression Language (SpEL)
  • bucket4j.filters.rate-limits.execute-condition – decides whether to execute the rate limit or not using SpEL
  • bucket4j.filters.rate-limits.bandwidths – defines the Bucket4j rate limit parameters

We replaced the PricingPlanService and the RateLimitInterceptor with a list of rate limit configurations that are evaluated sequentially.

Let’s try it out:

## successful request
$ curl -v -X POST http://localhost:9000/api/v1/area/triangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "height": 20, "base": 7 }'

< HTTP/1.1 200
< X-Rate-Limit-Remaining: 7
{"shape":"triangle","area":70.0}

## rejected request
$ curl -v -X POST http://localhost:9000/api/v1/area/triangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "height": 7, "base": 20 }'

< HTTP/1.1 429
< X-Rate-Limit-Retry-After-Seconds: 212
{ "status": 429, "error": "Too Many Requests", "message": "You have exhausted your API Request Quota" }

7. 結論

在本文中,我們演示了多種使用 Bucket4j 限制 Spring API 速率的方法。要了解更多信息,請務必查看官方 文檔

user avatar
0 位用戶收藏了這個故事!
收藏

發佈 評論

Some HTML is okay.