globalMaxParsingRequestsPerMin property

int globalMaxParsingRequestsPerMin
final

The maximum number of requests the job is allowed to make to the LLM model per minute in this project. Consult https://cloud.google.com/vertex-ai/generative-ai/docs/quotas and your document size to set an appropriate value here. If this value is not specified, max_parsing_requests_per_min will be used by indexing pipeline job as the global limit.

Implementation

final int globalMaxParsingRequestsPerMin;