maxParsingRequestsPerMin property

int maxParsingRequestsPerMin
final

The maximum number of requests the job is allowed to make to the LLM model per minute. Consult https://cloud.google.com/vertex-ai/generative-ai/docs/quotas and your document size to set an appropriate value here. If unspecified, a default value of 5000 QPM would be used.

Implementation

final int maxParsingRequestsPerMin;