Handling LLM Failures
LLM failures are inevitable. In your agent design you must handle these failures accordingly. Railtracks gives you the tooling to handle them gracefully.
Types of Failures
LLM failures fall into two categories, and the right handling differs between them:
-
Retryable Failures: Known anticipated errors you can recover from. Including, rate limits (
429), timeouts, transient server errors. Safe to retry with backoff. -
Fatal Failures: Unexpected errors that retrying won't fix — unhandled exceptions, malformed responses, or hard API errors. These should fail fast and surface to the caller.
Railtracks Tooling for Retries
Pass a RetryApproach to your LLM and it handles retry logic automatically. The internal logic will handle the difference between errors where you should retry on vs. not. Here's an example using ExponentialBackoffRetry:
import railtracks as rt
exponentialRetry = rt.llm.retries.ExponentialRetry(
max_tries=5,
base=2.0, # delay will be 2^attempt seconds
jitter=True, # add random jitter to avoid thundering herd
)
# Now pass in that expoential configuration into your llm.
rt.llm.OpenAILLM(
model_name="gpt-4",
retry_approach=exponentialRetry,
)
Railtracks Recommendation
Jittered Exponential Backoff is the industry standard and should be used in nearly all cases.
Custom Retry Logic
If you want to implement custom retry logic, you can do so by creating a class that inherits from RetryApproach and implementing the call_with_retry method. Here's an example of a custom retry approach that implements a fixed retry strategy:
import railtracks as rt
class CustomRetry(rt.llm.retries.RetryApproach):
@classmethod
def approach_name(cls) -> str:
return "custom"
def _compute_delay(self, attempt: int) -> float:
# implement your custom retry logic here. For example, you could do a quadratic backoff etc. For more complex logic, please create a new issue or reach out directly to the RailTracks team.
return 1