fix(agent): update context compressor limits after fallback activation (#3305)

When _try_activate_fallback() switches to the fallback model, it updates the agent's model/provider/client but never touches self.context_compressor. The compressor keeps the primary model's context_length and threshold_tokens, so compression decisions use wrong limits — a 200K primary → 32K fallback still uses 200K-based thresholds, causing oversized sessions to overflow the fallback. Update the compressor's model, credentials, context_length, and threshold_tokens after fallback activation using get_model_context_length() for the new model. Cherry-picked from PR #3202 by binhnt92. Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>
2026-04-28 06:51:16 +08:00 · 2026-03-26 18:10:50 -07:00
parent 18d28c63a7
commit 60fdb58ce4
2 changed files with 108 additions and 0 deletions
--- a/run_agent.py
+++ b/run_agent.py
@@ -4134,6 +4134,25 @@ class AIAgent:
                or is_native_anthropic
            )

+            # Update context compressor limits for the fallback model.
+            # Without this, compression decisions use the primary model's
+            # context window (e.g. 200K) instead of the fallback's (e.g. 32K),
+            # causing oversized sessions to overflow the fallback.
+            if hasattr(self, 'context_compressor') and self.context_compressor:
+                from agent.model_metadata import get_model_context_length
+                fb_context_length = get_model_context_length(
+                    self.model, base_url=self.base_url,
+                    api_key=self.api_key, provider=self.provider,
+                )
+                self.context_compressor.model = self.model
+                self.context_compressor.base_url = self.base_url
+                self.context_compressor.api_key = self.api_key
+                self.context_compressor.provider = self.provider
+                self.context_compressor.context_length = fb_context_length
+                self.context_compressor.threshold_tokens = int(
+                    fb_context_length * self.context_compressor.threshold_percent
+                )
+
            self._emit_status(
                f"🔄 Primary model failed — switching to fallback: "
                f"{fb_model} via {fb_provider}"