docs: add Redis solution articles ported from internal Confluence#753
Conversation
Port 23 troubleshooting and how-to articles for Alauda Cache Service for Redis OSS (redis-operator), filling gaps not covered by docs.alauda.io/redis/5.0/. Topics include sentinel password setup, dangerous-command ACL, slow log, custom commands, RedisInsight, RedisProxyOperator, Navicat client, cluster slot/node recovery, backup template compatibility, RedisShake migration, and an emergency playbook. Each doc carries an applicable-version callout and was reviewed against operator v5.0.x; legacy procedures are clearly scoped or routed to current alternatives.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (23)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds a set of Redis OSS “Solutions” knowledge-base articles (ported from internal Confluence) under docs/en/solutions/ecosystem/redis, covering operational troubleshooting, recovery playbooks, and how-to guides for Alauda Cache Service for Redis OSS on Kubernetes.
Changes:
- Adds an emergency response playbook plus best-practices guidance for Redis Sentinel/Cluster deployments.
- Adds multiple troubleshooting/how-to articles (cluster slot/node repair, replication sync failures, Sentinel password/failover, maxmemory, slowlog, traffic limiting, migration via RedisShake, RedisInsight, and proxy deployment).
- Adds a backup/restore compatibility note for parameter templates.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 28 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/en/solutions/ecosystem/redis/Redis_Emergency_Response_Playbook.md | New emergency response runbook for common operator/instance/cluster failure scenarios. |
| docs/en/solutions/ecosystem/redis/Redis_Best_Practices.md | New best-practices and sizing guidance for Sentinel/Cluster mode. |
| docs/en/solutions/ecosystem/redis/How_to_View_Redis_Slow_Logs.md | New slowlog inspection/configuration guide. |
| docs/en/solutions/ecosystem/redis/How_to_Troubleshoot_Cluster_Mode_Connection_Errors.md | New guidance for diagnosing non-cluster-aware client issues. |
| docs/en/solutions/ecosystem/redis/How_to_Trigger_Manual_Sentinel_Failover.md | New procedure for manual Sentinel failover. |
| docs/en/solutions/ecosystem/redis/How_to_Set_Sentinel_Node_Password.md | New procedure for setting/rotating Sentinel-side authentication. |
| docs/en/solutions/ecosystem/redis/How_to_Run_Redis_as_Root_User.md | New guidance for running Redis pods as root for specific storage backends. |
| docs/en/solutions/ecosystem/redis/How_to_Resolve_Master_Replica_Sync_Failure.md | New replication-buffer troubleshooting and tuning procedure. |
| docs/en/solutions/ecosystem/redis/How_to_Repair_Redis_Cluster_Slot_Anomalies.md | New manual slot repair procedures for Redis Cluster. |
| docs/en/solutions/ecosystem/redis/How_to_Recover_From_Redis_Cluster_Crash.md | New recovery procedure for corrupted/missing cluster state (nodes.conf). |
| docs/en/solutions/ecosystem/redis/How_to_Recover_From_Cross_Shard_Master_Corruption.md | New procedure for resolving operator reconciliation issues due to shard/pod role misalignment. |
| docs/en/solutions/ecosystem/redis/How_to_Rate_Limit_Redis_Traffic.md | New guidance for throttling/limiting traffic safely (connections, upstream throttling, BigKeys). |
| docs/en/solutions/ecosystem/redis/How_to_Migrate_Redis_Across_Clusters.md | New RedisShake-based cross-cluster migration procedures and version/image matrix. |
| docs/en/solutions/ecosystem/redis/How_to_Manually_Remove_Failed_Cluster_Nodes.md | New helper script and steps for CLUSTER FORGET cleanup of stale nodes. |
| docs/en/solutions/ecosystem/redis/How_to_Manage_Dangerous_Redis_Commands.md | New ACL-based guidance for managing dangerous commands, plus legacy operator methods. |
| docs/en/solutions/ecosystem/redis/How_to_Fix_Sentinel_Multi_Instance_Merge.md | New recovery guide for legacy Sentinel “merge via IP recycling” failure mode. |
| docs/en/solutions/ecosystem/redis/How_to_Deploy_RedisInsight_Web_Console.md | New Kubernetes manifest and connection notes for RedisInsight. |
| docs/en/solutions/ecosystem/redis/How_to_Deploy_Redis_Proxy_with_RedisProxyOperator.md | New guide for deploying Predixy-based proxy via redis-proxy-operator. |
| docs/en/solutions/ecosystem/redis/How_to_Connect_to_Redis_Sentinel_with_Navicat.md | New client configuration guide for Navicat with Sentinel. |
| docs/en/solutions/ecosystem/redis/How_to_Configure_Redis_MaxMemory.md | New guide for overriding operator maxmemory defaults (runtime vs customConfig). |
| docs/en/solutions/ecosystem/redis/How_to_Cleanup_Invalid_Cluster_Nodes.md | New procedure for orphaned node cleanup after pod IP recycling. |
| docs/en/solutions/ecosystem/redis/How_to_Add_Custom_Redis_Commands.md | New legacy guidance for re-enabling commands via rename-command (pre-ACL). |
| docs/en/solutions/ecosystem/redis/Backup_Restore_Template_Compatibility.md | New compatibility matrix + workaround for restoring RDB backups into AOF-template instances. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| products: | ||
| - Alauda Application Services | ||
| kind: | ||
| - Solution |
| products: | ||
| - Alauda Application Services | ||
| kind: | ||
| - Solution |
| products: | ||
| - Alauda Application Services | ||
| kind: | ||
| - Solution |
| products: | ||
| - Alauda Application Services | ||
| kind: | ||
| - Solution |
| products: | ||
| - Alauda Application Services | ||
| kind: | ||
| - Solution |
| | `slowlog-log-slower-than` | Minimum execution time (in microseconds) for a command to be recorded as a slow log entry. | Set to `0` to log every command. Set to `-1` to disable slow log recording. | | ||
| | `slowlog-max-len` | Maximum number of entries retained in the slow log. When the limit is reached, the oldest entry is removed. | Default: `0` (no entries retained). Set to a positive value such as `128` or `1024` for production use. | |
| ```bash | ||
| redis-cli -h <redis-host> -p <redis-port> -a <password> | ||
| ``` |
| ### 1. Create the Restore Instance With AOF Disabled | ||
|
|
||
| When you create the new Redis instance for the restore, override the parameter template to set `appendonly: "no"`. This allows Redis to load `dump.rdb` on startup. | ||
|
|
||
| For example, on a `RedisFailover` resource: | ||
|
|
||
| ```yaml | ||
| spec: | ||
| redis: | ||
| customConfig: | ||
| appendonly: "no" | ||
| restore: | ||
| backupName: <backup-name> | ||
| ``` |
| | Pub/Sub | `psubscribe`, `publish`, `pubsub`, `punsubscribe`, `subscribe`, `unsubscribe` | `spublish`, `ssubscribe`, `sunsubscribe` | | ||
| | Stream | | `xacl`, `xadd`, `xautoclaim`, `xclaim`, `xdel`, `xgroup`, `xinfo`, `xlen`, `xpending`, `xrange`, `xread`, `xreadgroup`, `xrevrange`, `xsetid`, `xtrim` | | ||
| | Transaction (not supported in Cluster mode) | `discard`, `exec`, `multi`, `unwatch`, `watch` | | | ||
| | Server Management | `command` (proxy), `config` (proxy), `info` (proxy) | `acl`, `bgrewrite`, `bgsave`, `command`, `config`, `dbsize`, `failover`, `flushall`, `flushdb`, `info`, `lastsave`, `latency`, `lolwut`, `memory`, `module`, `monitor`, `psync`, `replconf`, `replicaof`, `role`, `save`, `shutdown`, `slaveof`, `slowlog`, `swapdb`, `sync`, `time` | |
| - Solution | ||
| --- | ||
|
|
||
| # Resolve Master-Replica Sync Failure |
Mark each doc with the platform major(s) it applies to: - 4.x only: features that require operator >= 3.18 (sentinel password, RedisProxyOperator, current best practices) - 3.x only: legacy-bound docs (rename-command for <= 3.15, sentinel multi-instance merge unaffected on 3.18+/Redis 6+, RedisShake migration whose image table caps at 3.16.2) - 3.x and 4.x: generic procedures (slow log, cluster slot ops, manual failover, recovery playbooks) that work identically on both majors
- Redis_Best_Practices.md (Resource Spec table): use uppercase G for the Total resources column values (e.g. 4.5c / 4.8G). - How_to_Rate_Limit_Redis_Traffic.md: ~500K QPS -> ~500k QPS (SI kilo is lowercase). - How_to_Recover_From_Cross_Shard_Master_Corruption.md: indent the ```bash code block 2 more spaces so it aligns with the parent list item's content per remark-lint-code-block-split-list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Port 23 troubleshooting and how-to articles for Alauda Cache Service for Redis OSS (redis-operator), filling gaps not covered by docs.alauda.io/redis/5.0/. Topics include sentinel password setup, dangerous-command ACL, slow log, custom commands, RedisInsight, RedisProxyOperator, Navicat client, cluster slot/node recovery, backup template compatibility, RedisShake migration, and an emergency playbook. Each doc carries an applicable-version callout and was reviewed against operator v5.0.x; legacy procedures are clearly scoped or routed to current alternatives.