- Agents: Selection based on suitability and configuration.
- Daemon Set Pod: Design, integration, and deployment.
- Logs: Source log configuration changes, enrichment, and filtering.
- Shipping: Agent-target integration and pipeline benchmark testing.
Managed AWS ES Vs ELK:
- AWS ES Service vs. Self-Managed: Choose one over the other based on – team ELK familiarity, size of the cluster, visibility and extendibility, uptime SLA requirements, and cost.
- Bucket organization: Separation and storage of logs by customer VPC.
- Flexible retrieval: Functions to retrieve logs by adjustable intervals for each customer.
- ES snapshots: Keep recent month data in ES, store remaining as snapshots in S3.
- Archival: Store recent year data in S3 and remaining in Glacier.
- Cluster size: Configure ES and Kubernetes cluster size based on the data volume estimates.
- Storage volume: Determine storage requirements for S3 and Glacier based on the data volume. How much is configured as in-frequent vs. frequent access in S3?.
- Costs: Estimate TCO.
- Services integrations: CloudTrail and CloudWatch, Slack, Jira, and other notification channels with pager duty.
- Solution integrations: Agents to CloudWatch, CloudWatch, and Solution, Solution with S3 and Glacier, Solution, and PagerDuty.
- CloudWatch vs. ELK: Determine which alerts will be raised from CloudWatch vs. ELK.
- Custom Alerts: Alert customizations required for slack messages and JIRA tickets.
- Monitor Solution and Services: Agents and solutions should be monitored for 24/7 uptime with Prometheus.
- Incident management: We act as a central system for all notification channel integrations.
- Extensibility: Send alerts from CloudWatch and ELK to SNS, so that other systems and tools can subscribe to them in addition to pager duty.
- Agents: Setup the Daemon Set Pod security policy.
- Solution: Security implementation for the customer VPC to solution VPC and ELK stack, S3, and PagerDuty.
- Logs: Custom tamper-proof implementation with sign and sequence approach.
- Agents: Integrate Daemon Set Pod Kubernetes deployment with Kubernetes cluster running in customer VPC.
- Solution: ELK stack on Kubernetes using Helm charts or Rancher, S3 buckets setup, Configure archival from S3 to Glacier, and PagerDuty services setup.
Testing and Stabilization:
- Create a test environment in the Client network.
- Continuous deployment and testing in the target environment are critical.