TeamPlus
Years of exp.: 6+ years of experience in cloud engineering, observability,
or SRE, with at least 2+ years in a client-facing consulting, professional Services or
systems integrator role.
| Deep hands-on expertise with Open Telemetry (collectors – deployment, daemonset, gateway, auto/manual instrumentation, OTLP, semantic conventions, distributed tracing, pipeline, fleet management, release management) and at least one major APM platform (Datadog, Grafana Stack, Elastic APM, Dynatrace, Honeycomb, etc.) delivered in production client environments.
· Strong cloud architecture and application solution architecture experience, preferably on AWS, with proven ability to design observable, resilient distributed systems using IaC systems (e.g. Terraform). · Demonstrated success leading client projects end-to-end, including stakeholder management, workshops, and delivery under tight timelines. · Proficiency in scripting and development using Python, Go, JavaScript/TypeScript. · Solid understanding of AI/ML concepts related to observability (monitoring LLM latency, token usage, vector search, GPU workloads). · Solid understanding of Internal Developer Platform (IDP) technologies (e.g., Port, Backstage) and platform engineering concepts. · Experience working in a consulting or services start up environment (or equivalent high-velocity, client-driven setting) with high Ownership and adaptability. |