Yinon Mitin
Site Reliability Engineer | Kubernetes | AWS | Terraform | Observability | Linux | Networking
- yinon.mitin@gmail.com
- linkedin.com/in/yinon-mitin
- Location: Tel-Aviv, IL
Summary
Platform and Site Reliability Engineer focused on production infrastructure, Linux, Kubernetes, cloud platforms, networking, automation, and observability. Builds and maintains CI/CD pipelines, Terraform and Helm infrastructure workflows, monitoring and alerting systems, secure access controls, backup/recovery procedures, and operational automation for business-critical environments. Combines commercial e-commerce infrastructure experience with ownership of distributed connectivity systems, applying SRE, DevSecOps, and platform engineering practices across reliability, scalability, security, and runtime operations.
Professional Experience
- DevOps Engineer
- Padani Jewelers - E-commerce
- 2020-2024
-
Led automation for a large-scale Magento-to-Shopify migration, designing repeatable data flows, batch processing routines, validation steps, and operational handoff processes that reduced large catalog deployment time by approx. 60%.
-
Developed and maintained a production-facing Python ETL pipeline for extracting, validating, normalizing, transforming, and reconciling product data across business-critical systems.
-
Built automated asset collection workflows from vendor and brand sources, applying SKU-level naming standards, packaging rules, integrity checks, and structured delivery into downstream systems.
-
Operated recurring production workloads on Linux using cron, SQLite state tracking, structured logs, retry-safe routines, and automated dataset generation to reduce manual intervention and improve reliability.
-
Built a microservices platform with AWS-based infrastructure provisioned via Terraform, workloads deployed to managed Kubernetes, and application releases managed with Helm.
-
Implemented GitLab CI/CD pipelines to automate builds, validation, deployment flow, and operational repeatability across infrastructure and application changes.
-
Added Prometheus, Grafana, and alerting visibility to improve runtime monitoring, release confidence, and service reliability.
-
Improved operational troubleshooting by standardizing logs, dashboards, failure detection points, and repeatable recovery procedures for automation and infrastructure workloads.
- ##
- Infrastructure Engineer & SRE
- NDA
- 2024-Present
-
Designed and built a commercial VPN/connectivity platform from scratch for thousands of users, high traffic throughput, distributed ingress/egress points, load-balanced routing, autoscaling, and 24/7 operation.
-
Defined the full topology with isolated Linux nodes for protocol termination, traffic routing, control-plane services, observability, log aggregation, backups, and administrative access.
-
Implemented multi-protocol support across WireGuard, OpenVPN, VLESS, Xray-based stacks, and related tunneling technologies to serve heterogeneous clients, routing constraints, censorship-resistance scenarios, and protocol redundancy.
-
Built resilient traffic paths using load balancers, reverse proxies, firewall boundaries, DNS controls, secure management channels, and service isolation to enforce defense-in-depth across production nodes.
-
Implemented metrics, structured logs, uptime checks, dashboards, and alerting; telemetry is transported through encrypted tunnels instead of exposed public management endpoints.
-
Built a Telegram-based commercial and administrative control plane for onboarding, subscription payments, access provisioning, operational notifications, and customer-facing automation.
-
Designed a dedicated data-protection layer with separate log and backup nodes, encrypted backup workflows, external replicas, recovery procedures, and failure-isolation boundaries.
-
Applied DevSecOps/SRE practices across access control, SSH hardening, firewall policy, secrets hygiene, patching, health checks, incident feedback, capacity planning, and repeatable recovery.
-
Owns the platform end-to-end as a commercial infrastructure product, with uptime, security posture, customer impact, cost efficiency, automation quality, and reliability treated as core engineering requirements.
Independent Infrastructure & Platform Projects
HomeLab / Private Infrastructure Platform
-
Built and currently operates an enterprise-grade private infrastructure environment with multiple routers, managed switches, servers, segmented networks, self-hosted services, private DNS, firewall policy, and controlled remote access.
-
Designed VLAN segmentation, routing boundaries, DNS filtering, secure tunnels, and isolated service zones to enforce segmentation, containment, and least-privilege access boundaries.
-
Operates self-hosted services for backup orchestration, storage management, media automation, monitoring, alerting, and lifecycle maintenance without unnecessary public exposure.
-
Built resilient storage workflows with mirrored NAS, scheduled replication, external backups, validation routines, and recovery testing to reduce data-loss risk.
-
Runs containerized Linux services with service discovery, reverse proxying, health checks, log collection, update routines, and maintenance automation.
-
Operates the environment as a long-running private infrastructure platform for validating network architecture, service automation, observability, recovery workflows, security controls, and capacity planning under real operational constraints.
Skills
SRE / Platform Engineering: Production Ownership, Reliability Engineering, Incident Response, Troubleshooting, Runtime Analysis, Capacity Planning, Operational Automation, Backup & Recovery, Hardening, RCA
Infrastructure / Cloud: Linux, AWS, EC2, S3, RDS Concepts, VPS/VDS Infrastructure, Terraform, Docker, Kubernetes, Helm, Git, GitLab CI/CD, Infrastructure as Code
Networking / Security: TCP/IP, DNS, Firewall Policy, VLANs, Routing, Reverse Proxy, HAProxy, Nginx, Load Balancing, WireGuard, OpenVPN, VLESS, Xray-based Protocols, Secure Tunnels, Access Control
Observability: Prometheus, Grafana, Loki, Alertmanager, Metrics, Dashboards, Structured Logging, Uptime Monitoring, Telegram Alerts, OpenTelemetry, ELK Stack
Automation / Programming: Python, Bash, Cron, SQLite, ETL Automation, Deployment Automation, Validation Pipelines, Operational Runbooks, AI-assisted Debugging with Cursor / Claude Code
Cloud Exposure: AWS, GCP, Yandex Cloud
Education
| Computer Science | LIT | 2017-2020 |
| Software Engineering | Yandex University | 2017-2022 |
Languages
| English - C1 | Russian - Native | Hebrew - A2 |