About The Role
Teza is a systematic trading firm building quantitative strategies across multiple asset classes. We are looking for a DevOps Engineer to own and evolve our infrastructure platform — the systems that our quants, developers, and traders rely on every day.
Our infrastructure team is at an inflection point. We have a working platform that supports active trading and research, and we are now investing in making it more robust, observable, and developer-friendly. The successful candidate will have significant influence over the direction of our infrastructure — shaping tooling choices, establishing standards, and building systems that scale with the firm.
Location
Yerevan, Armenia, Austin, Texas or London, UK
What We’re Building Toward
- Full-stack observability across all internal services — unified metrics, centralized logging, distributed tracing, and actionable alerting — so that engineers and traders have clear, real‑time visibility into system health and performance.
- Reliable, self‑service compute orchestration spanning Slurm (HPC/ML workloads), Hadoop (batch data processing), and Airflow (workflow scheduling) — enabling researchers and data engineers to run workloads at scale without infrastructure bottlenecks.
- Mature secrets management for trading credentials, API keys, certificates, and service‑to‑service authentication — with rotation policies, auditing, and tight integration into deployment workflows.
- Unified release pipelines that bring consistency to how diverse applications — trading strategies, data pipelines, and real‑time trading systems — move from development to production, each with their own build, test, and deployment needs.
- A well‑maintained platform foundation where shared services — identity management, GitHub Actions runners, VPN, observability tooling — stay current and reliable without disrupting active trading.
- Strong security posture across production and research environments — network segmentation, access controls, vulnerability management, and compliance — that evolves alongside the platform rather than being bolted on after the fact.
Requirements
- Experience building and operating internal platform services for development teams (CI/CD, compute, monitoring, developer tooling) — not just consuming them.
- Strong proficiency in Linux systems administration and container orchestration (Docker, Kubernetes, or similar).
- Deep understanding of the software development lifecycle and how infrastructure supports engineering teams — from local development through CI to production deployment.
- Familiarity with ML and data‑intensive workflow requirements: GPU scheduling, large dataset access patterns, experiment tracking, and reproducible compute environments.
- Proficiency with at least one major cloud provider (AWS, GCP, or Azure) including networking, IAM, and managed services.
- Experience designing and operating hybrid infrastructure — cloud, on‑premises, and colocation environments — with an understanding of the tradeoffs between them.
- Hands‑on programming ability in Python or another scripting language, sufficient to build tooling, automation, and infrastructure‑as‑code — not just run playbooks.
- Solid understanding of core network protocols and services: DNS, LDAP, SMTP, TLS, HTTP, and SSH.
- Practical knowledge of infrastructure security: firewall management, access control models (zero‑trust, bastion hosts), vulnerability scanning, patch management, and audit logging.
Nice to Have
- Experience in a trading firm or other environment with strict uptime and latency requirements.
- Familiarity with infrastructure‑as‑code tools (Terraform, Pulumi, Ansible).
- Experience with log aggregation and SIEM systems.
- Understanding of compliance frameworks relevant to financial services.
Benefits
- Health, visual and dental insurance
- Flexible sick time policy
#J-18808-Ljbffr…
