Apply for this position
SHORT DESCRIPTION:
We are looking for talented, experienced, and motivated individuals in the following role to join the EDP team to take EDP to its next level.
The Observability Architect is responsible for architecture of the infrastructure wide observability platform of which provides observability (metric, events, alerting, remediation) for the core infrastructure technology teams – Network, Compute, Virtualization, Storage Security and Software, and for products provided to EDP Infrastructure customers. The infrastructure observability platform must also integrate with the EDP wide observability platform which encompasses other EDP technical towers (k8s, DevOps, Data, IAM, etc). The architect does this in conjunction and through consultation with the other infrastructure technology architects and the lead Infrastructure Architect.
ABOUT THE CLIENT:
The EDP team is building an internal platform for software product developers to accelerate the development and delivery of software products to tackle the massive challenges facing the energy sector. The EDP Platform is a service oriented, cloud-native platform that is being built to provide application teams with self-service capabilities to develop, run and operate their software products. EDP Platform provides services for application infrastructure, data, service lifecycle management, application build and delivery as well as services to operate their software products. The EDP Platform is deployed as a hybrid cloud, encompassing both private cloud and select public clouds.
YOU WILL:
The architect has the responsibility for their technology domain which incorporates the end-to-end lifecycle of the solutions from hardware to software and where required custom solutions (i.e. code) to deliver the required results. The architect is responsible for the solution lifecycle, therefore being responsible for ensuring the solution is correctly deployed by the engineering team. The architect is not responsible for engineering delivery but is responsible for ensuring the architecture is adhered to and resolving design issues fed back from engineering.
As this is an architecture role, it is expected that the individual is self-motivating, in that they take it upon themselves (be proactive) to identify and/or research new ways of doing things, new technologies and inventing new ways to provide the required solutions and overcome technical challenges.
The Observability Architect is responsible for the following technology areas:
- Infrastructure Core Metrics, events and alerting (Compute, Network, Storage, Security, IaaS)
- Customer Metrics, events and alerting.
- Security events, alerting and remediation.
- IaaS Metrics, Events and Alerting
- EDP wide observability integration
- Internal Infrastructure Observability Platform
- Customer Observability Platform
- QA Team / Testing Observability Platform
YOU NEED:
This is a senior architectural role; therefore, the individual must have and at least 7 years working in the specified technology area, and able to demonstrate this experience along-side demonstrating real world experience of the entire life cycle of products and/or resources.
The architect must have excellent problem-analysis and resolution competencies.
The individual must have the experience to think at a holistic level to ensure the work provided encompasses a forward view to enable solutions to be built upon with minimal disruption upstream or downstream, therefore understanding not just the core competency but surrounding aspects/technologies to ensure the architectures fit into the overall strategy.
The architectures will be owned by the architect through the entire lifecycle therefore being fully responsible.
Must-have skills (must have unless otherwise noted):
- MUST understand and have proven experience with observability (metrics, events, logging, alerting) for physical data center infrastructure. i.e., network devices, compute platforms, storage platforms, security platform.
- Have extensive experience with various observability platforms.
- Able to provide solutions with both COTS, open source and custom solutions (code) to fulfil end to end requirements.
- OpenTelemetry
- At least 3 or more of Loki, Grafana, Prometheus, DataDog o AppDynamics, DynaTrace, ELK
- Reporting tools and customizing for specific audiences
- Integrations with custom written software (i.e .Python) exposing metrics, events and alerting
- Integrations with hypervisor platforms (KVM, ESXi)
- ML experience used as a forecasting technology for scaling infrastructure hardware based on usage stats/metrics.
- eBPF experience to extract traffic flow metrics and analyze potential issues (performance, security)
- k8s CNI knowledge (i.e., Cilium) to extract traffic flow metrics and analyze potential issues (performance, security)
- Database knowledge to support observability platform data storage requirements (i.e., TimeSeries, Graph)
WOULD BE A PLUS:
There are no defined preferred competencies, additional skills we be evaluated and are of benefit but are not required. It is paramount that all skills at a senior level are proven in the “Must Have” section.
WE OFFER:
While we’re still crafting the details of this exciting opportunity, we encourage you to submit your resume and express your interest in being part of an exceptional team. Don’t miss out on the chance to be part of something special. Apply now and let’s shape the future together!