Search by job, company or skills

Lotus's

Department Manager - Site Reliability Engineer

Early Applicant
  • 3 days ago
  • Be among the first 50 applicants

Job Description

Key Responsibility:

  • Lead and manage staff resource, deliver works to the teams, and manage teams schedule
  • Leads and manage applications services under responsible areas to ensure BAU stabilization and meet expected incident SLA and system availability level defined per on/off peak time/period
  • Performs root cause analysis (RCA) to immediate troubleshoot issues and perform issue resolution (short term. Medium term and long term) within incident SLA along with proactive/reactive action.
  • Perform BAU system set up, bug fixing & small CRs with IT implementation methodology (build, test, deploy) aligned to company security and business objectives and strategy.
  • Lead and manage system monitoring process to ensure data quality and integrity in production is always accurate and available for key stakeholders and business processes that depend on it.
  • Lead and manage regular IT audit checks on recorded calls, incidents and provides feedback to team members to ensure procedures are followed and quality is improved
  • Lead and manage regular system patch upgrade with product owner & business stakeholders
  • Lead and manage IT service & support operating model and procedure in responsible area to ensure team is able to support BAU & business stakeholders smoothly especially month end & yearend financial closing activities
  • Manage support workbook and control. Ensure knowledge base has been well organized and keep up-to-date.
  • Be familiar with REST API (Synchronous Process), Message Producer/Consumer Process (Async Process) and Batch process.
  • Be familiar withe of Opensource Monitoring Tools such as ELK stack, Grafana
  • Be familiar with Container Technology such as Docker, K8S
  • Be familiar with Cloud Technology such AWS, Azure and Tencent cloud.

Qualifications:

  • Bachelor's in Computer Science or related field
  • 6+ years in SRE or Support Engineer, with leadership experience.
  • Familiar in programming (Java, Go),basic SQL, Linux/Unix scripting, Cloud platforms (AWS, Azure, Tencent Cloud)
  • Experienced in ITIL (Ticket Management, Problem Management)
  • Familiar with container technology such as Docker, Kubernetes (K8S).
  • Skilled in monitoring tools (ELK stack, Grafana) and incident response, with experience in high-availability design.
  • Excellent communication and team mentorship, with experience leading cross-functional projects.
  • Good English proficiency

CP AXTRA | Lotus's

CP AXTRA Public Company Limited.

Nawamin Office: Buengkum, Bangkok 10230, Thailand

By applying for this position, you consent to the collection, use and disclosure of your personal data to us, our recruitment firms and all relevant third parties for the purpose of processing your application for this job position (or any other suitable positions within Lotus's and its subsidiaries, if any). You understand and acknowledge that your personal data will be processed in accordance with the law and our policy.

More Info

Industry:Other

Function:technology

Job Type:Permanent Job

Skills Required

Login to check your skill match score

Login

Date Posted: 16/11/2024

Job ID: 100528725

Report Job

About Company

Follow

Hi , want to stand out? Get your resume crafted by experts.

Similar Jobs

Site Reliability Engineer

Arise by INFINITASCompany Name Confidential

C Software Engineer

YO HR ConsultancyCompany Name Confidential
Last Updated: 17-11-2024 07:05:54 PM
Home Jobs in Thailand Department Manager - Site Reliability Engineer