SAP on AWS S/4HANA Implementation

Implementation overview

The implementation process unfolded in a clear delivery sequence

The programme was executed as a greenfield S/4HANA implementation with a strong Basis and infrastructure workstream running alongside the application rollout. The target landscape supported a pharmaceutical business and included SAP S/4HANA 2022, Solution Manager and Web Dispatcher.

The implementation process involved migrating several years’ worth of legacy data from their ECC system into their new S/4HANA system to support their pharmaceutical business operations.
The implementation process unfolded as follows:

A Greenfield implementation of the S/4HANA system
Activation of SAP Best Practices
Customization of the system to meet specific needs
Migration of legacy data using the Migration Cockpit
Ensuring business continuity throughout the process
Successful business go-live

Planning

SAP on AWS planning decisions determined the shape of the deployment

Before any SAP system was deployed, the implementation had to settle a number of foundation decisions: the AWS Region closest to users and data centres, data residency and regulatory requirements, support for the required AWS services and EC2 instance families, multi-AZ needs for HA, the right deployment model, and the commercial impact of AWS service pricing.

Those choices set the course for the full implementation. Once they were clear, the project could move from design decisions into actual deployment and connectivity planning.

Region, service availability and pricing considerations for cloud deployment

System Deployment

What were the options?

When deploying an SAP S/4HANA system in AWS, you have the same options as on-premise

A standalone installation, where the database, central services instance, and the dialog instance are kept on the same host.
A distributed installation, where each component is installed on separate VMs.
A highly available installation, which prevents unplanned downtime due to redundancy of components.

The deployment method influences the performance, availability, and cost of running SAP in AWS.

Once the deployment option was decided, we need to establish the connectivity between on-premise and AWS. To establish the connectivity, our infrastructure team has configured the Site-to-Site VPN which is very easy to setup. An encrypted link is created form where the data can pass from customer network to and from AWS.

Here comes the BASIS part where we have started the installation of SAP system on AWS.

SAP Best Practice Activation

Development was built as a distributed S/4HANA environment

SAP Best Practice provide a preconfigured content library of end-to-end business processes, based on SAP’s extensive global implementation experience. These packages support rapid configuration and deployment, ensuring process consistency and reducing customization efforts across SAP projects.

SAP S/4HANA
OS – SUSE Linux Enterprise Server 15 SP5

Filesystems

Application server filesystem mount points

Shared filesystem layout for continuity planning

We have activated best practice for 3 countries i.e., Germany, Saudi Arabia and UAE. The important point which always need to be taken care is about the currency setup, as by default it takes only USD. But in our case, we have changed the group currency from USD to KWD (Kuwaiti Dinar).

SAP Best Practice Activation

The development landscape ran SAP S/4HANA 2022 with the primary application server, ASCS and central services on SUSE Linux Enterprise Server 15 SP5, while the HANA database ran on a separate dedicated database host. Solution Manager 7.2 was also deployed with its own database host. This was a distributed deployment in which the database and application tiers were intentionally separated.

After the installation and post-installation work, multiple clients were created for specific business and functional needs. SAP Best Practices were then activated for Germany, Saudi Arabia and the UAE, with the group currency changed from the default USD to KWD to suit the client’s operating model.

After completing post-installation activities, multiple clients were created within the system, each configured to serve specific functional or business purposes.

Clients in the S/4HANA Development system

Client 100 – Customization & Development

This is a unique client. It is the origination client for all functional transports across the landscape. It is the only client in the landscape that cannot be recreated with a client copy. It cannot be refreshed; it can only be restored. This client is used by the ABAP programmers to create new ABAP code.

Client 200 – Unit testing

This is the first client where official testing occurs. The Unit Test Client is for testing individual transactions and configuration; i.e., the smallest unit of a transaction or business process. Everyone works in the Unit Test Client. All transactions are executed in the Unit Test Client. ABAP code, Security Activity Groups, Data loads, Configuration, Master Data, Batch Jobs are all tested in this client. This client is the earliest version of what Production will look like with data in it.

Client 300 – Sandbox client

Sandbox client is a separate, isolated environment used for testing and experimenting with configurations and customizations without affecting the main development or production systems.

Development clients 100, 200 and 300 in S/4HANA

Connectivity and governance

Solution Manager and Site-to-Site VPN supported the programme controls

Once the deployment model was chosen, connectivity between the on-premises estate and AWS was established through a Site-to-Site VPN. The development architecture placed SAP S/4HANA, Solution Manager and HANA in a private subnet, with a jump server in a public subnet to provide controlled access into the landscape.

Solution Manager was used for change request management, system monitoring, EWA alerts, and ADS support for the S/4HANA development and quality systems. This kept the project governed while the implementation moved through build and testing.

Private-subnet architecture for SAP systems and database hosts
Jump server in a public subnet for secure administrative access
Site-to-Site VPN between customer network and AWS
Solution Manager used for change, monitoring, EWA alerts and ADS

High level Architecture

Solution Manager filesystem mount points — Sol Man Filesystems

ABAP filesystem layout for Solution Manager — Solutions manager 7.2 ABAP

SAP on AWS private subnet architecture with controlled administrative access

NetWeaver Java filesystem layout — SAP NetWeaver Java 7.5

Architecture diagram showing application, database and connectivity layers

Quality system

Quality mirrored development but added scale for SIT and UAT

The quality system retained the same overall architecture as development, but a larger database instance was used because SIT and UAT required more data. An additional application server was also provisioned so load could be distributed between PAS and AAS.

This allowed the quality environment to behave much more like a realistic pre-production landscape while still keeping the build pattern familiar to the delivery teams.

SAP Change Request Management
System Monitoring
EWA alerts
ADS for S4 Development and Quality system

The architecture of Quality System almost remains the except few points:

Higher-capacity database host to support larger data volumes
Additional application server added for load distribution
SIT and UAT performed on a landscape close to production behaviour

Clients in S/4HANA Quality system.

Client 100 – System integration testing (SIT)

This client concerns the overall testing of a complete system of many subsystem components or elements.

Client 200 – User acceptance testing (UAT)

Also called application testing or end-user testing, is a phase of software development in which the software is tested in the real world by its intended audience

For QAS, the SAP S/4HANA, Solution Manager, and the HANA database are hosted in a private subnet. A jump server is deployed in the public subnet to facilitate secure access to internal systems. All communication between AWS and the customer data center is routed through a Site-to-Site VPN connection.

Production system

SAP on AWS production was deployed as a highly available, multi-AZ landscape

Before SAP HANA system replication and high-availability controls were discussed, the project first put database backup in place. AWS Backint was used to back up the HANA database to an S3 bucket, with daily backups scheduled through SAP HANA Cockpit.

Backint provided the interface between SAP HANA and external backup storage, while the AWS Backint Agent handled the transfer of backup and catalog files into Amazon S3 or AWS Backup. This supported full, incremental and differential backups, along with log and catalog protection.

ASCS and ERS distributed across Availability Zones
AAS included for resilience and workload distribution
Dedicated HANA database tier spanning Availability Zones
Web Dispatcher used for secure Fiori access

Filesystems

Production application filesystem layout — SAP S/4HANA application

Production database filesystem layout — SAP HANA database

Web Dispatcher filesystem layout — SAP Web Dispatcher

Clients in S/4HANA Production system

Client 100 – Production

The live customer client, used to record the customers business transactions.

The architecture diagram depicts production environment where SAP S/4HANA and the HANA database are hosted in a private subnet. A Web Dispatcher is deployed in the public subnet (DMZ) to facilitate secure access to Fiori application from the internet. All communication between AWS and the customer data center is routed through a Site-to-Site VPN connection. High availability (HA) in an SAP S/4HANA production environment is essential to ensure uninterrupted business operations, minimal downtime, and continuous access to critical enterprise applications and data.
Before we start to discuss the business continuity, we need to discuss about one more important topic i.e SAP HANA database backup. This is one of the most important check that need to be in place before the starting the SAP HANA database high availability.

Backup and recovery

SAP HANA Database Backup Configuration

Before enabling SAP HANA system replication, ensure that backups are configured. In our case, we used AWS Backint to back up the HANA database to an S3 bucket and scheduled daily backups using SAP HANA Cockpit.

AWS Backint used to back up SAP HANA to Amazon S3
Daily backup scheduling managed in SAP HANA Cockpit
Support for full, incremental, differential and log backups
Backup readiness treated as a prerequisite for HA

Backint for SAP HANA is an API that lets third-party backup agent interface directly with the SAP HANA database. Data flows via pipe from the database to the agent, which sends it to an external backup server.
AWS Backint Agent for SAP HANA is an SAP-certified solution designed to facilitate backup and restore operations for SAP HANA workloads running on Amazon EC2. Deployed as a standalone component on the HANA database server, the agent transfers backup and catalog files to Amazon S3 or AWS Backup, depending on the defined configuration. It supports full, incremental, and differential backups, as well as backup of logs and catalogs. Recovery can be performed using SAP HANA Cockpit, SAP HANA Studio, or SQL commands. During restore operations, SAP HANA accesses catalog files via the Backint agent to identify and request the required backup data.
AWS Backint Agent is a free service provided by AWS. One only need to pay for the underlying AWS services as per the usage, for example Amazon S3.

The above diagram specifies that the daily backup is triggered from SAP HANA Cockpit. The backup is configured using AWS backint. Amazon Backint Agent then stores the backup files in your Amazon S3 bucket based on the information provided in the Amazon Backint Agent for SAP HANA configuration file

Now let come to the most important topic that business continuity. As well all know that SAP is a mission critical application, it provides a unified platform for managing various business processes, streamlining operations, and improving overall efficiency. It’s a leading ERP (Enterprise Resource Planning) software that helps businesses of all sizes to integrate data and processes across different departments, enabling better decision-making and real-time insights.
Multi-AZ resilient deployment tailored for business-critical workloads. All services that could become single points of failure are replicated across multiple Availability Zones to provide fault tolerance and maximize uptime.
The below section explains how to configure SAP S/4HANA on AWS using SUSE Linux Enterprise Server (SLES). It provides step-by-step instructions for configuring a Pacemaker cluster for the ABAP SAP Central Services (ASCS) and the Enqueue Replication Server (ERS) across EC2 instances in two Availability Zones within the same AWS Region.

Backint backup flow from HANA Cockpit to object storage

Common Single Points of Failure in SAP

Component	Description	How to Avoid SPOF
SAP Central Services (SCS/ASCS)	Manages lock table, message handling, and gateway for ABAP stack	Use HA cluster with failover node and Enqueue Replication Server (ERS)
Enqueue Server	Maintains lock table for SAP transactions (critical for data consistency)	Implement ERS to replicate lock table and failover with Pacemaker/Cluster
Database	Central data repository; failure halts all SAP operations	Use HANA System Replication (HSR)
Application Server (PAS)	If only one Primary Application Server (PAS), its failure affects logins	Add redundant AAS (Additional App Servers) behind a load balancer

Table of common single points of failure and mitigations

The most critical single points of failure in SAP are the ASCS instance (especially the enqueue server) and the database. These should be the first targets for HA design and clustering.

High-Availability (HA) Strategy

A pair of cluster nodes deployed in isolated subnets located in separate Availability Zones, all within a single VPC and AWS Region.
The PAS and AAS are distributed across different availability zones or physical servers, and configured behind a load balancer or SAP Web Dispatcher for load distribution and failover.

Enqueue Replication Server (ERS) is used to replicate lock table data from ASCS, ensuring failover without data loss in case of a primary node failure.
Permissions to view or modify the route tables associated with the specified subnets

SUSE Linux Enterprise Server for SAP applications (SLES for SAP).
AWS – Overlay IP

AWS supports configuring Overlay IP (OIP) addresses that lie outside the defined VPC CIDR block, enabling access to the active SAP instance. Through IP overlay routing, traffic destined for a non-overlapping private IP can be directed to any instance within the VPC—across Availability Zones— by updating route table entries using the SLES Overlay IP agent. Overlay IP agent to change a routing entry in AWS routing tables

The above picture is a high-level diagram of how to mitigate the SOF at ASCS/ERS level. ASCS is a single point of failure. If it fails, SAP communication and locking stop.
SAP ASCS and ERS component are installed on 2 different nodes in 2 AZs. ERS is running on Node B, continuously replicating the lock table. If the ASCS fails, it moves to the another node automatically on which ERS is running, keeping the lock table consistent when ASCS starts.

SAP on AWS ASCS and ERS failover architecture across two nodes

Pacemaker Architecture

The above is low level architecture diagram showing the depth of the configuration, resource, corosync and pacemaker. Various cluster resources are configured to achieve the business continuity.

Pacemaker and Corosync resource architecture

We created virtual IP, sapstartsrv and sap instance resources for ASCS and grouped them together. And in case of any failover, the group containing cluster resources will move to the node on which ERS is running together. In the same way we had also configured cluster resources for ERS and grouped them in different group.

Cluster resource grouping for central services

High availability cluster architecture with STONITH controls

Failover resource movement between cluster nodes

SAP HANA DB high Availability

In today’s fast-evolving digital landscape, the security, availability, and reliability of data are more critical than ever. Organizations increasingly rely on robust data management platforms like SAP HANA to ensure seamless operations, enable informed decision-making, and maintain a competitive edge. With its high-performance, in-memory architecture, SAP HANA plays a pivotal role in driving modern data strategies.
SAP HANA stands out as a powerful platform known for its exceptional speed and scalability. One of its key strengths lies in its system replication capabilities, which provide robust support for high availability, disaster recovery, and optimized data distribution—ensuring minimal disruption even during planned maintenance or unexpected failures.

SAP HANA System Replication is a critical feature designed to ensure high availability and business continuity. It enables real-time replication of data from a primary HANA system to one or more secondary systems, thereby minimizing downtime during planned maintenance or unexpected system failures.
The purpose of this blog is to provide insights into some of the essential terms and commonly used commands associated with HANA System Replication. Whether you’re setting it up for the first time or managing an existing configuration, understanding these elements is key to effectively maintaining a resilient SAP HANA environment.

Every time we discuss Business Continuity, RTO and RPO naturally come into focus. Let’s break down what these terms really mean.
RTO (Recovery Time Objective) and RPO (Recovery Point Objective) are two key concepts in disaster recovery and business continuity planning, especially in IT and data systems

Achieving High Availability

Recovery Point Objective (RPO)

RPO defines the maximum allowable amount of data loss that an organization can tolerate in the event of a disruption or system failure.
In simpler terms: How much data can you afford to lose?

Recovery Time Objective (RTO)

RTO refers to the maximum acceptable downtime an organization can tolerate before restoring its systems and services following a disruption.
In simpler terms: How quickly must the system be back up and running?

Availability, recovery time and recovery point comparison

While RPO focuses on acceptable data loss, RTO addresses the acceptable recovery time. Effectively managing both requires a combined approach, including replication methods, backup strategies, system validation, and automated failover mechanisms—ensuring consistency, high availability, and rapid recovery in the face of disruptions.

Overview of SAP HANA Replication

SAP HANA replication is a robust mechanism designed to duplicate data from a primary HANA system to one or more secondary systems. This real-time or near-real-time replication ensures that the secondary system mirrors the primary system’s memory and data, enabling rapid switchover when needed.

SAP HANA System Replication is a high availability feature offered by SAP to enhance the resilience of SAP HANA environments. It helps minimize downtime caused by planned maintenance, hardware failures, or disaster scenarios. In this setup, the secondary SAP HANA instance is a mirror of the primary system, maintaining an identical number of active hosts. Each service on the primary node continuously synchronizes with its corresponding service on the secondary node, operating in real-time replication mode to copy and persist both data and logs—typically preloading them into memory to enable rapid failover.

Ultimately, SAP HANA replication empowers organizations to remain resilient, highly available, and ready to meet the demands of today’s dynamic digital world.

System replication overview between primary and secondary database hosts

Here’s a list of important terminology in SAP HANA System Replication, along with brief definitions to help you understand or explain it clearly

Replication Modes

Replication mode defines how and when data is synchronized and transferred from the primary (source) system to the secondary (target) system in a high availability or disaster recovery configuration. It directly influences key aspects such as data consistency, latency, and system performance.

Purpose

The core purpose of replication modes is to control the timing and behavior of data synchronization between primary and secondary systems. It determines when and how changes made on the primary system are replicated to the secondary system, ensuring continuous data availability.

● Synchronous Replication

Synchronous replication is a real-time replication method that minimizes data loss by ensuring immediate and simultaneous transmission of data changes from the primary to the secondary system. It is ideal for environments with strict RPO requirements, as it guarantees that both systems remain in sync with zero data loss.
Secondary system sends acknowledgement to primary as soon as log data is received and persisted. Then primary commits. Response times will be slightly higher. Full-Sync option configures behavior in case of disconnect.

Key Features

● Asynchronous Replication

Asynchronous replication replicates data changes from the primary to the secondary system with a slight delay. This method offers lower latency compared to synchronous replication, making it suitable for environments where a small amount of data lag is acceptable and strict RPO is not critical.
The primary system commits the transaction after sending the log without waiting for a response. Entries are buffered on the primary system.

Key Features

● Synchronous in-memory Replication

The primary system commits the transaction after it receives a reply that the log was received by the secondary system. This may be before it was persisted at the secondary.

Operation Modes
Operation modes in SAP HANA System Replication ra efer to the state or behavior of a system in a high availability (HA) or disaster recovery setup. They determine the system’s role and functionality during normal operation, failover, or takeover scenarios.

Purpose
Operation modes define how a specific SAP HANA system responds under various conditions—such as during regular operation, failover, or takeover. They determine whether the system operates in a read-only, read-write, or standby state, enabling administrators to optimize performance, availability, and resource usage.

● Delta Shipping

Transfers only changed data (deltas) between the primary and secondary systems to reduce replication overhead.

● Log_replay

Replays logs on the secondary system for near real-time synchronization, ensuring better consistency.

● Logreplay_readaccess

Extends log_replay by allowing read-only access to the secondary system—ideal for reporting and analytics without impacting the primary system.

To perform SAP HANA system replication, we need to complete certain prerequisites:

SAP HANA system replication requires specific prerequisites to be met beforehand.

One of the most important aspect we also want to discuss in this paper is about the database backup. It is because backup is one of the criteria to be fulfilled before starting the hana system replication. Since we need to replicate logs from the primary node to the secondary node, it needs to be backed up before it is replicated. So, in order to store the logs, we have to configure the backup that can store the changes being persisted at the database.

HANA Cockpit backup configuration screen

We have configured performance optimized scenario in which secondary system is of the same size as primary. This configuration preload the column tables in-memory, and synchronous system replication. We installed the sap Hana database across availability zones in the same region.

SAP HANA System Replication on SLES for SAP Applications on AWS

SUSE’s approach automates the takeover process in SAP HANA system replication environments. While replicating data to a secondary SAP HANA instance ensures data availability, it doesn’t guarantee system continuity on its own. To enhance high availability, a cluster solution is required — one that manages the failover process and ensures seamless client access by handling the service address transition.

Cluster Solutions

SAP HANA deployments on AWS are architected to provide high availability and fault tolerance at the infrastructure level. However, failures at the SAP HANA database layer still require management. In the event of a hardware or software issue, a manual failover can be initiated using tools such as SAP HANA Cockpit, SAP HANA Studio, or the hdbnsutil command-line utility. These manual recovery procedures may lead to temporary disruptions in business operations.

The high availability setup for SAP HANA leveraging System Replication enables automated failover between the primary and secondary instances. Both instances are configured within a Pacemaker cluster, which operates at the OS level and integrates with the SAP HANA database through specialized hooks. This clustering solution continuously monitors the system and initiates automatic failover when needed. As a result, recovery can typically be achieved within minutes or even faster.

The Pacemaker cluster leverages a virtual IP address to route traffic to the active SAP HANA master instance. During a failover event, this virtual IP is reassigned to the standby instance, which is then promoted to become the new primary. On AWS, an overlay IP address is utilized for network configuration—this virtual IP consistently points to the active SAP HANA node, regardless of whether it resides on the original primary or the secondary system.

Architecture patterns

AWS organizes its infrastructure into distinct geographic locations known as regions and subdivides them further into Availability Zones (AZs). Deploying across multiple Availability Zones within a Region enhances fault tolerance and helps maintain consistent performance by reducing the impact of localized failures.

In a single-Region, multi-AZ setup, the secondary SAP HANA system can be deployed in a separate Availability Zone from the primary system within the same Region. This configuration supports fast failover during planned maintenance, storage issues, or localized disruptions, ensuring higher availability and operational continuity.

In our project, we have configured Active/Passive secondary system with Performance Optimized Scenario. System replication restricts read access and SQL querying on the secondary system until a takeover occurs, switching the active role from the primary to the secondary system. The secondary functions as a hot standby using the log replay operation mode. In the event of a failure of the primary SAP HANA system on primary node — whether due to a node or database instance issue—the cluster initiates a takeover process. This approach enables the secondary node to utilize pre-loaded data, making takeover significantly faster than a full local restart.

System replication for the production database is managed using the SAP HANA and SAP HANA Topology resource agents. The level of automation can be controlled using the AUTOMATED_REGISTER parameter. When enabled, the cluster automatically registers the former primary node as the new secondary after a failover.

SAP on AWS multi-AZ database replication topology

To automate SAP HANA System Replication in SUSE Linux, you need to orchestrate system replication configuration, monitoring, and takeover using a combination of:

1. SUSE High Availability Extension (HAE)
2. SAP HANA System Replication (HSR)
3. Pacemaker cluster
4. SAPHanaSR resource

Cluster Installation

When using SLES for SAP from the AWS Marketplace, SUSE HAE packages are already included. Check that you’re running the latest versions, and update via zypper as needed. Ensure that the following packages are installed.

corosync, crmsh, fence-agents, ha-cluster-bootstrap, pacemaker, patterns-ha-ha_sles, resource-agents, cluster-glue

Before proceeding with cluster configuration, the Pacemaker service should be in a stopped state. Confirm its status and stop it if active.

HSR Configuration

Primary node replication enablement command output — Enable Replication on the primary node

SAP on AWS replication configuration showing SIT system details

Secondary database registration command output — Register the secondary system

HANA Cockpit system replication status view

Replication mode and operation mode configuration

If we want to see the hana system replication from OS level, we should run the python script systemReplicationStatus.py using the sidadm user.

systemReplicationStatus.py command output

Once we configured the Hana system replication, we can now configure SUSE cluster to automate the Hana database takeover and failback in case of any database failure.

Pacemaker Cluster

1. Corosync configuration

2. Create encryption keys

After creation, the authkey file is located at /etc/corosync/. Copy it to the same location on the second node, making sure that file permissions and ownership remain unchanged.

hacluster password update command output — Update the hacluster password

Cluster startup command output — Start the cluster

Cluster resources

Cluster the bootstrap

It’s typically run before adding the HANA resource to the cluster or during initial cluster setup. When the stonith-action parameter is set to “off”, the agents initiate a shutdown of the instance during failover scenarios.

STONITH permission configuration for both nodes — STONITH - Grant permissions for both nodes to start/stop

Overlay IP resource command output — Overlay IP resource

SAPHanaTopology resource configuration — SAPHanaTopology

SAPHana resource configuration — SAPHana

Multi-state resource constraint configuration — Constraints - Multi-state (MSL)

Cluster status showing synchronized replication — Cluster Status

Bootstrap and fencing configuration output

SAP HANA High Availability

Takeover Procedure During An Outage

Initial Situation

o SAP NetWeaver is connecting to SAP HANA via the DBSL (Database Shared Library)
o Usually a virtual hostname (virt. IP address) is used to access the database host and the database instance on that host. Usually the Domain Names Service (DNS) translates virtual hostnames into corresponding virt. IP addresses which can move between network adapter ports.
o SAP HANA System Replication is working and secondary is in a synchronous or asynchronous state with primary SAP HANA instance.
o System Replication always tries to get in synchronous state
o With SYNC setup the primary waits for secondary to confirm operation of COMMITs

Incident happens, Take-over executed

o A cluster manager is checking on operational state of the setup and takes action if a failure is happening
o In case of this failure the cluster manager would isolate the box (drag virt. IPs away, even send a STONITH command) to prevent any further usage of primary host
o The orchestrator “cluster manager” also initiates the take-over, waits for the secondary to prompt the full operational state and finally moves the virtual IP address to the secondary host network port.
o With the move of the virtual IP address finally there is a living system again behind this interface and SAP NetWeaver sessions with work-processes can be reconnected to the secondary database instance

Follow-up and re-initiate SAP HANA System Replication in reverse direction

o Every committed transaction and related changes are available again on the take-over system.
o The resynchronization between new secondary and primary instance will take automatically. The resync will probably take some time.
o Here SAP HANA automatically choses the optimal way to fulfill this task of execution (delta-transfer).
o Only after this resync a takeover back to the initial situation (failback) can be started.

Client Connection

Connecting clients to SAP HANA using a virtual IP (VIP) on AWS typically involves configuring High Availability (HA) solutions that utilize a floating IP address for failover scenarios. This ensures continuous access to the SAP HANA database even if the primary instance becomes unavailable.
A virtual IP address (or overlay IP address) is configured within the HA cluster. This IP address is not permanently bound to a specific instance but floats between the active and passive nodes.
Clients (e.g., SAP applications, SAP HANA Studio, custom applications) are configured to connect to the SAP HANA database using this virtual IP address.
During a failover, the clustering solution automatically moves the virtual IP address to the new active SAP HANA instance. This ensures that clients can seamlessly reconnect to the database without manual reconfiguration, as the connection target (the VIP) remains constant.

Takeover procedure sequence for database outage

Final Takeaway

A successful SAP on AWS go-live depends on more than the application build

A successful SAP S/4HANA implementation on AWS is not only about installing the application and migrating data. The wider landscape has to be ready for real business use: secure connectivity, clear transport control, monitoring, backup, recovery, high availability, and
a tested continuity model all need to be in place before go-live.

In this implementation, the greenfield S/4HANA build was supported by a structured AWS architecture across development, quality, and production. SAP Best Practices were activated, legacy ECC data was migrated using Migration Cockpit, and the production environment was
designed with business-critical availability in mind.

The most important lesson is that Basis, infrastructure, and continuity planning cannot sit behind the functional workstream. They have to move alongside it. When the platform foundations are designed early, tested properly, and aligned to the business operating model,
the go-live becomes a controlled transition rather than a technical risk event.

Treat the AWS foundation as part of the SAP implementation, not a separate infrastructure task.
Design development, quality, and production with clear differences in capacity, availability, and purpose.
Complete backup and recovery configuration before relying on high availability controls.
Use Solution Manager, monitoring, and transport governance to keep the programme controlled.
Build production around real business continuity requirements, not just technical deployment success.
Validate the go-live approach across application, database, network, and operational support layers.

For Cloudwrxs, the value of SAP on AWS comes from combining cloud architecture with practical SAP Basis delivery. A resilient S/4HANA platform needs both: the flexibility and scalability of AWS, and the operational discipline required to keep SAP available, recoverable, and ready for business-critical workloads.

SAP S/4HANA implementation on AWS for a pharmaceutical business

The implementation process unfolded in a clear delivery sequence

SAP on AWS planning decisions determined the shape of the deployment

What were the options?

Development was built as a distributed S/4HANA environment

Filesystems

Solution Manager and Site-to-Site VPN supported the programme controls

High level Architecture

Quality mirrored development but added scale for SIT and UAT

SAP on AWS production was deployed as a highly available, multi-AZ landscape

Filesystems

Clients in S/4HANA Production system

SAP HANA Database Backup Configuration

Common Single Points of Failure in SAP

High-Availability (HA) Strategy

Pacemaker Architecture

SAP HANA DB high Availability

Achieving High Availability

Overview of SAP HANA Replication

SAP HANA System Replication on SLES for SAP Applications on AWS

Cluster Installation

HSR Configuration

Pacemaker Cluster

Cluster the bootstrap

Takeover Procedure During An Outage

A successful SAP on AWS go-live depends on more than the application build

Planning an SAP implementation on AWS?