SQL Server Virtualization Strategy: Itanium Replacement with x86 on VMware

Modernizing enterprise infrastructure often involves replacing aging hardware that no longer meets performance, compatibility, or support requirements. In one such case, an organization was running SQL Server 2008 on an Itanium-based platform. Although this system had served its purpose for many years, it was reaching the limits of practical operation. Vendor support was waning, and the Itanium architecture was becoming increasingly incompatible with newer software ecosystems. Investing in another generation of Itanium hardware was not a viable path forward.

The organization sought to migrate its environment to modern x86-based systems, with virtualization considered as a primary deployment model. At the same time, they wanted to evaluate competing x86 hardware options to ensure the best long-term fit for their workloads. Performance comparisons were essential to confirm that the transition would improve both efficiency and scalability.

Legacy Environment

The original production system relied on an Itanium server equipped with eight cores running at 1.6 GHz each. While the architecture was originally designed to excel at highly parallel workloads, limitations in application compatibility and the pace of technological change had reduced its appeal for enterprise workloads such as SQL Server.

Itanium processors had specific strengths in certain scenarios but lagged behind the flexibility and speed of modern x86 processors for many common database operations. Additionally, with fewer operating systems and software vendors continuing active development for Itanium, the future of such deployments was increasingly uncertain.

Modern x86 Platforms Considered

Two high-performance x86 server models were selected for comparison:

  • Server A featured four Intel Xeon E7-8870 processors, each with ten cores operating at 2.4 GHz, supported by 512 GB of RAM.

  • Server B included four Intel Xeon E7-4850 processors, each with ten cores operating at 2.0 GHz, also with 512 GB of RAM.

Both systems shared an identical storage architecture. Each had two 8 Gb Fibre Channel HBAs limited to 4 Gb connectivity. These were connected to the same EMC DMX4 storage area network, configured with 32 spindles per 500 GB LUN and backed by 128 GB of controller cache. Multipathing was implemented using PowerPath and PowerPath VE for optimal redundancy and throughput.

The use of identical storage ensured that any observed performance differences would be attributable to CPU performance, memory behavior, or architectural nuances rather than disparities in disk systems.

Benchmarking Expectations

Reference to SPEC CPU2006 benchmarks provided an initial indication that the newer Xeon processors would outperform the older Itanium cores. However, experience shows that synthetic benchmarks often fail to reflect true production performance, particularly when complex database workloads and virtualization are involved. As a result, comprehensive testing under realistic conditions was required to obtain meaningful performance data.

VMware Test Environment

VMware vSphere 5.0 U1 was chosen as the virtualization platform for both x86 systems. Each host was configured identically, adhering to established performance optimization practices for both hardware and software layers. Both were connected to the same SAN LUNs to maintain storage consistency.

A single virtual machine was provisioned to match the specifications of the existing production system as closely as possible:

  • Two virtual sockets, each with four virtual cores

  • 128 GB of virtual RAM

  • Three terabytes of VMDK-backed virtual disks

  • SQL Server 2008R2 installed and optimized using standard performance tuning procedures

By aligning virtual machine specifications with the current production server, the testing process ensured a direct comparison between the legacy Itanium environment and the proposed virtualized x86 systems.

Storage Performance Testing

The first phase of testing focused on storage performance using SQLIO, with a workload size of 50 GB. This workload mirrored the client’s average production database size. Tests were performed on Server A running Windows Server 2008R2 in two configurations: natively on the hardware and as a guest operating system within VMware.

Surprisingly, the virtualized configuration consistently outperformed the native configuration by an average margin of 10 percent. This result was unexpected, as typical virtualization overhead results in performance parity or a slight decrease rather than a measurable gain. A likely explanation was superior HBA driver efficiency when running within the VMware virtual hardware layer compared to the native Windows drivers.

High-Load Storage Testing

Following the initial tests, a more demanding scenario was introduced using a 200 GB workload. This was designed to exceed the SAN controller’s read and write cache capacity, forcing the workload to rely more heavily on the physical disk arrays. As expected, throughput declined once the cache was saturated. However, the observed reduction was only 21 percent, indicating that the underlying storage hardware provided robust sustained performance even under heavy demand.

The results also showed that both Server A and Server B performed similarly in storage throughput under identical conditions. While Server B exhibited a minor edge in IOPS and MB/s, the difference was not substantial enough to be a deciding factor in hardware selection for storage-intensive workloads.

Itanium Storage Results

The same storage tests were repeated on the production Itanium system connected to the identical SAN. In this case, performance fell noticeably short of the x86 systems. Despite identical connectivity and test parameters, throughput was significantly lower on Itanium. 

While the root cause was not fully investigated due to time constraints, probable factors included suboptimal HBA drivers, misaligned partitions, or limitations in the platform’s I/O handling. The results underscored the performance disadvantage of remaining on the Itanium architecture for storage-heavy SQL Server operations.

VMware High Availability Testing

Another important aspect of the evaluation was the measurement of recovery time during a host failure. VMware High Availability was tested by running a VM with three terabytes of SQL Server databases on Server B and then abruptly powering off the host. VMware HA responded by restarting the VM on an alternative host.

Across three trials, the SQL Server service was fully operational in approximately four minutes and thirty seconds after the failure. This recovery time was within acceptable limits for the organization’s requirements and demonstrated that VMware HA could provide reliable failover capabilities without extensive manual intervention.

SQL Server Performance Testing

With storage performance and high availability confirmed, the testing progressed to CPU and overall SQL Server workload performance. The DVDStore benchmarking utility from Dell was selected for this phase. This open-source tool generates realistic transactional workloads that stress CPU, memory, and storage simultaneously, making it ideal for simulating production-like conditions.

A 50 GB database was created within the virtual machine, and tests were conducted across a range of MaxDOP (maximum degree of parallelism) settings from 1 to 6. This allowed for evaluation of how SQL Server performed under both single-threaded and multi-threaded execution scenarios. The initial tests were run on Server A, after which the VM was migrated to Server B using vMotion. The same tests were then repeated to measure differences attributable to hardware.

Observations on CPU Speed and Architecture

One of the more interesting findings was that while Server B’s processors had a 20 percent higher clock speed compared to Server A, the observed performance improvement was only 4.8 percent under the DVDStore workload. When adjusting for clock speed differences, Server B’s architecture effectively delivered around 12 percent greater performance efficiency than Server A’s, indicating that factors beyond raw GHz contributed to SQL Server performance.

This highlighted the importance of considering not just CPU frequency but also architectural differences, memory bandwidth, and system-level optimizations when selecting hardware for SQL Server.

Comparing with Itanium CPU Performance

The same tests were conducted on the production Itanium environment for comparison. Under MaxDOP = 1, the virtualized x86 servers demonstrated a roughly 40 percent performance advantage over the Itanium system. This was significant because the client’s production environment was configured with MaxDOP = 1 for compatibility reasons.

Interestingly, as MaxDOP values increased, the Itanium platform’s relative performance improved. This was consistent with the processor’s design focus on parallel execution. However, in the client’s specific workload scenario, where lower MaxDOP values were required, the x86 platform still offered a clear advantage.

Impact of vCPU Allocation

Further testing explored the effects of virtual CPU allocation. The same VM configuration was tested with both 8 vCPUs and 32 vCPUs, ensuring alignment with NUMA nodes. Results showed that over-allocating vCPUs to a workload that did not fully utilize them actually degraded performance. This degradation was due to scheduling overhead within the hypervisor, where idle CPU cycles still had to be managed.

Only when workloads reached a level of concurrency sufficient to utilize all allocated vCPUs did the larger configuration pull ahead in performance. In this case, that occurred beyond MaxDOP = 3, at which point the 32-vCPU VM began to significantly outperform the smaller configuration.

Importance of Right-Sizing Virtual Machines

The vCPU allocation findings emphasized the importance of right-sizing virtual machines. Allocating more resources than a workload can actively use can lead to unnecessary scheduling delays and reduced efficiency. 

Instead, careful performance baselining is essential to determine the optimal resource configuration for each workload. By matching vCPU counts to actual usage patterns, organizations can maximize performance and minimize hypervisor overhead.

Deeper Analysis of CPU and SQL Server Performance

The DVDStore testing phase provided more than just a simple comparison of raw CPU power between the x86 systems and the Itanium platform. It revealed detailed patterns about how SQL Server workloads interact with CPU architecture, hypervisor scheduling, and system memory. These patterns are essential to understand when planning hardware refreshes, migrations, or virtualization projects.

The differences between Server A and Server B offered a case study in how clock speed, CPU design, and architectural efficiencies work together to determine actual database throughput. The findings also underscored that synthetic benchmarks such as SPEC CPU2006, while informative for general comparisons, do not always predict SQL Server’s real-world performance characteristics.

Performance Scaling and MaxDOP Behavior

The MaxDOP setting in SQL Server determines how many processor cores can be used to execute a single query. Lower settings limit queries to fewer cores, while higher settings allow broader parallelization. Choosing the right value for a given workload involves balancing throughput with overhead.

For the client’s workloads, the production environment had been operating with MaxDOP set to 1. This was done to address query plan stability, reduce certain concurrency issues, and control execution behavior in a predictable way. However, this setting also limited the platform’s ability to leverage multi-core CPU capabilities for single-query execution.

When tested under MaxDOP = 1, the virtualized x86 systems showed a dramatic performance lead over the Itanium platform. Server A and Server B both outpaced Itanium by approximately 40 percent in this configuration. This is critical because it represents the actual operational mode the client would continue using post-migration.

When MaxDOP was increased, the relative gap between x86 and Itanium narrowed. The Itanium processors began to perform more competitively as more cores could be engaged simultaneously. This behavior aligned with the platform’s original design, which favored highly parallel workloads. Nevertheless, in the context of the client’s actual requirements, the x86 systems maintained a clear advantage.

Comparing Server A and Server B in Context

Although Server B’s CPUs had a 20 percent higher clock speed than Server A’s, the DVDStore results did not show a 20 percent increase in workload throughput. The actual difference was just under 5 percent. This discrepancy might seem underwhelming at first glance, but when adjusted for clock speed, Server B’s architecture was about 12 percent more efficient. This efficiency gain could come from multiple factors, including improved cache design, better instruction pipelines, or differences in memory subsystem behavior.

The relatively small absolute performance difference between the two x86 servers raised important considerations. From a purely performance-driven standpoint, either system could support the client’s SQL Server workloads effectively. This allowed decision-making to include other factors such as acquisition cost, vendor support agreements, and operational considerations.

Impact of Virtual CPU Allocation

The project also examined how virtual CPU allocation affects performance. Two configurations were compared: one with 8 vCPUs and another with 32 vCPUs. Both were aligned to NUMA boundaries to ensure that memory access patterns were optimal and not artificially penalized by cross-node latency.

Under lighter workloads, the 8-vCPU configuration consistently outperformed the 32-vCPU setup. This counterintuitive result was due to the way the hypervisor schedules CPU resources. When too many vCPUs are assigned to a VM that does not actively use them all, scheduling delays occur. Even idle vCPUs require coordination, which adds overhead.

As workload concurrency increased, the performance gap narrowed. Beyond MaxDOP = 3, the 32-vCPU configuration began to take advantage of its additional resources, eventually surpassing the smaller configuration for highly parallel workloads. This confirmed that larger vCPU allocations are only beneficial when the workload has sufficient parallel execution demands to justify them.

Practical Lessons in Right-Sizing Virtual Machines

These results highlighted the importance of right-sizing virtual machines based on actual workload characteristics. Allocating excessive CPU resources can reduce performance rather than enhance it. The optimal approach is to monitor CPU utilization over time, identify peak concurrency requirements, and assign resources accordingly.

Right-sizing is not just about CPU. Memory, storage IOPS capacity, and network bandwidth should also be provisioned according to measured needs rather than theoretical maximums. Over-provisioning can lead to resource contention within the hypervisor, while under-provisioning can result in performance bottlenecks.

VMware vs Native Physical Performance

One of the most striking outcomes from the project was the observation that VMware, in certain scenarios, outperformed a native Windows Server 2008R2 installation on identical hardware. In the 50 GB SQLIO workload test, the virtualized environment was consistently about 10 percent faster.

Typically, virtualization overhead is negligible when best practices are followed, but it is rare to see a consistent gain over native performance. The most likely explanation in this case was that VMware’s virtual HBA implementation, combined with optimized driver support for Windows Server in a virtualized context, resulted in lower I/O latency and better queue handling than the physical HBA drivers in the native installation.

This finding underscores that virtualization should not be assumed to always impose a performance penalty. In some cases, the virtualization layer may provide optimizations that outperform certain aspects of bare-metal configurations.

Storage Performance Under Load

When the storage tests were expanded to a 200 GB workload, the intent was to exceed the SAN’s controller cache capacity and force more direct disk activity. As expected, throughput decreased once the cache was saturated. However, the drop was limited to around 21 percent, which is a strong indicator of robust underlying disk performance.

This stability under load was important for the client because their workloads included both predictable transactions and occasional large reporting queries that could generate sustained I/O activity. Knowing that the SAN could handle cache exhaustion without severe performance degradation provided confidence in the storage layer’s resilience.

Comparative Storage Results for All Platforms

In storage testing, Server A and Server B performed nearly identically when connected to the same SAN. Differences in IOPS and throughput were within normal variation and not significant enough to influence purchasing decisions. Both systems provided storage performance that exceeded the Itanium platform by a substantial margin.

The Itanium system’s weaker performance in identical storage tests raised concerns about driver maturity and platform-specific storage handling. Without the ability to perform a deep root cause analysis, exact reasons remained speculative. However, it was clear that the Itanium configuration was less efficient at handling I/O under identical conditions.

Observations on VMware High Availability Recovery

High availability testing demonstrated that VMware HA could restart a SQL Server VM containing 3 TB of databases on an alternate host in approximately four minutes and thirty seconds. This recovery time was consistent across multiple trials and met the client’s service availability objectives.

The process was completely automated, requiring no manual intervention beyond initial configuration. This capability significantly reduced the potential downtime risk associated with host hardware failures. For mission-critical database services, such automated recovery features provide substantial operational assurance.

Balancing CPU and Storage Considerations

While storage performance is often the first metric examined during database performance testing, CPU and memory throughput are equally important. In this case, the choice between Server A and Server B could not be made solely on raw CPU benchmarks or clock speeds. The efficiency of query execution under real workloads, the interaction between CPU cores and memory, and the impact of hypervisor scheduling all played roles in determining final performance.

The data suggested that although Server B had an architectural efficiency advantage, Server A’s performance was sufficiently close that other considerations such as total cost of ownership and vendor relationship could take precedence.

Importance of Workload Profiling

The results of the project reinforced the value of detailed workload profiling before making infrastructure decisions. The client’s production workload profile was characterized by moderate concurrency, controlled MaxDOP settings, and predictable database sizes. By recreating these conditions in a controlled test environment, the performance results were directly applicable to their operational reality.

Workload profiling also enables better resource allocation. For example, knowing that the workload rarely exceeded MaxDOP = 3 meant that allocating excessive vCPUs would offer no performance benefit and might even hinder responsiveness. Similarly, understanding the size and access patterns of the database allowed for more accurate storage performance predictions.

Considering Future Scalability

Although the testing focused on the client’s current workload requirements, future scalability was also a consideration. Both x86 servers provided sufficient headroom for anticipated workload growth. The ability to adjust vCPU allocation, increase memory, and expand storage without replacing hardware was a significant advantage over the fixed capabilities of the Itanium system.

Additionally, VMware’s flexibility allowed for the possibility of consolidating multiple workloads onto the same hardware without sacrificing performance. This provided potential cost savings through improved hardware utilization.

Reliability and Operational Benefits of Virtualization

Beyond performance, virtualization brought operational advantages. Features such as vMotion enabled live migration of virtual machines between hosts with no downtime, facilitating hardware maintenance and load balancing. Snapshots provided an additional safety mechanism during patching or configuration changes. Resource pools allowed for controlled distribution of CPU and memory among workloads.

These capabilities contributed to improved service availability and reduced administrative overhead. For a mission-critical application like SQL Server, these benefits were as important as raw performance metrics.

Deep Dive into Storage Performance Behavior

One of the most unexpected findings in the migration process was the behavior of storage performance between the virtualized and native environments. When using SQLIO with a 50 GB workload, the virtualized environment outperformed the native installation by approximately ten percent. This was contrary to the typical expectation that a native installation would either match or slightly outperform a virtualized one. The difference likely stemmed from optimizations in the VMware storage stack and the improved performance of HBA drivers within the virtual environment.

Scaling the workload to 200 GB pushed beyond the limits of the SAN cache, forcing the system to operate more directly against the spinning disks. Even in this state, the performance only dropped by around twenty-one percent. This demonstrated that the SAN’s underlying disk configuration and controller cache were both highly effective at sustaining throughput under heavy load. For enterprise workloads, this level of disk performance is critical in ensuring consistent responsiveness during peak periods.

On the Itanium platform, results were significantly lower. Potential reasons included legacy driver inefficiencies, possible misaligned partitions, and hardware architecture differences that impacted I/O handling. Although the storage subsystem was shared across all test platforms, the way each platform processed I/O requests varied enough to cause noticeable discrepancies in measured results.

Comparative Analysis of CPU Architecture

The CPU architecture differences between the tested systems played an important role in shaping performance outcomes. Server A and Server B both used Intel Xeon processors, but with different clock speeds and microarchitectural optimizations. Server B, with a 20% higher clock speed, was expected to deliver a proportionally higher throughput. However, the observed improvement in workload throughput was only 4.8%.

When the results were normalized for clock speed, Server B’s architecture still showed about a twelve percent efficiency improvement over Server A. This highlighted the importance of examining not only raw clock speed but also factors such as memory bandwidth, cache hierarchy, and instruction set enhancements.

The Itanium architecture, while designed with strong parallel processing capabilities, struggled under single-threaded workloads when MaxDOP was set to one. However, when MaxDOP was increased, Itanium performance improved noticeably. This was consistent with its design philosophy, which favors workloads that can fully exploit multiple execution threads and parallelism.

Virtualization and High Availability Performance

High availability testing provided valuable insights into operational resilience. In a simulated host failure, a virtual machine with 3 TB of SQL Server databases was automatically restarted on another host by VMware HA. The time to recover and bring SQL Server back online averaged around four minutes and thirty seconds over multiple trials.

This result demonstrated that even in large-scale configurations, VMware HA can provide a rapid recovery mechanism without the need for complex manual interventions. For organizations prioritizing uptime, this type of automated recovery capability is a compelling reason to adopt virtualization for mission-critical database systems.

Right-Sizing Virtual Machines for Workload Demands

During testing, it became clear that simply allocating the maximum possible number of virtual CPUs was not an optimal strategy. An eight-vCPU configuration performed better than a thirty-two-vCPU configuration when workloads were relatively small or when parallel processing demands were limited.

Performance scaling was observed only when the workload’s parallelism increased beyond MaxDOP = 3, at which point the larger vCPU allocation began to show its benefits. However, over-allocation could still introduce unnecessary scheduling overhead within the hypervisor, leading to inefficiencies.

This reinforces the value of regular workload baselining. By continuously monitoring CPU utilization patterns, organizations can right-size their virtual machines to match actual demand, thereby optimizing performance while minimizing resource waste.

Impact of Memory and Disk Configurations on SQL Server Performance

With 128 GB of assigned vRAM, the SQL Server instances were able to maintain a significant portion of the database in memory, reducing reliance on disk I/O for common operations. This configuration proved to be an effective way to absorb spikes in demand without introducing latency.

In addition, the use of VMDK-backed virtual disks provided flexible storage provisioning while still leveraging the SAN’s performance capabilities. The ability to resize and reconfigure storage without physical intervention contributed to the agility of the virtualized platform.

For the Itanium system, memory capacity was not the limiting factor, but differences in memory bandwidth and latency compared to the x86-based servers likely contributed to its relatively lower single-threaded performance.

Real-World Workload Testing with the DVDStore Benchmark

To simulate a transactional workload, the DVDStore benchmark was employed. With a 50 GB dataset, the benchmark stressed CPU, memory, and disk subsystems simultaneously. At low MaxDOP settings, the x86 virtual machines showed a significant advantage over the Itanium platform, delivering roughly forty percent higher throughput.

However, as MaxDOP increased, the gap narrowed due to the Itanium architecture’s strengths in parallel execution. This demonstrated that application-specific characteristics, such as query parallelism, can greatly influence the comparative performance of different hardware platforms.

Interestingly, while Server B’s higher clock speed helped in certain scenarios, the performance difference between Server A and Server B remained small in most real-world workloads. This suggests that for some database applications, architectural differences and system tuning have a greater impact than raw processor speed.

Observations on Virtualization Overhead

The assumption that virtualization inherently imposes a measurable performance penalty was challenged by the results of this project. In several cases, the VMware-hosted SQL Server instances performed on par with, or slightly better than, their native counterparts. This can be attributed to VMware’s optimization of hardware drivers, particularly for storage and network interfaces, and the ability to manage I/O operations more efficiently than some native OS configurations.

It is important to note that these results were obtained under carefully tuned conditions with best-practice configurations applied to both VMware and SQL Server. In less optimized environments, the balance could shift, highlighting the necessity of rigorous configuration management.

Long-Term Considerations for Virtualizing Mission-Critical Databases

The migration project also provided insight into long-term operational considerations for running large SQL Server deployments in a virtualized environment. Resource flexibility emerged as one of the key advantages. The ability to adjust CPU, memory, and storage allocations without downtime offered significant operational agility.

Additionally, the centralized management capabilities of VMware allowed for easier patching, maintenance, and monitoring. Performance baselining could be automated, making it easier to detect and address changes in workload patterns over time.

Lessons Learned from the Migration Process

A number of important lessons emerged from this migration and testing effort:

  • Performance differences between hardware platforms are not always predictable based solely on specifications.

  • Virtualization can sometimes enhance, rather than reduce, certain aspects of performance.

  • Optimal performance requires matching vCPU allocation to the actual workload’s needs, not just allocating the maximum available.

  • SAN performance characteristics, including cache size and disk configuration, have a major influence on database responsiveness.

  • Recovery times with modern virtualization high availability tools can meet the demands of mission-critical systems.

Broader Implications for Database Infrastructure Planning

These findings have broader implications for organizations considering a move away from legacy platforms such as Itanium. The combination of modern x86 hardware and enterprise virtualization platforms provides a viable path forward that can deliver both performance improvements and operational efficiencies.

In environments where legacy hardware support is limited, virtualization offers a way to extend the lifespan of critical applications while providing a foundation for future scalability. The key is to approach the migration with a detailed understanding of workload characteristics and a willingness to adapt configurations based on empirical performance data.

Conclusion

The migration and performance testing process revealed that modern x86 architectures combined with enterprise virtualization platforms offer a compelling alternative to legacy hardware for mission-critical SQL Server workloads. Through a structured approach involving baseline measurements, stress testing, and comparative benchmarking, it became evident that virtualization can not only match but sometimes exceed the performance of native deployments—particularly when leveraging optimized storage and network configurations.

One of the most significant takeaways was that theoretical hardware advantages, such as higher clock speeds or specialized architectures, do not always translate directly into superior real-world performance. Instead, the interplay between CPU efficiency, memory access speeds, storage subsystem capabilities, and database configuration ultimately determines workload responsiveness. The project demonstrated that right-sizing resources, fine-tuning parallelism settings, and aligning configurations with workload profiles are critical to achieving optimal performance.

High availability capabilities within virtualization platforms, such as rapid failover in the event of a host failure, further strengthen the case for virtualized SQL Server environments. The ability to reallocate CPU, memory, and storage resources dynamically provides the agility to adapt to changing workload demands without disruptive downtime.

For organizations evaluating infrastructure modernization paths, these findings underscore the importance of thorough testing before committing to a migration strategy. By validating performance expectations in a controlled environment, IT teams can reduce risk, plan for scalability, and ensure that mission-critical systems maintain high availability and reliability after the transition.

Ultimately, this project confirms that with careful planning, robust configuration, and an understanding of workload behavior, virtualization offers a powerful, flexible, and resilient foundation for running large-scale SQL Server deployments in the enterprise.