Commissioning of the data center is as crucial a step as having a sound design. It is frequently mistaken that commissioning is "one-step" in the process of establishing a new data center. This mistake can lead to serious implications in the data center life-cycle, including warranty, operating model, achieving expected capacity, and avoiding operational mistakes.
Commissioning Goals These critical activities of testing at various points are for the intent of achieving some important milestones:
- Establish Warranty starting periods
- Ensure that equipment is in working order upon arrival
- Provides evidence of a functioning system
- Gives scientific proof of system capacity
- Validates design intent (and defines design and implementation limitations)
Stages of Commissioning
There are several stages of commissioning that take place. In some instances, it may be appropriate to bypass some of these steps (depending on how much customization is done, how large the data center build will be).
The aim of the multiple layers of commissioning is to ensure a smooth and problem-free final commissioning exercise. The expectation is, with multiple stages and "smaller incremental testing" steps, there is less of a chance that "problems" will be discovered in the final testing stage.
In complex environments, these multiple test stages are invaluable in troubleshooting operational faults or malfunctions. This may be due to minor settings which require adjustment before optimal operation is achieved.
Factory Acceptance Testing
Commissioning begins with the manufacturing process. Testing at the factory (Factory Acceptance Testing) is utilized by all manufacturers in establishing the working condition of their component, to their specification.
In many cases, obtaining the results of the Factory Acceptance Test (FAT) will suffice as evidence that the system performed to expectations while at the factory.
Factory Witness Testing
Another type of commissioning is Factory Witness Testing (FWT). Factory Witness Testing is utilized when there is a change to standard specification for a piece of equipment. For instance, a Computer Room Air Conditioner (CRAC) may have a rated cooling capacity of 90 kilowatts. However, the specification is for a unit that provides 105 kilowatts. The manufacturer agrees to this capacity and new parameters are established for testing this unit's capability in the factory.
As an oversight step, someone from the Mechanical Engineering goes to the site to witness the testing and provides a unique and detailed set of tests by which to run. This can be a very critical step, as the factory "test bench" conditions may be the only reasonable place to test the full limits of the equipment.
This is common practice for AC Units (CRAC/CRAH), Generators, UPS, electrical switchgear (ATS, STS) in many implementations. However, if your data center is a small implementation, using well-established equipment from a credible manufacturer, the FWT step may not be required.
Startup Acceptance Testing
This stage of commissioning is frequently mistaken for "Commissioning", as in, when Startup Acceptance Testing (SAT) is done, then the Data Center is commissioned. On the other hand, some may ask, "Why do we need this testing? We proved it worked at the factory, so that should be all the testing needed." This is a mistake that is commonly made.
One of the aims we established at the start was to establish the Warranty period. Many manufacturers define the start of the warranty period in two ways: 1) From the point in time of delivery of the equipment to the site and/or 2) Upon installation of the equipment into the site, within 6 - 12 months (depending on the maker) of delivery of the equipment. It is important to understand that the warranty under all scenarios will start once the equipment is "powered on".
Startup Acceptance Testing is a crucial stage in many components' (very expensive components) life. Components should be powered on, and base configuration (operating set points) entered into the equipment. Transformers should be load tested, generators should be started and load tested, UPSs should be load tested, Cooling Units and their heat exchangers (chillers, cooling towers, condenser water) should be established as in good working order upon arrival.
To miss this step, and later find that equipment has failed (but has been sitting for six months or more without operation) may void the warranty. It may then also cause a delay, and certainly increased cost to the project.
Functional System Testing
But many times, Startup Acceptance Testing is not enough. This is where Functional System Testing (FST) may help to fill the gap. After powering up these systems, and verifying that they "turn on", and perhaps even function under a specific load, it still does not prove that equipment within a system can function properly together.
Consider for a moment, a bank of 3 generators. During Startup Acceptance Testing, the fuel supplies were connected individually to generators, and each generator started. The generator was placed on a load bank, and run for 3 hours to prove that it could operate at maximum expected capacity under load.
However, the load bank was insufficient to test all the generators together. This is common. However, we have not yet proved that the 3 generators (expected to be redundant, with the ability to manage the load even if one fails) will actually perform as such. In this case, we need to perform FST on systems within the data center.
Performing FST on the generator set will prove that they can run synchronized, under load (even if the test load is only that of 1 generator load). Each generator would be "failed" in succession to prove that the other two generators can continue to carry the load uninterrupted. This same type of functional testing can be utilized on Chillers, UPSs, AC units, any individual system. This is particularly true when there are multiple configurations of the same setup across a data center (multiple halls with independent equipment to service that hall).
For smaller implementations, (data centers with only a single hall, and one set of critical systems), this step is sometimes omitted, as it will still be picked up in the next stage of commissioning. For larger implementations, this helps to troubleshoot and isolate faults specific to one area of the data center, but may not be impacted by a larger set of shared systems.
Integrated Systems Testing
This is the final stage of commissioning, and arguably the most important step. While not necessarily advisable in most circumstances, it is theoretically possible to conduct only an Integrated Systems Test (IST) for the data center.
For data centers of 30 racks or less, this is not an unreasonable expectation. (This is due mainly to the simplicity of smaller data centers, and usually less redundancy implemented.) But for most, the IST is the critical stage in commissioning. For very large data centers, having performed the prior steps of commissioning will make for a smoother IST, and reduce the likelihood of a significant technical fault arising (minimizing troubleshooting involved).
It is the IST stage that will establish the data center design and operating expectations (or limitations). All redundancy will be tested. Load banks approximating the maximum load for each data hall will be deployed, and three modes of operation: Normal Mode, Maintenance Mode, and Emergency Mode will be established.
It is this stage of commissioning that proves that fluctuation to operating conditions in one system are balanced with another. For example, if the heat load increases in the data hall, the AC units, and their respective chiller plant or heat exchangers must adjust to this load increase and maintain temperature and humidity levels as defined in the operating set points.
If commercial (utility) power fails, the UPS must carry the load until generators commence to provide power to the site, and then transition back to utility power when the power is restored.
Maximum operating conditions will be tested (or as near as can be approximated), so if the data hall is expected to have 100 racks with an average load of 6 kilowatts per rack, and the room temperature maintained at 24C (600 kilowatt total load), then can the AC units maintain that temperature when that amount of load is applied to the room? (If not, how much load can it cool within maximum operation of the AC units with expected redundancy?)
If there are shortcomings or failures in operation, these are noted, and will either require remediation (in the case of say, a generator failing to start) or operational expectations may require adjusting (in the case of the cooling capacity achieving only a maximum of 540 kilowatts of the expected 600 due to other losses in the system that cannot be overcome).
Commissioning with GDCE
At Global Data Center Engineering, we have commissioned hundreds of data centers around the world, in some of the most complex environments imaginable. We have commissioned facilities at all levels of data center redundancy and resiliency, and all around the globe: in hot climates and cold climates, humid to dry, deserts to mountains. We understand the challenges of optimal operation in each unique environment. Commissioning a data center at 3,000 meters is not the same as commissioning a data center at 3 meters above sea level. Let our experienced team of data center commissioning professionals work to ensure the successful implementation and launch of your next data center.