Maintaining Mission Critical Systems in a 24/7 Environment. Peter M. Curtis

Чтение книги онлайн.

Читать онлайн книгу Maintaining Mission Critical Systems in a 24/7 Environment - Peter M. Curtis страница 17

Автор:
Жанр:
Серия:
Издательство:
Maintaining Mission Critical Systems in a 24/7 Environment - Peter M. Curtis

Скачать книгу

blackout of 2003 emphasized the interdependencies across the critical infrastructure and the cascading impacts that occur when one component falters. Most ATMs in the affected areas stopped working, although several had backup systems that enabled them to function for a short period. Soon after the power went out, the Comptroller of the Currency signed an order authorizing national banks to close at their discretion. Governors in a number of affected states made similar proclamations for state‐chartered depository institutions. The end result was a loss of revenue, profits, and a threat to the confidence in our financial system. More prudent planning and the proper level of investment in mission critical infrastructure for electric, water, and telecommunications utilities, coupled with proactive building infrastructure preparation, and operations, could have saved the banking and financial services industry millions.

      At the present time, the risks associated with cascading power supply interruptions from the public electrical grid in the United States have increased due to the ever‐increasing reliance on computer and related technologies. This has occurred while investments in the reliability and security of the grid have not kept pace with the levels recommended by industry experts. Today there are trillions of devices and billions of people connected to the world‐wide‐web. As the number of computers and related technologies continue to multiply in this increasingly digital world, the demand for reliability increases as well. Businesses are not only competing in the marketplace to deliver whatever goods and services are produced for consumption, but now they must compete to hire the best engineers from a dwindling pool of talent who can design the best infrastructures needed to obtain and deliver reliable power and cooling. This keeps the mission critical manufacturing and technology centers up and running with the ability to produce the very goods and services that sustain them. The idea that businesses today must compete for the best talent to obtain reliable power is not new, as are the consequences of failing to meet this challenge. Without reliable power, there are no goods and services for sale, no revenues, and no profits ‐ only losses when power is not available. Hiring and keeping the best‐trained engineers employing the very best analyses, making the best strategic choices, and following the best operational plans to keep ahead of the power supply curve is essential for any technologically sophisticated business to thrive and prosper. A key to success is to provide proper training and educational resources to engineers so they may increase their knowledge and keep current on the latest mission critical technologies available all over the world, which is one of the purposes of this content. In addition, companies need to pool their efforts toward improving educational opportunities and certification programs for young mission critical engineers to help address the decreasing workforce necessary to sustain the growing mission critical industry.

      In the world of high‐powered business, owners of real estate have come to learn that they, too, must meet the demands for reliable power supply to their tenants. As more and more buildings are required to deliver service guarantees, management must decide what performance is required from each facility in the building. Availability levels of 99.999% (5.25 minutes of downtime per year) allow virtually no facility downtime for maintenance or other planned or unplanned events. Moving toward high reliability is imperative. Moreover, avoiding the landmines that can cause outages and unscheduled downtime never ends. Event planning and impact assessments are tasks that are never truly completed; they should be viewed afresh at least once every budget cycle.

      The evolution of data center design and function has been driven, in part, by the need for uninterrupted power. Data centers now employ many unique designs developed specifically to achieve the goal of uninterrupted power within defined project constraints based on technological need, budget limitations, and the specific tasks each center must achieve to function usefully and efficiently. Providing continuous operation under all foreseeable risks of failure such as power outages, equipment breakdown, internal fires, and so on requires the use of modern design and modeling techniques to enhance reliability. These include redundant systems and components, standby power generation, fuel systems, automatic transfer and static switches, pure power quality, UPS systems, cooling systems, raised access floors, fire protection, as well as the use of Probabilistic Risk Analysis modeling software (each will be discussed in detail later) to predict potential future outages and develop maintenance and upgrade action plans for all major systems.

      Critical industries require an extraordinary degree of planning and assessing. It is important to identify the best strategies to reach the targeted level of reliability. In order to design a critical building with the appropriate level of reliability, the cost of downtime and the associated risks need to be assessed. It is important to understand that downtime occurs due to more than one type of failure: design failure, catastrophic failures, equipment failures or failures due to human error. Each type of failure will require a different approach on prevention. A solid and realistic approach to business resiliency must be a priority, especially because the present critical infrastructure is inevitably designed with all the eggs located in one basket.

      Within the banking and financial services, planning the critical area places considerable pressure on designing an infrastructure that evolves in an effort to support continuous business growth. Routine maintenance and upgrading equipment alone do not ensure continuous availability. The 24/7 operation of such service means an absence of scheduled interruptions for any reason, including routine maintenance, modifications, and upgrades. The main question is how and why infrastructure failures occur. Employing new methods of distributing critical power, understanding capital constraints, and developing processes that minimize human error are some key factors in improving recovery time in the event critical systems are impacted by base‐building failures.

Скачать книгу