Beruflich Dokumente
Kultur Dokumente
The Sun SPARC Enterprise T5120/T5220 Servers implement several new features and
enhancements. Overall, the servers feature improvements in processor throughput and IO
technology.
The next few slides describe the system specifications for the Sun SPARC Enterprise
T5120/T5220 Servers, starting with the processor.
The Sun SPARC Enterprise T5120/T5220 Servers are equipped with a single
UltraSPARC T2 processor on-board that operates at 1.2 and 1.4 gigahertz per second
(GHz/sec). The processor contains a 4 megabyte Level 2, or L2, cache, and four, six, or
eight cores, each of which has one floating point unit, or FPU, and supports eight threads.
Each core has a 16 kilobyte instruction cache and an 8 kilobyte data cache.
The CPU also supports up to 16 FBDIMM slots, each of which supports 1, 2, and 4
gigabyte DIMMs with a maximum of 64 gigabytes supported in the Sun SPARC
Enterprise T5120/T5220 Servers. The DIMMs are controlled by four memory channels,
each of which controls four DIMM slots.
The Sun SPARC Enterprise T5120/T5220 Servers implement several new features and
enhancements. Overall, the servers feature improvements in processor throughput and IO
technology.
The next few slides describe the system specifications for the Sun SPARC Enterprise
T5120/T5220 Servers, starting with the processor.
The Sun SPARC Enterprise T5120/T5220 Servers are equipped with a single
UltraSPARC T2 processor on-board that operates at 1.2 and 1.4 gigahertz per second
(GHz/sec). The processor contains a 4 megabyte Level 2, or L2, cache, and four, six, or
eight cores, each of which has one floating point unit, or FPU, and supports eight threads.
Each core has a 16 kilobyte instruction cache and an 8 kilobyte data cache.
The CPU also supports up to 16 FBDIMM slots, each of which supports 1, 2, and 4
gigabyte DIMMs with a maximum of 64 gigabytes supported in the Sun SPARC
Enterprise T5120/T5220 Servers. The DIMMs are controlled by four memory channels,
each of which controls four DIMM slots.
The system I/O is built on the PCIe bus, which provides support for:
Three standard half length/half height PCIe expansion slots on three riser boards in the
Sun SPARC Enterprise T5120 server or
Six standard half length/half height PCIe expansion slots on three riser boards in the Sun
SPARC Enterprise T5220 server
In addition to PCI Express expansion cards, two of the expansion slots can alternately
accept Sun proprietary XAUI-based cards to support 10Gbps Ethernet networking.
Note: XAUI cards are only supported in slots 0 and 1.
Internal storage in the Sun SPARC Enterprise T5120 server is handled by four small form
factor (SFF), 2.5”, internal SAS hard disk drives. Both the Sun SPARC Enterprise T5120
and T5220 Servers currently support 73 or 146 gigabyte disks at 10,000 revolutions per
minute, or RPM. The disks are attached to a SAS and SATA disk backplane and
managed through an LSI 1068E SAS/SATA disk controller.
The Sun SPARC Enterprise T5220 server supports eight small form factor (SFF) SAS
disk drives.
System components that affect the Sun SPARC Enterprise T5120 server’s environment
include two 650 watt, hot-swappable power supply units (PSUs) and four hot-swappable
system fan trays, 2 fans/ tray, providing N+1 redundancy.
System components that affect the Sun SPARC Enterprise T5220 server’s environment
include two 750 watt, hot-swappable power supply units (PSUs) and three hot-swappable
system fan trays, 2 fans/tray, providing N+1 redundancy.
For monitoring environmental conditions, both servers have an Integrated Lights Out
Management, or ILOM, controller with an RJ-45 serial management port and a 10/
100BaseT Ethernet Management port
System status LEDs provide system and component status information to administrators.
Essentially, any application that is vertically and/or horizontally scalable today can
benefit from CMT.
These include real-world applications such as:
Databases
Applications servers
Transaction processing
And web-based services
Primary market segments that require these services include:
Financial services
Telecommunications
Government
Retail organizations
And professional services
With the introduction of the new processor and architecture, the Sun SPARC Enterprise
T5120/T5220 Servers are the right servers for customers whose applications are highly
threaded, using parallel threads, or require large instruction or data working sets. The
CMT chips can handle certain types of software and tasks better than other high-end 64-
bit chips. Any application that creates network traffic, such as Web services software,
Java, Web servers, and application servers, can benefit from spreading lots of software
threads across low-powered processor cores.
Several servers compete in the same market as the Sun SPARC Enterprise T5120/T5220
Servers. Competition arises primarily from IBM, HP, Fujitsu, and Dell.
The trend toward multi-core processors is nothing new: IBM currently ships servers with
the dual-core Power4+ and Power5 chips; HP uses the dual-core with PA-RISC
processor; Fujitsu implements the SPARC64-V chip; Dell ships servers with the Intel
Xeon MP and EM64T processors; while Advanced Micro Devices is supplying an
Opteron processor to the industry.
Sun is well positioned to handle the boost in performance with the Solaris™ Operating
System. Unlike most desktop operating systems, the Solaris OS can leverage CMT to run
64 simultaneous threads. For threaded applications, future UltraSPARC™ processors will
deliver up to 50 times the performance of today's fastest UltraSPARC processor, without
a significantly higher cost-per-chip.
And just as important as speed, CMT enables administrators to consolidate the
infrastructure of dozens of servers onto a single server. Imagine the savings in
administration, maintenance, power, cooling, and floor space. The management and
maintenance savings are as phenomenal as the technology itself.
T5120 T5220
Parity Checking Parity Checking
N + 1 redundant hot-pluggable power N + 1 redundant hot-pluggable power supplies
supplies
N + 1 redundant hot-swappable fans N + 1 redundant hot-swappable fans
Hot-pluggable disk drives Hot-pluggable disk drives
Environmental monitoring Environmental monitoring
Remote access using ILOM Remote access using ILOM
Several key technologies are being used in the Sun SPARC Enterprise T5120/T5220
Servers. These technologies help to increase bandwidth, connectivity, system throughput,
and computing throughput through enhancements to the architecture, processors, and
system busses.
The Sun SPARC Enterprise T5120/T5220 Servers offer the following major features:
Dense packaging in a reduced footprint
Implementation of a four, six, or eight core Sun UltraSPARC T2 processor
On-chip multithreading technology to maximize the use of the system processor
High performance busses and memory interconnect systems
Improved input and output (I/O) throughput through the incorporation of PCIe
Architectural enhancements to system platforms to improve system upgrades, memory
management, and the I/O infrastructure
Increased on-board network connectivity
The Sun SPARC Enterprise T5120/T5220 Servers are the first of Sun’s servers to
implement the Sun UltraSPARC T2 processor. This processor uses four, six, or eight 64-
bit SPARC cores, each of which supports eight threads running concurrently. The servers
support one processor, and therefore support a maximum of 64 threads. The Sun SPARC
Enterprise T5120/T5220 Servers use the on-chip multithreaded processor to minimize the
CPU idle time, thus ensuring that multiple threads have access to the processor and do
not have to wait for the processor to be completely freed to gain CPU access time.
The processor is designed for highly threaded transactional processing. This means that
the time usually spent waiting for memory access has been minimized, which maximizes
the core utilization processes.
The Sun SPARC Enterprise T5120/T5220 Servers implement the PCI-Express, or PCIe,
as the primary I/O bus on the server.
PCIe offers an advantage of speed and data transmission rate; it operates at 2.5 GHz and
transmits data at approximately 250 Mbytes/ sec in each direction. PCIe provides a true
bidirectional link, so with one link available, the server offers a data transmission rate of
approximately 500 Mbytes/sec.
The Sun SPARC Enterprise T5120 Server has two PCIe x4 slots and one PCIe x8 slot
while the Sun SPARC Enterprise T5220 Server has four PCIe x4 and two PCIe x8 slots.
Note: The x4 and x8 nomenclature refers to the number of lanes. Each lanes supports 250
Mbytes/second.
The Sun SPARC Enterprise T5120/T5220 Servers utilize the UltraSPARC sun4v
architecture, a hyper-privileged architecture.
Enhancements in this line include:
Use of the Sun UltraSPARC T2 processor which supports multiple threads on multiple
cores and its conformance to the new architecture by segregating hardware-specific
drivers from the OS.
Implementation of virtualization, which introduces a layer between the operating system
and the platform that removes the need for the operating system to have direct register
access to the processor, memory, and critical I/O devices. This benefits customers
because hardware can be upgraded without changing the software infrastructure.
The Sun SPARC Enterprise T5120/T5220 Servers provide support for a new Ethernet
driver, referred to as e1000g. This new driver provides the following features:
Support for data link provider interface (DLPI) version 2, which enables a data link
service user to access and use a variety of conforming data link service providers without
special knowledge of the provider's protocol;
Physical layer static configuration using FORTH code, or FCode properties;
Portability to Solaris X86 and SPARC platforms through device driver interface, or DDI
frameworks;
Fault management infrastructure support, which provides error handling and management
capabilities;
Message signaled interrupts, or MSI support, which allows the devices to communicate in
a peer-to-peer manner without the involvement of the host CPU.
These implementations bring the gigabit Ethernet interface in line with Sun’s other
network interface drivers.
The Sun SPARC Enterprise T5120/T5220 Servers implement RAS features, which
enable the system to maintain a higher uptime rate. RAS features are incorporated across
both hardware and software components, including:
System busses
PCI bridges
Processors
Memory management
System power
And system cooling
The inclusion of RAS features in these entry-level servers continues Sun’s commitment
to providing enterprise-level services to lower-cost solutions. The RAS features included
in the Sun SPARC Enterprise T5120/T5220 Servers include:
N+1 redundant hot-pluggable power supplies and hot-swappable fan modules
Extended ECC protection on L2 cache data path and memory interface
DRAM extended ECC, which allows the detection of up to 4-bits in error, as long as they
are on the same DRAM
Standardized error message generation
Environmental monitoring with the inter- integrated circuit (I2C) control serial bus
Remote access using ILOM-based hardware and software components
Remote monitoring using Sun Net Connect
The Sun SPARC Enterprise T5120/T5220 Servers implement several key technologies
that give it a great advantage in the entry-level server market for companies that require
database servers, web servers, and application servers.
In this module, we have discussed:
An overview of the Sun SPARC Enterprise T5120/T5220 Servers, including their
features, target markets, and target applications
The key technologies driving the Sun SPARC Enterprise T5120/T5220 Servers,
including the new processor, architecture, and use of a faster PCI-based bus.
The Sun SPARC Enterprise T5120/T5220 servers have the following circuit boards
installed in the chassis:
Motherboard
Power distribution board (2 in the T5220)
Paddle board
USB board
Disk backplane
Fan boards (2)
PCIe riser cards (3)
The motherboard is actually an assembly made up of the motherboard itself and a tray or
carrier. The motherboard assembly comes in several different versions, with the only
difference being processor speed and the number of cores.
The motherboard includes a direct-attach CPU module, slots for 16 DIMMs, memory
control subsystems, and all system controller (ILOM) logic.
In addition, a removable NVRAM contains all Mac addresses, host ID, and OpenBoot
PROM configuration data. When replacing the motherboard, the NVRAM can be
transferred to a new board to retain system configuration data.
The service processor (ILOM) subsystem contains a PowerPC Extended Core, and a
communications processor that controls the host power and monitors host system events
(power and environmental). The ILOM controller draws power from the host’s 3.3V
standby supply rail, which is available whenever the system is receiving AC input power,
even when the system is turned off.
The power distribution board distributes main 12v power from the power supplies to the
rest of the system. It is directly connected to the paddle card, and to the motherboard via
a bus bar and ribbon cable.
The power supply backplane carries 12V power from the power supplies to the power
distribution board via a pair of bus bars.
In the Sun SPARC Enterprise T5120, the power supplies connect directly to the power
distribution board.
The paddle board is an assembly made up of the board, a metal mounting bracket, and a
top cover interlock or “kill” switch.
The paddle board serves as the interconnect between the fan connector boards, and SAS
backplane.
The disk backplane includes the connectors for the SAS drives, as well as the
interconnect for the USB board, Power and Locator buttons, and system/component
status LEDs. There are two different SAS backplanes, depending on the server form
factor:
1U - Four-disk backplane
2U - Eight-disk backplane
The USB board connects directly to the SAS backplane. It is packaged with the DVD
drive as a single customer-replaceable unit (CRU).
There are three PCI riser cards per system, each attached in slots to the rear of the
motherboard.
In 1U systems, each riser card supports one card.
In 2U systems, each riser supports two cards.
PCI riser cards come in 2 different versions.
One that can support both the x8 PCIe card or an XAUI card, and
One that supports x16 PCIe.
Slots on the motherboard are keyed so that you can only plug in the correct type of riser
card into the motherboard.
Note: The slots that you see on the motherboard are not industry standard PCIe slots.
They are Sun proprietary slots that only accommodate the Sun riser cards.
Most of the electrical connectivity in the Sun SPARC Enterprise T5120/T5220 servers is
accomplished through connectors on the system’s infrastructure boards.
The only system cables in the chassis are:
PDB to MB ribbon cable
Horizontal to vertical PDB ribbon cable (2U only)
Disk backplane to MB cable (1 in the 1U, 2 in the 2U)
Top cover, interlock switch cable
The diagram shown illustrates the overall system architecture for the Sun SPARC
Enterprise T5120/T5220 servers.
The top, center portion of the diagram depicts the UltraSPARC T2 CPU and the memory
architecture. The processor uses four memory channels, each of which manages one of
four banks, each with four FBDIMM slots.
There are three DC-DC converters required to deliver power to the processor and
FBDIMM connectors.
To the right of the UltraSPARC T2 processor, you will find the service processor
architecture. The service processor connects to the CPU through the serial system
interface (SSI) communications bus.
To the left of the UltraSPARC T2 processor, is the LSI 1068E SAS/SATA disk
controller.
Finally, the lower half of the architectural diagram shows the I/O architecture. The
UltraSPARC T2 CPU interfaces with the I/O through three PCIe switches.
We will provide an in depth explanation of all of these sections.
The illustration depicts all of the on-die components and data paths among the different
components for the UltraSPARC T2 processor.
Here, you have eight cores communicating through the Cache Crossbar (CCX) to the 4
MB total of 16-way associative L2 cache and then through the MCU controller channel,
labeled MCU 0 through 3.
You’ll notice in the diagram that each CPU core has its own floating point unit (FPU).
Each core also has its own crypto unit.
Also there are two 10GB (XAUI) interfaces and two PCI-Express interfaces are
integrated onto the chip.
The UltraSPARC T2 processor is designed to operate with the first generation industry
standard Fully Buffered Dual In-line Memory Modules (FBDIMMs).
This feature dictates the memory architecture of the Sun SPARC Enterprise
T5120/T5220 servers.
The UltraSPARC T2 processor has four memory controllers, referred to as memory
branches. Each branch has two channels. Each channel supports 10 Southbound (from
MCU to memory) and 14 Northbound (from memory to MCU) high-speed serial lanes
utilizing differential pair signaling.
The UltraSPARC T2 processor allows up to a maximum of 32 FBDIMMs to be accessed,
but is limited to 16 FBDIMMs in the Sun SPARC Enterprise T5120/T5220 servers.
An FBDIMM module is comprised of standard DDR2 memory chips with an additional
high speed serial link device, called an Advanced Memory Buffer (AMB) chip, which
buffers all memory chip address, data and control signals from the outside world. The
AMB serializes this information into high-speed, differentially driven data links that form
point to point connections from the processor to the first FBDIMM, or from one
FBDIMM to another FBDIMM in a daisy-chain fashion. Since the Sun SPARC
Enterprise T5120/ T5220 Servers utilize FBDIMM memory modules that contain the
AMB chip within the module, only the processor-to-DIMM and DIMM-to-DIMM link
signals must be routed on the motherboard.
The Sun SPARC Enterprise T5120/T5220 servers support three DIMM sizes at revenue
release, including:
1 GB, 2 GB and 4 GB.
Because all four branches need to be populated, the following three memory
configurations are supported:
4 FBDIMMs, 8 FBDIMMs and 16 FBDIMMs.
In total, the system supports a maximum total system memory of 128GB. The DIMMs
supported are FBDIMMs which must all be the same density within a memory branch..
Individual, faulty FBDIMMs can be replaced, meaning FBDIMMS do not have to be
replaced in pairs.
We’ll start our discussion of the I/O architecture with the two XAUI ports on the
UltraSPARC T2 processor. These control the two XAUI slots on the motherboard.
To the left of the XAUI ports is a PCIe port on the UltraSPARC T2 processor.
The first component on the PCIe port is the ST Probe. This is a Soft Touch probe which
is used for diagnostics.
Below the ST Probe is the first PCIe switch.
Coming off the bottom of the first switch is a PCIe x8 slot. Topologically, this is the
closest to the processor. It will have the least latency. All the other slots, which includes
two x4 slots in the Sun SPARC Enterprise T5120 and an additional x8 slot and two
additional x4 slots in the case of the Sun SPARC Enterprise T5220, are connected to
another PCIe switch, which is cascaded off the first PCIe switch.
The third PCIe switch, shown on the left side of the diagram, controls the remaining on-
board I/O which includes:
Four Gigabit Ethernet interfaces
Four USB ports
Two in the front
Two in the rear
DVD-ROM drive
The SAS/SATA controller provides support for embedded mirroring for the internal disks
of the Sun SPARC Enterprise T5120/T5220 servers. It supports RAID levels 0 and 1,
striping, and mirroring, respectively. This allows you to mirror the boot drive. Support
for mirroring would not require software intervention, but instead relies on the controller.
The SAS/SATA controller also provides external 32-bit support for Flash ROM and non-
volatile static random access memory (NVSRAM).
The controller implements the Fusion-MPT, or message passing technology architecture,
which features a performance-based message passing protocol. By managing all I/O and
coalescing interrupts to minimize system bus overhead, the controller requires small
device drivers that are independent of the I/O bus. This represents a savings as a single
device driver is used for SAS/SATA, SCSI, and fiber channel (FC).
• Reset control
• Clock control
• Power control
• Interrupts for the ILOM and UltraSPARC T2 processor
A mailbox communications system is used by both the UltraSPARC T2 processor and the ILOM
to gain indirect access to these functions. The mailbox pointers are held in the FPGA registers
while data is maintained in the SRAM. Both the processor and the ILOM use the mailbox, and
therefore the SRAM, to pass data between each other.
The Virtual Blade System Controller (vBSC) controls most FPGA functionality and provides
interfaces for ILOM to call these functions when required, such as booting the system and
configuring specific options.
I2C Devices
The I2C bus is used on the service processor for the TOD functionality, several SEEPROMs, and
to monitor the host server’s environmental monitoring devices.
The OSP card supports several devices monitored on the I2C bus for itself, including:
• IMAX
• ILOM
• NVRAM
• FRU PROM
• TOD clock
• Socketed system configuration PROM containing host identification and MAC address
The I2C bus is also responsible for monitoring the host server’s monitoring and control devices,
which includes temperature sensors, power supply and fan status, and LEDs.
The serial port implements the full complement of RS-232-style modem controls. It can be
connected to a terminal server or to a modem to gain access to the service processor and
therefore console access to the Sun SPARC Enterprise T5120/T5220 servers.
The service processor draws power from the 3.3V standby that is routed through the
motherboard from the PSUs, regardless of whether the server is on.
When power is applied to the system and the 3.3VDC receives power, the service
processor boots.
As predicted by Gordon Moore in his article in Electronics magazine in 1965, the number
of transistors on a chip has doubled approximately every two years. This has come to be
known as Moore's Law.
In addition, the law shows an equivalent doubling of the clock frequency within the same
time frame. However, the memory speeds have not kept pace. DRAM doubled in speed
every 6 years, leaving a growing speed gap between processors and memory.
High speed processors can execute instructions at a fast rate. This execution instruction,
called a thread, is made up of compute time and, for memory access instructions, memory
latency time.
The compute time can be fast, but overall processor performance is stalled whenever a
memory access instruction is performed. Due to the speed gap that exists between
processors and memory, the processor can spend up to 75 percent of its processing cycles
waiting on memory.
One way to take advantage of this memory latency is to start another thread that the
processor can execute, while the first thread is stalled.
This concept, known as multithreading, or MT, allows multiple instruction streams to be
executed within the same period of time.
An MT processor has multiple sets of registers and other thread states which allow
threads to execute either simultaneously, if the processor can physically support this, or
to be switched off when one thread is delayed waiting for data or instructions, such as
memory access.
Adding more transistors within the chip to support more threads in parallel, improves the
overall processor performance, and significantly lowers the effect of the memory latency.
With multiple threads, cycles that would otherwise be wasted are available as compute
cycles for another thread.
Another method of improving processor throughput is through chip multiprocessing, or
CMP.
CMP duplicates the processing unit of the processor, where multiple cores are included
on a single chip. Functionally, this allows one thread to be active at a time on each core.
This improves utilization of chip resources.
Because physical resources are now available with a multiple cores, that core processes
its thread independently from the other cores. The number of parallel threads that can be
executed has increased.
To realize the large opportunity for throughput that the processor provides, the operating
system must be able to effectively manage the individual cores and threads of the
processor.
Applications and the operating system should be able to fully utilize all cores so that the
CPU utilization rate is high. Threads are switched in and out quickly, depending on their
need. If a thread requires memory access as a result of a memory cache miss, another
thread can continue to execute its instructions until it has been moved off to make room
for another thread. Keep in mind that each thread has its own set of registers, so its state
is maintained.
The operating system, specifically the scheduler, is vital in scheduling threads for the
CMT environment. While for all appearances, the operating system appears to treat these
cores as individual processors, support is added to various OS subsystems to ensure that
they are CMT-aware.
The Solaris OS treats these logical processors like traditional SMP processors or
symmetrical multiprocessing processors, scheduling runnable threads across them.
However, there is an awareness of grouping, in which threads are organized by the core
they are associated with. This helps the kernel with scheduling, as it alerts the system to
which threads are sharing resources so that it can better manage scheduling policies to
improve performance, primarily through cache utilization and maximizing aggregate data
path bandwidth.
Unless bound to a processor by the user, the OS scheduler must decide which threads
should be run on the same processor or on separate processors.
The scheduler uses several methods to balance loads and improve performance,
including:
Load balancing in which the scheduler uses a scheduling policy to distribute the
workload across the logical processors to help maximize per-chip resource availability. It
looks at the current load on a core to determine the best fit for the thread requiring
execution time. If the core is underloaded, it can handle another thread without straining
access to resources.
Thread-to-cache affinity looks at a thread’s timestamp to see the last time it ran on its
prior CPU. If it is less than a predetermined time, the thread is reassigned to the CPU.
This method also takes into account the size of the cache to ensure the thread has the
amount of resources required to complete its tasks efficiently.
Shared run queues provide an advantage with shared cache. Once a thread runs on a
strand within a specific core, there may be no disadvantage to it running on any other
strands in the same core. Hence, the logical processors on a core share a dispatch queue.
Updating CPU performance counters, or CPCs, to make them more flexible and available
to the kernel, as well as to help make better scheduling decisions based on workload
characterizations
This module describes the CRU and FRU removal and replacement procedures for
the Sun SPARC Enterprise T5120/T5220 server.
In this module we will discuss the procedures that will assist you in replacing the various
hardware components of the Sun SPARC Enterprise T5120/T5220 server. Some of the
components are designed to be replaced by the customer. These components are referred
to as Customer Replaceable Units, or CRUs. Other components should be replaced by an
authorized, trained Sun engineer. These components are referred to as Field Replaceable
Units, or FRUs.
The following Sun SPARC Enterprise T5120/ T5220 components can be serviced by the
customer:
Fan modules
Power supply units of which there are 2 units accessible from the rear of the server.
PCI cards and riser cards
DDR2 FBDIMMs represents the memory available in 1G, 2GB and 4G FBDIMMs.
System battery
SAS disk drives
Rail mount kit which is used to rack mount the Sun SPARC Enterprise T5120/T5220
server
Cable management arm which is used to streamline the cables in the rear of the server.
The following Sun SPARC Enterprise T5120/ T5220 server components should be
serviced by an authorized, trained engineer:
When servicing the internal components to the Sun SPARC Enterprise T5120/T5220
server, be sure to:
Turn off all peripheral devices connected to the server.
Turn off the server itself, except in the case of hot-swap components.
Label and disconnect all of the cables coming into the server.
And finally, ensure that you follow ESD precautions.
Before working on the Sun SPARC Enterprise T5120/T5220 server be sure to have:
A Flat blade No.1 screwdriver and a No. 2 Phillips head screwdriver
An Electrostatic discharge (ESD) mat
A grounding wrist or foot strap
Hot-swappable components are those that you can install or remove while the system is
running, without affecting the system’s performance. However, you might have to
prepare the operating system before the hot-swap operation is performed.
The following components are hot-swappable in a Sun SPARC Enterprise
T5120/T5220 server:
Note: The system fans are hot-pluggable. No preparation of the operating system is
required before removing and replacing system fans.
The top cover for both the Sun SPARC Enterprise T5120 and T5220 server includes an
integrated, latched door for access to the hot plug fans.
Depending on the component that you are servicing, you might need to remove the top
cover.
Click the top link on your screen for a printable text based procedure on the removal and
replacement of the top cover.
Click the bottom link on your screen for an animated demonstration of the removal of the
top cover.
1. Push the 2 fan door latches toward the rear of the server and open the fan door.
2. While depressing the top cover push button latch, slide the cover toward the rear of the
chassis approximately 0.5 inch (12 mm).
3. Grasp the cover by its edges and lift it straight up from the chassis.
1. Set the cover on the chassis so that the tabs on the cover align with the notches in the
chassis.
2. Slide the cover toward the server front approximately 0.5 inch (12 mm), ensuring that the
push button latch engages.
3. Close the fan door and push down near the two latches until they snap in place.
For each of the CRUs listed, click the component name for a printable text based
procedure on the removal and replacement of that component.
Click the link provided to view an animated demonstration on servicing CRUs.
On the Sun SPARC Enterprise T5120, the Fan Fault indicators are located on the fan
board.
On the Sun SPARC Enterprise T5220, the Fan Fault indicators are located on the fan
modules.
4. Pull up on the fan module handle until the fan module is removed from the chassis.
Note: The removal procedure for the Sun SPARC Enterprise T5120 and T5220 fan
modules is the same. The only difference is that the Sun SPARC Enterprise T5120
server has 4 fan modules (8 fans) and the Sun SPARC Enterprise T5220 server has 3
fan modules (6 fans).
1. With the top cover door open, install the replacement fan module into the server. The fan
modules are keyed to ensure they are installed in the correct orientation.
2. Apply firm pressure to fully seat the fan module.
3. Verify that the Fan Fault indicator on the replaced fan module is not lit.
4. Close the fan door.
5. Verify that the Top Fan indicator, Service Required indicators, and the Locator
indicator/Locator button are not lit.
1. If the server is in a rack with a cable management arm attached, swivel open the cable
management arm to view the power supplies.
2. Identify which power supply you will replace. Each power supply has an amber LED that you
can view from the rear of the server. If the amber LED is on, the power supply is faulty and should
be replaced.
3. Disconnect the AC power cord from the power supply that you are replacing. The power
supplies are hot-swappable, so you do not have to shut down the server or disconnect the other
power supply.
Note: The Service Action Required LEDs on the front panel and back panel blink when a
power supply is unplugged.
a. Grasp the power supply handle and push the thumb latch toward the center of the power supply.
b. While continuing to push on the latch, use the handle to pull the power supply from the chassis.
1. Align the power supply with the empty bay in the chassis.
2. Press the power supply into the bay until it firmly engages the connector on the power
distribution board. It is fully seated when the thumb-latch clicks into place.
4. Swivel any cable management arm back into the closed position.
1. Unpackage the replacement PCIe or XAUI card and place it on an antistatic mat.
2. Locate the proper PCIe/XAUI slot for the card you are replacing or
3. If necessary, review the PCIe and XAUI Card Guidelines to plan your installation.
4. Disconnect any data cables connected to the cards on the PCIe/XAUI riser being
removed. Label the cables to ensure proper connection later.
5. Remove the riser board.
a. Remove the #2 Phillips screw securing the riser to the motherboard.
b. Slide the riser forward and out of the system.
6. Insert the PCIe/XAUI card into the correct slot on the riser board.
7. Replace the riser board.
a. Slide the riser back until it seats in its slot in the back panel.
b. Replace the #2 Phillips screw securing the riser to the motherboard.
8. Re-install any data cables connected to the cards on the PCIe/XAUI riser being installed.
9. Install the top cover.
Note: The procedures are the same for both the Sun SPARC Enterprise T5120 and the
Sun SPARC Enterprise T5220.
4. Locate the memory module socket in which you will remove an FBDIMM. Press the FBDIMM
fault button.
The FBDIMM fault button is located on the motherboard near the FBDIMMs.
Faulty FBDIMMs are identified with a corresponding amber LED on the motherboard.
5. Push down on the ejector tabs on each side of the FBDIMM until the FBDIMM is released
4. Locate the memory module socket in which you will install an FBDIMM.
5. Ensure that the ejectors, at each end of the memory socket, are fully open (rotated downward)
to accept the new FBDIMM.
7. Press the FBDIMM straight down until it snaps into place and the ejectors engage the cutouts
in the FBDIMM's left and right edges.
1. Power off the system and slide the system out of the rack.
2. Remove the top cover.
3. Remove PCIe/XAUI riser 0.
4. Using a small (No. 1 flat-blade) screwdriver, press the latch and remove the battery from
the motherboard.
Note: Install the new battery with the plus sign (+) facing up.
1. On the drive you plan to remove, push the hard drive release button.
Caution: The latch is not an ejector. Do not bend it too far to the left. Doing so can
damage the latch.
2. Grasp the latch and pull the drive out of the drive slot.
1. If necessary, remove the blank panel from the chassis. Press the FBDIMM's fault button.
Note: Sun SPARC Enterprise T5120 servers might have three blank panels covering
unoccupied drive slots. Sun SPARC Enterprise T5220 servers might have as many as
seven blank panels covering unoccupied hard drive slots.
Hard drives are physically addressed according to the slot in which they are installed. If you
removed an existing hard drive from a slot in the server, you must install the replacement drive in
the same slot as the drive that was removed.
3. Slide the drive into the drive slot until it is fully seated.
1. Slide the DVD/USB module into the front of the chassis until it seats.
2. Install the hard drive you removed during the DVD/USB module removal procedure.
1. Power off the system and slide the system out of the rack.
Note: If you are replacing a defective fan module connector board, remove only the fan
modules that are necessary to remove the defective fan module connector board.
4. Remove the Phillips screw that secures the fan module connector board to the chassis.
5. Slide the fan board toward the left side of the chassis approximately 0.5 inch (12 mm) to
disengage the fan board from the paddle board and the bottom of the chassis.
1. Power off the system and slide the system out of the rack.
2. Remove the top cover.
3. Remove all the disk drives from the server.
4. Remove the DVD, DVD carrier and USB board from the server.
5. Remove the 4 screws securing the disk cage assembly to the chassis. There are 2
screws on the side of the chassis near the right front and 2 screws on the side of the
chassis near the left front.
6. Slide the disk cage assembly toward the front of the chassis approximately 0.5 inch (12
mm). This releases the disk cage assembly from the chassis bottom.
7. Disconnect the disk cable from the disk backplane. The Sun SPARC Enterprise T5120
server has 1 disk cable. The Sun SPARC Enterprise T5220 server has 2 disk cables.
8. Remove the 2 screws that secure the disk backplane to the disk cage assembly. The Sun
SPARC Enterprise T5220 server has 4 screws securing the disk backplane to the disk
cage assembly.
9. Slide the disk backplane down approximately 0.25 inch (6 mm) to release the disk
backplane from the "fingers" on the disk cage assembly that protrude through keyhole
slots in the disk backplane.
10. Pull the disk backplane away from the disk cage assembly.
The motherboard assembly consists of the motherboard and the tray that the motherboard sits in.
They should be removed and installed as a single unit.
1. Power off the system and slide the system out of the rack.
2. Remove the top cover.
3. Remove the air baffle.
a. Open the air baffle.
b. Disengage the rear of the air baffle from the motherboard and rotate the air baffle
forward.
c. Press in the edges of the air baffle to disengage its pins from the chassis.
4. Disconnect the motherboard to power distribution board ribbon cable.
5. Disconnect the disk cable from the motherboard. The Sun SPARC Enterprise T5220
server has 2 disk cables.
6. Remove all PCI boards and riser cards.
7. Remove all FBDIMMs.
8. Remove the 4 screws connecting the motherboard to the bus bar.
9. Loosen the captive screw securing the motherboard to the chassis.
The captive screw is colored green, and is located to the left of the bus bar screws.
10. Using the green handles, slide the motherboard back and tilt the motherboard assembly
to lift it out of the chassis.
Grab the handles and move the motherboard toward the back of the system and lift it out
of the chassis.
Note: It is easier to service the power distribution board (PDB) with the bus bar
assembly attached.
1. Power off the system and slide the system out of the rack.
6. Remove the DVD, DVD carrier and USB board from the server.
7. Remove the 4 screws securing the disk cage assembly to the chassis. There are 2 screws on
the side of the chassis near the right front and 2 screws on the side of the chassis near the left
front.
8. Disconnect the disk cable from the motherboard, so that it does not obstruct access to the
power distribution board. Note: There are 2 disk cables connected to the motherboard in the Sun
SPARC Enterprise T5220 server.
10. Disconnect the top cover intrusion switch cable connector from the power distribution board.
11. Remove the 4 screws securing the power distribution board to the bus bar.
Note: The Sun SPARC Enterprise T5220 has 4 additional screws to remove. They are
connected to 2 additional bus bars that connect to a vertical power distribution board.
12. The Sun SPARC Enterprise T5220 server also has a ribbon cable that connects the
horizontal and vertical power distribution boards. This must be disconnected also.
13. Remove the single screw securing the power distribution board to the bottom of the chassis.
14. Slide the power distribution board toward the left approximately 0.5 inch (12 mm) to release
the power distribution board from the paddle board and the captive standoffs on the bottom of the
chassis.
1. Power off the system and slide the system out of the rack.
2. Remove the server top cover.
3. Lift the hinged air baffle.
4. Disconnect one end of the ribbon cable from the power distribution board and the other
end from the motherboard.
1. Power off the system and slide the system out of the rack.
6. Remove the DVD, DVD carrier and USB board from the server.
7. Remove the 4 screws securing the disk cage assembly to the chassis. There are 2 screws on
the side of the chassis near the right front and 2 screws on the side of the chassis near the left
front.
8. Disconnect the disk cable from the motherboard, so that it does not obstruct access to the
power distribution board.
10. Disconnect the top cover intrusion switch cable connector from the power distribution board.
11. Remove the 4 screws connecting the power distribution board to the bus bar.
12. Remove the single screw securing the power distribution board to the bottom of the chassis.
13. Slide the power distribution board toward the left approximately 0.5 inch (12 mm) to release
the power distribution board from the paddle board and the captive standoffs on the bottom of the
chassis.
15. Remove the 2 screws that secure the paddle board bracket to the chassis.
Note: Do not remove the 2 screws that secure the paddle board to the paddle board
bracket.
16. Slide the paddle board bracket toward the rear of the chassis approximately 0.5 inch (12 mm)
to release the paddle board bracket from the chassis.
17. Lift the paddle board and bracket out of the chassis.
To install the paddle board, reverse this procedure.
1. Power off the system and slide the system out of the rack.
2. Remove the top cover.
3. Remove the right most PCI and riser card.
4. The system configuration PROM is the chip located at j7901 on the motherboard. Lift it
straight up off the motherboard.
1. Align the light pipe assembly with the mounting holes on the hard drive cage.
2. Secure the light pipe assembly with a #2 Phillips screw.
3. Install the hard drive backplane.
4. Install the hard drive cage.
This module describes the procedures for configuring the Sun SPARC Enterprise
T5120/T5220 server service processor.
In this module we will discuss the details of the Sun SPARC Enterprise T5120/T5220
servers’ service processor. The service processor provides a built-in system management
application named ILOM to the server. The service processor is also known as the system
controller and is the main communication path to the rest of the system. Its key functions
are to configure, administer, and monitor the Sun SPARC Enterprise T5120/T5220
servers.
To set up the service processor with initial network configuration information, you must
establish a connection through ILOM to the service processor. Until the service processor
has an IP address assigned to it, you must use a serial connection to communicate with
the service processor. After establishing a serial connection to the service processor, you
can choose to configure the service processor with a static or DHCP IP address.
The default action for the service processor is to try and use DHCP for its network
configuration information. When you apply power to the system for the first time, ILOM
broadcasts a DHCPDISCOVER packet. If you have an established DHCP server on the
network, the DHCP server returns a DHCPOFFER packet containing an IP address and
other network configuration information to the service processor.
If you prefer to configure the service processor with a static IP address, and you have a
DHCP server established on your network, you can configure the static IP address prior
to attaching a LAN cable to the NET MGT port of the server.
Note: Sun recommends a static IP address for the service processor.
Whether static or DHCP IP addresses are assigned, you must initially establish a serial
console connection to communicate with ILOM.
You can access the ILOM CLI at any time by connecting a terminal or a PC running
terminal emulation software to the serial management port on the chassis.
To connect to the ILOM using a serial connection:
Step 2. Before trying to connect to the port, make sure that your display device is
properly configured. In the case of the laptop, desktop computer, and PDA, make sure
that you have a terminal simulation program loaded and started.
In the case of all the devices, make sure that they are configured with the following
default parameters:
8N1: eight data bits, no parity, one stop bit
9600 baud
Disable hardware flow control
Step 3. Connect a serial cable from the serial management port on the rear of the chassis
to a terminal device.
The serial management port is the left-most RJ-45 port, as viewed from the rear of the
chassis.
The CLI architecture is based on a hierarchical namespace, which is a predefined tree that
contains every managed object in the system. This namespace defines the targets for each
command verb.
The ILOM includes three namespaces:
The /SP namespace, the /SYS namespace, and the /HOST namespace.
The /SP namespace manages the ILOM. For example, you use this space to manage
users, clock settings, and other ILOM issues.
The /SYS namespace manages the host system. For example, you can change the host
state, read sensor information, and access other information for managed system
hardware.
The /HOST namespace - Used for monitoring and managing the host operating system.
Use the set command to change properties and values for network settings.
Network settings have two sets of properties:
pending - which are the updated settings, not currently in use
active - read-only settings, currently in use by the ILOM
To change settings:
First, enter the updated settings as the pending settings.
Then, set the commitpending property to true.
To display network settings, type the command:
show /SP/network
If you are already in the /SP/network directory, type:
show
Follow these steps to assign a static IP address to the network management port:
Step 1. At the ILOM prompt, type the following command to set the working directory.
-> cd /SP/network
Step 2. Type the following commands to specify a static Ethernet configuration.
-> set pendingipaddress=129.144.82.26
-> set pendingipnetmask=255.255.255.0
-> set pendingipgateway=129.144.82.254
-> set pendingipdiscovery=static
-> set commitpending=true
Note: The network values shown are samples only. You must specify the IP address,
netmask, and gateway appropriate for your network configuration.
Ensure that the same IP address is always assigned to an ILOM by either assigning a
static IP address to your ILOM after initial setup, or configuring your DHCP server to
always assign the same IP address to an ILOM.
This enables the ILOM to be easily located on the network.
When using the network management port, the wire speed is set to 10/100 megabit, full
duplex. Connection through this Ethernet port is allowed only after you have configured
the service processor using the serial port to have a valid IP address on your network.
The service processor accepts ssh only through its Ethernet port.
The first time the service processor is initialized, the following default conditions exist:
The network is enabled
DHCP is enabled
ssh service is enabled
The shell for the root user is the DMTF CLI
To create a new user, with the ALOM shell as their interface, follow the procedure on
your screen. This procedure creates the user admin to emulate the ALOM administrator.
Step 1. Log in as the root user.
Step 2. Create the admin user.
-> create /SP/users/admin
Step 3. Set the role and cli mode.
-> set /SP/users/admin role=Administrator
-> set /SP/users/admin cli_mode=alom
Step 4. Log out and log back in as the admin user. In subsequent logins as admin, you’ll
get the ALOM shell.
Over the course of a product’s life cycle, new versions of firmware are released. To
verify which version of firmware your system is running, issue the version command. In
the command output, you are looking for the ILOM version, the SC firmware version,
and the OBP version.
If you are considering updating the ILOM firmware, be aware of the following:
It is likely that the firmware images available to download from the SunSolve database
are more current than the image installed on your service processor at the factory.
The BIOS and the SP firmware are simultaneously updated. A single firmware image
contains both the BIOS and the SP firmware.
A firmware upgrade will cause the server and ILOM to be reset. It is recommended that
you perform a clean shutdown of the server prior to the upgrade procedure.
An upgrade takes about five minutes to complete.
ILOM will enter a special mode to load new firmware. No other tasks can be performed
in ILOM until the firmware upgrade is complete and ILOM is reset.
Note: Ensure that you have reliable power before upgrading your firmware. If power to
the system fails (for example, if the wall socket power fails or the system is unplugged)
during the firmware update procedure, the ILOM could be left in an unbootable state.
From the service processor, you can power on the server by typing:
start /SYS
You can use the stop command to perform an orderly shutdown of the server followed by
a power off of the server, as shown on your screen.
You can also skip the orderly shutdown and force an immediate power off with the -force
option, as shown on your screen.
Output to POST, OBP, and the Solaris OS are displayed to the system console, which is
accessible through the service processor.
To initiate a connection to the server console, execute the start /SP/console command
from the service processor prompt.
To terminate a connection to the server console, execute the stop /SP/console command
from the service processor prompt.
Working in a data center with thousands of servers in racks can sometimes pose a
problem when you are trying to locate one.
The locate LED is a white LED that you can light to help you find your server in a
crowded equipment room. The Locate LED has two states, fast blink and Off.
To turn on the locate LED, type:
-> /SYS/LOCATE value=Fast_Blink
To turn off the locate LED, type:
-> /SYS/LOCATE value=Off
The system event log accumulates various events, including administration changes to the
ILOM, software events, warnings, alerts, and events from the IPMI log.
You should note that the ILOM tags all events or actions with LocalTime=GMT (or
UDT). Browser clients show these events in LocalTime. This can cause apparent
discrepancies in the event log. When an event occurs on the ILOM, the event log shows it
in UDT, but a client shows it in local time.
To view and clear the system event logs, perform the following steps:
Step 1. Navigate to /SP/logs/event.
-> cd /SP/logs/event
Step 2. From the CLI, enter the show list command: -> show list
The event log scrolls onto your screen.
Step 3. To scroll down, press any key except ‘q’.
Step 4. To stop displaying the log, press ‘q’.
Step 5. To clear the system event log, use the command:
set clear=true
Step 6. The CLI asks you to confirm.
Type y.
The CLI clears the system event log.
Note: The system event log accumulates many types of events, including copies of entries
that IPMI posts to the IPMI log. Clearing the system event log clears all entries, including
the copies of the IPMI log entries. However, clearing the system event log does NOT
clear the actual IPMI log.
You must use IPMI commands to view and clear the IPMI log.
The service processor also allows you to create additional users. The tasks that can be
performed by a user are determined by the privileges that you assign to that user’s
account. You can have up to a maximum of ten user accounts, including root.
Each user account consists of a user name, a password, and a role.
As the root user, you can add, delete, and list users on the service processor.
To add a user, execute the create command, providing the following information:
username
password, and
role - either administrator or operator
You can remove users from the service processor using the delete command, as shown on
your screen.
To display information about all local user accounts, type show /SP/users.
As root user, you can use the set command to change passwords and roles for existing
user accounts.
For example, when changing the role for user1 from Administrator to Operator, type:
-> set /SP/users/user1 role=operator
To change user1's password, type:
-> set /SP/users/user1 new_password
To log into the Sun ILOM Web GUI, follow these procedures:
Step 1. Using secure http, type the IP address of the ILOM service processor into your
web browser, as the example on your screen shows.
The JavaTM Web Console login screen is displayed.
Step 2. Type your user name and password.
Step 3. Click OK.
The ILOM Web GUI login page is displayed.
Click the link provided on your screen to view a demonstration using the Web GUI.
This module describes the procedures for performing the software installation of the
Sun SPARC Enterprise T5120/T5220 server.
In this module we will discuss the details of the Sun SPARC Enterprise T5120/T5220
servers. We’ll start with the set of service processor commands used to configure and
administer the server. Then we’ll move to the Open Boot PROM to see what changes
have been made there. And we’ll finish up talking about the Solaris OS, including
discussions on differences in command output, disk, and network multipathing, and
performing dynamic reconfiguration.
The Sun SPARC Enterprise T5120/T5220 server has a virtual keyswitch with four
different modes.
1. The first mode is normal in which the service processor uses the diagnostic settings
that you specified with the set command to determine how POST is executed.
2. The second mode is diag in which the vBSC sets the diagnostic level to the interactive
menus.
3. The third mode is stby. In this mode, the service processor prevents you from powering
up the server.
4. And in the fourth mode, locked, the service processor does not allow you to send a
reset to the server.
To set the virtual keyswitch, execute the set command with /SYS/keyswitch_state as the
target and provide a mode as a value for the target.
To view the current keyswitch setting, execute the show /SYS/ keyswitch_state
command.
The bootmode target controls the behavior of the host server’s firmware while the host
server is initializing or being reset. There are options or properties that you can use to
control the booting of the server.
One property is state. There are two possible values for the state property.
normal - which means that at the next reset, the system retains its current NVRAM
variable settings, and
reset_nvram - which means that at the next reset, the system returns its non-volatile
random access memor, or NVRAM, variables to the default settings.
Another property is script. The value for the script property is a text string which is the
name of a script. By setting a bootscript string, you are controlling the host server’s Open
Boot PROM firmware method of booting. The string can be up to 64 bytes in length.
The bootmode target requires that you reset the host server within 10 minutes after
issuing the command. If you do not power cycle or reset the system within 10 minutes,
the host server ignores the bootmode target values.
To display the currently selected values, type:
-> show /HOST/bootmode
You can power on the system by executing the start /SYS command at the service
processor prompt, or by pressing the power button located in the bottom left corner on the
front of the machine.
Output from POST, OBP, and the Solaris OS are displayed to the system console, which
is accessible through the service processor. To acquire the console, execute the start /SP/
console command from the service processor.
When you issue the command the system will prompt you “Are you sure you want to
start /SP/console (y/n)?”. You can suppress this prompt by using the option -script.
Although multiple users can connect to the system console from ILOM, only one user at
a time has write access to the console. This is referred to as a write-locked session. Any
characters that other users type are ignored. These are referred to as read-only mode
sessions, where users can only view the console.
If no other users have access to the system console, then the user entering the console
session first obtains the write lock automatically.
To see if there are other users connected into the service processor and to see whether
they are connected to the console, execute the show /SP/sessions command.
Note: Terminate a console session by typing #. (pound, period).
If the Solaris OS is running, you can use the stop /SYS command from the ILOM shell to
issue a graceful shutdown to Solaris. It is similar to one of the Solaris OS commands,
such as shutdown, init, or uadmin.
It can take up to 65 seconds for the poweroff command to completely shut down the
system. This is because ILOM attempts to wait for a graceful shutdown to complete
before the server is powered off.
You can also force the server to power down by executing the -> stop -force /SYS
command or by pressing and holding the power button on the front of the server. This
does an immediate shutdown regardless of the state of the host.
From the service processor console, you can issue the command: -> set
/HOST/send_break_action true to bring down the server to the Open Boot PROM
prompt, or ok prompt. It is the equivalent of executing an L1-A or Stop-A on a system
with a keyboard attached.
For the system to accept a break, the virtual keyswitch must not be in the locked position.
If it is in the locked position, ILOM returns an error message.
To reset the service processor or the server, execute the reset command from the service
processor.
To reset the service processor, type reset /SP
If you reset /SYS, the server reboots using the Open Boot PROM settings that you have
configured.The reset command does not perform a graceful shutdown of the operating
system.
You will be prompted to confirm a reset operation.
The -script option instructs ILOM to proceed without prompting the confirmation
question.
The Sun SPARC Enterprise T5120/T5220 servers run OBP 4.x. The capabilities of the
Open Boot PROM in these servers have decreased as a lot of its functionality has been
moved to the Hypervisor layer. Its key functions are to allow you to boot the operating
system, modify system startup parameters, load and execute programs, and get help in
troubleshooting.
The Solaris OS comes pre-installed on the Sun SPARC Enterprise T5120/T5220 servers
on the disk in slot 0. The operating system is not configured, that is, the sys-unconfig
command was run after the OS was installed. When you boot the system for the first time
from the disk, you are prompted to configure it for your environment. At the ok prompt,
boot from the disk that contains the Solaris operating system. You might want to
configure an alias for this as your boot disk.
There are some instances when you might want to reinstall or upgrade the Solaris OS in
your server. The Sun SPARC Enterprise T5120/T5220 servers support a minimum OS
release of the Solaris 10 Update 4 OS. It has an architecture type of sun4v, to coincide
with the ILOM firmware running on the service processor. The v represents the
virtualization of the hardware to the Open Boot PROM and the Solaris OS, which is
performed by the Hypervisor layer. You can install or upgrade the operating system
through traditional jumpstart procedures or using the DVD-ROM.
Next, we’re going to take a look at some standard Solaris OS commands to see what is
reflected differently in their output.
The first is the date command. The system TOD (time of day) is managed by the service
processor and is provided to the Solaris OS by the Hypervisor layer. You must set the
date on the service processor using the set /SP/clock/datetime command. If you set the
date using the date command within the Solaris OS, it holds until your next re-POST,
when the service processor will then pass its date over to the Solaris OS once again.
The second command is prtdiag. The prtdiag command displays system configuration and
diagnostic information. You’ll notice in the prtdiag output that the Solaris OS sees the
UltraSPARC T2 processor as 64 CPUs. It is an 8-core processor with 8 threads running
per core. You’ll also see in the prtdiag output that you can obtain memory specifics from
the service processor. You only see the total memory size in the prtdiag output.
The next command is psrinfo. The psrinfo command reflects the same changes from a
CPU perspective as the prtdiag command. You will see that the Solaris OS is presented as
64 processors.
And finally, we’ll take a look at the output to ifconfig -a. This output represents the four
gigabit Ethernet interfaces as ipge interfaces, ipge0 through ipge3.
Multipathing software lets you define and control redundant physical paths to I/O
devices, such as storage arrays and network interfaces. If the active path to a device
becomes unavailable, the software can automatically switch to an alternative path and
maintain its availability. This is known as automatic failover. To take advantage of
multipathing capabilities, you must first configure the server with redundant hardware –
for example, multiple network interfaces going to the same subnet or two controllers
attached to the same storage array – and then configure the software to make use of it.
For the Sun SPARC Enterprise T5120/T5220 servers, there are three different types of
multipathing software that are available.
The first is the Solaris OS IPMP (IP Multipathing) which provides multipathing and load-
balancing capabilities for IP network interfaces.
Next, we have Sun StorEdge Traffic Manager Software (STMS), which enables I/O
devices to be accessed through multiple host controller interfaces from a single instance
of the I/O device. This software is fully integrated into the Solaris operating system.
And finally, we have Veritas Volume Manager (VxVM), which includes Dynamic
Multipathing (DMP). This software provides disk multipathing as well as disk load
balancing to optimize I/O throughput.
Within the Sun SPARC Enterprise T5120/ T5220 servers, the SAS controller supports
hardware mirroring and striping using the Solaris OS raidctl utility.
A hardware RAID volume created using the raidctl utility behaves differently than one
created using software RAID. When volumes are created using hardware RAID, only one
device appears in the device tree. Member disk devices are invisible to the operating
system and are accessed only by the SAS controller.
Executing the raidctl command tells you whether there are any RAID volumes found. To
create a RAID volume, execute the raidctl -c primary_drive secondary_drive command.
The secondary drive will disappear from the device tree.
To set up a striped volume, execute the raidctl -c -r 0 primary_drive secondary_drive
tertiary_drive, and so on command. In this case, all but the primary drive disappears from
the Solaris OS device tree.
To delete the hardware RAID volume, execute the raidctl -d mirrored_volume command.
Several tools and features are available on the Sun SPARC Enterprise T5120/T5220
servers to help administrators monitor, troubleshoot, and diagnose issues. In this module,
we will discuss the following tools:
Hardware diagnostic tools
Firmware tests and commands
ILOM commands and logs
Software diagnostic tools
You can monitor and analyze the server status using a combination of these tools.
Click as indicated on your screen to view a block diagram showing the diagnostic
components available on the Sun SPARC Enterprise T5120/T5220 servers.
LEDs are placed throughout the Sun SPARC Enterprise T5120/T5220 servers to help
pinpoint problem components in the server, as well as to give a visual indicator of the
overall server status.
LEDs are found on the following chassis locations and server components:
The front and rear panels
The system fan modules
The power supplies
The disk drives
The DVD drive
Service indicators are categorized as either system status LEDs or component status
LEDs.
On the Sun SPARC Enterprise T5120/T5220 servers front panel, system status LEDs are
on the left, and component status LEDs are on the right.
Looking at the system status LEDs on the front panel of the server, starting on the top,
and moving down, the system status LEDs are the:
White-colored Locator indicator,
The Amber-colored Service Required indicator, and the
Green-colored running or Power OK Indicator
Click the indicator name on your screen for additional information about the function of
that indicator.
Locator Indicator
This locator indicator enables you to find a particular system. The LED is activated using one of
the following methods:
Power OK Indicator
The Power OK indicator provides the following indications:
• Off – Indicates that the system is not running in its normal state. System power may be
on or in standby mode. The service processor might be running.
• Steady on – Indicates that the system is powered on and is running in its normal
operating state. No service actions are required.
• Fast blink – Indicates the system is running at a minimum level in standby and is ready to
be quickly returned to full function. The service processor is running.
• Slow blink – Indicates that a normal transitory activity is taking place. This could indicate
the system diagnostics are running, or that the system is booting.
Now that we have discussed the system status LEDs, let’s examine the component status
LEDs.
Each disk drive has the following status LEDs as shown on your screen:
Blue-colored OK to remove
Amber-colored Fault
Green-colored Activity
The DVD drive has an activity LED.
On the right front of the server, going top to bottom, the following LEDs exist:
Amber-colored Fan fault
Amber-colored Power supply fault
Amber-colored Over temperature
Click the indicator name on your screen for additional information about the function of
that indicator.
The system status LEDs on the front panel of the server are replicated on the rear of the
chassis, as shown on your screen. Starting on the left, and moving to the right, they are:
a White-colored Locator indicator
Amber-colored Service Required indicator, and
Green-colored running or Power OK Indicator
Each power supply has three LEDs. From top to bottom, they are:
a Green-colored PSU OK indicator,
Amber-colored PSU fault indicator, and
Green-colored A/C PSU power indicator
The gigabit Ethernet ports each have two LEDs to show their current status. They are the:
Green-colored Link/activity indicator, and the
Amber-colored Speed indicator
Click the indicator name on your screen for additional information about the function of
that indicator.
PSU OK Indicator
When this LED is lit, it is OK to remove the power supply.
Link/Activity Indicator
The Link/activity indicator provides the following indications:
Speed Indicator
The speed indicator provides the following indications:
The Sun SPARC Enterprise T5120/T5220 status indicators conform with the American
National Standards Institute (ANSI) Status Indicator Standard (SIS).
The table on your screen describes the SIS standard LED behaviors and their meanings.
You can also obtain the LED status information through the service processor.
To view LED status information in the ALOM CLI, use the showenvironment command.
In the ILOM cli, use the show /SYS/<component>/<property> command.
Click as indicated on your screen to view a demonstration of these commands.
Each DIMM slot on the motherboard has an associated fault LED that identifies a faulty
FBDIMM diagnosed by POST or FMA.
When power is removed, the FBDIMM fault LEDs are lit by pressing a fault reminder
button located on the motherboard.
To identify faulty FBDIMMs, follow this procedure:
Step 1. Unplug all power cords.
Step 2. Press the FBDIMM fault button.
The FBDIMM fault button is located on the motherboard near the FBDIMMs as shown.
Step 3. Note the location of faulty FBDIMMs.
Faulty FBDIMMs are identified with a corresponding amber LED on the motherboard.
Step 4. Ensure that all FBDIMMs are seated correctly in their slots. If re-seating the
FBDIMM does not fix the problem, remove and replace the faulty FBDIMM.
Note: The DIMM fault LEDs can be lit for only a minute or so with the fault reminder
button.
Firmware diagnostic tests are executed on both the Sun SPARC Enterprise T5120/ T5220
server host and on its service processor. The purpose of these tests is to verify the core
functionality of the service processor and the host.
The output displayed during the resetting of the service processor is an excellent source
of diagnostic information.
Click as indicated on your screen to view a demonstration of the power reset sequence of
the service processor.
When the service processor has finished booting and the ILOM firmware has loaded, you
can log in to the service processor and power on the host server. You can do this using
the poweron command in the ALOM cli, executing start /SYS in the ILOM cli, or by
pressing the power button on the front panel of the chassis.
When power is applied to the host server, the vBSC, which is responsible for calling
POST and collecting POST status on completion, is initialized.
Hosted on the service processor, vBSC can be thought of as an extension of Hypervisor.
The functionality that OBP once provided has been moved to the service processor. This
eliminates the need to tie OBP-specific settings to the entire server.
When the server powers on, the following actions take place:
vBSC is initialized from the service processor.
POST is called from vBSC to perform a sanity check of the server as well as testing of
the components based on the settings passed to the host server by vBSC.
The ASR database is updated based on diagnostic tests performed.
And Hypervisor is started for the host server.
While POST is called automatically, the output is not displayed by default. Likewise,
higher levels of POST testing are not called by default. The server must be configured for
these actions to happen.
The ILOM variables that affect POST are diag_trigger, diag_verbosity, diag_level and
diag_mode.
You can modify them with the set command in the ILOM or the setsc command in
ALOM and view them with their current settings by executing the show commnad in
ILOM and showsc command in ALOM.
The service processor’s setkeyswitch command also affects the behavior of POST
execution. If the keyswitch is set to the DIAG position, then vBSC sets the diagnostic
levels to the interactive menus. If the keyswitch is set to the NORMAL position, then the
service processor variables are used to determine if and how POST is executed.
Click as indicated on your screen to view a table of these ILOM variables and their
possible settings.
Post Variables
The following table lists ILOM variables that affect post.
OBP tests are executed once POST has completed and the Hypervisor has been loaded.
OBP tests are affected by the diag-switch? OBP variable.
The value of diag-switch? affects the verbosity of OBP. If diag_switch? is set to true, the
output from the OBP initialization and probing is sent to the console. If diag-switch? is
set to false, no initialization or probing output is displayed.
In addition to normal POST and OBP output, there are several commands and interfaces
that you can use to analyze and test the system hardware to assist in troubleshooting.
In the remainder of this module, we will look at analyzing POST output, running POST
menus, analyzing OBP diagnostic output, analyzing POST error messages, managing the
ASR database, and viewing logs, console messages, and faults.
There are two levels of POST tests that are executed on the host server. These are the
integrity POST and, if configured, EPOST.
Integrity POST executes the following tests on the server each time the system is
powered on:
Register tests on window registers and the Niagara CPU scratchpad
Floating point unit access to check the path from all threads
L2 access to check the path from all threads
Quick FPGA check by the master thread, the first thread to jump to POST, to check the
integrity of the SRAM
All threads returned from testing
EPOST executes tests for the remainder of the server based on the diag_level setting.
When executed with a diag_level set to min, EPOST:
Initializes registers and global variables.
Runs the basic memory tests in which all memory cells are touched with unique patterns
and checked with hardware ECC.
Tests I2C operation and clock frequency for the master thread only.
Performs a basic test on the JBUS-to-PCI-E bridge for the master thread only.
When executed with a diag_level set to max, EPOST:
Performs all the steps of the minimum diagnostic mode.
Runs memory tests on any arrays not covered by BIST, such as L1 tags and the internal
register arrays.
Tests the functional operation of the L1 and L2 caches and instruction and data memory
management unit and interrupts for all of the hardware strands.
Performs an extended memory test in addition to hardware ECC checks.
Tests the functionality of the JBUS-to-PCI-E bridge for the master hardware thread,
specifically direct memory access and interrupt testing.
The POST interactive menu mode is called when two conditions are met. The first is that
the system’s virtual keyswitch is set to DIAG mode and the second is that the service
processor diag_mode variable is set to menu. You can verify that these conditions are met
with the output to the showkeyswitch and showsc commands.
The POST menu mode provides access to all tests available within POST, including:
built in self test
Tests performed when diag_level is set to min
Tests performed when diag_level is set to max
When POST has completed, OBP initializes and works with Hypervisor to build the
device tree structure. Even though the Open Boot PROM does not have as much
functionality as it previously did, you can still gather some information from it to help
you troubleshoot.
From the output to the banner command, you can see how much memory has passed
POST.
From the output to show-devs, show-disks and show-nets, you can see the device tree that
has been built, along with the disks and network interfaces that the Open Boot PROM
sees.
POST displays a great deal of information on errors and warnings to assist you in
troubleshooting the cause of a problem. The level of information given is affected by the
value of the service processor diag_verbosity variable. Errors follow the standard format
that identifies:
The test that was executing.
The hardware that was being tested at the time of the failure
The suggested repair instructions for which component to replace
And the error message generated by the fault.
Automatic System Recovery, or ASR, lets you manually manage blacklisted items from
the service processor as well as lets the service processor manage the blacklisted items
through the vBSC.
An ASR database is maintained with a list of any blacklisted items. When POST
completes, it returns a status of the components tested to the vBSC. If any of the
components are reported back as failed, the service processor attempts to unconfigure
them and map them out of the server. An event is then logged to the service processor
and to the console regarding this issue.
ILOM provides commands to manage the ASR database. The command that you use is
cli dependent.
In the ILOM cli, you can use the show /SYS command to query the component_state of a
specific component.You use the set command to disable and enable a specific
component.
You also use the set /SYS command to clear the ASR database for a specific component.
ASR Components
The following table lists the ASR components for the Sun SPARC Enterprise T5120/T5220
servers.
Both the ILOM and ALOM cli provide commands for viewing console messages, system
messages, and system faults. The types of messages include errors, faults, notices and
general system information.
In the ILOM cli, execute the show /SP/ logs/event/list command to display messages,
notices, and events that have been sent to the service processor’s event log.
The equivalent ALOM CLI command is the showlogs command.
The ALOM showfaults command displays current valid system faults.
The ALOM cli also provides a command called consolehistory that shows the contents of
the boot log and the run log. The boot log contains the messages from POST, OBP, and
the booting of Solaris. The run log contains everything in the boot log plus the Solaris
runtime messages.
The ILOM cli does not provide the console history functionality.
Click the links on your screen to view a demonstration of these commands.
Several tools and features are available on the Sun SPARC Enterprise T5120/T5220
servers to help administrators monitor, troubleshoot, and diagnose issues. In this section
we will discuss the following tools:
Software-based commands
Applications
Log files
You can monitor and analyze the server status using a combination of these tools.
Click as indicated on your screen to view a block diagram showing the diagnostic
components on both the host server and the service processor.
Diagnostic Components
The following illustration shows a block diagram depicting the diagnostic components of both the
host and the service processor.
The server host provides the majority of the diagnostic functions, including:
• Hardware tools
• POST, Hypervisor, and OBP
• Solaris OS fault management agents, kernel components, drivers, commands, and log
files
• Solaris-based applications and other management tools that talk to the Solaris OS
The Sun SPARC Enterprise T5120/T5220 servers implement the fault management
architecture introduced in Solaris 10 OS. Incorporated into both the hardware and
software of the Sun SPARC Enterprise T5120/T5220 servers, the FMA helps the server
maintain a greater uptime rate by:
Automatically and silently diagnosing underlying problems
Using predictive self-healing
Disabling faulty components if necessary and possible
Issuing alerts on problems and logging events
Providing data to higher-level management services and, in the future, remote services
Be sure to have your server fully patched for the latest FMA agents.
There are several Solaris OS commands associated with FMA. These include:
The fmadm command, which lets you view, load, and unload modules. It also lets you
view and update the resource cache, which is a list of faulty resources as seen by the fault
management daemon, fmd.
The fmdump command, which enables system administrators to view any log files
associated with fmd and retrieve specific details of any diagnosis issued. By default, the
fmdump command lists the fault log, displaying the time, the ID associated with that
fault, and the message ID, SUNW-MSG-ID, that can be viewed on Sun’s message lookup
website.
The fmstat command, which reports the statistics of the fault management system. By
default, the fmstat command lists the active modules and the statistics associated with the
modules.
Intermittent problems can often times be difficult to diagnose. Diagnostic tools exercise
the components to the point where they display an emerging failing condition. These
tools are designed to stress the components to the point of failure. On the Sun SPARC
Enterprise T5120/T5220 servers, the Sun Validation Test Suite (SunVTS) is used to
exercise the server, as well as for hardware validation and repair verification.
The minimum version of SunVTS that supports the Sun SPARC Enterprise T5120/
T5220 servers is SunVTS 6.4 and it ships with the Solaris 10 Update 4 OS.
The following Sun SPARC Enterprise T5120/ T5220 components can be diagnosed
through SunVTS:
CPU
FBDIMMs
I/O
Gigabit Network Ports
SAS Disks, Controller and Cables
DVD Device
Host-to-Service Processor Interface
SunVTS was modified for the Sun SPARC Enterprise T5120/T5220 as follows:
CPU/memory tests were updated to test the UltraSPARC T2 specific features.
cryptotest was enhanced to test the cryptographic unit on the UltraSPARC T2.
A new SunVTS test developed for the UltraSPARC T2 is Xnetlbtest, which provides
testing coverage for the two 10 Gigabit ports on the network interface unit of the
UltraSPARC T2 processor.
Additional complex testing of nettest and netlbtest include the following features:
Spawns off continuous Tx/Rx asynchronously to force the driver to exercise different
DMA channels
Provides classification, IP fragmentation, and variable length packets to cover jumbo
frame test
Supports back to back (port to port) loopback tests
Transmit rate can be varied by using delay between sends
Soft error threshold allows limit of packet drop in pass/fail criteria
Provides options of different payload data patterns
You can collect Sun SPARC Enterprise T5120/T5220 servers status and configuration
information from several sources within the Solaris 10 OS. You can find system status,
such as error and information messages, in the log files. You can obtain system status by
executing specific Solaris OS commands.
The following section describe the utilities that you can use to collect status and
configuration information on the Sun SPARC Enterprise T5120/T5220 servers.
The main system log for the Solaris 10 OS is the messages file located in the /var/adm
directory. Here, you can locate system status and error and informational messages by
filtering this file. This file can grow to be large, so it is important to select key values to
filter on, for example, cpu, mem, error, and so on.
You can also obtain system status and configuration data through the use of the Solaris
OS utilities, such as:
prtdiag, which lists the available CPUs, the I/O configuration, and the PROM and ASIC
versions
iostat, which displays information on each I/O device, including I/O errors
prtconf, which displays the system device drivers
prtpicl, which displays platform-specific information stored in the platform information
and control library (PICL)
psrinfo, which displays which CPUs are available and their status
raidctl, which tells us if there are any RAID sets configured and, if so, what their
members are
and the Sun Explorer Data Collector
The Sun Explorer Data Collector is a utility, made up of shell scripts and some binaries,
that automates the collection of system configuration data from Sun Solaris servers. It
collects a summary of installed software, firmware, and storage subsystem components
and saves it in a compressed tar format. This tool is accessible from SunSolve, located at
http://sunsolve.sun.com.
This web site provides more information on the utility, any patches that are needed for
the Sun SPARC Enterprise T5120/T5220 servers, and the link to the software download
website.
Another source of system status and configuration information is a system core dump.
The Solaris 10 server is enabled by default to save a core dump when it occurs.
To verify that core dumps are enabled, run the dumpadm command. If core dumps are
enabled, you also need to verify that you have enough swap space and file system space
for the core dump to be stored. You can do this with the swap -l and df -k commands,
respectively.
To test the save dump utility, perform a graceful shutdown of the Solaris OS. From the
OBP prompt, perform a sync followed by a reset-all. Watch for the savecore messages
during boot and then verify that the savecore files are in the savecore directory.
The Sun Explorer Data Collector is a utility, made up of shell scripts and some binaries,
that automates the collection of system configuration data from Sun Solaris servers. It
collects a summary of installed software, firmware, and storage subsystem components
and saves it in a compressed tar format. This tool is accessible from SunSolve, located at
http://sunsolve.sun.com.
This web site provides more information on the utility, any patches that are needed for
the Sun SPARC Enterprise T5120/T5220 servers, and the link to the software download
website.
Another source of system status and configuration information is a system core dump.
The Solaris 10 server is enabled by default to save a core dump when it occurs.
To verify that core dumps are enabled, run the dumpadm command. If core dumps are
enabled, you also need to verify that you have enough swap space and file system space
for the core dump to be stored. You can do this with the swap -l and df -k commands,
respectively.
To test the save dump utility, perform a graceful shutdown of the Solaris OS. From the
OBP prompt, perform a sync followed by a reset-all. Watch for the savecore messages
during boot and then verify that the savecore files are in the savecore directory.