构建机房运维基础架构(五): 装机问题汇总(DELL)

1. 获取控制卡IP失败

控制卡没有通过DHCP获取到IP,需要检查网线和控制卡配置。

控制卡的正确配置参考此链接

 

2. NO DHCP OR proxyDHCP 

内网卡没法通过DHCP 获取到IP,请检查DHCP服务,一般的问题是:
1) 新机器内网段没在DHCP配置里面添加
2) 内网网线没插好
3) 交换机没配置DHCP relay

 

3. Media test failure

网线没插好,需要检查。

 

4. 拿不到网卡信息

/admin1-> racadm get NIC.NICConfig
ERROR: SWC0244 : Invalid Fully Qualified Device Descriptor (FQDD).

——————————————————————————-
Valid Options:

System.Power
System.Power.Supply
iDRAC.IMC
LifecycleController.LCAttributes
System.LCD
iDRAC.SNMP.Alert
System.Location

猜测和固件版本过低有关,待确认。

DELL的回复:

有异常的的这台7SK45W1是去年的机器,有可能是idrac固件版本较低的缘故。
BIOS.BiosBootSettings.BootSeq 这个参数在1.50.50版本上看到的。

1.之前的版本命令行格式可能不一样,您可以输入BIOS后按tab让其自动补全看看。

2.请web检查下idrac web界面能否正常登陆,如果无法正常登陆,请重启idrac后再尝试。
有两种方式可以让idrac重启重新初始化:
(a.长按前面板i键20秒,idrac会重启初始化  b.关机后拔掉电源线,长按电源开关按键15秒,后再插上电源线开机)

3.您也可以更新至新版idrac固件后再尝试,版本如果差太多的话,请先刷新至1.40.40后在更新至1.56.55
http://ftp.dell.com/FOLDER02069189M/1/ESM_Firmware_V554G_WN32_1.56.55_A00.EXE
http://ftp.dell.com/FOLDER01526113M/1/ESM_Firmware_F5F8N_WN32_1.40.40_A00.EXE

(登陆到idrac web页面下的更新与回滚,直接上传更新)

 

5. 设置不了启动顺序

第一种报错:
/admin1-> racadm set BIOS.BiosBootSettings.BootSeq HardDisk.List.1-1,NIC.Integrated.1-2-1

ERROR: BOOT018: Specified boot control list is read-only.
Verify the dependencies of the objects under the specified group using
“racadm help <device class>.<groupname>”, and retry the operation.

可能是硬盘没做RAID,尝试重做RAID 。

– – – – – – – – – – – – – – – – – – – – – – – –

第二种报错:
/admin1-> racadm set BIOS.BiosBootSettings.BootSeq HardDisk.List.1-1,NIC.Integrated.1-2-1

ERROR: BOOT016: Input source argument value for the boot device is incorrect or
     not found among the boot devices on the system.

这种报错,登陆到BIOS里面的启动设置选项,发现启动顺利里面没有第二块网卡,只有第一块。

这个问题的原因:

原来第一块网卡设置成了PXE,第二块网卡被设置成了 NONE,这TMD 也有影响 ,狗日的DELL 初始化居然没做 ! ! !

update at  Wed May 14 18:57:49 CST 2014 :

关于这个问题,我今天又遇到一个原因,就是  硬盘都是裸盘,没做RAID ,所以就废掉了。

– – – – – – – – – – – – – – – – – – – – – – – –

第三种报错:
/admin1-> racadm set BIOS.BiosBootSettings.BootSeq HardDisk.List.1-1,NIC.Integrated.1-2-1
[Key=BIOS.Setup.1-1#BiosBootSettings]
RAC1017: Successfully modified the object value and the change is in
pending state.
To apply modified value, create a configuration job and reboot
the system. To create the commit and reboot jobs, use “jobqueue”
command. For more information about the “jobqueue” command, see RACADM
help.
/admin1->
/admin1-> racadm jobqueue delete –all
/admin1-> racadm jobqueue create BIOS.Setup.1-1 -r pwrcycle -s TIME_NOW
ERROR: SUP002: Job creation failure. Retry the action. If this fails, reboot the iDRAC.

重启方法:
/admin1-> racadm racreset
RAC reset operation initiated successfully. It may take a few
minutes for the RAC to come online again.

另外,想重设idrac 配置,用 racadm racresetcfg

 

6. 设置网卡支持PXE启动报错

第一种报错:

/admin1-> racadm set NIC.NICConfig.2.LegacyBootProto PXE
[Key=NIC.Slot.1-2-1#NICConfig]
RAC1017: Successfully modified the object value and the change is in
pending state.
To apply modified value, create a configuration job and reboot
the system. To create the commit and reboot jobs, use “jobqueue”
command. For more information about the “jobqueue” command, see RACADM
help.
/admin1-> racadm jobqueue create NIC.Embedded.2-1-1 -r pwrcycle -s TIME_NOW
ERROR: RAC944: Unable to create the configuration job. Run “racadm set LifecycleController.LCAttributes.LifecycleControllerState 1” to enable Lifecycle Controller, and retry the operation.

根据提示,执行一下 蓝色的命令就好了。

/admin1-> racadm set LifecycleController.LCAttributes.LifecycleControllerState 1
Object value modified successfully

/admin1-> racadm jobqueue create NIC.Embedded.2-1-1 -r pwrcycle -s TIME_NOW
RAC1024: Successfully scheduled a job.
Verify the job status using “racadm jobqueue view -i JID_xxxxx” command.
Commit JID = JID_057986530557
Reboot JID = RID_057986531087

第二种报错:
/admin1-> racadm set NIC.NICConfig.2.LegacyBootProto PXE
[Key=NIC.Slot.1-2-1#NICConfig]
RAC1017: Successfully modified the object value and the change is in
pending state.
To apply modified value, create a configuration job and reboot
the system. To create the commit and reboot jobs, use “jobqueue”
command. For more information about the “jobqueue” command, see RACADM
help.
/admin1-> racadm jobqueue create NIC.Embedded.2-1-1 -r pwrcycle -s TIME_NOW
ERROR: SWC0244 : Invalid Fully Qualified Device Descriptor (FQDD).

这个错误我也不知道原因,但是手动重启之后机器会自己初始化,自己就好了。

 

7. 重启失败

/admin1-> racadm serveraction powercycle
ERROR: Timeout while waiting for server to perform requested power action.

查看页面,发现机器没法启动,重启了idrac卡就好了。

/admin1-> racadm racreset
RAC reset operation initiated successfully. It may take a few
minutes for the RAC to come online again.

/admin1-> racadm serveraction powercycle
Server power operation successful

注:如果查看电源状态,使用 racadm serveraction powerstatus

 

8. 连接ILO IP 报 No more sessions are available for this type of connection!

比如,

$ ssh root@10.2.8.195
root@10.2.8.195’s password:

No more sessions are available for this type of connection!

Connection to 10.2.8.195 closed.

用浏览器访问也会报这个错。

可以用ipmitool 命令把 控制卡冷重启了。
# ipmitool -I lanplus -H IP -U 用户名 -P 密码 mc reset cold 

DELL 机器上,要启用 LAN 上 IPMI,否则会报 Error: Unable to establish IPMI v2 / RMCP+ session 。

111110d2924e5f5261407fda7d6477cb2e302

(如果是本机,可以用 ipmitool mc reset cold,经实际测试,此命令不会重启操作系统,只是重启控制卡)

这个链接数超限的问题 可以通过升级 idrac 卡的 firmware 来解决。
1.57.57 版本的 firmware 下载地址

该版本的 bug fix 中提到过一点:
– Fix for issues that cause iDRAC7 sluggish responsiveness after a prolonged period of time (approx. 45-100 days, depending on the usage). In some cases, if the iDRAC is not reset, the iDRAC may become unresponsive and requires a server AC Power on reset. This issue was introduced in firmware release 1.50.50 and fixed in 1.56.55.

 

装机系统原理在这

 

发表评论

电子邮件地址不会被公开。 必填项已用*标注