SUN280R CPU issue on 02/02/2008: (smikame@ipv4sec.com) ------------------------------------------------------------------- paLevel 14 Interrupt ok nic[cpu0]/thread=2a100051d20: mutex_enter: adaptive at high PIL, lp=300001f0000 owner=2a100097d20 thread=2a100051d20 Fast Data Access MMU Miss ok ok reset-all Fast Data Access MMU Miss ok boot Resetting ...Corrected ECC Error ok boot FATAL: OpenBoot initialization sequence prematurely terminated. FATAL: system is not bootable, boot command is disabled ------------------------------------------------------------------- Before it died, the following errors were received from this 280R box which indicates memory error on the module J0304. It looks like data bit 117 is causing a problem on J0304 and maybe interleaved with the others by a problem with data bit 117 on the system bus, or maybe something else. Nov 20 07:49:38 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.79a2fbf0 Nov 20 07:49:38 venus Fault_PC 0x1000721c Esynd 0x01e8 J0304 Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 288824 kern.info] [AFT0] errID 0x005bed2a.28e93798 Corrected Memory Error on J0304 is Intermittent Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 485157 kern.info] NOTICE: [AFT0] First Error Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x005bed2a.28e93798 Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 238593 kern.info] [AFT0] errID 0x005bed2a.28e93798 Data Bit 117 was in error and corrected Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 528784 kern.info] [AFT2] errID 0x005bed2a.28e93798 PA=0x00000000.79a2fbc0 Nov 20 07:49:38 venus E$tag 0x00000001.e6492492 E$state_7 Exclusive Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x00) 0x00000000.00000000 0x00000000.00000000 ECC 0x000 Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x10) 0x00000310.0162fbc0 0x00000310.0162fbc0 ECC 0x027 Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x20) 0x00000310.01effbc0 0x00000310.016a7bc0 ECC 0x0a5 Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x30) 0xffffffff.ffffffff 0x00000000.00000000 ECC 0x0ed Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 929717 kern.info] [AFT2] D$ data not available Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 335345 kern.info] [AFT2] I$ data not available Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 161744 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x005bed2a.28e99940 Nov 20 07:49:38 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.79a2fbf0 Nov 20 07:49:38 venus Fault_PC 0x1009a338 Esynd 0x01e8 J0304 Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 496037 kern.info] [AFT0] errID 0x005bed2a.28e99940 Corrected Memory Error on J0304 is Intermittent Nov 20 07:49:38 venus SUNW,UltraSPARC-III+: [ID 297644 kern.info] [AFT0] errID 0x005bed2a.28e99940 Data Bit 117 was in error and corrected Nov 20 07:49:44 venus SUNW,UltraSPARC-III+: [ID 205589 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x005bed2b.8e8e9d58 Nov 20 07:49:44 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.67e2fd60 Nov 20 07:49:44 venus Fault_PC Esynd 0x01e8 J0304 Nov 20 07:49:44 venus SUNW,UltraSPARC-III+: [ID 788786 kern.info] [AFT0] errID 0x005bed2b.8e8e9d58 Corrected Memory Error on J0304 is Intermittent Nov 20 07:49:44 venus SUNW,UltraSPARC-III+: [ID 598106 kern.info] [AFT0] errID 0x005bed2b.8e8e9d58 Data Bit 117 was in error and corrected Nov 20 07:50:32 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.5410a020 Nov 20 07:50:32 venus Fault_PC 0x7816599c Esynd 0x01e8 J0304 Nov 20 07:50:32 venus SUNW,UltraSPARC-III+: [ID 781803 kern.info] [AFT0] errID 0x005bed36.f3c99c30 Corrected Memory Error on J0304 is Intermittent Nov 20 07:50:32 venus SUNW,UltraSPARC-III+: [ID 330156 kern.info] [AFT0] errID 0x005bed36.f3c99c30 Data Bit 117 was in error and corrected Nov 20 07:50:32 venus SUNW,UltraSPARC-III+: [ID 984193 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x005bed36.f3c99c30 Nov 20 07:50:32 venus SUNW,UltraSPARC-III+: [ID 356798 kern.info] [AFT2] errID 0x005bed36.f3c99c30 PA=0x00000000.5410a000 Nov 20 07:50:32 venus E$tag 0x00000001.50000002 E$state_0 Exclusive Nov 20 07:50:32 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x00) 0xdeadbeef.00000000 0x00000300.08624000 ECC 0x153 Nov 20 07:50:32 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x10) 0x00000000.00000000 0x000c0000.002a0000 ECC 0x098 Nov 20 07:50:32 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x20) 0x686d616c.00000000 0x03f76002.00000000 ECC 0x091 Nov 20 07:50:32 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x30) 0x0000003c.00000001 0x00000001.00000000 ECC 0x024 Nov 20 07:50:32 venus SUNW,UltraSPARC-III+: [ID 929717 kern.info] [AFT2] D$ data not available Nov 20 07:50:32 venus SUNW,UltraSPARC-III+: [ID 335345 kern.info] [AFT2] I$ data not available Nov 20 07:50:34 venus pcisch: [ID 285080 kern.info] NOTICE: correctable error detected by pci0 (safari id 8) during Nov 20 07:50:34 venus DVMA read transaction Nov 20 07:50:34 venus pcisch: [ID 475334 kern.info] Transaction was a block operation. Nov 20 07:50:34 venus pcisch: [ID 956438 kern.info] dvma access, Memory safari command, address 00000000.190e76e0, owned_in not asserted. Nov 20 07:50:34 venus pcisch: [ID 863403 kern.info] AFSR=40000000.880001e8 AFAR=00000000.190e76e0, Nov 20 07:50:34 venus quad word offset 00000000.00000002, Memory Module port id 8. Nov 20 07:50:34 venus pcisch: [ID 916270 kern.info] syndrome bits 1e8 Nov 20 07:50:34 venus pcisch: [ID 545677 kern.info] mtag 0, mtag ecc syndrome 0 Nov 20 07:50:38 venus SUNW,UltraSPARC-III+: [ID 861323 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x005bed38.592232a8 Nov 20 07:50:38 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.6950a1a0 Nov 20 07:50:38 venus Fault_PC Esynd 0x01e8 J0304 Nov 20 07:50:38 venus SUNW,UltraSPARC-III+: [ID 603687 kern.info] [AFT0] errID 0x005bed38.592232a8 Corrected Memory Error on J0304 is Persistent Nov 20 07:50:38 venus SUNW,UltraSPARC-III+: [ID 790336 kern.info] [AFT0] errID 0x005bed38.592232a8 Data Bit 117 was in error and corrected Nov 20 07:50:55 venus pcisch: [ID 285080 kern.info] NOTICE: correctable error detected by pci0 (safari id 8) during Nov 20 07:50:55 venus DVMA read transaction Nov 20 07:50:55 venus pcisch: [ID 475334 kern.info] Transaction was a block operation. Nov 20 07:50:55 venus pcisch: [ID 956438 kern.info] dvma access, Memory safari command, address 00000000.4db2f9d0, owned_in not asserted. Nov 20 07:50:55 venus pcisch: [ID 863403 kern.info] AFSR=40000000.480001e8 AFAR=00000000.4db2f9d0, Nov 20 07:50:55 venus quad word offset 00000000.00000001, Memory Module port id 8. Nov 20 07:50:55 venus pcisch: [ID 916270 kern.info] syndrome bits 1e8 Nov 20 07:50:55 venus pcisch: [ID 545677 kern.info] mtag 0, mtag ecc syndrome 0 Nov 20 07:51:11 venus SUNW,UltraSPARC-III+: [ID 263006 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x005bed3f.d01d7578 Nov 20 07:51:11 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.79c3ce90 Nov 20 07:51:11 venus Fault_PC 0x10158aa4 Esynd 0x01e8 J0304 Nov 20 07:51:11 venus SUNW,UltraSPARC-III+: [ID 459810 kern.info] [AFT0] errID 0x005bed3f.d01d7578 Corrected Memory Error on J0304 is Intermittent Nov 20 07:51:11 venus SUNW,UltraSPARC-III+: [ID 307376 kern.info] [AFT0] errID 0x005bed3f.d01d7578 Data Bit 117 was in error and corrected Nov 20 07:51:11 venus SUNW,UltraSPARC-III+: [ID 870756 kern.info] [AFT2] errID 0x005bed3f.d01d7578 PA=0x00000000.79c3ce80 Nov 20 07:51:11 venus E$tag 0x00000001.e7012492 E$state_2 Exclusive Nov 20 07:51:11 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x00) 0x00000310.01c8ce60 0x00000310.028bce60 ECC 0x12d Nov 20 07:51:11 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x10) 0xffffffff.ffffffff 0x00000000.00000000 ECC 0x0ed Nov 20 07:51:11 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x20) 0x00000000.00000080 0x00000000.00000000 ECC 0x03e Nov 20 07:51:11 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x30) 0x0000b4d1.00020000 0x00000000.00000000 ECC 0x04c Nov 20 07:51:11 venus SUNW,UltraSPARC-III+: [ID 929717 kern.info] [AFT2] D$ data not available Nov 20 07:51:11 venus SUNW,UltraSPARC-III+: [ID 335345 kern.info] [AFT2] I$ data not available Nov 20 07:51:19 venus SUNW,UltraSPARC-III+: [ID 217001 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x005bed41.db07c928 Nov 20 07:51:19 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.3f3f8be0 Nov 20 07:51:19 venus Fault_PC 0x10156f50 Esynd 0x01e8 J0304 Nov 20 07:51:19 venus SUNW,UltraSPARC-III+: [ID 112349 kern.info] [AFT0] errID 0x005bed41.db07c928 Corrected Memory Error on J0304 is Intermittent Nov 20 07:51:19 venus SUNW,UltraSPARC-III+: [ID 601130 kern.info] [AFT0] errID 0x005bed41.db07c928 Data Bit 117 was in error and corrected Nov 20 07:51:19 venus SUNW,UltraSPARC-III+: [ID 175834 kern.info] [AFT2] errID 0x005bed41.db07c928 PA=0x00000000.3f3f8bc0 Nov 20 07:51:19 venus E$tag 0x00000000.fc004010 E$state_7 Invalid Nov 20 07:51:19 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x00) 0x000007b6.00000002 0xffffffff.ffffffff ECC 0x1db Nov 20 07:51:19 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x10) 0xffffffff.00000000 0x02008000.00000000 ECC 0x0c9 Nov 20 07:51:19 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x20) 0x00000000.00000000 0x00000000.00000000 ECC 0x000 Nov 20 07:51:19 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x30) 0x00000000.57d81002 0x00000000.00000000 ECC 0x17d Nov 20 07:51:19 venus SUNW,UltraSPARC-III+: [ID 929717 kern.info] [AFT2] D$ data not available Nov 20 07:51:19 venus SUNW,UltraSPARC-III+: [ID 335345 kern.info] [AFT2] I$ data not available Nov 20 07:51:19 venus pcisch: [ID 285080 kern.info] NOTICE: correctable error detected by pci0 (safari id 8) during Nov 20 07:51:19 venus DVMA read transaction Nov 20 07:51:19 venus pcisch: [ID 475334 kern.info] Transaction was a block operation. Nov 20 07:51:19 venus pcisch: [ID 956438 kern.info] dvma access, Memory safari command, address 00000000.511d8c90, owned_in not asserted. Nov 20 07:51:19 venus pcisch: [ID 863403 kern.info] AFSR=40000000.480001e8 AFAR=00000000.511d8c90, Nov 20 07:51:19 venus quad word offset 00000000.00000001, Memory Module port id 8. Nov 20 07:51:19 venus pcisch: [ID 916270 kern.info] syndrome bits 1e8 Nov 20 07:51:19 venus pcisch: [ID 545677 kern.info] mtag 0, mtag ecc syndrome 0 Nov 20 07:51:25 venus pcisch: [ID 285080 kern.info] NOTICE: correctable error detected by pci0 (safari id 8) during Nov 20 07:51:25 venus DVMA read transaction Nov 20 07:51:25 venus pcisch: [ID 475334 kern.info] Transaction was a block operation. Nov 20 07:51:25 venus pcisch: [ID 956438 kern.info] dvma access, Memory safari command, address 00000000.500e05f0, owned_in not asserted. Nov 20 07:51:25 venus pcisch: [ID 863403 kern.info] AFSR=40000000.c80001e8 AFAR=00000000.500e05f0, Nov 20 07:51:25 venus quad word offset 00000000.00000003, Memory Module port id 8. Nov 20 07:51:25 venus pcisch: [ID 916270 kern.info] syndrome bits 1e8 Nov 20 07:51:25 venus pcisch: [ID 545677 kern.info] mtag 0, mtag ecc syndrome 0 Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 770681 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x005bed44.8b767208 Nov 20 07:51:31 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.501df1b0 Nov 20 07:51:31 venus Fault_PC 0x10157400 Esynd 0x01e8 J0304 Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 845261 kern.info] [AFT0] errID 0x005bed44.8b767208 Corrected Memory Error on J0304 is Intermittent Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 698841 kern.info] [AFT0] errID 0x005bed44.8b767208 Data Bit 117 was in error and corrected Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 730112 kern.info] [AFT2] errID 0x005bed44.8b767208 E$tag PA=0x00000000.7a9df180 does not match AFAR=0x00000000.501df180 Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 741677 kern.info] [AFT2] errID 0x005bed44.8b767208 PA=0x00000000.7a9df180 Nov 20 07:51:31 venus E$tag 0x00000001.ea492492 E$state_6 Exclusive Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x00) 0x00000000.00000000 0x00000000.00000000 ECC 0x000 Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x10) 0x00000310.025df180 0x00000310.025df180 ECC 0x021 Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x20) 0x00000310.02237180 0x00000310.02117180 ECC 0x128 Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x30) 0xffffffff.ffffffff 0x00000000.00000000 ECC 0x0ed Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 730112 kern.info] [AFT2] errID 0x005bed44.8b767208 E$tag PA=0x00000000.709df180 does not match AFAR=0x00000000.501df180 Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 741677 kern.info] [AFT2] errID 0x005bed44.8b767208 PA=0x00000000.709df180 Nov 20 07:51:31 venus E$tag 0x00000001.c2122924 E$state_6 Modified Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x00) 0x00000000.00000000 0xbaddcafe.baddcafe ECC 0x0b8 Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x10) 0xbaddcafe.baddcafe 0x00000300.051b7058 ECC 0x01d Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x20) 0x00000000.c1b61c1a 0x00000326.baddcafe ECC 0x09e Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 895151 kern.info] [AFT2] E$Data (0x30) 0xbaddcafe.baddcafe 0x00000001.00fc00fc ECC 0x128 Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 929717 kern.info] [AFT2] D$ data not available Nov 20 07:51:31 venus SUNW,UltraSPARC-III+: [ID 335345 kern.info] [AFT2] I$ data not available Nov 20 07:51:31 venus pcisch: [ID 285080 kern.info] NOTICE: correctable error detected by pci0 (safari id 8) during Nov 20 07:51:31 venus DVMA read transaction Nov 20 07:51:31 venus pcisch: [ID 475334 kern.info] Transaction was a block operation. Nov 20 07:51:31 venus pcisch: [ID 956438 kern.info] dvma access, Memory safari command, address 00000000.48e340c0, owned_in not asserted. Nov 20 07:51:31 venus pcisch: [ID 863403 kern.info] AFSR=40000000.080001e8 AFAR=00000000.48e340c0, Nov 20 07:51:31 venus quad word offset 00000000.00000000, Memory Module port id 8. Nov 20 07:51:31 venus pcisch: [ID 916270 kern.info] syndrome bits 1e8 Nov 20 07:51:31 venus pcisch: [ID 545677 kern.info] mtag 0, mtag ecc syndrome 0 ...skipping Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 466866 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x0 05beeed.0a6a5c88 Nov 20 08:21:54 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.7f8352c0 Nov 20 08:21:54 venus Fault_PC Esynd 0x01e8 J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 468406 kern.info] [AFT0] errID 0x005beeed.0a6a5c88 Corrected Memory Error on J0304 is Sticky Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 592241 kern.info] [AFT0] errID 0x005beeed.0a6a5c88 Data Bit 117 was in error and corrected Nov 20 08:21:54 venus unix: [ID 752700 kern.warning] WARNING: [AFT0] Sticky Softerror encountered on Memory Module J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 466354 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x0 05beeed.0a6a5c88 Nov 20 08:21:54 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.7f8351c0 Nov 20 08:21:54 venus Fault_PC Esynd 0x01e8 J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 468406 kern.info] [AFT0] errID 0x005beeed.0a6a5c88 Corrected Memory Error on J0304 is Sticky Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 592241 kern.info] [AFT0] errID 0x005beeed.0a6a5c88 Data Bit 117 was in error and corrected Nov 20 08:21:54 venus unix: [ID 752700 kern.warning] WARNING: [AFT0] Sticky Softerror encountered on Memory Module J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 198980 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x0 05beeed.0b080af0 Nov 20 08:21:54 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.5000b890 Nov 20 08:21:54 venus Fault_PC Esynd 0x01e8 J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 856588 kern.info] [AFT0] errID 0x005beeed.0b080af0 Corrected Memory Error on J0304 is Sticky Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 679875 kern.info] [AFT0] errID 0x005beeed.0b080af0 Data Bit 117 was in error and corrected Nov 20 08:21:54 venus unix: [ID 752700 kern.warning] WARNING: [AFT0] Sticky Softerror encountered on Memory Module J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 573716 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x0 05beeed.0b080af0 Nov 20 08:21:54 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.7f8351c0 Nov 20 08:21:54 venus Fault_PC Esynd 0x01e8 J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 856588 kern.info] [AFT0] errID 0x005beeed.0b080af0 Corrected Memory Error on J0304 is Sticky Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 679875 kern.info] [AFT0] errID 0x005beeed.0b080af0 Data Bit 117 was in error and corrected Nov 20 08:21:54 venus unix: [ID 752700 kern.warning] WARNING: [AFT0] Sticky Softerror encountered on Memory Module J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 294710 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x0 05beeed.0ccc87a8 Nov 20 08:21:54 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.7f8fb280 Nov 20 08:21:54 venus Fault_PC Esynd 0x01e8 J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 351617 kern.info] [AFT0] errID 0x005beeed.0ccc87a8 Corrected Memory Error on J0304 is Sticky Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 354363 kern.info] [AFT0] errID 0x005beeed.0ccc87a8 Data Bit 117 was in error and corrected Nov 20 08:21:54 venus unix: [ID 752700 kern.warning] WARNING: [AFT0] Sticky Softerror encountered on Memory Module J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 840909 kern.info] NOTICE: [AFT0] Corrected system bus (CE) Event detected by CPU0 at TL=0, errID 0x0 05beeed.0ccc87a8 Nov 20 08:21:54 venus AFSR 0x00000002.000001e8 AFAR 0x00000000.7f46b610 Nov 20 08:21:54 venus Fault_PC Esynd 0x01e8 J0304 Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 351617 kern.info] [AFT0] errID 0x005beeed.0ccc87a8 Corrected Memory Error on J0304 is Sticky Nov 20 08:21:54 venus SUNW,UltraSPARC-III+: [ID 354363 kern.info] [AFT0] errID 0x005beeed.0ccc87a8 Data Bit 117 was in error and corrected Nov 20 08:21:54 venus unix: [ID 752700 kern.warning] WARNING: [AFT0] Sticky Softerror encountered on Memory Module J0304 I've reseated the DIMMs (total 4 cards in group 0) but it did not boot. SUN 280R manual says at least one memory group consisting of 4 slots (group 0 = J0100, J0202, J0304, J0406 or group 1 = J0101, J0203, J0305, J0407) must be filled for the system to boot. So, I assumed that if one card is bad, then it will not boot unless we have 4 more cards in group 1 . So, I've replaced all DIMMs in group 0 by using new cards, but it was still failing. Because of this, I run debug (POST diagnostics) at power startup, and the debug result was telling that the CPU is bad. ---------------------------------------------------------------------- 0>ERROR: TEST = Safari quick check 0>H/W under test = Motherboard/Centerplane Safari, CPU Slot A 0>Repair Instructions: Replace items in order listed by 'H/W under test' above. 0>MSG = ERROR:0 AFSR Error 00000002.000001e8, AFAR 00000400.04417010. 0>END_ERROR 0> CE bit: Correctable system data ECC error 0>ERROR: TEST = Safari quick check 0>H/W under test = Motherboard/Centerplane Safari, CPU Slot A 0>Repair Instructions: Replace items in order listed by 'H/W under test' above. 0>MSG = Safari quick check error CPU_0 to IO-bridge_0 0>END_ERROR 0>ERROR: TEST = Safari quick check 0>H/W under test = Motherboard/Centerplane Safari, CPU Slot A 0>Repair Instructions: Replace items in order listed by 'H/W under test' above. 0>MSG = *** Test Failed!! *** 0>END_ERROR 0>ERROR: TEST = Safari quick check 0>H/W under test = Motherboard/Centerplane Safari, CPU Slot A 0>Repair Instructions: Replace items in order listed by 'H/W under test' above. 0>MSG = ERROR: Fatal CPU error on master, rolling over to new master. 0>END_ERROR 0>ERROR: TEST = Safari quick check 0>H/W under test = Motherboard/Centerplane Safari, CPU Slot A 0>Repair Instructions: Replace items in order listed by 'H/W under test' above. 0>MSG = ERROR: Unknown type request 80000000.00001a00 in global_gnt. 0>END_ERROR 0>ERROR: No good CPUs left! Calling debug menu. ---------------------------------------------------------------------- So, for the testing purpose and to find the root cause, I've replaced CPU on old 280R and confirmed no errors with debug test and the Solaris OS booted normally. But, I was already removing and installing all the ce and qfe cards from old 280R to new 280R, so I've decided to use new 280R instead of using old 280R (so, I've moved 'good CPU' back) and swap the hard disks to avoid rebuilding firewall (new 280R did not have enough memory, so I am still using the original DIMMs). The service has been restored and it is working fine on new 280R (for now).