Fall 2016 - September to December 2016 - Updated 2017-02-22 10:11 EST
A 1TB disk is slowly failing:
# smartctl -x /dev/sdd
[...]
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-- 118 089 006 - 191177437
3 Spin_Up_Time PO---- 093 092 000 - 0
4 Start_Stop_Count -O--CK 100 100 020 - 54
5 Reallocated_Sector_Ct PO--CK 099 099 036 - 73
7 Seek_Error_Rate POSR-- 087 060 030 - 507706352
9 Power_On_Hours -O--CK 070 070 000 - 26402
10 Spin_Retry_Count PO--C- 100 100 097 - 1
12 Power_Cycle_Count -O--CK 100 100 020 - 32
184 End-to-End_Error -O--CK 100 100 099 - 0
187 Reported_Uncorrect -O--CK 001 001 000 - 5143
188 Command_Timeout -O--CK 100 088 000 - 393568
189 High_Fly_Writes -O-RCK 001 001 000 - 1781
190 Airflow_Temperature_Cel -O---K 062 052 045 - 38 (Min/Max 29/48)
194 Temperature_Celsius -O---K 038 048 000 - 38 (0 16 0 0 0)
195 Hardware_ECC_Recovered -O-RC- 050 020 000 - 191177437
197 Current_Pending_Sector -O--C- 100 100 000 - 28
198 Offline_Uncorrectable ----C- 100 100 000 - 28
199 UDMA_CRC_Error_Count -OSRCK 200 199 000 - 39
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
Today, a routine backup had an I/O error:
# ./dobackup.sh
dobackup.sh: Doing /mnt/ubuntu10.04c/. -> /mnt/1tbB/ubuntu10.04c max size 20G Tue Feb 21 10:57:42 EST 2017
rsync: readlink_stat("/mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js") failed: Structure needs cleaning (117)
rsync: readlink_stat("/mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/java") failed: Structure needs cleaning (117)
IO error encountered -- skipping file deletion
Checking the file in question with ls
:
# ls -lF /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js
ls: cannot access '/mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js': Structure needs cleaning
Checking the parent directory shows damage:
# ls -lF /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl
ls: cannot access '/mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js': Structure needs cleaning
ls: cannot access '/mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/java': Structure needs cleaning
total 16
dr-xr-xr-x 5 idallen idallen 4096 May 15 2006 ./
dr-xr-xr-x 3 idallen idallen 4096 May 15 2006 ../
dr-xr-xr-x 6 idallen idallen 4096 May 15 2006 common/
d????????? ? ? ? ? ? java/
d????????? ? ? ? ? ? js/
-r-xr-xr-x 1 idallen idallen 885 May 15 2006 version.htm
Find the inode number of this directory:
# ls -lidF /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl
20711024 dr-xr-xr-x 5 idallen idallen 4096 May 15 2006 ./
fsck
IndexConfirm that /mnt/1tbB is a partition on the questionable /dev/sdd
disk:
# df /mnt/1tbB
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdd1 961302560 751972852 199545712 80% /mnt/1tbB
Unmount the file system and do a file system check and repair on it:
# umount /dev/sdd1
# fsck -v -C -f /dev/sdd1
Pass 1: Checking inodes, blocks, and sizes
Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes
Inode 20711106 was part of the orphaned inode list. FIXED.
Inode 20711106, i_blocks is 19792652383982, should be 0. Fix<y>? yes
Inode 20711107 was part of the orphaned inode list. FIXED.
Inode 20711107, i_blocks is 19796947351278, should be 0. Fix<y>? yes
Inode 20711108 was part of the orphaned inode list. FIXED.
Inode 20711108, i_blocks is 19801242318574, should be 0. Fix<y>? yes
Inode 20711109 was part of the orphaned inode list. FIXED.
Inode 20711109, i_blocks is 19805537285870, should be 0. Fix<y>? yes
Inode 20711110 was part of the orphaned inode list. FIXED.
Inode 20711110, i_blocks is 19809832253166, should be 0. Fix<y>? yes
Inode 20711111 was part of the orphaned inode list. FIXED.
Inode 20711111, i_blocks is 19814127220462, should be 0. Fix<y>? yes
Inode 20711112 was part of the orphaned inode list. FIXED.
Inode 20711112, i_blocks is 19818422187758, should be 0. Fix<y>? yes
Inode 20711113 was part of the orphaned inode list. FIXED.
Inode 20711113, i_blocks is 19822717155054, should be 0. Fix<y>? yes
Inode 20711114 was part of the orphaned inode list. FIXED.
Inode 20711114, i_blocks is 19827012122350, should be 0. Fix<y>? yes
Inode 20711115 was part of the orphaned inode list. FIXED.
Inode 20711115, i_blocks is 19831307089646, should be 0. Fix<y>? yes
Inode 20711116 was part of the orphaned inode list. FIXED.
Inode 20711116, i_blocks is 19835602056942, should be 0. Fix<y>? yes
Inode 20711117 was part of the orphaned inode list. FIXED.
Inode 20711117, i_blocks is 19839897024238, should be 0. Fix<y>? yes
Inode 20711118 was part of the orphaned inode list. FIXED.
Inode 20711118, i_blocks is 19844191991534, should be 0. Fix<y>? yes
Inode 20711119 was part of the orphaned inode list. FIXED.
Inode 20711119, i_blocks is 19848486958830, should be 0. Fix<y>? yes
Inode 20711120 was part of the orphaned inode list. FIXED.
Inode 20711120, i_blocks is 19852781926126, should be 0. Fix<y>? yes
Pass 2: Checking directory structure
Inode 20711108 (/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/java) has invalid mode (00).
Clear<y>? yes
Inode 20711115 (/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js) has invalid mode (00).
Clear<y>? yes
Inode 20711106 (/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/common/scripts/strutils.js) has invalid mode (00).
Clear<y>? yes
Inode 20711107 (/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/common/scripts/switch.js) has invalid mode (00).
Clear<y>? yes
Entry '..' in <20711115>/<20711123> (20711123) has deleted/unused inode 20711115. Clear<y>? yes
Entry '..' in <20711115>/<20711141> (20711141) has deleted/unused inode 20711115. Clear<y>? yes
Entry '..' in <20711115>/<20711144> (20711144) has deleted/unused inode 20711115. Clear<y>? yes
Pass 3: Checking directory connectivity
Unconnected directory inode 20711123 (...)
Connect to /lost+found<y>? yes
Unconnected directory inode 20711141 (...)
Connect to /lost+found<y>? yes
Unconnected directory inode 20711144 (...)
Connect to /lost+found<y>? yes
Pass 4: Checking reference counts
Inode 20711024 ref count is 5, should be 3. Fix<y>? yes
Inode 20711109 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711110 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711111 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711112 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711113 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711114 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711116 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711117 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711118 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711119 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711120 (...) has invalid mode (00).
Clear<y>? yes
Unattached inode 20711121
Connect to /lost+found<y>? yes
Inode 20711121 ref count is 2, should be 1. Fix<y>? yes
Unattached inode 20711122
Connect to /lost+found<y>? yes
Inode 20711122 ref count is 2, should be 1. Fix<y>? yes
Inode 20711123 ref count is 3, should be 2. Fix<y>? yes
Inode 20711141 ref count is 3, should be 2. Fix<y>? yes
Inode 20711144 ref count is 3, should be 2. Fix<y>? yes
Pass 5: Checking group summary information
Block bitmap differences: -(82845914--82845917) -(82872976--82872995)
Fix<y>? yes
Free blocks count wrong for group #2528 (15, counted=19).
Fix<y>? yes
Free blocks count wrong for group #2529 (73, counted=93).
Fix<y>? yes
Free blocks count wrong (52332403, counted=52332427).
Fix<y>? yes
Directories count wrong for group #2528 (893, counted=889).
Fix<y>? yes
1tbB: ***** FILE SYSTEM WAS MODIFIED *****
2623987 inodes used (4.30%, out of 61054976)
5225 non-contiguous files (0.2%)
247 non-contiguous directories (0.0%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 2579521/463/2
191857963 blocks used (78.57%, out of 244190390)
16 bad blocks
25 large files
2308517 regular files
269852 directories
790 character device files
26 block device files
69 fifos
223385 links
44649 symbolic links (43033 fast symbolic links)
75 sockets
------------
2847361 files
Re-mount the file system and look at the same directory. The damage is gone, but so are two damaged directories of files:
# mount /mnt/1tbB
# ls -liF /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl
total 16
20711024 dr-xr-xr-x 3 idallen idallen 4096 May 15 2006 ./
20711023 dr-xr-xr-x 3 idallen idallen 4096 May 15 2006 ../
20711025 dr-xr-xr-x 6 idallen idallen 4096 May 15 2006 common/
20711160 -r-xr-xr-x 1 idallen idallen 885 May 15 2006 version.htm
Check the lost+found
for this file system and see what was salvaged and ended up there:
# ls -lF /mnt/1tbB/lost+found/
total 44
-r-xr-xr-x 1 idallen idallen 1115 May 15 2006 #20711121
-r-xr-xr-x 1 idallen idallen 5217 May 15 2006 #20711122
dr-xr-xr-x 2 idallen idallen 4096 May 15 2006 #20711123/
dr-xr-xr-x 2 idallen idallen 4096 May 15 2006 #20711141/
dr-xr-xr-x 2 idallen idallen 4096 May 15 2006 #20711144/
drwx------ 5 root root 16384 Aug 3 2015 ./
drwxr-xr-x 8 idallen idallen 4096 Feb 3 08:56 ../
# find /mnt/1tbB/lost+found/
/mnt/1tbB/lost+found/
/mnt/1tbB/lost+found/#20711123
/mnt/1tbB/lost+found/#20711123/navanim1.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_it.gif
/mnt/1tbB/lost+found/#20711123/nfocbg.gif
/mnt/1tbB/lost+found/#20711123/patt_right.gif
/mnt/1tbB/lost+found/#20711123/tabspacer.gif
/mnt/1tbB/lost+found/#20711123/pdf.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_pt.gif
/mnt/1tbB/lost+found/#20711123/tabsbg.gif
/mnt/1tbB/lost+found/#20711123/navanim2.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_en.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_es.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_sv.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_nl.gif
/mnt/1tbB/lost+found/#20711123/navanim1_enfocsite.gif
/mnt/1tbB/lost+found/#20711123/tabsbg_bkup.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_fr.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_de.gif
/mnt/1tbB/lost+found/#20711141
/mnt/1tbB/lost+found/#20711141/options.js
/mnt/1tbB/lost+found/#20711141/locale.js
/mnt/1tbB/lost+found/#20711144
/mnt/1tbB/lost+found/#20711144/search.js
/mnt/1tbB/lost+found/#20711144/outlin1s.js
/mnt/1tbB/lost+found/#20711144/javascpt.js
/mnt/1tbB/lost+found/#20711144/outline.js
/mnt/1tbB/lost+found/#20711144/index.js
/mnt/1tbB/lost+found/#20711144/outlfast.js
/mnt/1tbB/lost+found/#20711144/search4s.js
/mnt/1tbB/lost+found/#20711144/panels.js
/mnt/1tbB/lost+found/#20711144/handler.js
/mnt/1tbB/lost+found/#20711144/tabs.js
/mnt/1tbB/lost+found/#20711144/search3s.js
/mnt/1tbB/lost+found/#20711144/search1s.js
/mnt/1tbB/lost+found/#20711144/outlsafe.js
/mnt/1tbB/lost+found/#20711144/search2s.js
/mnt/1tbB/lost+found/#20711144/index1s.js
/mnt/1tbB/lost+found/#20711121
/mnt/1tbB/lost+found/#20711122
Since this file system is a backup copy, I can check the original to discover the names of the three directories above by looking for the above file names in the original:
# find /idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl | grep navanim1.gif
/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js/images/navanim1.gif
The above shows that lost+found/#20711123
should be named /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js/images
. A similar search identifies the names of the other two directory inodes under lost+found
. We move the three directories back where they belong:
# mkdir /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js
# mkdir /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/java
# mv /mnt/1tbB/lost+found/#20711123 /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js/images
# mv /mnt/1tbB/lost+found/#20711141 /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js/private
# mv /mnt/1tbB/lost+found/#20711144 /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js/scripts
A comparison of the source and the backup copy show that a few files have actually been lost in this I/O error. The I/O error damaged some data inodes and that data is gone:
# rsync -avxHs -n /idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/. /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/
sending incremental file list
./
common/scripts/strutils.js
common/scripts/switch.js
java/
java/private/
java/private/books.xml
java/private/locale.js
java/private/locale.xml
java/private/options.js
java/private/options.xml
js/
js/html/
js/html/indexsel.htm
js/html/navigate.htm
js/html/panel.htm
js/html/panelini.htm
js/html/tabs.htm
js/html/wwhelp.htm
sent 2,445 bytes received 82 bytes 5,054.00 bytes/sec
total size is 394,537 speedup is 156.13 (DRY RUN)
Doing a checksum on the two remaining files in lost+found
and comparing sums with the above list of files identifies their names:
# cd /idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/
# sum js/html/tabs.htm js/html/wwhelp.htm /mnt/1tbB/lost+found/#2071112*
27775 2 js/html/tabs.htm
01742 6 js/html/wwhelp.htm
27775 2 /mnt/1tbB/lost+found/#20711121
01742 6 /mnt/1tbB/lost+found/#20711122
The other files in the html/
directory and other directories were damaged by the I/O error and are gone, but we can simply re-do the backup to recreate them. I should throw out this old disk!