Autopsy of a minor disk failure, fsck, lost+found

Ian! D. Allen – www.idallen.com

Fall 2017 - September to December 2017 - Updated 2017-02-22 10:11 EST

1 Disk attributes of damaged diskIndexup to index

A 1TB disk is slowly failing:

# smartctl -x /dev/sdd
[...]
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   118   089   006    -    191177437
  3 Spin_Up_Time            PO----   093   092   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    54
  5 Reallocated_Sector_Ct   PO--CK   099   099   036    -    73
  7 Seek_Error_Rate         POSR--   087   060   030    -    507706352
  9 Power_On_Hours          -O--CK   070   070   000    -    26402
 10 Spin_Retry_Count        PO--C-   100   100   097    -    1
 12 Power_Cycle_Count       -O--CK   100   100   020    -    32
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   001   001   000    -    5143
188 Command_Timeout         -O--CK   100   088   000    -    393568
189 High_Fly_Writes         -O-RCK   001   001   000    -    1781
190 Airflow_Temperature_Cel -O---K   062   052   045    -    38 (Min/Max 29/48)
194 Temperature_Celsius     -O---K   038   048   000    -    38 (0 16 0 0 0)
195 Hardware_ECC_Recovered  -O-RC-   050   020   000    -    191177437
197 Current_Pending_Sector  -O--C-   100   100   000    -    28
198 Offline_Uncorrectable   ----C-   100   100   000    -    28
199 UDMA_CRC_Error_Count    -OSRCK   200   199   000    -    39
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

2 I/O error during routine backupIndexup to index

Today, a routine backup had an I/O error:

# ./dobackup.sh
dobackup.sh: Doing /mnt/ubuntu10.04c/. -> /mnt/1tbB/ubuntu10.04c max size 20G Tue Feb 21 10:57:42 EST 2017
rsync: readlink_stat("/mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js") failed: Structure needs cleaning (117)
rsync: readlink_stat("/mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/java") failed: Structure needs cleaning (117)
IO error encountered -- skipping file deletion

Checking the file in question with ls:

# ls -lF /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js
ls: cannot access '/mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js': Structure needs cleaning

Checking the parent directory shows damage:

# ls -lF /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl
ls: cannot access '/mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js': Structure needs cleaning
ls: cannot access '/mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/java': Structure needs cleaning
total 16
dr-xr-xr-x 5 idallen idallen 4096 May 15  2006 ./
dr-xr-xr-x 3 idallen idallen 4096 May 15  2006 ../
dr-xr-xr-x 6 idallen idallen 4096 May 15  2006 common/
d????????? ? ?       ?          ?            ? java/
d????????? ? ?       ?          ?            ? js/
-r-xr-xr-x 1 idallen idallen  885 May 15  2006 version.htm

Find the inode number of this directory:

# ls -lidF /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl
20711024 dr-xr-xr-x 5 idallen idallen 4096 May 15  2006 ./

3 Running a file system check: fsckIndexup to index

Confirm that /mnt/1tbB is a partition on the questionable /dev/sdd disk:

# df /mnt/1tbB
Filesystem     1K-blocks      Used Available Use% Mounted on
/dev/sdd1      961302560 751972852 199545712  80% /mnt/1tbB

Unmount the file system and do a file system check and repair on it:

# umount /dev/sdd1

# fsck -v -C -f /dev/sdd1

Pass 1: Checking inodes, blocks, and sizes
Inodes that were part of a corrupted orphan linked list found.  Fix<y>? yes
Inode 20711106 was part of the orphaned inode list.  FIXED.
Inode 20711106, i_blocks is 19792652383982, should be 0.  Fix<y>? yes
Inode 20711107 was part of the orphaned inode list.  FIXED.
Inode 20711107, i_blocks is 19796947351278, should be 0.  Fix<y>? yes
Inode 20711108 was part of the orphaned inode list.  FIXED.
Inode 20711108, i_blocks is 19801242318574, should be 0.  Fix<y>? yes
Inode 20711109 was part of the orphaned inode list.  FIXED.
Inode 20711109, i_blocks is 19805537285870, should be 0.  Fix<y>? yes
Inode 20711110 was part of the orphaned inode list.  FIXED.
Inode 20711110, i_blocks is 19809832253166, should be 0.  Fix<y>? yes
Inode 20711111 was part of the orphaned inode list.  FIXED.
Inode 20711111, i_blocks is 19814127220462, should be 0.  Fix<y>? yes
Inode 20711112 was part of the orphaned inode list.  FIXED.
Inode 20711112, i_blocks is 19818422187758, should be 0.  Fix<y>? yes
Inode 20711113 was part of the orphaned inode list.  FIXED.
Inode 20711113, i_blocks is 19822717155054, should be 0.  Fix<y>? yes
Inode 20711114 was part of the orphaned inode list.  FIXED.
Inode 20711114, i_blocks is 19827012122350, should be 0.  Fix<y>? yes
Inode 20711115 was part of the orphaned inode list.  FIXED.
Inode 20711115, i_blocks is 19831307089646, should be 0.  Fix<y>? yes
Inode 20711116 was part of the orphaned inode list.  FIXED.
Inode 20711116, i_blocks is 19835602056942, should be 0.  Fix<y>? yes
Inode 20711117 was part of the orphaned inode list.  FIXED.
Inode 20711117, i_blocks is 19839897024238, should be 0.  Fix<y>? yes
Inode 20711118 was part of the orphaned inode list.  FIXED.
Inode 20711118, i_blocks is 19844191991534, should be 0.  Fix<y>? yes
Inode 20711119 was part of the orphaned inode list.  FIXED.
Inode 20711119, i_blocks is 19848486958830, should be 0.  Fix<y>? yes
Inode 20711120 was part of the orphaned inode list.  FIXED.
Inode 20711120, i_blocks is 19852781926126, should be 0.  Fix<y>? yes

Pass 2: Checking directory structure
Inode 20711108 (/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/java) has invalid mode (00).
Clear<y>? yes
Inode 20711115 (/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js) has invalid mode (00).
Clear<y>? yes
Inode 20711106 (/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/common/scripts/strutils.js) has invalid mode (00).
Clear<y>? yes
Inode 20711107 (/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/common/scripts/switch.js) has invalid mode (00).
Clear<y>? yes
Entry '..' in <20711115>/<20711123> (20711123) has deleted/unused inode 20711115.  Clear<y>? yes
Entry '..' in <20711115>/<20711141> (20711141) has deleted/unused inode 20711115.  Clear<y>? yes
Entry '..' in <20711115>/<20711144> (20711144) has deleted/unused inode 20711115.  Clear<y>? yes

Pass 3: Checking directory connectivity

Unconnected directory inode 20711123 (...)
Connect to /lost+found<y>? yes
Unconnected directory inode 20711141 (...)
Connect to /lost+found<y>? yes
Unconnected directory inode 20711144 (...)
Connect to /lost+found<y>? yes

Pass 4: Checking reference counts

Inode 20711024 ref count is 5, should be 3.  Fix<y>? yes
Inode 20711109 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711110 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711111 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711112 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711113 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711114 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711116 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711117 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711118 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711119 (...) has invalid mode (00).
Clear<y>? yes
Inode 20711120 (...) has invalid mode (00).
Clear<y>? yes
Unattached inode 20711121
Connect to /lost+found<y>? yes
Inode 20711121 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 20711122
Connect to /lost+found<y>? yes
Inode 20711122 ref count is 2, should be 1.  Fix<y>? yes
Inode 20711123 ref count is 3, should be 2.  Fix<y>? yes
Inode 20711141 ref count is 3, should be 2.  Fix<y>? yes
Inode 20711144 ref count is 3, should be 2.  Fix<y>? yes

Pass 5: Checking group summary information

Block bitmap differences:  -(82845914--82845917) -(82872976--82872995)
Fix<y>? yes
Free blocks count wrong for group #2528 (15, counted=19).
Fix<y>? yes
Free blocks count wrong for group #2529 (73, counted=93).
Fix<y>? yes
Free blocks count wrong (52332403, counted=52332427).
Fix<y>? yes
Directories count wrong for group #2528 (893, counted=889).
Fix<y>? yes

1tbB: ***** FILE SYSTEM WAS MODIFIED *****

     2623987 inodes used (4.30%, out of 61054976)
        5225 non-contiguous files (0.2%)
         247 non-contiguous directories (0.0%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 2579521/463/2
   191857963 blocks used (78.57%, out of 244190390)
          16 bad blocks
          25 large files

     2308517 regular files
      269852 directories
         790 character device files
          26 block device files
          69 fifos
      223385 links
       44649 symbolic links (43033 fast symbolic links)
          75 sockets
------------
     2847361 files

4 Re-mount the file system and examine lost+foundIndexup to index

Re-mount the file system and look at the same directory. The damage is gone, but so are two damaged directories of files:

# mount /mnt/1tbB

# ls -liF /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl
total 16
20711024 dr-xr-xr-x 3 idallen idallen 4096 May 15  2006 ./
20711023 dr-xr-xr-x 3 idallen idallen 4096 May 15  2006 ../
20711025 dr-xr-xr-x 6 idallen idallen 4096 May 15  2006 common/
20711160 -r-xr-xr-x 1 idallen idallen  885 May 15  2006 version.htm

Check the lost+found for this file system and see what was salvaged and ended up there:

# ls -lF /mnt/1tbB/lost+found/
total 44
-r-xr-xr-x 1 idallen idallen  1115 May 15  2006 #20711121
-r-xr-xr-x 1 idallen idallen  5217 May 15  2006 #20711122
dr-xr-xr-x 2 idallen idallen  4096 May 15  2006 #20711123/
dr-xr-xr-x 2 idallen idallen  4096 May 15  2006 #20711141/
dr-xr-xr-x 2 idallen idallen  4096 May 15  2006 #20711144/
drwx------ 5 root    root    16384 Aug  3  2015 ./
drwxr-xr-x 8 idallen idallen  4096 Feb  3 08:56 ../

# find /mnt/1tbB/lost+found/
/mnt/1tbB/lost+found/
/mnt/1tbB/lost+found/#20711123
/mnt/1tbB/lost+found/#20711123/navanim1.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_it.gif
/mnt/1tbB/lost+found/#20711123/nfocbg.gif
/mnt/1tbB/lost+found/#20711123/patt_right.gif
/mnt/1tbB/lost+found/#20711123/tabspacer.gif
/mnt/1tbB/lost+found/#20711123/pdf.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_pt.gif
/mnt/1tbB/lost+found/#20711123/tabsbg.gif
/mnt/1tbB/lost+found/#20711123/navanim2.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_en.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_es.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_sv.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_nl.gif
/mnt/1tbB/lost+found/#20711123/navanim1_enfocsite.gif
/mnt/1tbB/lost+found/#20711123/tabsbg_bkup.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_fr.gif
/mnt/1tbB/lost+found/#20711123/searchbutton_de.gif
/mnt/1tbB/lost+found/#20711141
/mnt/1tbB/lost+found/#20711141/options.js
/mnt/1tbB/lost+found/#20711141/locale.js
/mnt/1tbB/lost+found/#20711144
/mnt/1tbB/lost+found/#20711144/search.js
/mnt/1tbB/lost+found/#20711144/outlin1s.js
/mnt/1tbB/lost+found/#20711144/javascpt.js
/mnt/1tbB/lost+found/#20711144/outline.js
/mnt/1tbB/lost+found/#20711144/index.js
/mnt/1tbB/lost+found/#20711144/outlfast.js
/mnt/1tbB/lost+found/#20711144/search4s.js
/mnt/1tbB/lost+found/#20711144/panels.js
/mnt/1tbB/lost+found/#20711144/handler.js
/mnt/1tbB/lost+found/#20711144/tabs.js
/mnt/1tbB/lost+found/#20711144/search3s.js
/mnt/1tbB/lost+found/#20711144/search1s.js
/mnt/1tbB/lost+found/#20711144/outlsafe.js
/mnt/1tbB/lost+found/#20711144/search2s.js
/mnt/1tbB/lost+found/#20711144/index1s.js
/mnt/1tbB/lost+found/#20711121
/mnt/1tbB/lost+found/#20711122

Since this file system is a backup copy, I can check the original to discover the names of the three directories above by looking for the above file names in the original:

# find /idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl | grep navanim1.gif
/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js/images/navanim1.gif

The above shows that lost+found/#20711123 should be named /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js/images. A similar search identifies the names of the other two directory inodes under lost+found. We move the three directories back where they belong:

# mkdir /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js
# mkdir /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/java
# mv /mnt/1tbB/lost+found/#20711123 /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js/images
# mv /mnt/1tbB/lost+found/#20711141 /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js/private
# mv /mnt/1tbB/lost+found/#20711144 /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/js/scripts

A comparison of the source and the backup copy show that a few files have actually been lost in this I/O error. The I/O error damaged some data inodes and that data is gone:

# rsync -avxHs -n /idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/. /mnt/1tbB/ubuntu10.04c/idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/
sending incremental file list
./
common/scripts/strutils.js
common/scripts/switch.js
java/
java/private/
java/private/books.xml
java/private/locale.js
java/private/locale.xml
java/private/options.js
java/private/options.xml
js/
js/html/
js/html/indexsel.htm
js/html/navigate.htm
js/html/panel.htm
js/html/panelini.htm
js/html/tabs.htm
js/html/wwhelp.htm

sent 2,445 bytes  received 82 bytes  5,054.00 bytes/sec
total size is 394,537  speedup is 156.13 (DRY RUN)

Doing a checksum on the two remaining files in lost+found and comparing sums with the above list of files identifies their names:

# cd /idallen/archive/SpeedTouch516/Documentation/HTML/ST706_es/wwhelp/wwhimpl/
# sum js/html/tabs.htm js/html/wwhelp.htm /mnt/1tbB/lost+found/#2071112*
27775     2 js/html/tabs.htm
01742     6 js/html/wwhelp.htm
27775     2 /mnt/1tbB/lost+found/#20711121
01742     6 /mnt/1tbB/lost+found/#20711122

The other files in the html/ directory and other directories were damaged by the I/O error and are gone, but we can simply re-do the backup to recreate them. I should throw out this old disk!

Author: 
| Ian! D. Allen, BA, MMath  -  idallen@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/

Plain Text - plain text version of this page in Pandoc Markdown format

Campaign for non-browser-specific HTML   Valid XHTML 1.0 Transitional   Valid CSS!   Creative Commons by nc sa 3.0   Hacker Ideals Emblem   Author Ian! D. Allen