2

I was "zeroing" my drive with the dd tool when I get this input/output error:

Input/output error

Is my drive good for trash or is there a way to fix it please?

Thanks a lot for your answer :)

sudo dd if=/dev/zero of=/dev/zero bs=4096 status=progress

It works with bs=4096!

bs=4096

sudo badblocks -wsv /dev/sda

The read-write badblocks test is OK.

Pass completed, 0 bad blocks found (0/0/0 errors)

sudo smartctl -H /dev/sda

The quick report is OK (PASSED):

smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.18.0-15-generic] (local build) 
Copyright (C) 2002-16, Bruce Allen, Christian Franke,

www.smartmontools.org

=== START OF READ SMART DATA SECTION === 
SMART overall-health self-assessment test result: PASSED

I ran this test:

sudo smartctl -t long /dev/sda

Here is the report obtained with:

sudo smartctl -a /dev/sda

smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.18.0-15-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     Hitachi HDS721050CLA360
Serial Number:    JP1572FN3RJ3WK LU WWN
Device Id: 5 000cca 399f48335
Firmware Version: JP2OA50E
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Tue Nov 26 16:03:40 2019 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values: Offline data collection status:  (0x80)
Offline data collection activity was never started.
Auto Offline Data Collection: Enabled. Self-test execution status:      (   0)
The previous self-test routine completed without error or no self-test 
    has ever been run.
Total time to complete Offline data collection: ( 4920) seconds.
Offline data collection capabilities:  (0x5b) SMART execute Offline immediate.
                Auto Offline data collection on/off support.
                Suspend Offline collection upon new
                command.
                Offline surface scan supported.
                Self-test supported.
                No Conveyance Self-test supported.
                Selective Self-test supported.
SMART capabilities:  (0x0003)
                     Saves SMART data before entering power-saving mode.
                     Supports SMART auto save timer.
Error logging capability:        (0x01)
                                 Error logging supported.
                                 General Purpose Logging supported.  
Short self-test routine  recommended polling time:   (   1) minutes.
Extended self-test routine recommended polling time:     (  82) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always  -       0
 2 Throughput_Performance  0x0005   136   136   054    Pre-fail  Offline      -       95
 3 Spin_Up_Time            0x0007   118   118   024    Pre-fail  Always       -       193 (Average 195)
 4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       1533
 5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
 7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
 8 Seek_Time_Performance   0x0005   140   140   020    Pre-fail  Offline      -       30
 9 Power_On_Hours          0x0012   099   099   000    Old_age   Always       -       11208
 10 Spin_Retry_Count       0x0013   100   100   060    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       1533
192 Power-Off_Retract_Count 0x0032  099   099   000    Old_age   Always       -       1533
193 Load_Cycle_Count        0x0012  099   099   000    Old_age   Always       -       1533
194 Temperature_Celsius     0x0002   222   222   000    Old_age   Always       -       27 (Min/Max 10/34)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0 198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
ATA Error Count: 88 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss
where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec.
It "wraps" after 49.710 days.

Error 88 occurred at disk power-on lifetime: 11195 hours (466 days + 11 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 05 0b e4 5b 03
Error: UNC at LBA = 0x035be40b = 56353803 
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name      
-- -- -- -- -- -- -- --  ----------------  --------------------    
60 08 30 08 e4 5b 40 00  13d+15:30:26.073  READ FPDMA QUEUED
60 08 28 00 e4 5b 40 00  13d+15:30:26.072  READ FPDMA QUEUED
60 08 20 f8 e3 5b 40 00  13d+15:30:26.071  READ FPDMA QUEUED
60 08 18 f0 e3 5b 40 00  13d+15:30:26.070  READ FPDMA QUEUED
60 08 10 e8 e3 5b 40 00  13d+15:30:26.069  READ FPDMA QUEUED 

Error 87 occurred at disk power-on lifetime: 11194 hours (466 days + 10 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 05 0b e4 5b 03
Error: UNC at LBA = 0x035be40b = 56353803
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name   
-- -- -- -- -- -- -- --  ----------------  --------------------     
60 08 c0 08 e4 5b 40 00  13d+09:23:07.021  READ FPDMA QUEUED
60 08 b8 00 e4 5b 40 00  13d+09:23:07.019  READ FPDMA QUEUED
60 08 b0 f8 e3 5b 40 00  13d+09:23:07.018  READ FPDMA QUEUED
60 08 a8 f0 e3 5b 40 00  13d+09:23:07.017  READ FPDMA QUEUED
60 08 a0 e8 e3 5b 40 00  13d+09:23:07.015  READ FPDMA QUEUED

Error 86 occurred at disk power-on lifetime: 11189 hours (466 days + 5 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 05 0b e4 5b 03
Error: UNC at LBA = 0x035be40b = 56353803
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name   
-- -- -- -- -- -- -- --  ----------------  --------------------      
60 08 c0 08 e4 5b 40 00  12d+03:26:04.624  READ FPDMA QUEUED
60 08 b8 00 e4 5b 40 00  12d+03:26:04.624  READ FPDMA QUEUED
60 08 b0 f8 e3 5b 40 00  12d+03:26:04.623  READ FPDMA QUEUED
60 08 a8 f0 e3 5b 40 00  12d+03:26:04.622  READ FPDMA QUEUED
60 08 a0 e8 e3 5b 40 00  12d+03:26:04.621  READ FPDMA QUEUED

Error 85 occurred at disk power-on lifetime: 10943 hours (455 days + 23 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 0d 0b e4 5b 03
Error: UNC at LBA = 0x035be40b = 56353803
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name   
-- -- -- -- -- -- -- --  ----------------  --------------------   
60 20 18 00 58 44 40 00  48d+23:49:04.673  READ FPDMA QUEUED
60 08 10 30 e5 5b 40 00  48d+23:49:04.673  READ FPDMA QUEUED
60 10 08 08 e4 5b 40 00  48d+23:49:04.673  READ FPDMA QUEUED
ef 02 00 00 00 00 00 00  48d+23:49:04.671  SET FEATURES [Enable write cache]
60 20 18 00 58 44 a0 ff  48d+23:49:04.554  READ FPDMA QUEUED 

Error 84 occurred at disk power-on lifetime: 10943 hours (455 days + 23 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 0d 0b e4 5b 03
Error: UNC at LBA = 0x035be40b = 56353803
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name   
-- -- -- -- -- -- -- --  ----------------  --------------------     
60 20 18 00 58 44 40 00  48d+23:48:46.106  READ FPDMA QUEUED
60 08 10 30 e5 5b 40 00  48d+23:48:46.106  READ FPDMA QUEUED
60 10 08 08 e4 5b 40 00  48d+23:48:46.105  READ FPDMA QUEUED
ef 02 00 00 00 00 00 00  48d+23:48:46.104  SET FEATURES [Enable write cache]
60 20 18 00 58 44 a0 ff  48d+23:48:45.990  READ FPDMA QUEUED

SMART Self-test log structure revision number 1 Num   
Test_Description Status
              Remaining  LifeTime(hours)  LBA_of_first_error

\# 1  Extended offline    Completed without error       00%     11208         -
\# 2  Short offline       Completed without error       00%     11206         -
\# 3  Short offline       Completed without error       00%     11206         -
\# 4  Vendor (0x50)       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1  SPAN 
MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk. 
If Selective self-test is pending on power-up, resume after 0 minute delay.

After all this, I try another dd with default bs value, just to be sure:

sudo dd if=/dev/zero of=/dev/zero status=progress

Currently, dd has written 78 GB without any error and it goes on...

Do you have any idea of what happened?

Thank you :)

Zanna
  • 70,465
Frankie
  • 31
  • This link can help you analyze the problem, and if you are lucky, revive it. - By the way, what kind of drive is it and how big is it (size in GB)? Was it writing to the end of the drive or was the error way before it reached the end? If a HDD or SSD, you should also check the S.M.A.R.T. information. It will tell you and the status of the drive (if it is good or bad). – sudodus Nov 24 '19 at 19:18
  • If I understand correctly, the second run with dd was successful and the third run has passed the point, where the first run failed. It seems to me that there was some temporary error, maybe a bad electrical connection or some disturbing event (electrical or mechanical shock). -- It is difficult to find an error when you cannot identify the conditions, to make it happen again. – sudodus Nov 28 '19 at 08:08
  • 2
    Yes @sudodus you are correct. Moreover, the third "dd" run has been fully completed ! – Frankie Nov 28 '19 at 10:43
  • Just a comment on this for anyone reading it later -- the dd command can be very dangerous if you're not careful! And the text commands given here are not correct, you would never do if=/dev/zero of=/dev/zero; commands in the images make more sense; probably the drive remapped the bad sector eventually, but didn't record it in the Reallocated_Sector_Ct, which is annoying; Is the drive still functioning, Frankie? – SpinUp __ A Davis May 09 '21 at 17:49

1 Answers1

0

I have encountered this problem with a bad block as well. The drive likely has a sector size of 512 bytes, but a block size of 4096 bytes, which is the minimum size available to disk I/O (see e.g. the output of blockdev --report). dd is fine with writing to healthy areas of the drive using the default size (512), but requires writing a whole block if there is a problem (bs=4096).

This answer provides the reasoning: to write only one sector of a block, the whole block must be read first, then the sector data will be changed and the block written back. In our case, the read fails, so that whole operation fails with bs=512. However, with bs=4096, no reading is necessary, since the whole block will be replaced; therefore, the operation succeeds.

The idea of using dd to write zeros to the drive is to force the drive firmware to confirm the health of every sector and remap any bad sectors to the reserved area. Once the bad block was successfully written to (then either confirmed healthy by the drive or remapped), the badblocks tool (which is a higher-level tool) was happy, because the bad block was hidden from it, and the smartctl tests passed as well.