Setup SMART Reporting via email

Discussion in 'Configuration' started by joeschmuck, Mar 3, 2012.

  1. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Mar 3, 2012

    This is a simple way to get SMART monitoring to report daily the status of your hard drives.

    The purpose of this is to have all your SMART enabled drives report how they are doing daily. I wanted to see how many times the drives spun up and down, see if there were any flaky transmission errors because I had a bad SATA cable and it was detected through SMART.

    This code is not persistent between upgrades of FreeNAS and must be placed back on the boot drive. You can run this from one of your hard drives however it will force them to spin up each time and you may not desire that.

    This is a very simple script and implementing it will take very little time.

    First the basic instructions:
    NOTE: Do not type the single quotes, they are simply around the text to type. And it is assumed you already have FreeNAS email all set up. Sendmail will fail if it is not set up.

    1) SSH or use the console and log in as root/SU.

    2) Type 'mount -wu /'

    3) Type 'cd /conf/base/etc'

    4) Type 'ee esmart.sh'

    5) Cut and paste the script below into the editor.

    Here is the simple script:
    Code (text):
    1.  
    2. #!/usr/local/bin/sh
    3. #
    4. # Place this in /conf/base/etc/
    5. # Call: sh esmart.sh /dev/ada0
    6. switch1=$1
    7. (
    8. echo "To: YourEmail@Address.net"
    9. echo "Subject: SMART Drive Results for ${switch1}"
    10. echo " "
    11. ) > /var/cover
    12. smartctl -i -H -A -n standby -l error ${switch1} >> /var/cover
    13. sendmail -t < /var/cover
    14. exit 0
    15.  
    16. # Set idle mode to so it doesn't spin up.
    17. # Options -n standby = Will not let the drive spin up if it's not currently spinning.  This means that no data will be present if the drive is not running because it exits out with an error condition.  This is nice for those folks who like to use HDD Standby in FreeNAS.
    18. # -i = Device Info (Does not force a drive spinup)
    19. # -H = Device Health (Forces spinup)
    20. # -A = Only Vendor specific SMART attributes (Forces spinup)
    21. # -l error = SMART Error Log (Forces spinup)
    22.  
    Here is the more complex script but it brings something extra (not explained like the basic code is in the below text but you should be able to understand it). In the previous script if the drive is in standby you will get a report that doesn't tell you much because the drive is not running. In this script it will periodically poll the hard drive to see if it's out of standby and then generate the report plus it cleans up the report some and more importantly you can run it on all the drives at once (same time period) where as the previous script you could only run a CRON job on one drive, wait a minute and run another CRON job. This is by far the better script of the two.
    Code (text):
    1.  
    2. #!/usr/local/bin/sh
    3. #
    4. # Place this in /conf/base/etc/
    5. # Call: sh esmart.sh /dev/ada0
    6. # switch1 is the drive to check (passed parameter)
    7. switch1=$1
    8.  
    9. # This will use the characters after "/dev/" for the temp file names.
    10. # Example: /dev/ada0 becomes coverada0 or cover0ada0 or cover1ada0
    11. # This needs to be done to keep multiple jobs from using the same files.
    12. drv=`echo $switch1 | cut -c6-`
    13.  
    14. # Variable just so we can add a note that the drive was asleep when the
    15. # application started but is now awake.
    16. c=0
    17.  
    18. # Process to run our check on the drive
    19. chkdrive()
    20. {
    21. smartctl -H -n standby -l error ${switch1} > /var/cover0${drv}
    22. }
    23.  
    24. (
    25. echo "To: youremail@address.net"
    26. echo "Subject: SMART Drive Results for ${switch1}"
    27. echo " "
    28. ) > /var/cover${drv}
    29. chkdrive
    30. while [ $? != "0" ]
    31. do
    32. # Pause the checking of the drive to about once a minute if the drive is not running.
    33.   sleep 59
    34.   c=1
    35.   chkdrive
    36. done
    37.  
    38. if [ $c -eq 1 ]
    39.  then
    40.  echo "THE DRIVE WAS ASLEEP AND JUST WOKE UP" >> /var/cover${drv}
    41. fi
    42.  
    43. # These lines remove the automatic Branding lines
    44. sed -e '1d' /var/cover0${drv} > /var/cover1${drv}
    45. sed -e '1d' /var/cover1${drv} > /var/cover0${drv}
    46. sed -e '1d' /var/cover0${drv} > /var/cover1${drv}
    47. sed -e '1d' /var/cover1${drv} > /var/cover0${drv}
    48.  
    49. cat /var/cover0${drv} >> /var/cover${drv}
    50. sendmail -t < /var/cover${drv}
    51.  
    52. # Cleanup our trash
    53. rm /var/cover${drv}
    54. rm /var/cover0${drv}
    55. rm /var/cover1${drv}
    56. exit 0
    57.  
    58. # Set idle mode to so it doesn't spin up.
    59. # Options
    60. # -n standby (Remove this to force a spinup)
    61. # -i = Device Info
    62. # -H = Device Health
    63. # -A = Only Vendor specific SMART attributes
    64. # -l error = SMART Error Log
    65.  
    6) Edit youremail@address.net to reflect the email address you desire the report to be sent.

    7) You can edit more of the script if you like but you can stop here and lets save this script, Press Escape and save it.

    8) Test Run the script by typing 'sh esmart.sh /dev/ada0' NOTE: /dev/ada0 needs to be changed to your drive path. Depending on the drive adapter it could be different.

    9) Wait a few seconds and check your email.

    10) If it worked then lets copy it to /etc so you can run it now without having to reboot. Type:

    Code (text):
    1.  
    2. cd /etc
    3. cp /conf/base/etc/esmart.sh .
    4.  
    11) Type 'mount -r /'

    Now let me explain this script so you can make changes to it if you like.

    The first few lines define this as a script and tell you how to call this script.
    Next we assign variable switch1 for use in the commands.
    Next is the code within the parentheses which define the email header and creates the file cover in the RAM based area /var to keep the hard drives from being accessed.

    Code (text):
    1.  
    2. #!/usr/local/bin/sh
    3. #
    4. # Place this in /conf/base/etc/
    5. # Call: sh esmart.sh /dev/ada0
    6. # $1 is the command line variable /dev/ada0 in this example
    7. switch1=$1
    8. (
    9. echo "To: YourEmail@Address.net"
    10. echo "Subject: SMART Drive Results for ${switch1}"
    11. echo " "
    12. ) > /var/cover
    13.  

    The key line in the script is:
    Code (text):
    1. smartctl -i -H -A -n standby -l error ${switch1} >> /var/cover
    This does all the real work and it will collect the data on the drive specified in switch1. The results will be added to the file cover. If the drive is not spinning then you will get an email stating the drive is in Standby and it exited. You get no data. This is fine for most people who want to minimize spinning up thier hard drives but if you really want the data everytime you run this script, remove the '-n standby' portion.

    This section takes the file we created called cover and sends it via the sendmail application.
    Code (text):
    1.  
    2. sendmail -t < /var/cover
    3. exit 0
    4.  
    Now we will create a CRON so this script runs once a day, you can chose how often you want it to run.
    12) Open FreeNAS GUI

    13) On left side window click on System, Cron Jobs, Add Cron Job

    14) Use the following settings: (You may set the time intervals to whatever you desire)
    User: root
    Command: /etc/sh esmart.sh /dev/ada0
    Description: ada0 SMART Results
    Minute: Each selected minute: 01
    Hour: Each selected hour: 01 (checking at 1 AM)
    Day of month: Every N day of month: 1
    Leave the rest at the default of all checked and click OK.

    15) Add additional Cron Jobs for each additional drive you have.

    16) Sit back and watch the reports come in.

    If you would rather just have one command to check all the drives, here is a script to check 4 drives and you can modify it as you see fit.

    Code (text):
    1.  
    2. #!/usr/local/bin/sh
    3. #
    4. # Place this in /conf/base/etc/
    5. # Call: sh esmart.sh
    6. (
    7. echo "To: YourEmail@Address.net"
    8. echo "Subject: SMART Drive Results for all drives"
    9. echo " "
    10. ) > /var/cover
    11. smartctl -i -H -A -n standby -l error /dev/ada0 >> /var/cover
    12. smartctl -i -H -A -n standby -l error /dev/ada1 >> /var/cover
    13. smartctl -i -H -A -n standby -l error /dev/ada2 >> /var/cover
    14. smartctl -i -H -A -n standby -l error /dev/ada3 >> /var/cover
    15. sendmail -t < /var/cover
    16. exit 0
    17.  
    18. # Set idle mode to so it doesn't spin up.
    19. # Options -n standby
    20. # -i = Device Info
    21. # -H = Device Health
    22. # -A = Only Vendor specific SMART attributes
    23. # -l error = SMART Error Log
    24.  
    Again, the '-n standby' will cause an issue at the point where a drive not spinning is encountered. Since I have a single pool of 4 drives, should my first drive exit due to not spinning, I can safely assume my other drives are not spinning either since I have the same HDD Standby (in FreeNAS GUI) settings for each.

    -Mark
  2. Offline

    calgarychris Newbie

    Member Since:
    Aug 27, 2011
    Messages:
    91
    Message Count:
    91
    Likes Received:
    2
    Trophy Points:
    8
    calgarychris, Mar 14, 2012

    I haven't had a chance to implement this, but just wanted to say "Thank you" - this has to be one of the clearest, best written "how-to's". I appreciate you putting the time to document this clearly, including having a purpose at the top and step by step explanations of what it's doing. One of the things I struggle with as a newb is not knowing what I don't know.

    I will give this a whirl!

    Cheers
    Chris
  3. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Mar 15, 2012

    Thanks. I try to write them they way I would like to see them and most of my guides are in this type of format. Just make sure you have email setup in FreeNAS. If you are getting daily status reports already then you are all set.
  4. Offline

    calgarychris Newbie

    Member Since:
    Aug 27, 2011
    Messages:
    91
    Message Count:
    91
    Likes Received:
    2
    Trophy Points:
    8
    calgarychris, Mar 15, 2012

    Ha, I'll be doing a search to see what other guides you've written! No dice with this one - I thought I'd set up email - I get daily security and status reports, but this one never came through. I'm not sure I've got email set up properly (on Settings - Email), as it didn't seem to ever like my smtp authentication username/pw but suddenly I started getting reports, so I figure it's working properly :p Anyway, when I run the script, it pauses like it's running in the background, but nothing comes through. I wasn't sure if I needed to chmod u+x on this one (I was working on another script that required that) - I did that but it didn't seem to help...

    Any ideas?

    Thanks
    Chris
  5. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Mar 15, 2012

    The pause is normal, it's sendmail doing it's thing.

    So I suspect you have your email not setup properly. Here are some tips:

    Setup the root email address: (Note: This has nothing to do with the script but it should be correct so you get root emails like the two daily reports)
    1) In FreeNAS, Account Tab, Users
    2) Locate root, E-Mail, Change E-mail, and ensure it points to the address you want to send emails to.

    Setup the SMTP:
    1) If you have filled this page out then click on Send Test Email (at bottom of the page). See if you get the test email.

    In each field goes...
    1) On FreeNAS, Settings Tab, Email
    2) From email: root@freenas.local (This really doesn't matter but it should be something)
    3) Outgoing mail server: (Depends on your outgoing mail server, what service are you using? Gmail, Verizon, Hotmail, etc???
    4) Port to connect to: Normally 25 however Hotmail and Verizon are different, I think Gmail might be different as well. It's a SPAM prevention measure.
    5) Use SMTP Authentication: Normally checked as you normally have to enter a password.
    6) Username and password will need to be set.

    If you copied the script from above then you only need to edit one line, your email address to where it will be sent and that will get you an email, even if it doesn't contain the proper information. Of course you need to adjust the drive letters as needed.

    One last thing, the mail server that my email was going to flagged my first testing as SPAM only because I was sending a lot of emails in a very short duration. I was able to white list it so that didn't remain a problem. I don't think that is your problem since you didn't even get one email.

    Let me know how it goes.
  6. Offline

    survive Super Moderator

    Member Since:
    May 28, 2011
    Messages:
    830
    Message Count:
    830
    Likes Received:
    20
    Trophy Points:
    18
    Occupation:
    Senior Systems Engineer
    Location:
    Missouri, USA
    survive, Mar 15, 2012

    Hi joeschmuck,

    Great write-up, you rarely see something so efficient.

    Just so I'm clear, by default the SAMRT service & scheduled tests won't email email? Even if they detect errors or warnings? If so what's the point of having it at all (I know you aren't a developer, but maybe you have some insight into what's going on)?

    -Will
  7. Offline

    survive Super Moderator

    Member Since:
    May 28, 2011
    Messages:
    830
    Message Count:
    830
    Likes Received:
    20
    Trophy Points:
    18
    Occupation:
    Senior Systems Engineer
    Location:
    Missouri, USA
    survive, Mar 15, 2012

    Hi joeschmuck,

    I have a request for a modification to your script....

    I'm looking at the second script that tests all the drives and I'm wondering if it would it be possible to modify it so it just cut the "SMART overall-health self-assessment test result: " lines out of each drives results and sent them in the email? Maybe have the device tested appended to each result so it looked like this:

    /dev/da0 SMART overall-health self-assessment test result: PASSED
    /dev/da1 SMART overall-health self-assessment test result: PASSED
    .....
    /dev/da6 SMART overall-health self-assessment test result: PASSED
    /dev/da7 SMART overall-health self-assessment test result: PASSED

    -Will
  8. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Mar 16, 2012

    @survive,
    I'm sure there is a way, probably very easy as well but I'm not well versed in shell scripting but I also wanted to do that so this gives me the opportunity to make this change.

    For now, the closest I can come up to is to only use the -H parameter in the smartctl lines.

    I will post an update once I figure out how to make things better.
  9. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Mar 16, 2012

    The SMART Tests "should" send an email only if it detects a failure. In the earlier versions of FreeNAS this didn't work. I am under the impression that does work but how do you test something like that unless you have a real failing drive.

    I wrote the script because there are some values that I wanted to see (Startup Count, Temperature, and UDMA CRC Error Count) so I could check to see if I thought there were any problems, And because after a selftest you never know the results unless you see them and using the no news is good news doesn't sit well with me.

    Hope that helps some.

    -Mark
  10. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Mar 16, 2012

    @ survive,
    Here is some crude coding that gets the job done and it's something simple to understand. Just update the code to reflect the drives you have.

    Code (text):
    1.  
    2. #!/usr/local/bin/sh
    3. # Call: sh esmart.sh
    4. (
    5. echo "To: YourEmail@Address.net"
    6. echo "Subject: SMART Drive Results for all drives"
    7. echo " "
    8. ) > /var/cover0
    9. echo "/dev/ada0 " >> /var/cover0
    10. smartctl -H /dev/ada0 > /var/coverA
    11. sed -e '1d' /var/coverA > /var/coverB
    12. sed -e '1d' /var/coverB > /var/coverA
    13. sed -e '1d' /var/coverA > /var/coverB
    14. sed -e '1d' /var/coverB >> /var/cover0
    15. echo "/dev/ada1 " >> /var/cover0
    16. smartctl -H /dev/ada1 > /var/coverA
    17. sed -e '1d' /var/coverA > /var/coverB
    18. sed -e '1d' /var/coverB > /var/coverA
    19. sed -e '1d' /var/coverA > /var/coverB
    20. sed -e '1d' /var/coverB >> /var/cover0
    21. echo "/dev/ada2 " >> /var/cover0
    22. smartctl -H /dev/ada2 > /var/coverA
    23. sed -e '1d' /var/coverA > /var/coverB
    24. sed -e '1d' /var/coverB > /var/coverA
    25. sed -e '1d' /var/coverA > /var/coverB
    26. sed -e '1d' /var/coverB >> /var/cover0
    27. echo "/dev/ada3 " >> /var/cover0
    28. smartctl -H /dev/ada3 > /var/coverA
    29. sed -e '1d' /var/coverA > /var/coverB
    30. sed -e '1d' /var/coverB > /var/coverA
    31. sed -e '1d' /var/coverA > /var/coverB
    32. sed -e '1d' /var/coverB >> /var/cover0
    33. sendmail -t < /var/cover0
    34. exit 0
    35.  
    The output is:
    Code (text):
    1.  
    2. /dev/ada0
    3. SMART overall-health self-assessment test result: PASSED
    4.  
    5. /dev/ada1
    6. SMART overall-health self-assessment test result: PASSED
    7.  
    8. /dev/ada2
    9. SMART overall-health self-assessment test result: PASSED
    10.  
    11. /dev/ada3
    12. SMART overall-health self-assessment test result: PASSED
    13.  
    14.  
    Again, quick and dirty, gets the job done. I'm sure there is a better way, you could make it a function and call it as well. I just wanted to get something out there for you. Also keep in mind that this will spin up a hard drive. If you don't want that you need to change the smartctl line to include the "-n standby" parameter.

    Please let me know if this works for you.

    -Mark
  11. Offline

    calgarychris Newbie

    Member Since:
    Aug 27, 2011
    Messages:
    91
    Message Count:
    91
    Likes Received:
    2
    Trophy Points:
    8
    calgarychris, Mar 19, 2012

    Still no luck for me - I somehow managed to stop my daily security reports too! It comes up with the message "Your email could not be sent. [Errno8], hostname nor servname provided or not known.

    Weird, as I have double checked the info on gmail itself. There is a guide here (http://www.sw33tcode.com/?p=7) but it says to "To fix this I appended google’s dns server to my /etc/resolv.conf" and refers to nameserver 8.8.8.8 which I'm not familiar with ??? I don't know what google's dns server is...or whether I should be editing resolv/conf...any thoughts on this?

    Thanks
    Chris
  12. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Apr 28, 2012

    Updated Code - Runs SMART test and sends results.

    UPDATE: This code runs the SMART Short test and and after 5 minutes it will email you the results. In the subject line it will say PASSED or PROBLEM.

    Notes: If you change this to run the long test then you must change the wait time appropriately. For my Samsung 2TB drives the wait should be a minimum of 255 minutes according to the drive. I took the sleep timer and set it to 5 hours for a long test.

    I recommend creating two versions of this, a short and long test version then you can simply call the version you want to run. Here is the short test version.

    Call this script as follows: sh /etc/esmart.sh drive
    example: sh /etc/esmart.sh /dev/ada0

    Code (text):
    1.  
    2. #!/usr/local/bin/sh
    3. #
    4. # Place this in /conf/base/etc/
    5. # Call: sh esmart.sh /dev/ada0
    6. # switch1 is the drive to check (passed parameter)
    7. switch1=$1
    8.  
    9. # This will use the characters after "/dev/" for the temp file names.
    10. # Example: /dev/ada0 becomes coverada0 or cover0ada0 or cover1ada0
    11. # This needs to be done to keep multiple jobs from using the same files.
    12. drv=`echo $switch1 | cut -c6-`
    13.  
    14. # Variable just so we can add a note that the drive was asleep when the
    15. # application started but is now awake.
    16. c=0
    17.  
    18. ### Run SMART Quick Test
    19. runsmartshort()
    20. {
    21. ### If changing to long SMART test, swap the hash marks from the three lines below.
    22. ### You may edit the sleep to whatever your drive recommends for the test to finish.
    23. smartctl -t short ${switch1}
    24. # smartctl -t long ${switch1}
    25. echo "Short Test Running, waiting 5 minutes for test to finish."
    26. # echo "Long Test Running, waiting 255 minutes for test to finish."
    27. sleep 300
    28. # sleep 15300
    29. }
    30.  
    31. ### Process to run our check on the drive, setup exclusivly for only "-l error".
    32. # Output cover0
    33. chkdrive()
    34. {
    35. smartctl -n standby -l error -l selftest ${switch1} > /var/cover0${drv}
    36. }
    37.  
    38. ### Process to create the email header
    39. # Input cover1, output cover.
    40. makeheader()
    41. {
    42. (
    43. echo "To: youremail@address.net"
    44. printf "Subject: SMART Drive Results for ${switch1} - " ; cat /var/cover1${drv}
    45. echo " "
    46. ) > /var/cover${drv}
    47. }
    48.  
    49. ### Process to create the email header for failure
    50. # Input none, output cover.
    51. makeheaderfailure()
    52. {
    53. (
    54. echo "To: youremail@address.net"
    55. printf "Subject: SMART Drive Results for ${switch1} - PROBLEM"
    56. echo " "
    57. ) > /var/cover${drv}
    58. }
    59.  
    60. ### Process for normal results
    61. # Input is cover0, output is cover1
    62. procnormal()
    63. {
    64. ### Delete lines 1 through 5 leaving the status returned, cover0 cannot be changed here.
    65. sed '1,5d' /var/cover0${drv} > /var/cover1${drv}
    66.  
    67. ### If the drive was asleep we can add a line so the user knows it was sleeping
    68. if [ $c -eq 1 ]
    69.  then
    70. (
    71. echo " "
    72. date
    73. printf "The drive was sleeping and just woke up."
    74. echo " "
    75. ) >> /var/cover1${drv}
    76. fi
    77. }
    78.  
    79. # Process to cleanup our trash files
    80. cleanup()
    81. {
    82. rm /var/cover${drv}
    83. rm /var/cover0${drv}
    84. rm /var/cover1${drv}
    85. }
    86.  
    87. ### Lets test the drive
    88. runsmartshort
    89.  
    90. ### Lets call chkdrive, output is cover0
    91. chkdrive
    92. ### If chkdrive returns a value 2 for sleeping then loop
    93. while [ $? -eq "2" ]
    94. do
    95. ### Pause the checking of the drive to about once a minute if the drive is not running.
    96. ### This can be changed to more or less frequent, it's a personal choice.
    97.   sleep 59
    98.   c=1
    99.   chkdrive
    100. done
    101.  
    102. ### If chkdrive returns a value other than 0 before or after sleeping, error.
    103. if [ $? -ne "0" ]
    104. then
    105. makeheaderfailure
    106. cat /var/cover0${drv} >> /var/cover1${drv}
    107. else
    108. procnormal
    109. makeheader
    110. ### Chop off all but the most recent 5 test results
    111. sed '11,40d' /var/cover${drv} > /var/cover1${drv}
    112. fi
    113.  
    114. sendmail -t < /var/cover1${drv}
    115.  
    116. ### Call Cleanup Process
    117. cleanup
    118. exit 0
    119.  
    Example of the output. I have it set to give you the last 5 tests vice all tests.
    Code (text):
    1.  
    2. From: root@freenas.local
    3. To: youremail@address.net
    4. Subject: SMART Drive Results for /dev/ada0 - No Errors Logged
    5.  
    6. SMART Self-test log structure revision number 1
    7. Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    8. # 1  Short offline       Completed without error       00%      2646         -
    9. # 2  Extended offline    Completed without error       00%      2400         -
    10. # 3  Short offline       Completed without error       00%      2168         -
    11. # 4  Extended offline    Completed without error       00%      2002         -
    12. # 5  Extended offline    Completed without error       00%      1936         -
    13.  
  13. Offline

    Wolfeman0101 FreeNAS Aware

    Member Since:
    Jun 14, 2012
    Messages:
    197
    Message Count:
    197
    Likes Received:
    0
    Trophy Points:
    16
    Wolfeman0101, Jul 2, 2012

    Thanks a lot for this script!
  14. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Jul 2, 2012

    No problem, hope it works out for you.
  15. Offline

    Wolfeman0101 FreeNAS Aware

    Member Since:
    Jun 14, 2012
    Messages:
    197
    Message Count:
    197
    Likes Received:
    0
    Trophy Points:
    16
    Wolfeman0101, Jul 3, 2012

    I'm having 1 issue. I placed the esmart.sh in the /conf/base/etc/ folder but when I try to run it using "sh /etc/esmart.sh /dev/ada1" I get an error:

    Code (text):
    1. [[root@freenas] /conf/base/etc# sh /etc/esmart.sh /dev/ada1                      
    2. /etc/esmart.sh: Can't open /etc/esmart.sh: No such file or directory            
    3. [root@freenas] /conf/base/etc#
  16. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Jul 3, 2012

    Reboot the NAS. Once you have rebooted it the esmart.sh file is copied into /etc/ for use. The file needs to be placed into /conf/base/etc because /etc/ lives in RAM and is killed after a system reboot or power off.

    Hope that helps. If you already rebooted and it still doesn't work, let me know.
  17. Offline

    Wolfeman0101 FreeNAS Aware

    Member Since:
    Jun 14, 2012
    Messages:
    197
    Message Count:
    197
    Likes Received:
    0
    Trophy Points:
    16
    Wolfeman0101, Jul 6, 2012

    The reboot worked and my tests are running nightly. 2 of the tests output as expected but 2 are giving me a weird email. I get nothing in the body and in the subject:
    Code (text):
    1. SMART Drive Results for /dev/ada3 - ATA Error Count: 1 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX]" as the subject.
  18. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Jul 6, 2012

    Well I have a few questions...

    Since the code is working for 2 drives then that means the code is working.

    I'm thinking of things which might be different such as, are they all the same model hard drive? Are they connected to the same controller?

    What you might try is to open a shell as root and enter 'smartctl -t short /dev/ada3' and see what pops up. Run 'smarctl -l selftest /dev/ada3' after 3 to 4 minutes to see if the test finished and what the results are. Maybe the drive does not support SMART testing or maybe over the specific interface? There is too many possibilities without knowing what really happens and you hardware configuration. If this works then how long does it take to complete the test? Is 5 minutes too short to wait for a result?

    Are you using the script I posted on 4-28-12 (most recent code)? If you made any changes feel free to post them here or PM me if you like.

    Please let me know what you find out.
  19. Offline

    Wolfeman0101 FreeNAS Aware

    Member Since:
    Jun 14, 2012
    Messages:
    197
    Message Count:
    197
    Likes Received:
    0
    Trophy Points:
    16
    Wolfeman0101, Jul 7, 2012

    No the 2 that are having the problem are different models but on the same controller.

    Code (text):
    1. [Derp@freenas] /mnt/Vol1# smartctl -t short /dev/ada2
    2. smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-RELEASE-p9 amd64] (local build)
    3. Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
    4.  
    5. === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
    6. Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
    7. Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
    8. Testing has begun.
    9. Please wait 2 minutes for test to complete.
    10. Test will complete after Sat Jul  7 09:48:51 2012
    11.  
    12. Use smartctl -X to abort test.
    13. [Derp@freenas] /mnt/Vol1# smartctl -l selftest /dev/ada2
    14. smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-RELEASE-p9 amd64] (local build)
    15. Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
    16.  
    17. === START OF READ SMART DATA SECTION ===
    18. SMART Self-test log structure revision number 1
    19. Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    20. # 1  Short offline       Completed without error       00%     17846         -
    21. # 2  Short offline       Completed without error       00%     17841         -
    22. # 3  Short offline       Completed without error       00%     17817         -
    23. # 4  Short offline       Completed without error       00%     17793         -
    24. # 5  Short offline       Completed without error       00%     17771         -
    25. # 6  Short offline       Completed without error       00%     17759         -
    26. # 7  Short offline       Completed without error       00%     11317         -
    27. # 8  Short offline       Completed without error       00%      5860         -
    28. # 9  Short offline       Completed without error       00%      1837         -
    29. #10  Short offline       Completed without error       00%       672         -
    30.  
    31. [Derp@freenas] /mnt/Vol1#
    Yes I'm using the 4-28-12 version.
  20. Offline

    joeschmuck Old Man

    Member Since:
    May 28, 2011
    Messages:
    2,619
    Message Count:
    2,619
    Likes Received:
    86
    Trophy Points:
    48
    Occupation:
    Electrical Engineer, Data Analysis, and Management
    Location:
    Virginia
    joeschmuck, Jul 7, 2012

    So the test is completing fine.

    Well if you don't mind some troubleshooting but I think you might have 2 drives which may have had problems and the '-l error' is echoing that. The script I wrote would have listed the error in the subject line and I guess I only planned to see a simpler fail message. Since I don't have any failed drives it's not something I could readily test.

    Options:
    type 'smartctl -n standby -l error -l selftest /dev/ada2' and see what it outputs. I think the '-l error' will bring to light possibly some problems.

    Or

    type 'smartctl -l error /dev/ada2' and I'll bet is gives a error info.


    If that doesn't work try this...
    Change the script at the end where sendmail is located to: (adds the hash disabling the sendmail line and prints the entire email)

    #sendmail -t < /var/cover1${drv}
    cat /var/cover1${drv}

    I hope you don't have two drives failing. If you do then maybe this was a good thing to find out before data loss occurs.

Share This Page