Page 2 of 8 FirstFirst 123456 ... LastLast
Results 11 to 20 of 78

Thread: Setup SMART Reporting via email

  1. #11
    Still no luck for me - I somehow managed to stop my daily security reports too! It comes up with the message "Your email could not be sent. [Errno8], hostname nor servname provided or not known.

    Weird, as I have double checked the info on gmail itself. There is a guide here (http://www.sw33tcode.com/?p=7) but it says to "To fix this I appended google’s dns server to my /etc/resolv.conf" and refers to nameserver 8.8.8.8 which I'm not familiar with ??? I don't know what google's dns server is...or whether I should be editing resolv/conf...any thoughts on this?

    Thanks
    Chris

  2. #12
    Senior Member joeschmuck's Avatar
    Join Date
    May 2011
    Location
    Dark Side of the Moon
    Posts
    1,355

    Updated Code - Runs SMART test and sends results.

    UPDATE: This code runs the SMART Short test and and after 5 minutes it will email you the results. In the subject line it will say PASSED or PROBLEM.

    Notes: If you change this to run the long test then you must change the wait time appropriately. For my Samsung 2TB drives the wait should be a minimum of 255 minutes according to the drive. I took the sleep timer and set it to 5 hours for a long test.

    I recommend creating two versions of this, a short and long test version then you can simply call the version you want to run. Here is the short test version.

    Call this script as follows: sh /etc/esmart.sh drive
    example: sh /etc/esmart.sh /dev/ada0

    Code:
    #!/usr/local/bin/sh
    #
    # Place this in /conf/base/etc/
    # Call: sh esmart.sh /dev/ada0
    # switch1 is the drive to check (passed parameter)
    switch1=$1
    
    # This will use the characters after "/dev/" for the temp file names.
    # Example: /dev/ada0 becomes coverada0 or cover0ada0 or cover1ada0
    # This needs to be done to keep multiple jobs from using the same files.
    drv=`echo $switch1 | cut -c6-`
    
    # Variable just so we can add a note that the drive was asleep when the
    # application started but is now awake.
    c=0
    
    ### Run SMART Quick Test
    runsmartshort()
    {
    ### If changing to long SMART test, swap the hash marks from the three lines below.
    ### You may edit the sleep to whatever your drive recommends for the test to finish.
    smartctl -t short ${switch1}
    # smartctl -t long ${switch1}
    echo "Short Test Running, waiting 5 minutes for test to finish."
    # echo "Long Test Running, waiting 255 minutes for test to finish."
    sleep 300
    # sleep 15300
    }
    
    ### Process to run our check on the drive, setup exclusivly for only "-l error". 
    # Output cover0
    chkdrive()
    {
    smartctl -n standby -l error -l selftest ${switch1} > /var/cover0${drv}
    }
    
    ### Process to create the email header
    # Input cover1, output cover.
    makeheader()
    {
    (
    echo "To: youremail@address.net"
    printf "Subject: SMART Drive Results for ${switch1} - " ; cat /var/cover1${drv}
    echo " "
    ) > /var/cover${drv}
    }
    
    ### Process to create the email header for failure
    # Input none, output cover.
    makeheaderfailure()
    {
    (
    echo "To: youremail@address.net"
    printf "Subject: SMART Drive Results for ${switch1} - PROBLEM" 
    echo " "
    ) > /var/cover${drv}
    }
    
    ### Process for normal results
    # Input is cover0, output is cover1
    procnormal()
    {
    ### Delete lines 1 through 5 leaving the status returned, cover0 cannot be changed here.
    sed '1,5d' /var/cover0${drv} > /var/cover1${drv}
    
    ### If the drive was asleep we can add a line so the user knows it was sleeping
    if [ $c -eq 1 ]
     then
    (
    echo " "
    date
    printf "The drive was sleeping and just woke up."
    echo " "
    ) >> /var/cover1${drv}
    fi
    }
    
    # Process to cleanup our trash files
    cleanup()
    {
    rm /var/cover${drv}
    rm /var/cover0${drv}
    rm /var/cover1${drv}
    }
    
    ### Lets test the drive
    runsmartshort
    
    ### Lets call chkdrive, output is cover0
    chkdrive
    ### If chkdrive returns a value 2 for sleeping then loop
    while [ $? -eq "2" ]
    do
    ### Pause the checking of the drive to about once a minute if the drive is not running.
    ### This can be changed to more or less frequent, it's a personal choice.
      sleep 59
      c=1
      chkdrive
    done
    
    ### If chkdrive returns a value other than 0 before or after sleeping, error.
    if [ $? -ne "0" ]
    then
    makeheaderfailure
    cat /var/cover0${drv} >> /var/cover1${drv}
    else
    procnormal
    makeheader
    ### Chop off all but the most recent 5 test results
    sed '11,40d' /var/cover${drv} > /var/cover1${drv}
    fi
    
    sendmail -t < /var/cover1${drv}
    
    ### Call Cleanup Process
    cleanup
    exit 0
    Example of the output. I have it set to give you the last 5 tests vice all tests.
    Code:
    From: root@freenas.local
    To: youremail@address.net
    Subject: SMART Drive Results for /dev/ada0 - No Errors Logged
    
    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Short offline       Completed without error       00%      2646         -
    # 2  Extended offline    Completed without error       00%      2400         -
    # 3  Short offline       Completed without error       00%      2168         -
    # 4  Extended offline    Completed without error       00%      2002         -
    # 5  Extended offline    Completed without error       00%      1936         -
    FreeNAS 8.3.1-Release-p1 w/MiniDLNA Plugin + MiniDLNA Automatic Scan Fix
    Gigabyte P45T-ES3G | Intel E8500 (3.2GHz) CPU | 16GB DDR3 1066 RAM
    Six WD Red WD20EFRX NAS Hard Drives (RAIDZ2, 7.3TB usable space)
    Adata PD7 USB Flash Drive (4GB) for OS | 1GB Patriot Xporter USB Flash Drive for Scripts & Plugins
    APC Back-UPS Pro BR1000G

  3. #13
    Senior Member
    Join Date
    Jun 2012
    Posts
    128
    Thanks a lot for this script!

  4. #14
    Senior Member joeschmuck's Avatar
    Join Date
    May 2011
    Location
    Dark Side of the Moon
    Posts
    1,355
    No problem, hope it works out for you.

  5. #15
    Senior Member
    Join Date
    Jun 2012
    Posts
    128
    Quote Originally Posted by joeschmuck View Post
    No problem, hope it works out for you.
    I'm having 1 issue. I placed the esmart.sh in the /conf/base/etc/ folder but when I try to run it using "sh /etc/esmart.sh /dev/ada1" I get an error:

    Code:
    [[root@freenas] /conf/base/etc# sh /etc/esmart.sh /dev/ada1                      
    /etc/esmart.sh: Can't open /etc/esmart.sh: No such file or directory            
    [root@freenas] /conf/base/etc#

  6. #16
    Senior Member joeschmuck's Avatar
    Join Date
    May 2011
    Location
    Dark Side of the Moon
    Posts
    1,355
    Reboot the NAS. Once you have rebooted it the esmart.sh file is copied into /etc/ for use. The file needs to be placed into /conf/base/etc because /etc/ lives in RAM and is killed after a system reboot or power off.

    Hope that helps. If you already rebooted and it still doesn't work, let me know.

  7. #17
    Senior Member
    Join Date
    Jun 2012
    Posts
    128
    Quote Originally Posted by joeschmuck View Post
    Reboot the NAS. Once you have rebooted it the esmart.sh file is copied into /etc/ for use. The file needs to be placed into /conf/base/etc because /etc/ lives in RAM and is killed after a system reboot or power off.

    Hope that helps. If you already rebooted and it still doesn't work, let me know.
    The reboot worked and my tests are running nightly. 2 of the tests output as expected but 2 are giving me a weird email. I get nothing in the body and in the subject:
    Code:
    SMART Drive Results for /dev/ada3 - ATA Error Count: 1 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX]" as the subject.

  8. #18
    Senior Member joeschmuck's Avatar
    Join Date
    May 2011
    Location
    Dark Side of the Moon
    Posts
    1,355
    Quote Originally Posted by Wolfeman0101 View Post
    The reboot worked and my tests are running nightly. 2 of the tests output as expected but 2 are giving me a weird email. I get nothing in the body and in the subject:
    Code:
    SMART Drive Results for /dev/ada3 - ATA Error Count: 1 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX]" as the subject.
    Well I have a few questions...

    Since the code is working for 2 drives then that means the code is working.

    I'm thinking of things which might be different such as, are they all the same model hard drive? Are they connected to the same controller?

    What you might try is to open a shell as root and enter 'smartctl -t short /dev/ada3' and see what pops up. Run 'smarctl -l selftest /dev/ada3' after 3 to 4 minutes to see if the test finished and what the results are. Maybe the drive does not support SMART testing or maybe over the specific interface? There is too many possibilities without knowing what really happens and you hardware configuration. If this works then how long does it take to complete the test? Is 5 minutes too short to wait for a result?

    Are you using the script I posted on 4-28-12 (most recent code)? If you made any changes feel free to post them here or PM me if you like.

    Please let me know what you find out.
    FreeNAS 8.3.1-Release-p1 w/MiniDLNA Plugin + MiniDLNA Automatic Scan Fix
    Gigabyte P45T-ES3G | Intel E8500 (3.2GHz) CPU | 16GB DDR3 1066 RAM
    Six WD Red WD20EFRX NAS Hard Drives (RAIDZ2, 7.3TB usable space)
    Adata PD7 USB Flash Drive (4GB) for OS | 1GB Patriot Xporter USB Flash Drive for Scripts & Plugins
    APC Back-UPS Pro BR1000G

  9. #19
    Senior Member
    Join Date
    Jun 2012
    Posts
    128
    Quote Originally Posted by joeschmuck View Post
    I'm thinking of things which might be different such as, are they all the same model hard drive? Are they connected to the same controller?
    No the 2 that are having the problem are different models but on the same controller.

    Quote Originally Posted by joeschmuck View Post
    What you might try is to open a shell as root and enter 'smartctl -t short /dev/ada3' and see what pops up. Run 'smarctl -l selftest /dev/ada3' after 3 to 4 minutes to see if the test finished and what the results are. Maybe the drive does not support SMART testing or maybe over the specific interface? There is too many possibilities without knowing what really happens and you hardware configuration. If this works then how long does it take to complete the test? Is 5 minutes too short to wait for a result?
    Code:
    [Derp@freenas] /mnt/Vol1# smartctl -t short /dev/ada2
    smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-RELEASE-p9 amd64] (local build)
    Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
    
    === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
    Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
    Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
    Testing has begun.
    Please wait 2 minutes for test to complete.
    Test will complete after Sat Jul  7 09:48:51 2012
    
    Use smartctl -X to abort test.
    [Derp@freenas] /mnt/Vol1# smartctl -l selftest /dev/ada2
    smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-RELEASE-p9 amd64] (local build)
    Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
    
    === START OF READ SMART DATA SECTION ===
    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Short offline       Completed without error       00%     17846         -
    # 2  Short offline       Completed without error       00%     17841         -
    # 3  Short offline       Completed without error       00%     17817         -
    # 4  Short offline       Completed without error       00%     17793         -
    # 5  Short offline       Completed without error       00%     17771         -
    # 6  Short offline       Completed without error       00%     17759         -
    # 7  Short offline       Completed without error       00%     11317         -
    # 8  Short offline       Completed without error       00%      5860         -
    # 9  Short offline       Completed without error       00%      1837         -
    #10  Short offline       Completed without error       00%       672         -
    
    [Derp@freenas] /mnt/Vol1#
    Quote Originally Posted by joeschmuck View Post
    Are you using the script I posted on 4-28-12 (most recent code)? If you made any changes feel free to post them here or PM me if you like.

    Please let me know what you find out.
    Yes I'm using the 4-28-12 version.

  10. #20
    Senior Member joeschmuck's Avatar
    Join Date
    May 2011
    Location
    Dark Side of the Moon
    Posts
    1,355
    So the test is completing fine.

    Well if you don't mind some troubleshooting but I think you might have 2 drives which may have had problems and the '-l error' is echoing that. The script I wrote would have listed the error in the subject line and I guess I only planned to see a simpler fail message. Since I don't have any failed drives it's not something I could readily test.

    Options:
    type 'smartctl -n standby -l error -l selftest /dev/ada2' and see what it outputs. I think the '-l error' will bring to light possibly some problems.

    Or

    type 'smartctl -l error /dev/ada2' and I'll bet is gives a error info.


    If that doesn't work try this...
    Change the script at the end where sendmail is located to: (adds the hash disabling the sendmail line and prints the entire email)

    #sendmail -t < /var/cover1${drv}
    cat /var/cover1${drv}

    I hope you don't have two drives failing. If you do then maybe this was a good thing to find out before data loss occurs.
    FreeNAS 8.3.1-Release-p1 w/MiniDLNA Plugin + MiniDLNA Automatic Scan Fix
    Gigabyte P45T-ES3G | Intel E8500 (3.2GHz) CPU | 16GB DDR3 1066 RAM
    Six WD Red WD20EFRX NAS Hard Drives (RAIDZ2, 7.3TB usable space)
    Adata PD7 USB Flash Drive (4GB) for OS | 1GB Patriot Xporter USB Flash Drive for Scripts & Plugins
    APC Back-UPS Pro BR1000G

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •