Forums

Arun
Arun
Offline
Resolved
1 votes
Hi Friends,

Using 2 Internet connections in WAN side (ClearOS-7 )
There are two 4G connection using sim-cards(it keeps on dropping ). So Implemented a tiny bash script(as cron-job) to get mail alert at the time of WAN disconnections.

MY ISSUE:
-------------
While on fail-over, mail command fails to resolve smtp server. below portion of the script is in a loop, and mail commands fails till restoring the failed connection. You can see in connection1.log(below ), nslookup command resolves smtp server , but even though mail command fails ..

- - - - 
ping -c 3 -I enp2s0 8.8.8.8 > /dev/null 2>&1
if [ $? -ne 0 ]
then
nslookup smtp.gmail.com 8.8.8.8 >> connecton1.log
echo "Auto Generated MSG - Pls Restart 4GRouter-1" | mail -s "4G Alert: Connection-1 Down" linkdownalert@yahoo.com
fi
- - - -



Manually tested by removing one wan connection,results as follows!!

[root@gateway ~]# ping smtp.gmail.com
ping: unknown host smtp.gmail.com
[root@gateway ~]# echo "Auto Generated MSG - Pls Restart 4GRouter-1" | mail -s "4G Alert: Connection-1 Down" kcarun@yahoo.com
[root@gateway ~]# Could not resolve host: smtp.gmail.com
"/root/dead.letter" 11/361
. . . message not sent.

[root@gateway ~]#



connection1.log is as follows!!

---------------------------------------------------------------------
Sat Oct 22 08:17:13 AST 2016 Disconnection Detected
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached

;; connection timed out; trying next origin
;; connection timed out; no servers could be reached

Sat Oct 22 08:18:15 AST 2016 Router not responding.(may be rebooting)
Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Sat Oct 22 08:21:15 AST 2016 Router starts responding...
Router down Duration : 4 minutes and 2 seconds
Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Server: 8.8.8.8
Address: 8.8.8.8#53

Non-authoritative answer:
smtp.gmail.com canonical name = gmail-smtp-msa.l.google.com.
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.108
Name: gmail-smtp-msa.l.google.com
Address: 74.125.206.109

Mail Alert Sent.....
Sat Oct 22 08:23:01 AST 2016 Connection just Restored
Internet Outage Duration : 5 minutes and 48 seconds
---------------------------------------------------------------------






This one I tried from command-line at during the down time

[root@gateway ~]# ping smtp.gmail.com
ping: unknown host smtp.gmail.com
[root@gateway ~]# echo "Auto Generated MSG - Pls Restart 4GRouter-1" | mail -s "4G Alert: Connection-1 Down" linkdownalert@yahoo.com
[root@gateway ~]# Could not resolve host: smtp.gmail.com
"/root/dead.letter" 11/361
. . . message not sent.

[root@gateway ~]#
In Mail
Saturday, October 22 2016, 08:20 AM
Share this post:

Accepted Answer

Sunday, October 23 2016, 12:07 PM - #Permalink
Resolved
1 votes
I've completely misread the information provided.

In the DNS Server screen all this does is populate the hosts file. An entry there for 192.168.0.1 is fine. The other two for Google do nothing particularly helpful and can be got rid of.

In Webconfig > Network > Settings > IP Settings, edit your WAN interfaces and uncheck the box "Automatic DNS Servers". You should then be able to edit the DNS on the IP Settings page. Change these to 8.8.8.8 and 8.8.4.4 (if you want GoogleDNS). This is the same as editing /etc/resolv-peerdns.conf manually and then restarting dnsmasq. (I got this hint from your post the the dnsmasq thread where I received an automatic e-mail with your original post before you asked for the post to be deleted)
The reply is currently minimized Show
Responses (12)
  • Accepted Answer

    Saturday, October 22 2016, 09:24 AM - #Permalink
    Resolved
    1 votes
    I'm not sure of toubleshooting this as I'm not sure how ping and nslookup do their lookups - from cached results or what. How have you configured your DNS servers? Are they using the ISP's or public servers (or your own fully fledged DNS Server?). When you have multiwan from different ISP's it is not good to use the ISP's DNS servers. You should use a public service such as GoogleDNS or OpenDNS (or your own server such as BIND or Unbound).
    The reply is currently minimized Show
  • Accepted Answer

    Saturday, October 22 2016, 12:13 PM - #Permalink
    Resolved
    1 votes
    In addition to Nick's comments...

    Are both of the 4G connections using sim-cards going to the same provider? If so, are they on the same sub-net?

    Be good to see an extract of /var/log/syswatch just before, during and after a connection goes down. (within code-tags please)

    What is the reason for using fail-over rather than the default Multi-WAN weighting with both WANs active at the same time?
    The reply is currently minimized Show
  • Accepted Answer

    Sunday, October 23 2016, 08:04 AM - #Permalink
    Resolved
    1 votes
    Are your routers in router of bridge mode? What is the output of "ifconfig | grep ^e -A 1"
    The reply is currently minimized Show
  • Accepted Answer

    Sunday, October 23 2016, 10:09 AM - #Permalink
    Resolved
    1 votes
    Since you have a DNS server showing as 192.168.0.1 - I assume you are running dnsmasq.
    Is dnsmasq running correctly? Why isn't 192.168.0.1 the first DNS, rather than third, to take advantage of caching?

    You didn't answer re. fail-over versus weighting - perhaps the best way is to show us your /etc/clearos/multiwan.conf file

    Is your ClearOS system up-to-date? Output of "rpm -qa | grep multiwan"

    From your /var/log/syswatch extract you should have had continual Internet access as there is no indication that the enp3s0 interface failed. Can we have the full ifconfig output for the WAN interfaces so we can see the traffic distribution between them...
    The reply is currently minimized Show
  • Accepted Answer

    Sunday, October 23 2016, 02:21 PM - #Permalink
    Resolved
    1 votes
    There are programs you can use to test DNS lookup response times, but I'm not sure there is any point setting your modem/routers as the primary and secondary DNS servers. Dnsmasq acts as a cache, but if your routers do as well then you are caching a cache. Also if your 192.168.10.1 router goes down, every DNS lookup will have to timeout first before it switches to your secondary DNS - the other router. This could really slow down DNS lookups when you lose that WAN. I'd just settle for 8.8.8.8 and 8.8.4.4 (actually I use OpenDNS or Unbound for DNS).
    The reply is currently minimized Show
  • Accepted Answer

    Arun
    Arun
    Offline
    Sunday, October 23 2016, 07:32 AM - #Permalink
    Resolved
    0 votes
    Hi Nick Howitt & Tony Ellis, As per your advice , providing more details for further analysis !


    My DNS server Configuration.

    https://i.imgur.com/xSTKgBd.png


    Yes I am using Multi-WAN in My environment & this issue happens if one connection goes down.



    Yes Both 4G Connections are from same provider but different subnets.



    Syswatch Log ( From disconnection to restoration - Done Router Restart here )



    [root@gateway ~]# tail -f /var/log/syswatch
    Sun Oct 23 09:31:43 2016 info: system - heartbeat...
    Sun Oct 23 09:36:46 2016 info: system - heartbeat...
    Sun Oct 23 09:41:48 2016 info: system - heartbeat...
    Sun Oct 23 09:46:50 2016 info: system - heartbeat...
    Sun Oct 23 09:51:52 2016 info: system - heartbeat...
    Sun Oct 23 09:56:55 2016 info: system - heartbeat...
    Sun Oct 23 10:01:57 2016 info: system - heartbeat...
    Sun Oct 23 10:06:59 2016 info: system - heartbeat...
    Sun Oct 23 10:12:03 2016 info: system - heartbeat...
    Sun Oct 23 10:17:05 2016 info: system - heartbeat...
    Sun Oct 23 10:20:09 2016 debug: enp2s0 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)
    Sun Oct 23 10:20:11 2016 info: enp2s0 - ping check on server #1 failed - 8.8.8.8
    Sun Oct 23 10:20:18 2016 info: enp2s0 - ping check on server #2 failed - 54.152.208.245
    Sun Oct 23 10:20:18 2016 warn: enp2s0 - connection warning
    Sun Oct 23 10:20:30 2016 debug: enp2s0 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)
    Sun Oct 23 10:20:32 2016 info: enp2s0 - ping check on server #1 failed - 8.8.8.8
    Sun Oct 23 10:20:39 2016 info: enp2s0 - ping check on server #2 failed - 54.152.208.245
    Sun Oct 23 10:20:39 2016 warn: enp2s0 - connection warning
    Sun Oct 23 10:20:51 2016 debug: enp2s0 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)
    Sun Oct 23 10:20:53 2016 info: enp2s0 - ping check on server #1 failed - 8.8.8.8
    Sun Oct 23 10:21:00 2016 info: enp2s0 - ping check on server #2 failed - 54.152.208.245
    Sun Oct 23 10:21:00 2016 warn: enp2s0 - upstream connection problems with your ISP
    Sun Oct 23 10:21:00 2016 info: system - changing active WAN list - enp3s0 (was enp3s0 enp2s0)
    Sun Oct 23 10:21:00 2016 info: system - current WANs in use - enp3s0
    Sun Oct 23 10:21:00 2016 info: system - restarting firewall
    Sun Oct 23 10:21:12 2016 debug: enp2s0 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)
    Sun Oct 23 10:21:14 2016 info: enp2s0 - ping check on server #1 failed - 8.8.8.8
    Sun Oct 23 10:21:21 2016 info: enp2s0 - ping check on server #2 failed - 54.152.208.245
    Sun Oct 23 10:21:21 2016 warn: enp2s0 - upstream connection problems with your ISP
    Sun Oct 23 10:21:33 2016 debug: enp2s0 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)
    Sun Oct 23 10:21:35 2016 info: enp2s0 - ping check on server #1 failed - 8.8.8.8
    Sun Oct 23 10:21:42 2016 info: enp2s0 - ping check on server #2 failed - 54.152.208.245
    Sun Oct 23 10:21:42 2016 warn: enp2s0 - upstream connection problems with your ISP
    Sun Oct 23 10:21:54 2016 debug: enp2s0 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)
    Sun Oct 23 10:21:56 2016 info: enp2s0 - ping check on server #1 failed - 8.8.8.8
    Sun Oct 23 10:22:03 2016 info: enp2s0 - ping check on server #2 failed - 54.152.208.245
    Sun Oct 23 10:22:03 2016 warn: enp2s0 - upstream connection problems with your ISP
    Sun Oct 23 10:22:13 2016 info: system - heartbeat...
    Sun Oct 23 10:22:15 2016 debug: enp2s0 - ping check on server #1 failed - 8.8.8.8 (ping size: 1)
    Sun Oct 23 10:22:17 2016 info: enp2s0 - ping check on server #1 failed - 8.8.8.8
    Sun Oct 23 10:22:20 2016 info: enp2s0 - ping check on server #2 passed - 54.152.208.245
    Sun Oct 23 10:22:20 2016 info: system - changing active WAN list - enp3s0 enp2s0 (was enp3s0)
    Sun Oct 23 10:22:20 2016 info: system - current WANs in use - enp3s0 enp2s0
    Sun Oct 23 10:22:20 2016 info: system - restarting firewall
    Sun Oct 23 10:22:41 2016 info: enp2s0 - ping check on server #1 passed - 8.8.8.8



    One Additional Information , While click-on the Support menu in Clear-OS Web Config. Page getting a pop-up error "DNS Lookup Failed"


    https://i.imgur.com/Qb2HRoW.png

    Hope this helps for further diagnostics !!
    The reply is currently minimized Show
  • Accepted Answer

    Arun
    Arun
    Offline
    Sunday, October 23 2016, 08:40 AM - #Permalink
    Resolved
    0 votes
    Nick Howitt wrote:

    Are your routers in router of bridge mode? What is the output of "ifconfig | grep ^e -A 1"


    1)Routers are not in bridge mode

    2) Output of " ifconfig | grep ^e -A 1"

    [root@gateway ~]# ifconfig | grep ^e -A 1
    enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
    inet 192.168.10.3 netmask 255.255.255.0 broadcast 192.168.10.255
    --
    enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
    inet 192.168.20.4 netmask 255.255.255.0 broadcast 192.168.20.255
    --
    enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
    inet 192.168.0.1 netmask 255.255.255.0 broadcast 192.168.0.255
    [root@gateway ~]#

    The reply is currently minimized Show
  • Accepted Answer

    Arun
    Arun
    Offline
    Sunday, October 23 2016, 11:11 AM - #Permalink
    Resolved
    0 votes
    I am not sure about 'dnsmasq' , I think is not using dnsmasq( I am using default installation of ClearOS7)


    [root@gateway ~]# cat /etc/clearos/multiwan.conf
    MULTIPATH="on"
    MULTIPATH_WEIGHTS="enp2s0|1 enp3s0|1"
    EXTIF_BACKUP=""
    EXTIF_STANDBY=""
    [root@gateway ~]#



    [root@gateway ~]# rpm -qa | grep multiwan
    app-multiwan-core-2.2.1-1.v7.noarch
    app-multiwan-2.2.1-1.v7.noarch
    [root@gateway ~]#



    [root@gateway ~]# ifconfig
    enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
    inet 192.168.10.3 netmask 255.255.255.0 broadcast 192.168.10.255
    inet6 fe80::c24a:ff:fe02:acd9 prefixlen 64 scopeid 0x20<link>
    ether c0:4a:00:02:ac:d9 txqueuelen 1000 (Ethernet)
    RX packets 452853168 bytes 523653322393 (487.6 GiB)
    RX errors 0 dropped 0 overruns 0 frame 0
    TX packets 298430568 bytes 43236189694 (40.2 GiB)
    TX errors 1168 dropped 49632 overruns 0 carrier 0 collisions 0

    enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
    inet 192.168.20.4 netmask 255.255.255.0 broadcast 192.168.20.255
    inet6 fe80::c24a:ff:fe02:74f7 prefixlen 64 scopeid 0x20<link>
    ether c0:4a:00:02:74:f7 txqueuelen 1000 (Ethernet)
    RX packets 467165704 bytes 545652884137 (508.1 GiB)
    RX errors 0 dropped 0 overruns 0 frame 0
    TX packets 312450650 bytes 48837773783 (45.4 GiB)
    TX errors 1068 dropped 44979 overruns 0 carrier 0 collisions 0

    enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
    inet 192.168.0.1 netmask 255.255.255.0 broadcast 192.168.0.255
    inet6 fe80::4637:e6ff:fedf:de17 prefixlen 64 scopeid 0x20<link>
    ether 44:37:e6:df:de:17 txqueuelen 1000 (Ethernet)
    RX packets 611518636 bytes 94141137829 (87.6 GiB)
    RX errors 0 dropped 0 overruns 0 frame 0
    TX packets 917009196 bytes 1066812967458 (993.5 GiB)
    TX errors 1 dropped 0 overruns 0 carrier 0 collisions 0

    lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
    inet 127.0.0.1 netmask 255.0.0.0
    inet6 ::1 prefixlen 128 scopeid 0x10<host>
    loop txqueuelen 0 (Local Loopback)
    RX packets 82667 bytes 7445167 (7.1 MiB)
    RX errors 0 dropped 0 overruns 0 frame 0
    TX packets 82667 bytes 7445167 (7.1 MiB)
    TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

    [root@gateway ~]#


    Need to change "192.168.0.1 gateway.qf2.com" to first line ?

    [root@gateway ~]# cat /etc/hosts
    127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
    8.8.8.8 google-public-dns-a.google.com
    8.8.4.4 google-public-dns-b.google.com
    192.168.0.1 gateway.qf2.com
    [root@gateway ~]#
    The reply is currently minimized Show
  • Accepted Answer

    Arun
    Arun
    Offline
    Sunday, October 23 2016, 12:00 PM - #Permalink
    Resolved
    0 votes
    May be this is the issue ? Required more entries in below file ??


    [root@gateway dnsmasq]# cat /etc/resolv-peerdns.conf
    ; generated by /usr/sbin/dhclient-script
    nameserver 192.168.10.1
    [root@gateway dnsmasq]#


    /etc/dnsmasq.conf file details for reference

    [root@gateway dnsmasq]# cat /etc/dnsmasq.conf
    bogus-priv
    cache-size=5000
    conf-dir=/etc/dnsmasq.d
    dhcp-authoritative
    dhcp-lease-max=1000
    domain-needed
    domain=qf2.com
    expand-hosts
    no-negcache
    port=53
    resolv-file=/etc/resolv-peerdns.conf
    strict-order
    user=nobody
    [root@gateway dnsmasq]#
    [root@gateway dnsmasq]#
    The reply is currently minimized Show
  • Accepted Answer

    Sunday, October 23 2016, 11:55 AM - #Permalink
    Resolved
    0 votes
    I'm not so sure about 192.168.0.1. I think it should be removed otherwise dns lookups could loop. There should only be external resolvers here or, perhaps, an internal resolver which has its own external resolvers. To take advantage of the DNS caching on the LAN the DHCP server should be configured with a DNS server of 192.168.0.1.

    One thing about multiwan which I am never sure about is when the next hop is an unbridged router/modem. This is, because logically to me, the ClearOS WAN still works as it has a functioning connection to the router/modem. To me, it makes sense to try to bridge the modem/router if it supports such a mode.
    The reply is currently minimized Show
  • Accepted Answer

    Arun
    Arun
    Offline
    Sunday, October 23 2016, 12:43 PM - #Permalink
    Resolved
    0 votes
    Hi Nick Howitt,

    Your thought was correct , Now the issue is solved ( I Tested this )

    So current modifications as follows !!

    https://i.imgur.com/AY4EXGf.png


    Thank You Very Much Mr.Tonny Ellis & Mr. Nick Howitt For your valid contributions !!!
    The reply is currently minimized Show
  • Accepted Answer

    Arun
    Arun
    Offline
    Monday, October 24 2016, 05:56 AM - #Permalink
    Resolved
    0 votes
    Nick Howitt wrote:

    There are programs you can use to test DNS lookup response times, but I'm not sure there is any point setting your modem/routers as the primary and secondary DNS servers. Dnsmasq acts as a cache, but if your routers do as well then you are caching a cache. Also if your 192.168.10.1 router goes down, every DNS lookup will have to timeout first before it switches to your secondary DNS - the other router. This could really slow down DNS lookups when you lose that WAN. I'd just settle for 8.8.8.8 and 8.8.4.4 (actually I use OpenDNS or Unbound for DNS).


    As you suggested changed DNS server IPs to 8.8.8.8 and 8.8.4.4, Thank you very much !!
    The reply is currently minimized Show
Your Reply