Announcement

Collapse
No announcement yet.

Sudden high CPU usage

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sudden high CPU usage

    Hi,

    We are running NxFilter 4.6.8.1 on Ubuntu 18.04. For the last three days, the NxFilter process suddenly started consuming 300% CPU during daytime - approx. 07:30 to 18:00.

    Here's the weekly VM CPU usage:

    Click image for larger version

Name:	nxfilter-cpu.png
Views:	158
Size:	36.8 KB
ID:	2606

    And here's the daily usage, where the CPU ramping up can be seen:

    Click image for larger version

Name:	nxfilter-cpu2.png
Views:	94
Size:	31.6 KB
ID:	2607

    Some details of this installation:

    NxFilter 4.6.8.1, connected to Active Directory, approx. 500 users and 1000 IPs. Log retention is 1 day, database size is around 11 GB. Using Cloudflare as upstream DNS, no DNS over HTTPS.
    Categorization databse is Globlist, classifier is off, default policy is forbid all except the whitelisted domains/keywords.


    Here are the thing we tried while trying to find out what's going on:

    1. Restarted NxFilter
    2. Monitoring UDP/53 traffic on all server interfaces - no flooding or unusual traffic can be seen.
    3. Dropped all UDP/53 traffic to the VM for a minute - no change in CPU usage.
    4. Switching in log4j.properties from WARN to INFO and then DEBUG to see what's going on - nothing interesting on INFO, too much to filter on DEBUG.


    I can provide more info if needed, if logs are requested, please specify at which level.

    Thanks in advance!

  • #2
    It's more about how many requests it gets. Show me your usage report from 'Report > Usage'. And you can see the number of request and UDP packets in /nxfilter/log/nxfilter.log file.

    Code:
     INFO [11-18 23:20:04] - DSf, udpCnt = 8, tlsCnt = 0, aclDropCnt = 0, queryCnt = 8, customCnt = 0, authRediCnt = 0, authDropCnt = 0, refreshCnt = 7, pstCacheCnt = 4605, respCacheHitCnt = 7, cacheHitRatio = 87.5%.
     INFO [11-18 23:20:04] - MHMr, Stats, reqCnt = 14466, blockCnt = 1070, domainCnt = 5000, userCnt = 1, tokenNameCnt = 0, cltIpCnt = 1, malwareCnt = 0, fwdErrCnt = 0, domainTestCnt = 0, xUserCnt = 1, xIpCnt = 1, xReqCnt = 14468.
    udpCnt is for the count of UDP packets for one minute and reqCnt is the daily count of DNS request.

    Comment


    • #3
      Thank you!

      Here's the requested Usage, I only have data for the last three days:


      Click image for larger version  Name:	nxfilter-usage.png Views:	0 Size:	45.4 KB ID:	2613

      And here are udpCnt and reqCnt during the period of 300% CPU usage:

      Code:
      INFO [11-19 15:47:18] - DSf, udpCnt = 3715, tlsCnt = 0, aclDropCnt = 0, queryCnt = 3715, customCnt = 0, authRediCnt = 237, authDropCnt = 1, refreshCnt = 179, pstCacheCnt = 0, respCacheHitCnt = 1097, cacheHitRatio = 29.53%.
      INFO [11-19 15:47:18] - MHMr, Stats, reqCnt = 39963, blockCnt = 7283, domainCnt = 1144, userCnt = 179, tokenNameCnt = 0, cltIpCnt = 397, malwareCnt = 0, fwdErrCnt = 0, domainTestCnt = 0, xUserCnt = 0, xIpCnt = 0, xReqCnt = 12080.
      And this is now (off-hours), at 30-40% CPU usage:

      Code:
      INFO [11-19 20:03:08] - DSf, udpCnt = 1940, tlsCnt = 0, aclDropCnt = 0, queryCnt = 1940, customCnt = 0, authRediCnt = 627, authDropCnt = 0, refreshCnt = 135, pstCacheCnt = 0, respCacheHitCnt = 380, cacheHitRatio = 19.59%.
      INFO [11-19 20:03:08] - MHMr, Stats, reqCnt = 10732, blockCnt = 492, domainCnt = 1062, userCnt = 102, tokenNameCnt = 0, cltIpCnt = 280, malwareCnt = 0, fwdErrCnt = 0, domainTestCnt = 0, xUserCnt = 0, xIpCnt = 0, xReqCnt = 3704.
      If I am not mistaken, 1/3 of the traffic is served with 1/6 of the CPU usage at the moment. Is is possible some client to send such requests to overload the server like that?

      I would love your insight on that.

      EDIT: Half an hour later, down to 0-3% CPU:

      Code:
      INFO [11-19 20:34:01] - DSf, udpCnt = 1252, tlsCnt = 0, aclDropCnt = 0, queryCnt = 1252, customCnt = 0, authRediCnt = 50, authDropCnt = 7, refreshCnt = 186, pstCacheCnt = 0, respCacheHitCnt = 414, cacheHitRatio = 33.07%.
      INFO [11-19 20:34:01] - MHMr, Stats, reqCnt = 26853, blockCnt = 880, domainCnt = 1561, userCnt = 104, tokenNameCnt = 0, cltIpCnt = 297, malwareCnt = 0, fwdErrCnt = 0, domainTestCnt = 0, xUserCnt = 0, xIpCnt = 0, xReqCnt = 8866.
      Last edited by azidhaka; 11-19-2023, 05:36 PM.

      Comment


      • #4
        udpCnt= 3715 means that your NxFilter needs to handle 3715 UDP packets a minute or 3715 DNS queries a minute. That's more than 60 queries for a second. If they are new domains, there's no caching works. So, it's like you send more than 60 DNS queries to your upstream server and wait for the answers.

        It depends on a lot of factors but 3715 is a big number. You need to find out who sends what kind of DNS queries at the time. In some cases, there's some malware sending type 255 or ANY type queries. This happens mostly when you run NxFilter on cloud without proper security measure. But it can happen in a local network as well. And there are some security softwares sending and receiving excessive amount of TXT queries. If it's one of these cases, you can block their type on 'DNS > Server Protection'.

        If it's about normal users sending too many queries then you have to put more servers or use NxRelay especially when you run it on cloud.
        Last edited by support200; 11-19-2023, 06:16 PM.

        Comment

        Working...
        X